WorldWideScience

Sample records for accurate svm-based gene

  1. Accurate Multisteps Traffic Flow Prediction Based on SVM

    Directory of Open Access Journals (Sweden)

    Zhang Mingheng

    2013-01-01

    Full Text Available Accurate traffic flow prediction is prerequisite and important for realizing intelligent traffic control and guidance, and it is also the objective requirement for intelligent traffic management. Due to the strong nonlinear, stochastic, time-varying characteristics of urban transport system, artificial intelligence methods such as support vector machine (SVM are now receiving more and more attentions in this research field. Compared with the traditional single-step prediction method, the multisteps prediction has the ability that can predict the traffic state trends over a certain period in the future. From the perspective of dynamic decision, it is far important than the current traffic condition obtained. Thus, in this paper, an accurate multi-steps traffic flow prediction model based on SVM was proposed. In which, the input vectors were comprised of actual traffic volume and four different types of input vectors were compared to verify their prediction performance with each other. Finally, the model was verified with actual data in the empirical analysis phase and the test results showed that the proposed SVM model had a good ability for traffic flow prediction and the SVM-HPT model outperformed the other three models for prediction.

  2. CompareSVM: supervised, Support Vector Machine (SVM) inference of gene regularity networks.

    Science.gov (United States)

    Gillani, Zeeshan; Akash, Muhammad Sajid Hamid; Rahaman, M D Matiur; Chen, Ming

    2014-11-30

    Predication of gene regularity network (GRN) from expression data is a challenging task. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. Most promising methods are based on support vector machine (SVM). There is a need for comprehensive analysis on prediction accuracy of supervised method SVM using different kernels on different biological experimental conditions and network size. We developed a tool (CompareSVM) based on SVM to compare different kernel methods for inference of GRN. Using CompareSVM, we investigated and evaluated different SVM kernel methods on simulated datasets of microarray of different sizes in detail. The results obtained from CompareSVM showed that accuracy of inference method depends upon the nature of experimental condition and size of the network. For network with nodes (SVM Gaussian kernel outperform on knockout, knockdown, and multifactorial datasets compared to all the other inference methods. For network with large number of nodes (~500), choice of inference method depend upon nature of experimental condition. CompareSVM is available at http://bis.zju.edu.cn/CompareSVM/ .

  3. Accurate Fluid Level Measurement in Dynamic Environment Using Ultrasonic Sensor and ν-SVM

    Directory of Open Access Journals (Sweden)

    Jenny TERZIC

    2009-10-01

    Full Text Available A fluid level measurement system based on a single Ultrasonic Sensor and Support Vector Machines (SVM based signal processing and classification system has been developed to determine the fluid level in automotive fuel tanks. The novel approach based on the ν-SVM classification method uses the Radial Basis Function (RBF to compensate for the measurement error induced by the sloshing effects in the tank caused by vehicle motion. A broad investigation on selected pre-processing filters, namely, Moving Mean, Moving Median, and Wavelet filter, has also been presented. Field drive trials were performed under normal driving conditions at various fuel volumes ranging from 5 L to 50 L to acquire sample data from the ultrasonic sensor for the training of SVM model. Further drive trials were conducted to obtain data to verify the SVM results. A comparison of the accuracy of the predicted fluid level obtained using SVM and the pre-processing filters is provided. It is demonstrated that the ν-SVM model using the RBF kernel function and the Moving Median filter has produced the most accurate outcome compared with the other signal filtration methods in terms of fluid level measurement.

  4. Predication of Crane Condition Parameters Based on SVM and AR

    International Nuclear Information System (INIS)

    Xu Xiuzhong; Hu Xiong; Zhou Congxiao

    2011-01-01

    Through statistic analysis of vibration signals of motor on the container crane hoisting mechanism in a port, the feature vectors with vibration are obtained. Through data preprocessing and training data, Training models of condition parameters based on support vector machine (SVM) are established. The testing data of condition monitoring parameters can be predicted by the training models. During training the models, the penalty parameter and kernel function of model are optimized by cross validation. In order to analysis the accurate of SVM model, autoregressive model is used to predict the trend of vibration. The research showed the predicted results of model using SVM are better than the results by autoregressive (AR) modeling.

  5. Identification of eggs from different production systems based on hyperspectra and CS-SVM.

    Science.gov (United States)

    Sun, J; Cong, S L; Mao, H P; Zhou, X; Wu, X H; Zhang, X D

    2017-06-01

    1. To identify the origin of table eggs more accurately, a method based on hyperspectral imaging technology was studied. 2. The hyperspectral data of 200 samples of intensive and extensive eggs were collected. Standard normalised variables combined with a Savitzky-Golay were used to eliminate noise, then stepwise regression (SWR) was used for feature selection. Grid search algorithm (GS), genetic search algorithm (GA), particle swarm optimisation algorithm (PSO) and cuckoo search algorithm (CS) were applied by support vector machine (SVM) methods to establish an SVM identification model with the optimal parameters. The full spectrum data and the data after feature selection were the input of the model, while egg category was the output. 3. The SWR-CS-SVM model performed better than the other models, including SWR-GS-SVM, SWR-GA-SVM, SWR-PSO-SVM and others based on full spectral data. The training and test classification accuracy of the SWR-CS-SVM model were respectively 99.3% and 96%. 4. SWR-CS-SVM proved effective for identifying egg varieties and could also be useful for the non-destructive identification of other types of egg.

  6. Lex-SVM: exploring the potential of exon expression profiling for disease classification.

    Science.gov (United States)

    Yuan, Xiongying; Zhao, Yi; Liu, Changning; Bu, Dongbo

    2011-04-01

    Exon expression profiling technologies, including exon arrays and RNA-Seq, measure the abundance of every exon in a gene. Compared with gene expression profiling technologies like 3' array, exon expression profiling technologies could detect alterations in both transcription and alternative splicing, therefore they are expected to be more sensitive in diagnosis. However, exon expression profiling also brings higher dimension, more redundancy, and significant correlation among features. Ignoring the correlation structure among exons of a gene, a popular classification method like L1-SVM selects exons individually from each gene and thus is vulnerable to noise. To overcome this limitation, we present in this paper a new variant of SVM named Lex-SVM to incorporate correlation structure among exons and known splicing patterns to promote classification performance. Specifically, we construct a new norm, ex-norm, including our prior knowledge on exon correlation structure to regularize the coefficients of a linear SVM. Lex-SVM can be solved efficiently using standard linear programming techniques. The advantage of Lex-SVM is that it can select features group-wisely, force features in a subgroup to take equal weihts and exclude the features that contradict the majority in the subgroup. Experimental results suggest that on exon expression profile, Lex-SVM is more accurate than existing methods. Lex-SVM also generates a more compact model and selects genes more consistently in cross-validation. Unlike L1-SVM selecting only one exon in a gene, Lex-SVM assigns equal weights to as many exons in a gene as possible, lending itself easier for further interpretation.

  7. Lamb Wave Damage Quantification Using GA-Based LS-SVM

    Directory of Open Access Journals (Sweden)

    Fuqiang Sun

    2017-06-01

    Full Text Available Lamb waves have been reported to be an efficient tool for non-destructive evaluations (NDE for various application scenarios. However, accurate and reliable damage quantification using the Lamb wave method is still a practical challenge, due to the complex underlying mechanism of Lamb wave propagation and damage detection. This paper presents a Lamb wave damage quantification method using a least square support vector machine (LS-SVM and a genetic algorithm (GA. Three damage sensitive features, namely, normalized amplitude, phase change, and correlation coefficient, were proposed to describe changes of Lamb wave characteristics caused by damage. In view of commonly used data-driven methods, the GA-based LS-SVM model using the proposed three damage sensitive features was implemented to evaluate the crack size. The GA method was adopted to optimize the model parameters. The results of GA-based LS-SVM were validated using coupon test data and lap joint component test data with naturally developed fatigue cracks. Cases of different loading and manufacturer were also included to further verify the robustness of the proposed method for crack quantification.

  8. Lamb Wave Damage Quantification Using GA-Based LS-SVM.

    Science.gov (United States)

    Sun, Fuqiang; Wang, Ning; He, Jingjing; Guan, Xuefei; Yang, Jinsong

    2017-06-12

    Lamb waves have been reported to be an efficient tool for non-destructive evaluations (NDE) for various application scenarios. However, accurate and reliable damage quantification using the Lamb wave method is still a practical challenge, due to the complex underlying mechanism of Lamb wave propagation and damage detection. This paper presents a Lamb wave damage quantification method using a least square support vector machine (LS-SVM) and a genetic algorithm (GA). Three damage sensitive features, namely, normalized amplitude, phase change, and correlation coefficient, were proposed to describe changes of Lamb wave characteristics caused by damage. In view of commonly used data-driven methods, the GA-based LS-SVM model using the proposed three damage sensitive features was implemented to evaluate the crack size. The GA method was adopted to optimize the model parameters. The results of GA-based LS-SVM were validated using coupon test data and lap joint component test data with naturally developed fatigue cracks. Cases of different loading and manufacturer were also included to further verify the robustness of the proposed method for crack quantification.

  9. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.

    Science.gov (United States)

    Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai

    2017-12-26

    Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  10. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Xiaohui Lin

    2017-12-01

    Full Text Available Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  11. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    Science.gov (United States)

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  12. SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-12-12

    Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.

  13. Generalized SMO algorithm for SVM-based multitask learning.

    Science.gov (United States)

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  14. SVM and SVM Ensembles in Breast Cancer Prediction.

    Science.gov (United States)

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  15. SVM and SVM Ensembles in Breast Cancer Prediction.

    Directory of Open Access Journals (Sweden)

    Min-Wei Huang

    Full Text Available Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  16. [Hyperspectral remote sensing image classification based on SVM optimized by clonal selection].

    Science.gov (United States)

    Liu, Qing-Jie; Jing, Lin-Hai; Wang, Meng-Fei; Lin, Qi-Zhong

    2013-03-01

    Model selection for support vector machine (SVM) involving kernel and the margin parameter values selection is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyperspectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, artificial immune clonal selection algorithm is introduced to the optimal selection of SVM (CSSVM) kernel parameter a and margin parameter C to improve the training efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for testing the novel CSSVM, as well as a traditional SVM classifier with general Grid Searching cross-validation method (GSSVM) for comparison. And then, evaluation indexes including SVM model training time, classification overall accuracy (OA) and Kappa index of both CSSVM and GSSVM were all analyzed quantitatively. It is demonstrated that OA of CSSVM on test samples and whole image are 85.1% and 81.58, the differences from that of GSSVM are both within 0.08% respectively; And Kappa indexes reach 0.8213 and 0.7728, the differences from that of GSSVM are both within 0.001; While the ratio of model training time of CSSVM and GSSVM is between 1/6 and 1/10. Therefore, CSSVM is fast and accurate algorithm for hyperspectral image classification and is superior to GSSVM.

  17. SVM Classifier – a comprehensive java interface for support vector machine classification of microarray data

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-01-01

    Motivation Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. Results The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1–BRCA2 samples with RBF kernel of SVM. Conclusion We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at . PMID:17217518

  18. Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods.

    Science.gov (United States)

    Tuo, Youlin; An, Ning; Zhang, Ming

    2018-03-01

    The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non‑metastasis samples were screened under the threshold of PSVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non‑metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin‑dependent kinase 2 (CDK2), myelocytomatosis proto‑oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non‑ATPase 2 and telomeric repeat binding factor 2. The cyclin‑dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non‑metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non‑metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several

  19. Uniform design based SVM model selection for face recognition

    Science.gov (United States)

    Li, Weihong; Liu, Lijuan; Gong, Weiguo

    2010-02-01

    Support vector machine (SVM) has been proved to be a powerful tool for face recognition. The generalization capacity of SVM depends on the model with optimal hyperparameters. The computational cost of SVM model selection results in application difficulty in face recognition. In order to overcome the shortcoming, we utilize the advantage of uniform design--space filling designs and uniformly scattering theory to seek for optimal SVM hyperparameters. Then we propose a face recognition scheme based on SVM with optimal model which obtained by replacing the grid and gradient-based method with uniform design. The experimental results on Yale and PIE face databases show that the proposed method significantly improves the efficiency of SVM model selection.

  20. Optimised Selection of Stroke Biomarker Based on Svm and Information Theory

    Directory of Open Access Journals (Sweden)

    Wang Xiang

    2017-01-01

    Full Text Available With the development of molecular biology and gene-engineering technology, gene diagnosis has been an emerging approach for modern life sciences. Biological marker, recognized as the hot topic in the molecular and gene fields, has important values in early diagnosis, malignant tumor stage, treatment and therapeutic efficacy evaluation. So far, the researcher has not found any effective way to predict and distinguish different type of stroke. In this paper, we aim to optimize stroke biomarker and figure out effective stroke detection index based on SVM (support vector machine and information theory. Through mutual information analysis and principal component analysis to complete the selection of biomarkers and then we use SVM to verify our model. According to the testing data of patients provided by Xuanwu Hospital, we explore the significant markers of the stroke through data analysis. Our model can predict stroke well. Then discuss the effects of each biomarker on the incidence of stroke.

  1. GI-SVM: A sensitive method for predicting genomic islands based on unannotated sequence of a single genome.

    Science.gov (United States)

    Lu, Bingxin; Leong, Hon Wai

    2016-02-01

    Genomic islands (GIs) are clusters of functionally related genes acquired by lateral genetic transfer (LGT), and they are present in many bacterial genomes. GIs are extremely important for bacterial research, because they not only promote genome evolution but also contain genes that enhance adaption and enable antibiotic resistance. Many methods have been proposed to predict GI. But most of them rely on either annotations or comparisons with other closely related genomes. Hence these methods cannot be easily applied to new genomes. As the number of newly sequenced bacterial genomes rapidly increases, there is a need for methods to detect GI based solely on sequences of a single genome. In this paper, we propose a novel method, GI-SVM, to predict GIs given only the unannotated genome sequence. GI-SVM is based on one-class support vector machine (SVM), utilizing composition bias in terms of k-mer content. From our evaluations on three real genomes, GI-SVM can achieve higher recall compared with current methods, without much loss of precision. Besides, GI-SVM allows flexible parameter tuning to get optimal results for each genome. In short, GI-SVM provides a more sensitive method for researchers interested in a first-pass detection of GI in newly sequenced genomes.

  2. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    Science.gov (United States)

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine.

    Science.gov (United States)

    Manavalan, Balachandran; Shin, Tae H; Lee, Gwang

    2018-01-01

    Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.

  4. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Balachandran Manavalan

    2018-03-01

    Full Text Available Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.

  5. Density-based penalty parameter optimization on C-SVM.

    Science.gov (United States)

    Liu, Yun; Lian, Jie; Bartolacci, Michael R; Zeng, Qing-An

    2014-01-01

    The support vector machine (SVM) is one of the most widely used approaches for data classification and regression. SVM achieves the largest distance between the positive and negative support vectors, which neglects the remote instances away from the SVM interface. In order to avoid a position change of the SVM interface as the result of an error system outlier, C-SVM was implemented to decrease the influences of the system's outliers. Traditional C-SVM holds a uniform parameter C for both positive and negative instances; however, according to the different number proportions and the data distribution, positive and negative instances should be set with different weights for the penalty parameter of the error terms. Therefore, in this paper, we propose density-based penalty parameter optimization of C-SVM. The experiential results indicated that our proposed algorithm has outstanding performance with respect to both precision and recall.

  6. Comparing SVM and ANN based Machine Learning Methods for Species Identification of Food Contaminating Beetles.

    Science.gov (United States)

    Bisgin, Halil; Bera, Tanmay; Ding, Hongjian; Semey, Howard G; Wu, Leihong; Liu, Zhichao; Barnes, Amy E; Langley, Darryl A; Pava-Ripoll, Monica; Vyas, Himansu J; Tong, Weida; Xu, Joshua

    2018-04-25

    Insect pests, such as pantry beetles, are often associated with food contaminations and public health risks. Machine learning has the potential to provide a more accurate and efficient solution in detecting their presence in food products, which is currently done manually. In our previous research, we demonstrated such feasibility where Artificial Neural Network (ANN) based pattern recognition techniques could be implemented for species identification in the context of food safety. In this study, we present a Support Vector Machine (SVM) model which improved the average accuracy up to 85%. Contrary to this, the ANN method yielded ~80% accuracy after extensive parameter optimization. Both methods showed excellent genus level identification, but SVM showed slightly better accuracy  for most species. Highly accurate species level identification remains a challenge, especially in distinguishing between species from the same genus which may require improvements in both imaging and machine learning techniques. In summary, our work does illustrate a new SVM based technique and provides a good comparison with the ANN model in our context. We believe such insights will pave better way forward for the application of machine learning towards species identification and food safety.

  7. Effective Sequential Classifier Training for SVM-Based Multitemporal Remote Sensing Image Classification

    Science.gov (United States)

    Guo, Yiqing; Jia, Xiuping; Paull, David

    2018-06-01

    The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.

  8. The generalization ability of online SVM classification based on Markov sampling.

    Science.gov (United States)

    Xu, Jie; Yan Tang, Yuan; Zou, Bin; Xu, Zongben; Li, Luoqing; Lu, Yang

    2015-03-01

    In this paper, we consider online support vector machine (SVM) classification learning algorithms with uniformly ergodic Markov chain (u.e.M.c.) samples. We establish the bound on the misclassification error of an online SVM classification algorithm with u.e.M.c. samples based on reproducing kernel Hilbert spaces and obtain a satisfactory convergence rate. We also introduce a novel online SVM classification algorithm based on Markov sampling, and present the numerical studies on the learning ability of online SVM classification based on Markov sampling for benchmark repository. The numerical studies show that the learning performance of the online SVM classification algorithm based on Markov sampling is better than that of classical online SVM classification based on random sampling as the size of training samples is larger.

  9. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    Science.gov (United States)

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  10. A Roller Bearing Fault Diagnosis Method Based on LCD Energy Entropy and ACROA-SVM

    Directory of Open Access Journals (Sweden)

    HungLinh Ao

    2014-01-01

    Full Text Available This study investigates a novel method for roller bearing fault diagnosis based on local characteristic-scale decomposition (LCD energy entropy, together with a support vector machine designed using an Artificial Chemical Reaction Optimisation Algorithm, referred to as an ACROA-SVM. First, the original acceleration vibration signals are decomposed into intrinsic scale components (ISCs. Second, the concept of LCD energy entropy is introduced. Third, the energy features extracted from a number of ISCs that contain the most dominant fault information serve as input vectors for the support vector machine classifier. Finally, the ACROA-SVM classifier is proposed to recognize the faulty roller bearing pattern. The analysis of roller bearing signals with inner-race and outer-race faults shows that the diagnostic approach based on the ACROA-SVM and using LCD to extract the energy levels of the various frequency bands as features can identify roller bearing fault patterns accurately and effectively. The proposed method is superior to approaches based on Empirical Mode Decomposition method and requires less time.

  11. SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.

    Science.gov (United States)

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.

  12. Short Term Prediction of Freeway Exiting Volume Based on SVM and KNN

    Directory of Open Access Journals (Sweden)

    Xiang Wang

    2015-09-01

    The model results indicate that the proposed algorithm is feasible and accurate. The Mean Absolute Percentage Error is under 10%. When comparing with the results of single KNN or SVM method, the results show that the combination of KNN and SVM can improve the reliability of the prediction significantly. The proposed method can be implemented in the on-line application of exiting volume prediction, which is able to consider different vehicle types.

  13. [Non-destructive detection research for hollow heart of potato based on semi-transmission hyperspectral imaging and SVM].

    Science.gov (United States)

    Huang, Tao; Li, Xiao-yu; Xu, Meng-ling; Jin, Rui; Ku, Jing; Xu, Sen-miao; Wu, Zhen-zhong

    2015-01-01

    The quality of potato is directly related to their edible value and industrial value. Hollow heart of potato, as a physiological disease occurred inside the tuber, is difficult to be detected. This paper put forward a non-destructive detection method by using semi-transmission hyperspectral imaging with support vector machine (SVM) to detect hollow heart of potato. Compared to reflection and transmission hyperspectral image, semi-transmission hyperspectral image can get clearer image which contains the internal quality information of agricultural products. In this study, 224 potato samples (149 normal samples and 75 hollow samples) were selected as the research object, and semi-transmission hyperspectral image acquisition system was constructed to acquire the hyperspectral images (390-1 040 nn) of the potato samples, and then the average spectrum of region of interest were extracted for spectral characteristics analysis. Normalize was used to preprocess the original spectrum, and prediction model were developed based on SVM using all wave bands, the accurate recognition rate of test set is only 87. 5%. In order to simplify the model competitive.adaptive reweighed sampling algorithm (CARS) and successive projection algorithm (SPA) were utilized to select important variables from the all 520 spectral variables and 8 variables were selected (454, 601, 639, 664, 748, 827, 874 and 936 nm). 94. 64% of the accurate recognition rate of test set was obtained by using the 8 variables to develop SVM model. Parameter optimization algorithms, including artificial fish swarm algorithm (AFSA), genetic algorithm (GA) and grid search algorithm, were used to optimize the SVM model parameters: penalty parameter c and kernel parameter g. After comparative analysis, AFSA, a new bionic optimization algorithm based on the foraging behavior of fish swarm, was proved to get the optimal model parameter (c=10. 659 1, g=0. 349 7), and the recognition accuracy of 10% were obtained for the AFSA-SVM

  14. A Statistical Parameter Analysis and SVM Based Fault Diagnosis Strategy for Dynamically Tuned Gyroscopes

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Gyro's fault diagnosis plays a critical role in inertia navigation systems for higher reliability and precision. A new fault diagnosis strategy based on the statistical parameter analysis (SPA) and support vector machine(SVM) classification model was proposed for dynamically tuned gyroscopes (DTG). The SPA, a kind of time domain analysis approach, was introduced to compute a set of statistical parameters of vibration signal as the state features of DTG, with which the SVM model, a novel learning machine based on statistical learning theory (SLT), was applied and constructed to train and identify the working state of DTG. The experimental results verify that the proposed diagnostic strategy can simply and effectively extract the state features of DTG, and it outperforms the radial-basis function (RBF) neural network based diagnostic method and can more reliably and accurately diagnose the working state of DTG.

  15. Predicting enhancer activity and variant impact using gkm-SVM.

    Science.gov (United States)

    Beer, Michael A

    2017-09-01

    We participated in the Critical Assessment of Genome Interpretation eQTL challenge to further test computational models of regulatory variant impact and their association with human disease. Our prediction model is based on a discriminative gapped-kmer SVM (gkm-SVM) trained on genome-wide chromatin accessibility data in the cell type of interest. The comparisons with massively parallel reporter assays (MPRA) in lymphoblasts show that gkm-SVM is among the most accurate prediction models even though all other models used the MPRA data for model training, and gkm-SVM did not. In addition, we compare gkm-SVM with other MPRA datasets and show that gkm-SVM is a reliable predictor of expression and that deltaSVM is a reliable predictor of variant impact in K562 cells and mouse retina. We further show that DHS (DNase-I hypersensitive sites) and ATAC-seq (assay for transposase-accessible chromatin using sequencing) data are equally predictive substrates for training gkm-SVM, and that DHS regions flanked by H3K27Ac and H3K4me1 marks are more predictive than DHS regions alone. © 2017 Wiley Periodicals, Inc.

  16. A Mass Spectrometric Analysis Method Based on PPCA and SVM for Early Detection of Ovarian Cancer.

    Science.gov (United States)

    Wu, Jiang; Ji, Yanju; Zhao, Ling; Ji, Mengying; Ye, Zhuang; Li, Suyi

    2016-01-01

    Background. Surfaced-enhanced laser desorption-ionization-time of flight mass spectrometry (SELDI-TOF-MS) technology plays an important role in the early diagnosis of ovarian cancer. However, the raw MS data is highly dimensional and redundant. Therefore, it is necessary to study rapid and accurate detection methods from the massive MS data. Methods. The clinical data set used in the experiments for early cancer detection consisted of 216 SELDI-TOF-MS samples. An MS analysis method based on probabilistic principal components analysis (PPCA) and support vector machine (SVM) was proposed and applied to the ovarian cancer early classification in the data set. Additionally, by the same data set, we also established a traditional PCA-SVM model. Finally we compared the two models in detection accuracy, specificity, and sensitivity. Results. Using independent training and testing experiments 10 times to evaluate the ovarian cancer detection models, the average prediction accuracy, sensitivity, and specificity of the PCA-SVM model were 83.34%, 82.70%, and 83.88%, respectively. In contrast, those of the PPCA-SVM model were 90.80%, 92.98%, and 88.97%, respectively. Conclusions. The PPCA-SVM model had better detection performance. And the model combined with the SELDI-TOF-MS technology had a prospect in early clinical detection and diagnosis of ovarian cancer.

  17. Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data

    Directory of Open Access Journals (Sweden)

    Harris Lyndsay N

    2006-04-01

    Full Text Available Abstract Background Like microarray-based investigations, high-throughput proteomics techniques require machine learning algorithms to identify biomarkers that are informative for biological classification problems. Feature selection and classification algorithms need to be robust to noise and outliers in the data. Results We developed a recursive support vector machine (R-SVM algorithm to select important genes/biomarkers for the classification of noisy data. We compared its performance to a similar, state-of-the-art method (SVM recursive feature elimination or SVM-RFE, paying special attention to the ability of recovering the true informative genes/biomarkers and the robustness to outliers in the data. Simulation experiments show that a 5 %-~20 % improvement over SVM-RFE can be achieved regard to these properties. The SVM-based methods are also compared with a conventional univariate method and their respective strengths and weaknesses are discussed. R-SVM was applied to two sets of SELDI-TOF-MS proteomics data, one from a human breast cancer study and the other from a study on rat liver cirrhosis. Important biomarkers found by the algorithm were validated by follow-up biological experiments. Conclusion The proposed R-SVM method is suitable for analyzing noisy high-throughput proteomics and microarray data and it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features. The multivariate SVM-based method outperforms the univariate method in the classification performance, but univariate methods can reveal more of the differentially expressed features especially when there are correlations between the features.

  18. SVM-Based Spectral Analysis for Heart Rate from Multi-Channel WPPG Sensor Signals.

    Science.gov (United States)

    Xiong, Jiping; Cai, Lisang; Wang, Fei; He, Xiaowei

    2017-03-03

    Although wrist-type photoplethysmographic (hereafter referred to as WPPG) sensor signals can measure heart rate quite conveniently, the subjects' hand movements can cause strong motion artifacts, and then the motion artifacts will heavily contaminate WPPG signals. Hence, it is challenging for us to accurately estimate heart rate from WPPG signals during intense physical activities. The WWPG method has attracted more attention thanks to the popularity of wrist-worn wearable devices. In this paper, a mixed approach called Mix-SVM is proposed, it can use multi-channel WPPG sensor signals and simultaneous acceleration signals to measurement heart rate. Firstly, we combine the principle component analysis and adaptive filter to remove a part of the motion artifacts. Due to the strong relativity between motion artifacts and acceleration signals, the further denoising problem is regarded as a sparse signals reconstruction problem. Then, we use a spectrum subtraction method to eliminate motion artifacts effectively. Finally, the spectral peak corresponding to heart rate is sought by an SVM-based spectral analysis method. Through the public PPG database in the 2015 IEEE Signal Processing Cup, we acquire the experimental results, i.e., the average absolute error was 1.01 beat per minute, and the Pearson correlation was 0.9972. These results also confirm that the proposed Mix-SVM approach has potential for multi-channel WPPG-based heart rate estimation in the presence of intense physical exercise.

  19. Classification of cardiovascular tissues using LBP based descriptors and a cascade SVM.

    Science.gov (United States)

    Mazo, Claudia; Alegre, Enrique; Trujillo, Maria

    2017-08-01

    Histological images have characteristics, such as texture, shape, colour and spatial structure, that permit the differentiation of each fundamental tissue and organ. Texture is one of the most discriminative features. The automatic classification of tissues and organs based on histology images is an open problem, due to the lack of automatic solutions when treating tissues without pathologies. In this paper, we demonstrate that it is possible to automatically classify cardiovascular tissues using texture information and Support Vector Machines (SVM). Additionally, we realised that it is feasible to recognise several cardiovascular organs following the same process. The texture of histological images was described using Local Binary Patterns (LBP), LBP Rotation Invariant (LBPri), Haralick features and different concatenations between them, representing in this way its content. Using a SVM with linear kernel, we selected the more appropriate descriptor that, for this problem, was a concatenation of LBP and LBPri. Due to the small number of the images available, we could not follow an approach based on deep learning, but we selected the classifier who yielded the higher performance by comparing SVM with Random Forest and Linear Discriminant Analysis. Once SVM was selected as the classifier with a higher area under the curve that represents both higher recall and precision, we tuned it evaluating different kernels, finding that a linear SVM allowed us to accurately separate four classes of tissues: (i) cardiac muscle of the heart, (ii) smooth muscle of the muscular artery, (iii) loose connective tissue, and (iv) smooth muscle of the large vein and the elastic artery. The experimental validation was conducted using 3000 blocks of 100 × 100 sized pixels, with 600 blocks per class and the classification was assessed using a 10-fold cross-validation. using LBP as the descriptor, concatenated with LBPri and a SVM with linear kernel, the main four classes of tissues were

  20. A SVM bases AI design for interactive gaming

    OpenAIRE

    Jiang, Yang; Jiang, Jianmin; Palmer, Ian

    2008-01-01

    Interactive gaming requires automatic processing on large volume of random data produced by players on spot, such as shooting, football kicking, boxing etc. In this paper, we describe an artificial intelligence approach in processing such random data for interactive gaming by using a one-class support vector machine (OC-SVM). In comparison with existing techniques, our OC-SVM based interactive gaming design has the features of: (i): high speed processing, providing instant response to the pla...

  1. Quality-Oriented Classification of Aircraft Material Based on SVM

    Directory of Open Access Journals (Sweden)

    Hongxia Cai

    2014-01-01

    Full Text Available The existing material classification is proposed to improve the inventory management. However, different materials have the different quality-related attributes, especially in the aircraft industry. In order to reduce the cost without sacrificing the quality, we propose a quality-oriented material classification system considering the material quality character, Quality cost, and Quality influence. Analytic Hierarchy Process helps to make feature selection and classification decision. We use the improved Kraljic Portfolio Matrix to establish the three-dimensional classification model. The aircraft materials can be divided into eight types, including general type, key type, risk type, and leveraged type. Aiming to improve the classification accuracy of various materials, the algorithm of Support Vector Machine is introduced. Finally, we compare the SVM and BP neural network in the application. The results prove that the SVM algorithm is more efficient and accurate and the quality-oriented material classification is valuable.

  2. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.

    Science.gov (United States)

    Zhang, Yi; Ren, Jinchang; Jiang, Jianmin

    2015-01-01

    Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  3. Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations

    Directory of Open Access Journals (Sweden)

    Yi Zhang

    2015-01-01

    Full Text Available Maximum likelihood classifier (MLC and support vector machines (SVM are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.

  4. A Multi-Classification Method of Improved SVM-based Information Fusion for Traffic Parameters Forecasting

    Directory of Open Access Journals (Sweden)

    Hongzhuan Zhao

    2016-04-01

    Full Text Available With the enrichment of perception methods, modern transportation system has many physical objects whose states are influenced by many information factors so that it is a typical Cyber-Physical System (CPS. Thus, the traffic information is generally multi-sourced, heterogeneous and hierarchical. Existing research results show that the multisourced traffic information through accurate classification in the process of information fusion can achieve better parameters forecasting performance. For solving the problem of traffic information accurate classification, via analysing the characteristics of the multi-sourced traffic information and using redefined binary tree to overcome the shortcomings of the original Support Vector Machine (SVM classification in information fusion, a multi-classification method using improved SVM in information fusion for traffic parameters forecasting is proposed. The experiment was conducted to examine the performance of the proposed scheme, and the results reveal that the method can get more accurate and practical outcomes.

  5. Comparison of SVM RBF-NN and DT for crop and weed identification based on spectral measurement over corn fields

    Science.gov (United States)

    It is important to find an appropriate pattern-recognition method for in-field plant identification based on spectral measurement in order to classify the crop and weeds accurately. In this study, the method of Support Vector Machine (SVM) was evaluated and compared with two other methods, Decision ...

  6. SVM-Based Spectral Analysis for Heart Rate from Multi-Channel WPPG Sensor Signals

    Directory of Open Access Journals (Sweden)

    Jiping Xiong

    2017-03-01

    Full Text Available Although wrist-type photoplethysmographic (hereafter referred to as WPPG sensor signals can measure heart rate quite conveniently, the subjects’ hand movements can cause strong motion artifacts, and then the motion artifacts will heavily contaminate WPPG signals. Hence, it is challenging for us to accurately estimate heart rate from WPPG signals during intense physical activities. The WWPG method has attracted more attention thanks to the popularity of wrist-worn wearable devices. In this paper, a mixed approach called Mix-SVM is proposed, it can use multi-channel WPPG sensor signals and simultaneous acceleration signals to measurement heart rate. Firstly, we combine the principle component analysis and adaptive filter to remove a part of the motion artifacts. Due to the strong relativity between motion artifacts and acceleration signals, the further denoising problem is regarded as a sparse signals reconstruction problem. Then, we use a spectrum subtraction method to eliminate motion artifacts effectively. Finally, the spectral peak corresponding to heart rate is sought by an SVM-based spectral analysis method. Through the public PPG database in the 2015 IEEE Signal Processing Cup, we acquire the experimental results, i.e., the average absolute error was 1.01 beat per minute, and the Pearson correlation was 0.9972. These results also confirm that the proposed Mix-SVM approach has potential for multi-channel WPPG-based heart rate estimation in the presence of intense physical exercise.

  7. [Rapid determination of COD in aquaculture water based on LS-SVM with ultraviolet/visible spectroscopy].

    Science.gov (United States)

    Liu, Xue-Mei; Zhang, Hai-Liang

    2014-10-01

    Ultraviolet/visible (UV/Vis) spectroscopy was studied for the rapid determination of chemical oxygen demand (COD), which was an indicator to measure the concentration of organic matter in aquaculture water. In order to reduce the influence of the absolute noises of the spectra, the extracted 135 absorbance spectra were preprocessed by Savitzky-Golay smoothing (SG), EMD, and wavelet transform (WT) methods. The preprocessed spectra were then used to select latent variables (LVs) by partial least squares (PLS) methods. Partial least squares (PLS) was used to build models with the full spectra, and back- propagation neural network (BPNN) and least square support vector machine (LS-SVM) were applied to build models with the selected LVs. The overall results showed that BPNN and LS-SVM models performed better than PLS models, and the LS-SVM models with LVs based on WT preprocessed spectra obtained the best results with the determination coefficient (r2) and RMSE being 0. 83 and 14. 78 mg · L(-1) for calibration set, and 0.82 and 14.82 mg · L(-1) for the prediction set respectively. The method showed the best performance in LS-SVM model. The results indicated that it was feasible to use UV/Vis with LVs which were obtained by PLS method, combined with LS-SVM calibration could be applied to the rapid and accurate determination of COD in aquaculture water. Moreover, this study laid the foundation for further implementation of online analysis of aquaculture water and rapid determination of other water quality parameters.

  8. A method of neighbor classes based SVM classification for optical printed Chinese character recognition.

    Science.gov (United States)

    Zhang, Jie; Wu, Xiaohong; Yu, Yanmei; Luo, Daisheng

    2013-01-01

    In optical printed Chinese character recognition (OPCCR), many classifiers have been proposed for the recognition. Among the classifiers, support vector machine (SVM) might be the best classifier. However, SVM is a classifier for two classes. When it is used for multi-classes in OPCCR, its computation is time-consuming. Thus, we propose a neighbor classes based SVM (NC-SVM) to reduce the computation consumption of SVM. Experiments of NC-SVM classification for OPCCR have been done. The results of the experiments have shown that the NC-SVM we proposed can effectively reduce the computation time in OPCCR.

  9. A support vector machine (SVM) based voltage stability classifier

    Energy Technology Data Exchange (ETDEWEB)

    Dosano, R.D.; Song, H. [Kunsan National Univ., Kunsan, Jeonbuk (Korea, Republic of); Lee, B. [Korea Univ., Seoul (Korea, Republic of)

    2007-07-01

    Power system stability has become even more complex and critical with the advent of deregulated energy markets and the growing desire to completely employ existing transmission and infrastructure. The economic pressure on electricity markets forces the operation of power systems and components to their limit of capacity and performance. System conditions can be more exposed to instability due to greater uncertainty in day to day system operations and increase in the number of potential components for system disturbances potentially resulting in voltage stability. This paper proposed a support vector machine (SVM) based power system voltage stability classifier using local measurements of voltage and active power of load. It described the procedure for fast classification of long-term voltage stability using the SVM algorithm. The application of the SVM based voltage stability classifier was presented with reference to the choice of input parameters; input data preconditioning; moving window for feature vector; determination of learning samples; and other considerations in SVM applications. The paper presented a case study with numerical examples of an 11-bus test system. The test results for the feasibility study demonstrated that the classifier could offer an excellent performance in classification with time-series measurements in terms of long-term voltage stability. 9 refs., 14 figs.

  10. A novel one-class SVM based negative data sampling method for reconstructing proteome-wide HTLV-human protein interaction networks.

    Science.gov (United States)

    Mei, Suyu; Zhu, Hao

    2015-01-26

    Protein-protein interaction (PPI) prediction is generally treated as a problem of binary classification wherein negative data sampling is still an open problem to be addressed. The commonly used random sampling is prone to yield less representative negative data with considerable false negatives. Meanwhile rational constraints are seldom exerted on model selection to reduce the risk of false positive predictions for most of the existing computational methods. In this work, we propose a novel negative data sampling method based on one-class SVM (support vector machine, SVM) to predict proteome-wide protein interactions between HTLV retrovirus and Homo sapiens, wherein one-class SVM is used to choose reliable and representative negative data, and two-class SVM is used to yield proteome-wide outcomes as predictive feedback for rational model selection. Computational results suggest that one-class SVM is more suited to be used as negative data sampling method than two-class PPI predictor, and the predictive feedback constrained model selection helps to yield a rational predictive model that reduces the risk of false positive predictions. Some predictions have been validated by the recent literature. Lastly, gene ontology based clustering of the predicted PPI networks is conducted to provide valuable cues for the pathogenesis of HTLV retrovirus.

  11. Feature selection based on SVM significance maps for classification of dementia

    NARCIS (Netherlands)

    E.E. Bron (Esther); M. Smits (Marion); J.C. van Swieten (John); W.J. Niessen (Wiro); S. Klein (Stefan)

    2014-01-01

    textabstractSupport vector machine significance maps (SVM p-maps) previously showed clusters of significantly different voxels in dementiarelated brain regions. We propose a novel feature selection method for classification of dementia based on these p-maps. In our approach, the SVM p-maps are

  12. Research on Classification of Chinese Text Data Based on SVM

    Science.gov (United States)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  13. Time Reversal Reconstruction Algorithm Based on PSO Optimized SVM Interpolation for Photoacoustic Imaging

    Directory of Open Access Journals (Sweden)

    Mingjian Sun

    2015-01-01

    Full Text Available Photoacoustic imaging is an innovative imaging technique to image biomedical tissues. The time reversal reconstruction algorithm in which a numerical model of the acoustic forward problem is run backwards in time is widely used. In the paper, a time reversal reconstruction algorithm based on particle swarm optimization (PSO optimized support vector machine (SVM interpolation method is proposed for photoacoustics imaging. Numerical results show that the reconstructed images of the proposed algorithm are more accurate than those of the nearest neighbor interpolation, linear interpolation, and cubic convolution interpolation based time reversal algorithm, which can provide higher imaging quality by using significantly fewer measurement positions or scanning times.

  14. SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling.

    Science.gov (United States)

    Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin

    2013-03-01

    Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Area Determination of Diabetic Foot Ulcer Images Using a Cascaded Two-Stage SVM-Based Classification.

    Science.gov (United States)

    Wang, Lei; Pedersen, Peder C; Agu, Emmanuel; Strong, Diane M; Tulu, Bengisu

    2017-09-01

    The standard chronic wound assessment method based on visual examination is potentially inaccurate and also represents a significant clinical workload. Hence, computer-based systems providing quantitative wound assessment may be valuable for accurately monitoring wound healing status, with the wound area the best suited for automated analysis. Here, we present a novel approach, using support vector machines (SVM) to determine the wound boundaries on foot ulcer images captured with an image capture box, which provides controlled lighting and range. After superpixel segmentation, a cascaded two-stage classifier operates as follows: in the first stage, a set of k binary SVM classifiers are trained and applied to different subsets of the entire training images dataset, and incorrectly classified instances are collected. In the second stage, another binary SVM classifier is trained on the incorrectly classified set. We extracted various color and texture descriptors from superpixels that are used as input for each stage in the classifier training. Specifically, color and bag-of-word representations of local dense scale invariant feature transformation features are descriptors for ruling out irrelevant regions, and color and wavelet-based features are descriptors for distinguishing healthy tissue from wound regions. Finally, the detected wound boundary is refined by applying the conditional random field method. We have implemented the wound classification on a Nexus 5 smartphone platform, except for training which was done offline. Results are compared with other classifiers and show that our approach provides high global performance rates (average sensitivity = 73.3%, specificity = 94.6%) and is sufficiently efficient for a smartphone-based image analysis.

  16. Laos Organization Name Using Cascaded Model Based on SVM and CRF

    Directory of Open Access Journals (Sweden)

    Duan Shaopeng

    2017-01-01

    Full Text Available According to the characteristics of Laos organization name, this paper proposes a two layer model based on conditional random field (CRF and support vector machine (SVM for Laos organization name recognition. A layer of model uses CRF to recognition simple organization name, and the result is used to support the decision of the second level. Based on the driving method, the second layer uses SVM and CRF to recognition the complicated organization name. Finally, the results of the two levels are combined, And by a subsequent treatment to correct results of low confidence recognition. The results show that this approach based on SVM and CRF is efficient in recognizing organization name through open test for real linguistics, and the recalling rate achieve 80. 83%and the precision rate achieves 82. 75%.

  17. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons.

    Science.gov (United States)

    Long, Yi; Du, Zhi-Jiang; Wang, Wei-Dong; Zhao, Guang-Yu; Xu, Guo-Qiang; He, Long; Mao, Xi-Wang; Dong, Wei

    2016-09-02

    Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM) optimized by particle swarm optimization (PSO) to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS) attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz), a three-layer wavelet packet analysis (WPA) is used for feature extraction, after which, the kernel principal component analysis (kPCA) is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA) is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  18. PSO-SVM-Based Online Locomotion Mode Identification for Rehabilitation Robotic Exoskeletons

    Directory of Open Access Journals (Sweden)

    Yi Long

    2016-09-01

    Full Text Available Locomotion mode identification is essential for the control of a robotic rehabilitation exoskeletons. This paper proposes an online support vector machine (SVM optimized by particle swarm optimization (PSO to identify different locomotion modes to realize a smooth and automatic locomotion transition. A PSO algorithm is used to obtain the optimal parameters of SVM for a better overall performance. Signals measured by the foot pressure sensors integrated in the insoles of wearable shoes and the MEMS-based attitude and heading reference systems (AHRS attached on the shoes and shanks of leg segments are fused together as the input information of SVM. Based on the chosen window whose size is 200 ms (with sampling frequency of 40 Hz, a three-layer wavelet packet analysis (WPA is used for feature extraction, after which, the kernel principal component analysis (kPCA is utilized to reduce the dimension of the feature set to reduce computation cost of the SVM. Since the signals are from two types of different sensors, the normalization is conducted to scale the input into the interval of [0, 1]. Five-fold cross validation is adapted to train the classifier, which prevents the classifier over-fitting. Based on the SVM model obtained offline in MATLAB, an online SVM algorithm is constructed for locomotion mode identification. Experiments are performed for different locomotion modes and experimental results show the effectiveness of the proposed algorithm with an accuracy of 96.00% ± 2.45%. To improve its accuracy, majority vote algorithm (MVA is used for post-processing, with which the identification accuracy is better than 98.35% ± 1.65%. The proposed algorithm can be extended and employed in the field of robotic rehabilitation and assistance.

  19. Determination of the carmine content based on spectrum fluorescence spectral and PSO-SVM

    Science.gov (United States)

    Wang, Shu-tao; Peng, Tao; Cheng, Qi; Wang, Gui-chuan; Kong, De-ming; Wang, Yu-tian

    2018-03-01

    Carmine is a widely used food pigment in various food and beverage additives. Excessive consumption of synthetic pigment shall do harm to body seriously. The food is generally associated with a variety of colors. Under the simulation context of various food pigments' coexistence, we adopted the technology of fluorescence spectroscopy, together with the PSO-SVM algorithm, so that to establish a method for the determination of carmine content in mixed solution. After analyzing the prediction results of PSO-SVM, we collected a bunch of data: the carmine average recovery rate was 100.84%, the root mean square error of prediction (RMSEP) for 1.03e-04, 0.999 for the correlation coefficient between the model output and the real value of the forecast. Compared with the prediction results of reverse transmission, the correlation coefficient of PSO-SVM was 2.7% higher, the average recovery rate for 0.6%, and the root mean square error was nearly one order of magnitude lower. According to the analysis results, it can effectively avoid the interference caused by pigment with the combination of the fluorescence spectrum technique and PSO-SVM, accurately determining the content of carmine in mixed solution with an effect better than that of BP.

  20. SVM Classifiers: The Objects Identification on the Base of Their Hyperspectral Features

    Directory of Open Access Journals (Sweden)

    Demidova Liliya

    2017-01-01

    Full Text Available The problem of the objects identification on the base of their hyperspectral features has been considered. It is offered to use the SVM classifiers on the base of the modified PSO algorithm, adapted to specifics of the problem of the objects identification on the base of their hyperspectral features. The results of the objects identification on the base of their hyperspectral features with using of the SVM classifiers have been presented.

  1. A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM.

    Science.gov (United States)

    Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei; Song, Houbing

    2018-01-15

    Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model's performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM's parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models' performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors.

  2. pDHS-SVM: A prediction method for plant DNase I hypersensitive sites based on support vector machine.

    Science.gov (United States)

    Zhang, Shanxin; Zhou, Zhiping; Chen, Xinmeng; Hu, Yong; Yang, Lindong

    2017-08-07

    DNase I hypersensitive sites (DHSs) are accessible chromatin regions hypersensitive to cleavages by DNase I endonucleases. DHSs are indicative of cis-regulatory DNA elements (CREs), all of which play important roles in global gene expression regulation. It is helpful for discovering CREs by recognition of DHSs in genome. To accelerate the investigation, it is an important complement to develop cost-effective computational methods to identify DHSs. However, there is a lack of tools used for identifying DHSs in plant genome. Here we presented pDHS-SVM, a computational predictor to identify plant DHSs. To integrate the global sequence-order information and local DNA properties, reverse complement kmer and dinucleotide-based auto covariance of DNA sequences were applied to construct the feature space. In this work, fifteen physical-chemical properties of dinucleotides were used and Support Vector Machine (SVM) was employed. To further improve the performance of the predictor and extract an optimized subset of nucleotide physical-chemical properties positive for the DHSs, a heuristic nucleotide physical-chemical property selection algorithm was introduced. With the optimized subset of properties, experimental results of Arabidopsis thaliana and rice (Oryza sativa) showed that pDHS-SVM could achieve accuracies up to 87.00%, and 85.79%, respectively. The results indicated the effectiveness of proposed method for predicting DHSs. Furthermore, pDHS-SVM could provide a helpful complement for predicting CREs in plant genome. Our implementation of the novel proposed method pDHS-SVM is freely available as source code, at https://github.com/shanxinzhang/pDHS-SVM. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. A Fault Diagnosis Approach for Gears Based on IMF AR Model and SVM

    Directory of Open Access Journals (Sweden)

    Yu Yang

    2008-05-01

    Full Text Available An accurate autoregressive (AR model can reflect the characteristics of a dynamic system based on which the fault feature of gear vibration signal can be extracted without constructing mathematical model and studying the fault mechanism of gear vibration system, which are experienced by the time-frequency analysis methods. However, AR model can only be applied to stationary signals, while the gear fault vibration signals usually present nonstationary characteristics. Therefore, empirical mode decomposition (EMD, which can decompose the vibration signal into a finite number of intrinsic mode functions (IMFs, is introduced into feature extraction of gear vibration signals as a preprocessor before AR models are generated. On the other hand, by targeting the difficulties of obtaining sufficient fault samples in practice, support vector machine (SVM is introduced into gear fault pattern recognition. In the proposed method in this paper, firstly, vibration signals are decomposed into a finite number of intrinsic mode functions, then the AR model of each IMF component is established; finally, the corresponding autoregressive parameters and the variance of remnant are regarded as the fault characteristic vectors and used as input parameters of SVM classifier to classify the working condition of gears. The experimental analysis results show that the proposed approach, in which IMF AR model and SVM are combined, can identify working condition of gears with a success rate of 100% even in the case of smaller number of samples.

  4. [Measurement of soil organic matter and available K based on SPA-LS-SVM].

    Science.gov (United States)

    Zhang, Hai-Liang; Liu, Xue-Mei; He, Yong

    2014-05-01

    Visible and short wave infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement of soil organic matter (OM) and available potassium (K). Four types of pretreatments including smoothing, SNV, MSC and SG smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares regression (PLSR) and least squares-support vector machine (LS-SVM) models were implemented for calibration models. The LS-SVM model was built by using characteristic wavelength based on successive projections algorithm (SPA). Simultaneously, the performance of LSSVM models was compared with PLSR models. The results indicated that LS-SVM models using characteristic wavelength as inputs based on SPA outperformed PLSR models. The optimal SPA-LS-SVM models were achieved, and the correlation coefficient (r), and RMSEP were 0. 860 2 and 2. 98 for OM and 0. 730 5 and 15. 78 for K, respectively. The results indicated that visible and short wave near infrared spectroscopy (Vis/SW-NIRS) (325 approximately 1 075 nm) combined with LS-SVM based on SPA could be utilized as a precision method for the determination of soil properties.

  5. TargetCrys: protein crystallization prediction by fusing multi-view features with two-layered SVM.

    Science.gov (United States)

    Hu, Jun; Han, Ke; Li, Yang; Yang, Jing-Yu; Shen, Hong-Bin; Yu, Dong-Jun

    2016-11-01

    The accurate prediction of whether a protein will crystallize plays a crucial role in improving the success rate of protein crystallization projects. A common critical problem in the development of machine-learning-based protein crystallization predictors is how to effectively utilize protein features extracted from different views. In this study, we aimed to improve the efficiency of fusing multi-view protein features by proposing a new two-layered SVM (2L-SVM) which switches the feature-level fusion problem to a decision-level fusion problem: the SVMs in the 1st layer of the 2L-SVM are trained on each of the multi-view feature sets; then, the outputs of the 1st layer SVMs, which are the "intermediate" decisions made based on the respective feature sets, are further ensembled by a 2nd layer SVM. Based on the proposed 2L-SVM, we implemented a sequence-based protein crystallization predictor called TargetCrys. Experimental results on several benchmark datasets demonstrated the efficacy of the proposed 2L-SVM for fusing multi-view features. We also compared TargetCrys with existing sequence-based protein crystallization predictors and demonstrated that the proposed TargetCrys outperformed most of the existing predictors and is competitive with the state-of-the-art predictors. The TargetCrys webserver and datasets used in this study are freely available for academic use at: http://csbio.njust.edu.cn/bioinf/TargetCrys .

  6. Quantitative analysis of glycated albumin in serum based on ATR-FTIR spectrum combined with SiPLS and SVM.

    Science.gov (United States)

    Li, Yuanpeng; Li, Fucui; Yang, Xinhao; Guo, Liu; Huang, Furong; Chen, Zhenqiang; Chen, Xingdan; Zheng, Shifu

    2018-08-05

    A rapid quantitative analysis model for determining the glycated albumin (GA) content based on Attenuated total reflectance (ATR)-Fourier transform infrared spectroscopy (FTIR) combining with linear SiPLS and nonlinear SVM has been developed. Firstly, the real GA content in human serum was determined by GA enzymatic method, meanwhile, the ATR-FTIR spectra of serum samples from the population of health examination were obtained. The spectral data of the whole spectra mid-infrared region (4000-600 cm -1 ) and GA's characteristic region (1800-800 cm -1 ) were used as the research object of quantitative analysis. Secondly, several preprocessing steps including first derivative, second derivative, variable standardization and spectral normalization, were performed. Lastly, quantitative analysis regression models were established by using SiPLS and SVM respectively. The SiPLS modeling results are as follows: root mean square error of cross validation (RMSECV T ) = 0.523 g/L, calibration coefficient (R C ) = 0.937, Root Mean Square Error of Prediction (RMSEP T ) = 0.787 g/L, and prediction coefficient (R P ) = 0.938. The SVM modeling results are as follows: RMSECV T  = 0.0048 g/L, R C  = 0.998, RMSEP T  = 0.442 g/L, and R p  = 0.916. The results indicated that the model performance was improved significantly after preprocessing and optimization of characteristic regions. While modeling performance of nonlinear SVM was considerably better than that of linear SiPLS. Hence, the quantitative analysis model for GA in human serum based on ATR-FTIR combined with SiPLS and SVM is effective. And it does not need sample preprocessing while being characterized by simple operations and high time efficiency, providing a rapid and accurate method for GA content determination. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM)

    Science.gov (United States)

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-01

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms.

  8. Signal peptide discrimination and cleavage site identification using SVM and NN.

    Science.gov (United States)

    Kazemian, H B; Yusuf, S A; White, K

    2014-02-01

    About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model. © 2013 Published by Elsevier Ltd.

  9. Predicting Protein-Protein Interaction Sites with a Novel Membership Based Fuzzy SVM Classifier.

    Science.gov (United States)

    Sriwastava, Brijesh K; Basu, Subhadip; Maulik, Ujjwal

    2015-01-01

    Predicting residues that participate in protein-protein interactions (PPI) helps to identify, which amino acids are located at the interface. In this paper, we show that the performance of the classical support vector machine (SVM) algorithm can further be improved with the use of a custom-designed fuzzy membership function, for the partner-specific PPI interface prediction problem. We evaluated the performances of both classical SVM and fuzzy SVM (F-SVM) on the PPI databases of three different model proteomes of Homo sapiens, Escherichia coli and Saccharomyces Cerevisiae and calculated the statistical significance of the developed F-SVM over classical SVM algorithm. We also compared our performance with the available state-of-the-art fuzzy methods in this domain and observed significant performance improvements. To predict interaction sites in protein complexes, local composition of amino acids together with their physico-chemical characteristics are used, where the F-SVM based prediction method exploits the membership function for each pair of sequence fragments. The average F-SVM performance (area under ROC curve) on the test samples in 10-fold cross validation experiment are measured as 77.07, 78.39, and 74.91 percent for the aforementioned organisms respectively. Performances on independent test sets are obtained as 72.09, 73.24 and 82.74 percent respectively. The software is available for free download from http://code.google.com/p/cmater-bioinfo.

  10. Human Walking Pattern Recognition Based on KPCA and SVM with Ground Reflex Pressure Signal

    Directory of Open Access Journals (Sweden)

    Zhaoqin Peng

    2013-01-01

    Full Text Available Algorithms based on the ground reflex pressure (GRF signal obtained from a pair of sensing shoes for human walking pattern recognition were investigated. The dimensionality reduction algorithms based on principal component analysis (PCA and kernel principal component analysis (KPCA for walking pattern data compression were studied in order to obtain higher recognition speed. Classifiers based on support vector machine (SVM, SVM-PCA, and SVM-KPCA were designed, and the classification performances of these three kinds of algorithms were compared using data collected from a person who was wearing the sensing shoes. Experimental results showed that the algorithm fusing SVM and KPCA had better recognition performance than the other two methods. Experimental outcomes also confirmed that the sensing shoes developed in this paper can be employed for automatically recognizing human walking pattern in unlimited environments which demonstrated the potential application in the control of exoskeleton robots.

  11. Absolute cosine-based SVM-RFE feature selection method for prostate histopathological grading.

    Science.gov (United States)

    Sahran, Shahnorbanun; Albashish, Dheeb; Abdullah, Azizi; Shukor, Nordashima Abd; Hayati Md Pauzi, Suria

    2018-04-18

    Feature selection (FS) methods are widely used in grading and diagnosing prostate histopathological images. In this context, FS is based on the texture features obtained from the lumen, nuclei, cytoplasm and stroma, all of which are important tissue components. However, it is difficult to represent the high-dimensional textures of these tissue components. To solve this problem, we propose a new FS method that enables the selection of features with minimal redundancy in the tissue components. We categorise tissue images based on the texture of individual tissue components via the construction of a single classifier and also construct an ensemble learning model by merging the values obtained by each classifier. Another issue that arises is overfitting due to the high-dimensional texture of individual tissue components. We propose a new FS method, SVM-RFE(AC), that integrates a Support Vector Machine-Recursive Feature Elimination (SVM-RFE) embedded procedure with an absolute cosine (AC) filter method to prevent redundancy in the selected features of the SV-RFE and an unoptimised classifier in the AC. We conducted experiments on H&E histopathological prostate and colon cancer images with respect to three prostate classifications, namely benign vs. grade 3, benign vs. grade 4 and grade 3 vs. grade 4. The colon benchmark dataset requires a distinction between grades 1 and 2, which are the most difficult cases to distinguish in the colon domain. The results obtained by both the single and ensemble classification models (which uses the product rule as its merging method) confirm that the proposed SVM-RFE(AC) is superior to the other SVM and SVM-RFE-based methods. We developed an FS method based on SVM-RFE and AC and successfully showed that its use enabled the identification of the most crucial texture feature of each tissue component. Thus, it makes possible the distinction between multiple Gleason grades (e.g. grade 3 vs. grade 4) and its performance is far superior to

  12. Sales Growth Rate Forecasting Using Improved PSO and SVM

    Directory of Open Access Journals (Sweden)

    Xibin Wang

    2014-01-01

    Full Text Available Accurate forecast of the sales growth rate plays a decisive role in determining the amount of advertising investment. In this study, we present a preclassification and later regression based method optimized by improved particle swarm optimization (IPSO for sales growth rate forecasting. We use support vector machine (SVM as a classification model. The nonlinear relationship in sales growth rate forecasting is efficiently represented by SVM, while IPSO is optimizing the training parameters of SVM. IPSO addresses issues of traditional PSO, such as relapsing into local optimum, slow convergence speed, and low convergence precision in the later evolution. We performed two experiments; firstly, three classic benchmark functions are used to verify the validity of the IPSO algorithm against PSO. Having shown IPSO outperform PSO in convergence speed, precision, and escaping local optima, in our second experiment, we apply IPSO to the proposed model. The sales growth rate forecasting cases are used to testify the forecasting performance of proposed model. According to the requirements and industry knowledge, the sample data was first classified to obtain types of the test samples. Next, the values of the test samples were forecast using the SVM regression algorithm. The experimental results demonstrate that the proposed model has good forecasting performance.

  13. DCS-SVM: a novel semi-automated method for human brain MR image segmentation.

    Science.gov (United States)

    Ahmadvand, Ali; Daliri, Mohammad Reza; Hajiali, Mohammadtaghi

    2017-11-27

    In this paper, a novel method is proposed which appropriately segments magnetic resonance (MR) brain images into three main tissues. This paper proposes an extension of our previous work in which we suggested a combination of multiple classifiers (CMC)-based methods named dynamic classifier selection-dynamic local training local Tanimoto index (DCS-DLTLTI) for MR brain image segmentation into three main cerebral tissues. This idea is used here and a novel method is developed that tries to use more complex and accurate classifiers like support vector machine (SVM) in the ensemble. This work is challenging because the CMC-based methods are time consuming, especially on huge datasets like three-dimensional (3D) brain MR images. Moreover, SVM is a powerful method that is used for modeling datasets with complex feature space, but it also has huge computational cost for big datasets, especially those with strong interclass variability problems and with more than two classes such as 3D brain images; therefore, we cannot use SVM in DCS-DLTLTI. Therefore, we propose a novel approach named "DCS-SVM" to use SVM in DCS-DLTLTI to improve the accuracy of segmentation results. The proposed method is applied on well-known datasets of the Internet Brain Segmentation Repository (IBSR) and promising results are obtained.

  14. Diagnosis of Elevator Faults with LS-SVM Based on Optimization by K-CV

    Directory of Open Access Journals (Sweden)

    Zhou Wan

    2015-01-01

    Full Text Available Several common elevator malfunctions were diagnosed with a least square support vector machine (LS-SVM. After acquiring vibration signals of various elevator functions, their energy characteristics and time domain indicators were extracted by theoretically analyzing the optimal wavelet packet, in order to construct a feature vector of malfunctions for identifying causes of the malfunctions as input of LS-SVM. Meanwhile, parameters about LS-SVM were optimized by K-fold cross validation (K-CV. After diagnosing deviated elevator guide rail, deviated shape of guide shoe, abnormal running of tractor, erroneous rope groove of traction sheave, deviated guide wheel, and tension of wire rope, the results suggested that the LS-SVM based on K-CV optimization was one of effective methods for diagnosing elevator malfunctions.

  15. Hadamard Kernel SVM with applications for breast cancer outcome predictions.

    Science.gov (United States)

    Jiang, Hao; Ching, Wai-Ki; Cheung, Wai-Shun; Hou, Wenpin; Yin, Hong

    2017-12-21

    Breast cancer is one of the leading causes of deaths for women. It is of great necessity to develop effective methods for breast cancer detection and diagnosis. Recent studies have focused on gene-based signatures for outcome predictions. Kernel SVM for its discriminative power in dealing with small sample pattern recognition problems has attracted a lot attention. But how to select or construct an appropriate kernel for a specified problem still needs further investigation. Here we propose a novel kernel (Hadamard Kernel) in conjunction with Support Vector Machines (SVMs) to address the problem of breast cancer outcome prediction using gene expression data. Hadamard Kernel outperform the classical kernels and correlation kernel in terms of Area under the ROC Curve (AUC) values where a number of real-world data sets are adopted to test the performance of different methods. Hadamard Kernel SVM is effective for breast cancer predictions, either in terms of prognosis or diagnosis. It may benefit patients by guiding therapeutic options. Apart from that, it would be a valuable addition to the current SVM kernel families. We hope it will contribute to the wider biology and related communities.

  16. Applications of PCA and SVM-PSO Based Real-Time Face Recognition System

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Shieh

    2014-01-01

    Full Text Available This paper incorporates principal component analysis (PCA with support vector machine-particle swarm optimization (SVM-PSO for developing real-time face recognition systems. The integrated scheme aims to adopt the SVM-PSO method to improve the validity of PCA based image recognition systems on dynamically visual perception. The face recognition for most human-robot interaction applications is accomplished by PCA based method because of its dimensionality reduction. However, PCA based systems are only suitable for processing the faces with the same face expressions and/or under the same view directions. Since the facial feature selection process can be considered as a problem of global combinatorial optimization in machine learning, the SVM-PSO is usually used as an optimal classifier of the system. In this paper, the PSO is used to implement a feature selection, and the SVMs serve as fitness functions of the PSO for classification problems. Experimental results demonstrate that the proposed method simplifies features effectively and obtains higher classification accuracy.

  17. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM).

    Science.gov (United States)

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-15

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms. Copyright © 2017. Published by Elsevier B.V.

  18. Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

    Science.gov (United States)

    Rokni Deilmai, B.; Ahmad, B. Bin; Zabihi, H.

    2014-06-01

    Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification.

  19. Comparison of two Classification methods (MLC and SVM) to extract land use and land cover in Johor Malaysia

    International Nuclear Information System (INIS)

    Deilmai, B Rokni; Ahmad, B Bin; Zabihi, H

    2014-01-01

    Mapping is essential for the analysis of the land use and land cover, which influence many environmental processes and properties. For the purpose of the creation of land cover maps, it is important to minimize error. These errors will propagate into later analyses based on these land cover maps. The reliability of land cover maps derived from remotely sensed data depends on an accurate classification. In this study, we have analyzed multispectral data using two different classifiers including Maximum Likelihood Classifier (MLC) and Support Vector Machine (SVM). To pursue this aim, Landsat Thematic Mapper data and identical field-based training sample datasets in Johor Malaysia used for each classification method, which results indicate in five land cover classes forest, oil palm, urban area, water, rubber. Classification results indicate that SVM was more accurate than MLC. With demonstrated capability to produce reliable cover results, the SVM methods should be especially useful for land cover classification

  20. Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting

    Directory of Open Access Journals (Sweden)

    Fei Wang

    2017-12-01

    Full Text Available Accurate solar photovoltaic (PV power forecasting is an essential tool for mitigating the negative effects caused by the uncertainty of PV output power in systems with high penetration levels of solar PV generation. Weather classification based modeling is an effective way to increase the accuracy of day-ahead short-term (DAST solar PV power forecasting because PV output power is strongly dependent on the specific weather conditions in a given time period. However, the accuracy of daily weather classification relies on both the applied classifiers and the training data. This paper aims to reveal how these two factors impact the classification performance and to delineate the relation between classification accuracy and sample dataset scale. Two commonly used classification methods, K-nearest neighbors (KNN and support vector machines (SVM are applied to classify the daily local weather types for DAST solar PV power forecasting using the operation data from a grid-connected PV plant in Hohhot, Inner Mongolia, China. We assessed the performance of SVM and KNN approaches, and then investigated the influences of sample scale, the number of categories, and the data distribution in different categories on the daily weather classification results. The simulation results illustrate that SVM performs well with small sample scale, while KNN is more sensitive to the length of the training dataset and can achieve higher accuracy than SVM with sufficient samples.

  1. Settlement Prediction of Road Soft Foundation Using a Support Vector Machine (SVM Based on Measured Data

    Directory of Open Access Journals (Sweden)

    Yu Huiling

    2016-01-01

    Full Text Available The suppor1t vector machine (SVM is a relatively new artificial intelligence technique which is increasingly being applied to geotechnical problems and is yielding encouraging results. SVM is a new machine learning method based on the statistical learning theory. A case study based on road foundation engineering project shows that the forecast results are in good agreement with the measured data. The SVM model is also compared with BP artificial neural network model and traditional hyperbola method. The prediction results indicate that the SVM model has a better prediction ability than BP neural network model and hyperbola method. Therefore, settlement prediction based on SVM model can reflect actual settlement process more correctly. The results indicate that it is effective and feasible to use this method and the nonlinear mapping relation between foundation settlement and its influence factor can be expressed well. It will provide a new method to predict foundation settlement.

  2. Prediction of N-Methyl-D-Aspartate Receptor GluN1-Ligand Binding Affinity by a Novel SVM-Pose/SVM-Score Combinatorial Ensemble Docking Scheme.

    Science.gov (United States)

    Leong, Max K; Syu, Ren-Guei; Ding, Yi-Lung; Weng, Ching-Feng

    2017-01-06

    The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r 2  = 0.928-0.988,  = 0.894-0.954, RMSE = 0.002-0.412, s = 0.001-0.214), and the predicted pK i values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r 2  = 0.967,  = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q 2  = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery.

  3. Intelligent Recognition of Lung Nodule Combining Rule-based and C-SVM Classifiers

    Directory of Open Access Journals (Sweden)

    Bin Li

    2012-02-01

    Full Text Available Computer-aided detection(CAD system for lung nodules plays the important role in the diagnosis of lung cancer. In this paper, an improved intelligent recognition method of lung nodule in HRCT combing rule-based and cost-sensitive support vector machine(C-SVM classifiers is proposed for detecting both solid nodules and ground-glass opacity(GGO nodules(part solid and nonsolid. This method consists of several steps. Firstly, segmentation of regions of interest(ROIs, including pulmonary parenchyma and lung nodule candidates, is a difficult task. On one side, the presence of noise lowers the visibility of low-contrast objects. On the other side, different types of nodules, including small nodules, nodules connecting to vasculature or other structures, part-solid or nonsolid nodules, are complex, noisy, weak edge or difficult to define the boundary. In order to overcome the difficulties of obvious boundary-leak and slow evolvement speed problem in segmentatioin of weak edge, an overall segmentation method is proposed, they are: the lung parenchyma is extracted based on threshold and morphologic segmentation method; the image denoising and enhancing is realized by nonlinear anisotropic diffusion filtering(NADF method; candidate pulmonary nodules are segmented by the improved C-V level set method, in which the segmentation result of EM-based fuzzy threshold method is used as the initial contour of active contour model and a constrained energy term is added into the PDE of level set function. Then, lung nodules are classified by using the intelligent classifiers combining rules and C-SVM. Rule-based classification is first used to remove easily dismissible nonnodule objects, then C-SVM classification are used to further classify nodule candidates and reduce the number of false positive(FP objects. In order to increase the efficiency of SVM, an improved training method is used to train SVM, which uses the grid search method to search the optimal

  4. Intelligent Recognition of Lung Nodule Combining Rule-based and C-SVM Classifiers

    Directory of Open Access Journals (Sweden)

    Bin Li

    2011-10-01

    Full Text Available Computer-aided detection(CAD system for lung nodules plays the important role in the diagnosis of lung cancer. In this paper, an improved intelligent recognition method of lung nodule in HRCT combing rule-based and costsensitive support vector machine(C-SVM classifiers is proposed for detecting both solid nodules and ground-glass opacity(GGO nodules(part solid and nonsolid. This method consists of several steps. Firstly, segmentation of regions of interest(ROIs, including pulmonary parenchyma and lung nodule candidates, is a difficult task. On one side, the presence of noise lowers the visibility of low-contrast objects. On the other side, different types of nodules, including small nodules, nodules connecting to vasculature or other structures, part-solid or nonsolid nodules, are complex, noisy, weak edge or difficult to define the boundary. In order to overcome the difficulties of obvious boundary-leak and slow evolvement speed problem in segmentatioin of weak edge, an overall segmentation method is proposed, they are: the lung parenchyma is extracted based on threshold and morphologic segmentation method; the image denoising and enhancing is realized by nonlinear anisotropic diffusion filtering(NADF method;candidate pulmonary nodules are segmented by the improved C-V level set method, in which the segmentation result of EM-based fuzzy threshold method is used as the initial contour of active contour model and a constrained energy term is added into the PDE of level set function. Then, lung nodules are classified by using the intelligent classifiers combining rules and C-SVM. Rule-based classification is first used to remove easily dismissible nonnodule objects, then C-SVM classification are used to further classify nodule candidates and reduce the number of false positive(FP objects. In order to increase the efficiency of SVM, an improved training method is used to train SVM, which uses the grid search method to search the optimal parameters

  5. gkmSVM: an R package for gapped-kmer SVM.

    Science.gov (United States)

    Ghandi, Mahmoud; Mohammad-Noori, Morteza; Ghareghani, Narges; Lee, Dongwon; Garraway, Levi; Beer, Michael A

    2016-07-15

    We present a new R package for training gapped-kmer SVM classifiers for DNA and protein sequences. We describe an improved algorithm for kernel matrix calculation that speeds run time by about 2 to 5-fold over our original gkmSVM algorithm. This package supports several sequence kernels, including: gkmSVM, kmer-SVM, mismatch kernel and wildcard kernel. gkmSVM package is freely available through the Comprehensive R Archive Network (CRAN), for Linux, Mac OS and Windows platforms. The C ++ implementation is available at www.beerlab.org/gkmsvm mghandi@gmail.com or mbeer@jhu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems.

    Science.gov (United States)

    Cho, Ming-Yuan; Hoang, Thi Thom

    2017-01-01

    Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO) based support vector machine (SVM) classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR) method with a pseudorandom binary sequence (PRBS) stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  7. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems

    Directory of Open Access Journals (Sweden)

    Ming-Yuan Cho

    2017-01-01

    Full Text Available Fast and accurate fault classification is essential to power system operations. In this paper, in order to classify electrical faults in radial distribution systems, a particle swarm optimization (PSO based support vector machine (SVM classifier has been proposed. The proposed PSO based SVM classifier is able to select appropriate input features and optimize SVM parameters to increase classification accuracy. Further, a time-domain reflectometry (TDR method with a pseudorandom binary sequence (PRBS stimulus has been used to generate a dataset for purposes of classification. The proposed technique has been tested on a typical radial distribution network to identify ten different types of faults considering 12 given input features generated by using Simulink software and MATLAB Toolbox. The success rate of the SVM classifier is over 97%, which demonstrates the effectiveness and high efficiency of the developed method.

  8. Multiclass Posterior Probability Twin SVM for Motor Imagery EEG Classification.

    Science.gov (United States)

    She, Qingshan; Ma, Yuliang; Meng, Ming; Luo, Zhizeng

    2015-01-01

    Motor imagery electroencephalography is widely used in the brain-computer interface systems. Due to inherent characteristics of electroencephalography signals, accurate and real-time multiclass classification is always challenging. In order to solve this problem, a multiclass posterior probability solution for twin SVM is proposed by the ranking continuous output and pairwise coupling in this paper. First, two-class posterior probability model is constructed to approximate the posterior probability by the ranking continuous output techniques and Platt's estimating method. Secondly, a solution of multiclass probabilistic outputs for twin SVM is provided by combining every pair of class probabilities according to the method of pairwise coupling. Finally, the proposed method is compared with multiclass SVM and twin SVM via voting, and multiclass posterior probability SVM using different coupling approaches. The efficacy on the classification accuracy and time complexity of the proposed method has been demonstrated by both the UCI benchmark datasets and real world EEG data from BCI Competition IV Dataset 2a, respectively.

  9. A Method of Particle Swarm Optimized SVM Hyper-spectral Remote Sensing Image Classification

    International Nuclear Information System (INIS)

    Liu, Q J; Jing, L H; Wang, L M; Lin, Q Z

    2014-01-01

    Support Vector Machine (SVM) has been proved to be suitable for classification of remote sensing image and proposed to overcome the Hughes phenomenon. Hyper-spectral sensors are intrinsically designed to discriminate among a broad range of land cover classes which may lead to high computational time in SVM mutil-class algorithms. Model selection for SVM involving kernel and the margin parameter values selection which is usually time-consuming, impacts training efficiency of SVM model and final classification accuracies of SVM hyper-spectral remote sensing image classifier greatly. Firstly, based on combinatorial optimization theory and cross-validation method, particle swarm algorithm is introduced to the optimal selection of SVM (PSSVM) kernel parameter σ and margin parameter C to improve the modelling efficiency of SVM model. Then an experiment of classifying AVIRIS in India Pine site of USA was performed for evaluating the novel PSSVM, as well as traditional SVM classifier with general Grid-Search cross-validation method (GSSVM). And then, evaluation indexes including SVM model training time, classification Overall Accuracy (OA) and Kappa index of both PSSVM and GSSVM are all analyzed quantitatively. It is demonstrated that OA of PSSVM on test samples and whole image are 85% and 82%, the differences with that of GSSVM are both within 0.08% respectively. And Kappa indexes reach 0.82 and 0.77, the differences with that of GSSVM are both within 0.001. While the modelling time of PSSVM can be only 1/10 of that of GSSVM, and the modelling. Therefore, PSSVM is an fast and accurate algorithm for hyper-spectral image classification and is superior to GSSVM

  10. The 2nu-SVM: A Cost-Sensitive Extension of the nu-SVM

    National Research Council Canada - National Science Library

    Davenport, Mark A

    2005-01-01

    .... In this report we review cost-sensitive extensions of standard support vector machines (SVMs). In particular, we describe cost-sensitive extensions of the C-SVM and the nu-SVM, which we denote the 2C-SVM and 2nu-SVM respectively...

  11. DSP Based Direct Torque Control of Permanent Magnet Synchronous Motor (PMSM) using Space Vector Modulation (DTC-SVM)

    DEFF Research Database (Denmark)

    Swierczynski, Dariusz; Kazmierkowski, Marian P.; Blaabjerg, Frede

    2002-01-01

    DSP Based Direct Torque Control of Permanent Magnet Synchronous Motor (PMSM) using Space Vector Modulation (DTC-SVM)......DSP Based Direct Torque Control of Permanent Magnet Synchronous Motor (PMSM) using Space Vector Modulation (DTC-SVM)...

  12. Biometric identification based on feature fusion with PCA and SVM

    Science.gov (United States)

    Lefkovits, László; Lefkovits, Szidónia; Emerich, Simina

    2018-04-01

    Biometric identification is gaining ground compared to traditional identification methods. Many biometric measurements may be used for secure human identification. The most reliable among them is the iris pattern because of its uniqueness, stability, unforgeability and inalterability over time. The approach presented in this paper is a fusion of different feature descriptor methods such as HOG, LIOP, LBP, used for extracting iris texture information. The classifiers obtained through the SVM and PCA methods demonstrate the effectiveness of our system applied to one and both irises. The performances measured are highly accurate and foreshadow a fusion system with a rate of identification approaching 100% on the UPOL database.

  13. Age and gender estimation using Region-SIFT and multi-layered SVM

    Science.gov (United States)

    Kim, Hyunduk; Lee, Sang-Heon; Sohn, Myoung-Kyu; Hwang, Byunghun

    2018-04-01

    In this paper, we propose an age and gender estimation framework using the region-SIFT feature and multi-layered SVM classifier. The suggested framework entails three processes. The first step is landmark based face alignment. The second step is the feature extraction step. In this step, we introduce the region-SIFT feature extraction method based on facial landmarks. First, we define sub-regions of the face. We then extract SIFT features from each sub-region. In order to reduce the dimensions of features we employ a Principal Component Analysis (PCA) and a Linear Discriminant Analysis (LDA). Finally, we classify age and gender using a multi-layered Support Vector Machines (SVM) for efficient classification. Rather than performing gender estimation and age estimation independently, the use of the multi-layered SVM can improve the classification rate by constructing a classifier that estimate the age according to gender. Moreover, we collect a dataset of face images, called by DGIST_C, from the internet. A performance evaluation of proposed method was performed with the FERET database, CACD database, and DGIST_C database. The experimental results demonstrate that the proposed approach classifies age and performs gender estimation very efficiently and accurately.

  14. Comparative study of SVM methods combined with voxel selection for object category classification on fMRI data.

    Science.gov (United States)

    Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li

    2011-02-16

    Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.

  15. Learning using privileged information: SVM+ and weighted SVM.

    Science.gov (United States)

    Lapin, Maksim; Hein, Matthias; Schiele, Bernt

    2014-05-01

    Prior knowledge can be used to improve predictive performance of learning algorithms or reduce the amount of data required for training. The same goal is pursued within the learning using privileged information paradigm which was recently introduced by Vapnik et al. and is aimed at utilizing additional information available only at training time-a framework implemented by SVM+. We relate the privileged information to importance weighting and show that the prior knowledge expressible with privileged features can also be encoded by weights associated with every training example. We show that a weighted SVM can always replicate an SVM+ solution, while the converse is not true and we construct a counterexample highlighting the limitations of SVM+. Finally, we touch on the problem of choosing weights for weighted SVMs when privileged features are not available. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. SVM classifier to predict genes important for self-renewal and pluripotency of mouse embryonic stem cells

    Directory of Open Access Journals (Sweden)

    Xu Huilei

    2010-12-01

    Full Text Available Abstract Background Mouse embryonic stem cells (mESCs are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro. Their distinct features are their ability to self-renew and to differentiate to all adult cell types. Genes that maintain mESCs self-renewal and pluripotency identity are of interest to stem cell biologists. Although significant steps have been made toward the identification and characterization of such genes, the list is still incomplete and controversial. For example, the overlap among candidate self-renewal and pluripotency genes across different RNAi screens is surprisingly small. Meanwhile, machine learning approaches have been used to analyze multi-dimensional experimental data and integrate results from many studies, yet they have not been applied to specifically tackle the task of predicting and classifying self-renewal and pluripotency gene membership. Results For this study we developed a classifier, a supervised machine learning framework for predicting self-renewal and pluripotency mESCs stemness membership genes (MSMG using support vector machines (SVM. The data used to train the classifier was derived from mESCs-related studies using mRNA microarrays, measuring gene expression in various stages of early differentiation, as well as ChIP-seq studies applied to mESCs profiling genome-wide binding of key transcription factors, such as Nanog, Oct4, and Sox2, to the regulatory regions of other genes. Comparison to other classification methods using the leave-one-out cross-validation method was employed to evaluate the accuracy and generality of the classification. Finally, two sets of candidate genes from genome-wide RNA interference screens are used to test the generality and potential application of the classifier. Conclusions Our results reveal that an SVM approach can be useful for prioritizing genes for functional validation experiments and complement the analyses of high

  17. Fault diagnosis method based on FFT-RPCA-SVM for Cascaded-Multilevel Inverter.

    Science.gov (United States)

    Wang, Tianzhen; Qi, Jie; Xu, Hao; Wang, Yide; Liu, Lei; Gao, Diju

    2016-01-01

    Thanks to reduced switch stress, high quality of load wave, easy packaging and good extensibility, the cascaded H-bridge multilevel inverter is widely used in wind power system. To guarantee stable operation of system, a new fault diagnosis method, based on Fast Fourier Transform (FFT), Relative Principle Component Analysis (RPCA) and Support Vector Machine (SVM), is proposed for H-bridge multilevel inverter. To avoid the influence of load variation on fault diagnosis, the output voltages of the inverter is chosen as the fault characteristic signals. To shorten the time of diagnosis and improve the diagnostic accuracy, the main features of the fault characteristic signals are extracted by FFT. To further reduce the training time of SVM, the feature vector is reduced based on RPCA that can get a lower dimensional feature space. The fault classifier is constructed via SVM. An experimental prototype of the inverter is built to test the proposed method. Compared to other fault diagnosis methods, the experimental results demonstrate the high accuracy and efficiency of the proposed method. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  18. Accurate, model-based tuning of synthetic gene expression using introns in S. cerevisiae.

    Directory of Open Access Journals (Sweden)

    Ido Yofe

    2014-06-01

    Full Text Available Introns are key regulators of eukaryotic gene expression and present a potentially powerful tool for the design of synthetic eukaryotic gene expression systems. However, intronic control over gene expression is governed by a multitude of complex, incompletely understood, regulatory mechanisms. Despite this lack of detailed mechanistic understanding, here we show how a relatively simple model enables accurate and predictable tuning of synthetic gene expression system in yeast using several predictive intron features such as transcript folding and sequence motifs. Using only natural Saccharomyces cerevisiae introns as regulators, we demonstrate fine and accurate control over gene expression spanning a 100 fold expression range. These results broaden the engineering toolbox of synthetic gene expression systems and provide a framework in which precise and robust tuning of gene expression is accomplished.

  19. VLSI Design of SVM-Based Seizure Detection System With On-Chip Learning Capability.

    Science.gov (United States)

    Feng, Lichen; Li, Zunchao; Wang, Yuanfa

    2018-02-01

    Portable automatic seizure detection system is very convenient for epilepsy patients to carry. In order to make the system on-chip trainable with high efficiency and attain high detection accuracy, this paper presents a very large scale integration (VLSI) design based on the nonlinear support vector machine (SVM). The proposed design mainly consists of a feature extraction (FE) module and an SVM module. The FE module performs the three-level Daubechies discrete wavelet transform to fit the physiological bands of the electroencephalogram (EEG) signal and extracts the time-frequency domain features reflecting the nonstationary signal properties. The SVM module integrates the modified sequential minimal optimization algorithm with the table-driven-based Gaussian kernel to enable efficient on-chip learning. The presented design is verified on an Altera Cyclone II field-programmable gate array and tested using the two publicly available EEG datasets. Experiment results show that the designed VLSI system improves the detection accuracy and training efficiency.

  20. [Identification of varieties of cashmere by Vis/NIR spectroscopy technology based on PCA-SVM].

    Science.gov (United States)

    Wu, Gui-Fang; He, Yong

    2009-06-01

    One mixed algorithm was presented to discriminate cashmere varieties with principal component analysis (PCA) and support vector machine (SVM). Cashmere fiber has such characteristics as threadlike, softness, glossiness and high tensile strength. The quality characters and economic value of each breed of cashmere are very different. In order to safeguard the consumer's rights and guarantee the quality of cashmere product, quickly, efficiently and correctly identifying cashmere has significant meaning to the production and transaction of cashmere material. The present research adopts Vis/NIRS spectroscopy diffuse techniques to collect the spectral data of cashmere. The near infrared fingerprint of cashmere was acquired by principal component analysis (PCA), and support vector machine (SVM) methods were used to further identify the cashmere material. The result of PCA indicated that the score map made by the scores of PC1, PC2 and PC3 was used, and 10 principal components (PCs) were selected as the input of support vector machine (SVM) based on the reliabilities of PCs of 99.99%. One hundred cashmere samples were used for calibration and the remaining 75 cashmere samples were used for validation. A one-against-all multi-class SVM model was built, the capabilities of SVM with different kernel function were comparatively analyzed, and the result showed that SVM possessing with the Gaussian kernel function has the best identification capabilities with the accuracy of 100%. This research indicated that the data mining method of PCA-SVM has a good identification effect, and can work as a new method for rapid identification of cashmere material varieties.

  1. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.

    Science.gov (United States)

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-05-22

    significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition.

  2. SVM-based multisensor data fusion for phase concentration measurement in biomass-coal co-combustion

    Science.gov (United States)

    Wang, Xiaoxin; Hu, Hongli; Jia, Huiqin; Tang, Kaihao

    2018-05-01

    In this paper, the electrical method combines the electrostatic sensor and capacitance sensor to measure the phase concentration of pulverized coal/biomass/air three-phase flow through data fusion technology. In order to eliminate the effects of flow regimes and improve the accuracy of the phase concentration measurement, the mel frequency cepstrum coefficient features extracted from electrostatic signals are used to train the Continuous Gaussian Mixture Hidden Markov Model (CGHMM) for flow regime identification. Support Vector Machine (SVM) is introduced to establish the concentration information fusion model under identified flow regimes. The CGHMM models and SVM models are transplanted on digital signal processing (DSP) to realize on-line accurate measurement. The DSP flow regime identification time is 1.4 ms, and the concentration predict time is 164 μs, which can fully meet the real-time requirement. The average absolute value of the relative error of the pulverized coal is about 1.5% and that of the biomass is about 2.2%.

  3. Power quality events recognition using a SVM-based method

    Energy Technology Data Exchange (ETDEWEB)

    Cerqueira, Augusto Santiago; Ferreira, Danton Diego; Ribeiro, Moises Vidal; Duque, Carlos Augusto [Department of Electrical Circuits, Federal University of Juiz de Fora, Campus Universitario, 36036 900, Juiz de Fora MG (Brazil)

    2008-09-15

    In this paper, a novel SVM-based method for power quality event classification is proposed. A simple approach for feature extraction is introduced, based on the subtraction of the fundamental component from the acquired voltage signal. The resulting signal is presented to a support vector machine for event classification. Results from simulation are presented and compared with two other methods, the OTFR and the LCEC. The proposed method shown an improved performance followed by a reasonable computational cost. (author)

  4. Assessment of ANN and SVM models for estimating normal direct irradiation (H_b)

    International Nuclear Information System (INIS)

    Santos, Cícero Manoel dos; Escobedo, João Francisco; Teramoto, Érico Tadao; Modenese Gorla da Silva, Silvia Helena

    2016-01-01

    Highlights: • The performance of SVM and ANN in estimating Normal Direct Irradiation (H_b) was evaluated. • 12 models using different input variables are developed (hourly and daily partitions). • The most relevant input variables for DNI are kt, H_s_c and insolation ratio (r′ = n/N). • Support Vector Machine (SVM) provides accurate estimates and outperforms the Artificial Neural Network (ANN). - Abstract: This study evaluates the estimation of hourly and daily normal direct irradiation (H_b) using machine learning techniques (ML): Artificial Neural Network (ANN) and Support Vector Machine (SVM). Time series of different meteorological variables measured over thirteen years in Botucatu were used for training and validating ANN and SVM. Seven different sets of input variables were tested and evaluated, which were chosen based on statistical models reported in the literature. Relative Mean Bias Error (rMBE), Relative Root Mean Square Error (rRMSE), determination coefficient (R"2) and “d” Willmott index were used to evaluate ANN and SVM models. When compared to statistical models which use the same set of input variables (R"2 between 0.22 and 0.78), ANN and SVM show higher values of R"2 (hourly models between 0.52 and 0.88; daily models between 0.42 and 0.91). Considering the input variables, atmospheric transmissivity of global radiation (kt), integrated solar constant (H_s_c) and insolation ratio (n/N, n is sunshine duration and N is photoperiod) were the most relevant in ANN and SVM models. The rMBE and rRMSE values in the two time partitions of SVM models are lower than those obtained with ANN. Hourly ANN and SVM models have higher rRMSE values than daily models. Optimal performance with hourly models was obtained with ANN4"h (rMBE = 12.24%, rRMSE = 23.99% and “d” = 0.96) and SVM4"h (rMBE = 1.75%, rRMSE = 20.10% and “d” = 0.96). Optimal performance with daily models was obtained with ANN2"d (rMBE = −3.09%, rRMSE = 18.95% and “d” = 0

  5. SVM-based glioma grading. Optimization by feature reduction analysis

    International Nuclear Information System (INIS)

    Zoellner, Frank G.; Schad, Lothar R.; Emblem, Kyrre E.; Harvard Medical School, Boston, MA; Oslo Univ. Hospital

    2012-01-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values (∝87%) while reducing the number of features by up to 98%. (orig.)

  6. SVM-based glioma grading. Optimization by feature reduction analysis

    Energy Technology Data Exchange (ETDEWEB)

    Zoellner, Frank G.; Schad, Lothar R. [University Medical Center Mannheim, Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Emblem, Kyrre E. [Massachusetts General Hospital, Charlestown, A.A. Martinos Center for Biomedical Imaging, Boston MA (United States). Dept. of Radiology; Harvard Medical School, Boston, MA (United States); Oslo Univ. Hospital (Norway). The Intervention Center

    2012-11-01

    We investigated the predictive power of feature reduction analysis approaches in support vector machine (SVM)-based classification of glioma grade. In 101 untreated glioma patients, three analytic approaches were evaluated to derive an optimal reduction in features; (i) Pearson's correlation coefficients (PCC), (ii) principal component analysis (PCA) and (iii) independent component analysis (ICA). Tumor grading was performed using a previously reported SVM approach including whole-tumor cerebral blood volume (CBV) histograms and patient age. Best classification accuracy was found using PCA at 85% (sensitivity = 89%, specificity = 84%) when reducing the feature vector from 101 (100-bins rCBV histogram + age) to 3 principal components. In comparison, classification accuracy by PCC was 82% (89%, 77%, 2 dimensions) and 79% by ICA (87%, 75%, 9 dimensions). For improved speed (up to 30%) and simplicity, feature reduction by all three methods provided similar classification accuracy to literature values ({proportional_to}87%) while reducing the number of features by up to 98%. (orig.)

  7. Abnormal Gait Behavior Detection for Elderly Based on Enhanced Wigner-Ville Analysis and Cloud Incremental SVM Learning

    Directory of Open Access Journals (Sweden)

    Jian Luo

    2016-01-01

    Full Text Available A cloud based health care system is proposed in this paper for the elderly by providing abnormal gait behavior detection, classification, online diagnosis, and remote aid service. Intelligent mobile terminals with triaxial acceleration sensor embedded are used to capture the movement and ambulation information of elderly. The collected signals are first enhanced by a Kalman filter. And the magnitude of signal vector features is then extracted and decomposed into a linear combination of enhanced Gabor atoms. The Wigner-Ville analysis method is introduced and the problem is studied by joint time-frequency analysis. In order to solve the large-scale abnormal behavior data lacking problem in training process, a cloud based incremental SVM (CI-SVM learning method is proposed. The original abnormal behavior data are first used to get the initial SVM classifier. And the larger abnormal behavior data of elderly collected by mobile devices are then gathered in cloud platform to conduct incremental training and get the new SVM classifier. By the CI-SVM learning method, the knowledge of SVM classifier could be accumulated due to the dynamic incremental learning. Experimental results demonstrate that the proposed method is feasible and can be applied to aged care, emergency aid, and related fields.

  8. Optimal structural design of the midship of a VLCC based on the strategy integrating SVM and GA

    Science.gov (United States)

    Sun, Li; Wang, Deyu

    2012-03-01

    In this paper a hybrid process of modeling and optimization, which integrates a support vector machine (SVM) and genetic algorithm (GA), was introduced to reduce the high time cost in structural optimization of ships. SVM, which is rooted in statistical learning theory and an approximate implementation of the method of structural risk minimization, can provide a good generalization performance in metamodeling the input-output relationship of real problems and consequently cuts down on high time cost in the analysis of real problems, such as FEM analysis. The GA, as a powerful optimization technique, possesses remarkable advantages for the problems that can hardly be optimized with common gradient-based optimization methods, which makes it suitable for optimizing models built by SVM. Based on the SVM-GA strategy, optimization of structural scantlings in the midship of a very large crude carrier (VLCC) ship was carried out according to the direct strength assessment method in common structural rules (CSR), which eventually demonstrates the high efficiency of SVM-GA in optimizing the ship structural scantlings under heavy computational complexity. The time cost of this optimization with SVM-GA has been sharply reduced, many more loops have been processed within a small amount of time and the design has been improved remarkably.

  9. Fault Diagnosis of Complex Industrial Process Using KICA and Sparse SVM

    Directory of Open Access Journals (Sweden)

    Jie Xu

    2013-01-01

    Full Text Available New approaches are proposed for complex industrial process monitoring and fault diagnosis based on kernel independent component analysis (KICA and sparse support vector machine (SVM. The KICA method is a two-phase algorithm: whitened kernel principal component analysis (KPCA. The data are firstly mapped into high-dimensional feature subspace. Then, the ICA algorithm seeks the projection directions in the KPCA whitened space. Performance monitoring is implemented through constructing the statistical index and control limit in the feature space. If the statistical indexes exceed the predefined control limit, a fault may have occurred. Then, the nonlinear score vectors are calculated and fed into the sparse SVM to identify the faults. The proposed method is applied to the simulation of Tennessee Eastman (TE chemical process. The simulation results show that the proposed method can identify various types of faults accurately and rapidly.

  10. Cross Validation Through Two-Dimensional Solution Surface for Cost-Sensitive SVM.

    Science.gov (United States)

    Gu, Bin; Sheng, Victor S; Tay, Keng Yeow; Romano, Walter; Li, Shuo

    2017-06-01

    Model selection plays an important role in cost-sensitive SVM (CS-SVM). It has been proven that the global minimum cross validation (CV) error can be efficiently computed based on the solution path for one parameter learning problems. However, it is a challenge to obtain the global minimum CV error for CS-SVM based on one-dimensional solution path and traditional grid search, because CS-SVM is with two regularization parameters. In this paper, we propose a solution and error surfaces based CV approach (CV-SES). More specifically, we first compute a two-dimensional solution surface for CS-SVM based on a bi-parameter space partition algorithm, which can fit solutions of CS-SVM for all values of both regularization parameters. Then, we compute a two-dimensional validation error surface for each CV fold, which can fit validation errors of CS-SVM for all values of both regularization parameters. Finally, we obtain the CV error surface by superposing K validation error surfaces, which can find the global minimum CV error of CS-SVM. Experiments are conducted on seven datasets for cost sensitive learning and on four datasets for imbalanced learning. Experimental results not only show that our proposed CV-SES has a better generalization ability than CS-SVM with various hybrids between grid search and solution path methods, and than recent proposed cost-sensitive hinge loss SVM with three-dimensional grid search, but also show that CV-SES uses less running time.

  11. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.

    Science.gov (United States)

    Li, Qiang; Gu, Yu; Jia, Jing

    2017-01-30

    Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  12. Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier

    Directory of Open Access Journals (Sweden)

    Qiang Li

    2017-01-01

    Full Text Available Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS and support vector machine (SVM algorithms in a quartz crystal microbalance (QCM-based electronic nose (e-nose we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3% showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN classifier (93.3% and moving average-linear discriminant analysis (MA-LDA classifier (87.6%. The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.

  13. Universum Learning for Multiclass SVM

    OpenAIRE

    Dhar, Sauptik; Ramakrishnan, Naveen; Cherkassky, Vladimir; Shah, Mohak

    2016-01-01

    We introduce Universum learning for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose a span bound for MU-SVM that can be used for model selection thereby avoiding resampling. Empirical results demonstrate the effectiveness of MU-SVM and the proposed bound.

  14. Adaptive SVM for Data Stream Classification

    Directory of Open Access Journals (Sweden)

    Isah A. Lawal

    2017-07-01

    Full Text Available In this paper, we address the problem of learning an adaptive classifier for the classification of continuous streams of data. We present a solution based on incremental extensions of the Support Vector Machine (SVM learning paradigm that updates an existing SVM whenever new training data are acquired. To ensure that the SVM effectiveness is guaranteed while exploiting the newly gathered data, we introduce an on-line model selection approach in the incremental learning process. We evaluated the proposed method on real world applications including on-line spam email filtering and human action classification from videos. Experimental results show the effectiveness and the potential of the proposed approach.

  15. New KF-PP-SVM classification method for EEG in brain-computer interfaces.

    Science.gov (United States)

    Yang, Banghua; Han, Zhijun; Zan, Peng; Wang, Qian

    2014-01-01

    Classification methods are a crucial direction in the current study of brain-computer interfaces (BCIs). To improve the classification accuracy for electroencephalogram (EEG) signals, a novel KF-PP-SVM (kernel fisher, posterior probability, and support vector machine) classification method is developed. Its detailed process entails the use of common spatial patterns to obtain features, based on which the within-class scatter is calculated. Then the scatter is added into the kernel function of a radial basis function to construct a new kernel function. This new kernel is integrated into the SVM to obtain a new classification model. Finally, the output of SVM is calculated based on posterior probability and the final recognition result is obtained. To evaluate the effectiveness of the proposed KF-PP-SVM method, EEG data collected from laboratory are processed with four different classification schemes (KF-PP-SVM, KF-SVM, PP-SVM, and SVM). The results showed that the overall average improvements arising from the use of the KF-PP-SVM scheme as opposed to KF-SVM, PP-SVM and SVM schemes are 2.49%, 5.83 % and 6.49 % respectively.

  16. Comparison of sensorless FOC and SVM-DTFC of PMSM for low-speed applications

    DEFF Research Database (Denmark)

    Basar, M. Sertug; Bech, Michael Møller; Andersen, Torben Ole

    2013-01-01

    This article presents the performance analysis of Field Oriented Control (FOC) and Space Vector Modulation (SVM) Direct Torque and Flux Control (DTFC) of a Non-Salient Permanent Magnet Synchronous Machine (PMSM) under sensorless control within low speed region. The high-frequency alternating...... with a commercially available PMSM machine. Both controllers show satisfactory sensorless performance. FOC provides smoother and more accurate response while SVM-DTFC has the advantage of faster control....

  17. Comparison of sensorless FOC and SVM-DTFC of PMSM for low-speed applications

    DEFF Research Database (Denmark)

    Basar, Mehmet Sertug

    2016-01-01

    This article presents the performance analysis of Field Oriented Control (FOC) and Space Vector Modulation (SVM) Direct Torque and Flux Control (DTFC) of a Non-Salient Permanent Magnet Synchronous Machine (PMSM) under sensorless control within low speed region. The high-frequency alternating...... with a commercially available PMSM machine. Both controllers show satisfactory sensorless performance. FOC provides smoother and more accurate response while SVM-DTFC has the advantage of faster control....

  18. Protein-protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM.

    Science.gov (United States)

    Sriwastava, Brijesh Kumar; Basu, Subhadip; Maulik, Ujjwal

    2015-10-01

    Protein-protein interaction (PPI) site prediction aids to ascertain the interface residues that participate in interaction processes. Fuzzy support vector machine (F-SVM) is proposed as an effective method to solve this problem, and we have shown that the performance of the classical SVM can be enhanced with the help of an interaction-affinity based fuzzy membership function. The performances of both SVM and F-SVM on the PPI databases of the Homo sapiens and E. coli organisms are evaluated and estimated the statistical significance of the developed method over classical SVM and other fuzzy membership-based SVM methods available in the literature. Our membership function uses the residue-level interaction affinity scores for each pair of positive and negative sequence fragments. The average AUC scores in the 10-fold cross-validation experiments are measured as 79.94% and 80.48% for the Homo sapiens and E. coli organisms respectively. On the independent test datasets, AUC scores are obtained as 76.59% and 80.17% respectively for the two organisms. In almost all cases, the developed F-SVM method improves the performances obtained by the corresponding classical SVM and the other classifiers, available in the literature.

  19. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images

    Science.gov (United States)

    Xu, Yongzheng; Yu, Guizhen; Wang, Yunpeng; Wu, Xinkai; Ma, Yalong

    2016-01-01

    A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J) and linear SVM classifier with HOG feature (HOG + SVM) methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV) images. As both V-J and HOG + SVM are sensitive to on-road vehicles’ in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians. PMID:27548179

  20. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images.

    Science.gov (United States)

    Xu, Yongzheng; Yu, Guizhen; Wang, Yunpeng; Wu, Xinkai; Ma, Yalong

    2016-08-19

    A new hybrid vehicle detection scheme which integrates the Viola-Jones (V-J) and linear SVM classifier with HOG feature (HOG + SVM) methods is proposed for vehicle detection from low-altitude unmanned aerial vehicle (UAV) images. As both V-J and HOG + SVM are sensitive to on-road vehicles' in-plane rotation, the proposed scheme first adopts a roadway orientation adjustment method, which rotates each UAV image to align the roads with the horizontal direction so the original V-J or HOG + SVM method can be directly applied to achieve fast detection and high accuracy. To address the issue of descending detection speed for V-J and HOG + SVM, the proposed scheme further develops an adaptive switching strategy which sophistically integrates V-J and HOG + SVM methods based on their different descending trends of detection speed to improve detection efficiency. A comprehensive evaluation shows that the switching strategy, combined with the road orientation adjustment method, can significantly improve the efficiency and effectiveness of the vehicle detection from UAV images. The results also show that the proposed vehicle detection method is competitive compared with other existing vehicle detection methods. Furthermore, since the proposed vehicle detection method can be performed on videos captured from moving UAV platforms without the need of image registration or additional road database, it has great potentials of field applications. Future research will be focusing on expanding the current method for detecting other transportation modes such as buses, trucks, motors, bicycles, and pedestrians.

  1. DISEÑO Y EVALUACIÓN DE UN CLASIFICADOR DE TEXTURAS BASADO EN LS-SVM

    Directory of Open Access Journals (Sweden)

    Beitmantt Cárdenas Quintero

    2013-07-01

    Full Text Available Evaluar el desempeño y el costo computacional de diferentes arquitecturas y metodologías Least Square Support Vector Machine (LS-SVM ante la segmentación de imágenes por textura y a partir de dichos resultados postular un modelo de un clasificador de texturas LS-SVM.  Metodología: Ante un problema de clasificación binaria representado por la segmentación  de 32 imágenes, organizadas en 4 grupos y formadas por pares de texturas típicas (granito/corteza, ladrillo/tapicería, madera/mármol, tejido/pelaje, se mide y compara el desempeño y el costo computacional de dos tipos de núcleo (Radial / Polinomial, dos funciones de optimización (mínimo local / búsqueda exhaustiva y dos funciones de costo (validación cruzada aleatoria / Validación cruzada dejando al menos uno en una LS-SVM que toma como entrada los pixeles que conforman la vecindad cruz del pixel a evaluar (no se hace extracción de características. Resultados: LS-SVM como clasificador de texturas, presenta mejor desempeño y exige menor costo computacional cuando utiliza un kernel de base radial y una función de optimización basada en un algoritmo de búsqueda de mínimos locales acompañado de una función de costo que use validación cruzada aleatoria.

  2. SVM classification model in depression recognition based on mutation PSO parameter optimization

    Directory of Open Access Journals (Sweden)

    Zhang Ming

    2017-01-01

    Full Text Available At present, the clinical diagnosis of depression is mainly through structured interviews by psychiatrists, which is lack of objective diagnostic methods, so it causes the higher rate of misdiagnosis. In this paper, a method of depression recognition based on SVM and particle swarm optimization algorithm mutation is proposed. To address on the problem that particle swarm optimization (PSO algorithm easily trap in local optima, we propose a feedback mutation PSO algorithm (FBPSO to balance the local search and global exploration ability, so that the parameters of the classification model is optimal. We compared different PSO mutation algorithms about classification accuracy for depression, and found the classification accuracy of support vector machine (SVM classifier based on feedback mutation PSO algorithm is the highest. Our study promotes important reference value for establishing auxiliary diagnostic used in depression recognition of clinical diagnosis.

  3. Prediction of the strength of concrete radiation shielding based on LS-SVM

    International Nuclear Information System (INIS)

    Juncai, Xu; Qingwen, Ren; Zhenzhong, Shen

    2015-01-01

    Highlights: • LS-SVM was introduced for prediction of the strength of RSC. • A model for prediction of the strength of RSC was implemented. • The grid search algorithm was used to optimize the parameters of the LS-SVM. • The performance of LS-SVM in predicting the strength of RSC was evaluated. - Abstract: Radiation-shielding concrete (RSC) and conventional concrete differ in strength because of their distinct constituents. Predicting the strength of RSC with different constituents plays a vital role in radiation shielding (RS) engineering design. In this study, a model to predict the strength of RSC is established using a least squares-support vector machine (LS-SVM) through grid search algorithm. The algorithm is used to optimize the parameters of the LS-SVM on the basis of traditional prediction methods for conventional concrete. The predicted results of the LS-SVM model are compared with the experimental data. The results of the prediction are stable and consistent with the experimental results. In addition, the studied parameters exhibit significant effects on the simulation results. Therefore, the proposed method can be applied in predicting the strength of RSC, and the predicted results can be adopted as an important reference for RS engineering design

  4. In-Vivo Imaging of Cell Migration Using Contrast Enhanced MRI and SVM Based Post-Processing.

    Science.gov (United States)

    Weis, Christian; Hess, Andreas; Budinsky, Lubos; Fabry, Ben

    2015-01-01

    The migration of cells within a living organism can be observed with magnetic resonance imaging (MRI) in combination with iron oxide nanoparticles as an intracellular contrast agent. This method, however, suffers from low sensitivity and specificty. Here, we developed a quantitative non-invasive in-vivo cell localization method using contrast enhanced multiparametric MRI and support vector machines (SVM) based post-processing. Imaging phantoms consisting of agarose with compartments containing different concentrations of cancer cells labeled with iron oxide nanoparticles were used to train and evaluate the SVM for cell localization. From the magnitude and phase data acquired with a series of T2*-weighted gradient-echo scans at different echo-times, we extracted features that are characteristic for the presence of superparamagnetic nanoparticles, in particular hyper- and hypointensities, relaxation rates, short-range phase perturbations, and perturbation dynamics. High detection quality was achieved by SVM analysis of the multiparametric feature-space. The in-vivo applicability was validated in animal studies. The SVM detected the presence of iron oxide nanoparticles in the imaging phantoms with high specificity and sensitivity with a detection limit of 30 labeled cells per mm3, corresponding to 19 μM of iron oxide. As proof-of-concept, we applied the method to follow the migration of labeled cancer cells injected in rats. The combination of iron oxide labeled cells, multiparametric MRI and a SVM based post processing provides high spatial resolution, specificity, and sensitivity, and is therefore suitable for non-invasive in-vivo cell detection and cell migration studies over prolonged time periods.

  5. Fault diagnosis of nuclear-powered equipment based on HMM and SVM

    International Nuclear Information System (INIS)

    Yue Xia; Zhang Chunliang; Zhu Houyao; Quan Yanming

    2012-01-01

    For the complexity and the small fault samples of nuclear-powered equipment, a hybrid HMM/SVM method was introduced in fault diagnosis. The hybrid method has two steps: first, HMM is utilized for primary diagnosis, in which the range of possible failure is reduced and the state trends can be observed; then faults can be recognized taking the advantage of the generalization ability of SVM. Experiments on the main pump failure simulator show that the HMM/SVM system has a high recognition rate and can be used in the fault diagnosis of nuclear-powered equipment. (authors)

  6. A Power Transformers Fault Diagnosis Model Based on Three DGA Ratios and PSO Optimization SVM

    Science.gov (United States)

    Ma, Hongzhe; Zhang, Wei; Wu, Rongrong; Yang, Chunyan

    2018-03-01

    In order to make up for the shortcomings of existing transformer fault diagnosis methods in dissolved gas-in-oil analysis (DGA) feature selection and parameter optimization, a transformer fault diagnosis model based on the three DGA ratios and particle swarm optimization (PSO) optimize support vector machine (SVM) is proposed. Using transforming support vector machine to the nonlinear and multi-classification SVM, establishing the particle swarm optimization to optimize the SVM multi classification model, and conducting transformer fault diagnosis combined with the cross validation principle. The fault diagnosis results show that the average accuracy of test method is better than the standard support vector machine and genetic algorithm support vector machine, and the proposed method can effectively improve the accuracy of transformer fault diagnosis is proved.

  7. SVM and SVM Ensembles in Breast Cancer Prediction

    OpenAIRE

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction per...

  8. Learning SVM in Kreĭn Spaces.

    Science.gov (United States)

    Loosli, Gaelle; Canu, Stephane; Ong, Cheng Soon

    2016-06-01

    This paper presents a theoretical foundation for an SVM solver in Kreĭn spaces. Up to now, all methods are based either on the matrix correction, or on non-convex minimization, or on feature-space embedding. Here we justify and evaluate a solution that uses the original (indefinite) similarity measure, in the original Kreĭn space. This solution is the result of a stabilization procedure. We establish the correspondence between the stabilization problem (which has to be solved) and a classical SVM based on minimization (which is easy to solve). We provide simple equations to go from one to the other (in both directions). This link between stabilization and minimization problems is the key to obtain a solution in the original Kreĭn space. Using KSVM, one can solve SVM with usually troublesome kernels (large negative eigenvalues or large numbers of negative eigenvalues). We show experiments showing that our algorithm KSVM outperforms all previously proposed approaches to deal with indefinite matrices in SVM-like kernel methods.

  9. Hybrid PSO–SVM-based method for forecasting of the remaining useful life for aircraft engines and evaluation of its reliability

    International Nuclear Information System (INIS)

    García Nieto, P.J.; García-Gonzalo, E.; Sánchez Lasheras, F.; Cos Juez, F.J. de

    2015-01-01

    The present paper describes a hybrid PSO–SVM-based model for the prediction of the remaining useful life of aircraft engines. The proposed hybrid model combines support vector machines (SVMs), which have been successfully adopted for regression problems, with the particle swarm optimization (PSO) technique. This optimization technique involves kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not been yet widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid PSO–SVM-based model from the remaining measured parameters (input variables) for aircraft engines with success. A coefficient of determination equal to 0.9034 was obtained when this hybrid PSO–RBF–SVM-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. One of the main advantages of this predictive model is that it does not require information about the previous operation states of the engine. Finally, the main conclusions of this study are exposed. - Highlights: • A hybrid PSO–SVM-based model is built as a predictive model of the RUL values for aircraft engines. • The remaining physical–chemical variables in this process are studied in depth. • The obtained regression accuracy of our method is about 95%. • The results show that PSO–SVM-based model can assist in the diagnosis of the RUL values with accuracy

  10. Research on gesture recognition of augmented reality maintenance guiding system based on improved SVM

    Science.gov (United States)

    Zhao, Shouwei; Zhang, Yong; Zhou, Bin; Ma, Dongxi

    2014-09-01

    Interaction is one of the key techniques of augmented reality (AR) maintenance guiding system. Because of the complexity of the maintenance guiding system's image background and the high dimensionality of gesture characteristics, the whole process of gesture recognition can be divided into three stages which are gesture segmentation, gesture characteristic feature modeling and trick recognition. In segmentation stage, for solving the misrecognition of skin-like region, a segmentation algorithm combing background mode and skin color to preclude some skin-like regions is adopted. In gesture characteristic feature modeling of image attributes stage, plenty of characteristic features are analyzed and acquired, such as structure characteristics, Hu invariant moments features and Fourier descriptor. In trick recognition stage, a classifier based on Support Vector Machine (SVM) is introduced into the augmented reality maintenance guiding process. SVM is a novel learning method based on statistical learning theory, processing academic foundation and excellent learning ability, having a lot of issues in machine learning area and special advantages in dealing with small samples, non-linear pattern recognition at high dimension. The gesture recognition of augmented reality maintenance guiding system is realized by SVM after the granulation of all the characteristic features. The experimental results of the simulation of number gesture recognition and its application in augmented reality maintenance guiding system show that the real-time performance and robustness of gesture recognition of AR maintenance guiding system can be greatly enhanced by improved SVM.

  11. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2007-05-01

    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  12. sw-SVM: sensor weighting support vector machines for EEG-based brain-computer interfaces.

    Science.gov (United States)

    Jrad, N; Congedo, M; Phlypo, R; Rousseau, S; Flamary, R; Yger, F; Rakotomamonjy, A

    2011-10-01

    In many machine learning applications, like brain-computer interfaces (BCI), high-dimensional sensor array data are available. Sensor measurements are often highly correlated and signal-to-noise ratio is not homogeneously spread across sensors. Thus, collected data are highly variable and discrimination tasks are challenging. In this work, we focus on sensor weighting as an efficient tool to improve the classification procedure. We present an approach integrating sensor weighting in the classification framework. Sensor weights are considered as hyper-parameters to be learned by a support vector machine (SVM). The resulting sensor weighting SVM (sw-SVM) is designed to satisfy a margin criterion, that is, the generalization error. Experimental studies on two data sets are presented, a P300 data set and an error-related potential (ErrP) data set. For the P300 data set (BCI competition III), for which a large number of trials is available, the sw-SVM proves to perform equivalently with respect to the ensemble SVM strategy that won the competition. For the ErrP data set, for which a small number of trials are available, the sw-SVM shows superior performances as compared to three state-of-the art approaches. Results suggest that the sw-SVM promises to be useful in event-related potentials classification, even with a small number of training trials.

  13. F-SVM: Combination of Feature Transformation and SVM Learning via Convex Relaxation

    OpenAIRE

    Wu, Xiaohe; Zuo, Wangmeng; Zhu, Yuanyuan; Lin, Liang

    2015-01-01

    The generalization error bound of support vector machine (SVM) depends on the ratio of radius and margin, while standard SVM only considers the maximization of the margin but ignores the minimization of the radius. Several approaches have been proposed to integrate radius and margin for joint learning of feature transformation and SVM classifier. However, most of them either require the form of the transformation matrix to be diagonal, or are non-convex and computationally expensive. In this ...

  14. Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-06-28

    A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is

  15. Cerebral 18F-FDG PET in macrophagic myofasciitis: An individual SVM-based approach.

    Science.gov (United States)

    Blanc-Durand, Paul; Van Der Gucht, Axel; Guedj, Eric; Abulizi, Mukedaisi; Aoun-Sebaiti, Mehdi; Lerman, Lionel; Verger, Antoine; Authier, François-Jérôme; Itti, Emmanuel

    2017-01-01

    Macrophagic myofasciitis (MMF) is an emerging condition with highly specific myopathological alterations. A peculiar spatial pattern of a cerebral glucose hypometabolism involving occipito-temporal cortex and cerebellum have been reported in patients with MMF; however, the full pattern is not systematically present in routine interpretation of scans, and with varying degrees of severity depending on the cognitive profile of patients. Aim was to generate and evaluate a support vector machine (SVM) procedure to classify patients between healthy or MMF 18F-FDG brain profiles. 18F-FDG PET brain images of 119 patients with MMF and 64 healthy subjects were retrospectively analyzed. The whole-population was divided into two groups; a training set (100 MMF, 44 healthy subjects) and a testing set (19 MMF, 20 healthy subjects). Dimensionality reduction was performed using a t-map from statistical parametric mapping (SPM) and a SVM with a linear kernel was trained on the training set. To evaluate the performance of the SVM classifier, values of sensitivity (Se), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV) and accuracy (Acc) were calculated. The SPM12 analysis on the training set exhibited the already reported hypometabolism pattern involving occipito-temporal and fronto-parietal cortices, limbic system and cerebellum. The SVM procedure, based on the t-test mask generated from the training set, correctly classified MMF patients of the testing set with following Se, Sp, PPV, NPV and Acc: 89%, 85%, 85%, 89%, and 87%. We developed an original and individual approach including a SVM to classify patients between healthy or MMF metabolic brain profiles using 18F-FDG-PET. Machine learning algorithms are promising for computer-aided diagnosis but will need further validation in prospective cohorts.

  16. SVM models for analysing the headstreams of mine water inrush

    Energy Technology Data Exchange (ETDEWEB)

    Yan Zhi-gang; Du Pei-jun; Guo Da-zhi [China University of Science and Technology, Xuzhou (China). School of Environmental Science and Spatial Informatics

    2007-08-15

    The support vector machine (SVM) model was introduced to analyse the headstrean of water inrush in a coal mine. The SVM model, based on a hydrogeochemical method, was constructed for recognising two kinds of headstreams and the H-SVMs model was constructed for recognising multi- headstreams. The SVM method was applied to analyse the conditions of two mixed headstreams and the value of the SVM decision function was investigated as a means of denoting the hydrogeochemical abnormality. The experimental results show that the SVM is based on a strict mathematical theory, has a simple structure and a good overall performance. Moreover the parameter W in the decision function can describe the weights of discrimination indices of the headstream of water inrush. The value of the decision function can denote hydrogeochemistry abnormality, which is significant in the prevention of water inrush in a coal mine. 9 refs., 1 fig., 7 tabs.

  17. A method of distributed avionics data processing based on SVM classifier

    Science.gov (United States)

    Guo, Hangyu; Wang, Jinyan; Kang, Minyang; Xu, Guojing

    2018-03-01

    Under the environment of system combat, in order to solve the problem on management and analysis of the massive heterogeneous data on multi-platform avionics system, this paper proposes a management solution which called avionics "resource cloud" based on big data technology, and designs an aided decision classifier based on SVM algorithm. We design an experiment with STK simulation, the result shows that this method has a high accuracy and a broad application prospect.

  18. An SVM Based Approach for the Analysis Of Mammography Images

    Science.gov (United States)

    Gan, X.; Kapsokalivas, L.; Skaliotis, A.; Steinhöfel, K.; Tangaro, S.

    2007-09-01

    Mammography is among the most popular imaging techniques used in the diagnosis of breast cancer. Nevertheless distinguishing between healthy and ill images is hard even for an experienced radiologist, because a single image usually includes several regions of interest (ROIs). The hardness of this classification problem along with the substantial amount of data, gathered from patients' medical history, motivates the use of a machine learning approach as part of a CAD (Computer Aided Detection) tool, aiming to assist radiologists in the characterization of mammography images. Specifically, our approach involves: i) the ROI extraction, ii) the Feature Vector extraction, iii) the Support Vector Machine (SVM) classification of ROIs and iv) the characterization of the whole image. We evaluate the performance of our approach in terms of the SVM's training and testing error and in terms of ROI specificity—sensitivity. The results show a relation between the number of features used and the SVM's performance.

  19. An SVM Based Approach for the Analysis Of Mammography Images

    International Nuclear Information System (INIS)

    Gan, X.; Kapsokalivas, L.; Skaliotis, A.; Steinhoefel, K.; Tangaro, S.

    2007-01-01

    Mammography is among the most popular imaging techniques used in the diagnosis of breast cancer. Nevertheless distinguishing between healthy and ill images is hard even for an experienced radiologist, because a single image usually includes several regions of interest (ROIs). The hardness of this classification problem along with the substantial amount of data, gathered from patients' medical history, motivates the use of a machine learning approach as part of a CAD (Computer Aided Detection) tool, aiming to assist radiologists in the characterization of mammography images. Specifically, our approach involves: i) the ROI extraction, ii) the Feature Vector extraction, iii) the Support Vector Machine (SVM) classification of ROIs and iv) the characterization of the whole image. We evaluate the performance of our approach in terms of the SVM's training and testing error and in terms of ROI specificity - sensitivity. The results show a relation between the number of features used and the SVM's performance

  20. Automatic epileptic seizure detection in EEGs using MF-DFA, SVM based on cloud computing.

    Science.gov (United States)

    Zhang, Zhongnan; Wen, Tingxi; Huang, Wei; Wang, Meihong; Li, Chunfeng

    2017-01-01

    Epilepsy is a chronic disease with transient brain dysfunction that results from the sudden abnormal discharge of neurons in the brain. Since electroencephalogram (EEG) is a harmless and noninvasive detection method, it plays an important role in the detection of neurological diseases. However, the process of analyzing EEG to detect neurological diseases is often difficult because the brain electrical signals are random, non-stationary and nonlinear. In order to overcome such difficulty, this study aims to develop a new computer-aided scheme for automatic epileptic seizure detection in EEGs based on multi-fractal detrended fluctuation analysis (MF-DFA) and support vector machine (SVM). New scheme first extracts features from EEG by MF-DFA during the first stage. Then, the scheme applies a genetic algorithm (GA) to calculate parameters used in SVM and classify the training data according to the selected features using SVM. Finally, the trained SVM classifier is exploited to detect neurological diseases. The algorithm utilizes MLlib from library of SPARK and runs on cloud platform. Applying to a public dataset for experiment, the study results show that the new feature extraction method and scheme can detect signals with less features and the accuracy of the classification reached up to 99%. MF-DFA is a promising approach to extract features for analyzing EEG, because of its simple algorithm procedure and less parameters. The features obtained by MF-DFA can represent samples as well as traditional wavelet transform and Lyapunov exponents. GA can always find useful parameters for SVM with enough execution time. The results illustrate that the classification model can achieve comparable accuracy, which means that it is effective in epileptic seizure detection.

  1. Linear SVM-Based Android Malware Detection for Reliable IoT Services

    Directory of Open Access Journals (Sweden)

    Hyo-Sik Ham

    2014-01-01

    Full Text Available Current many Internet of Things (IoT services are monitored and controlled through smartphone applications. By combining IoT with smartphones, many convenient IoT services have been provided to users. However, there are adverse underlying effects in such services including invasion of privacy and information leakage. In most cases, mobile devices have become cluttered with important personal user information as various services and contents are provided through them. Accordingly, attackers are expanding the scope of their attacks beyond the existing PC and Internet environment into mobile devices. In this paper, we apply a linear support vector machine (SVM to detect Android malware and compare the malware detection performance of SVM with that of other machine learning classifiers. Through experimental validation, we show that the SVM outperforms other machine learning classifiers.

  2. Predicting the Types of Ion Channel-Targeted Conotoxins Based on AVC-SVM Model.

    Science.gov (United States)

    Xianfang, Wang; Junmei, Wang; Xiaolei, Wang; Yue, Zhang

    2017-01-01

    The conotoxin proteins are disulfide-rich small peptides. Predicting the types of ion channel-targeted conotoxins has great value in the treatment of chronic diseases, epilepsy, and cardiovascular diseases. To solve the problem of information redundancy existing when using current methods, a new model is presented to predict the types of ion channel-targeted conotoxins based on AVC (Analysis of Variance and Correlation) and SVM (Support Vector Machine). First, the F value is used to measure the significance level of the feature for the result, and the attribute with smaller F value is filtered by rough selection. Secondly, redundancy degree is calculated by Pearson Correlation Coefficient. And the threshold is set to filter attributes with weak independence to get the result of the refinement. Finally, SVM is used to predict the types of ion channel-targeted conotoxins. The experimental results show the proposed AVC-SVM model reaches an overall accuracy of 91.98%, an average accuracy of 92.17%, and the total number of parameters of 68. The proposed model provides highly useful information for further experimental research. The prediction model will be accessed free of charge at our web server.

  3. A structural SVM approach for reference parsing.

    Science.gov (United States)

    Zhang, Xiaoli; Zou, Jie; Le, Daniel X; Thoma, George R

    2011-06-09

    Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.

  4. An Efficient Normalized Rank Based SVM for Room Level Indoor WiFi Localization with Diverse Devices

    Directory of Open Access Journals (Sweden)

    Yasmine Rezgui

    2017-01-01

    Full Text Available This paper proposes an efficient and effective WiFi fingerprinting-based indoor localization algorithm, which uses the Received Signal Strength Indicator (RSSI of WiFi signals. In practical harsh indoor environments, RSSI variation and hardware variance can significantly degrade the performance of fingerprinting-based localization methods. To address the problem of hardware variance and signal fluctuation in WiFi fingerprinting-based localization, we propose a novel normalized rank based Support Vector Machine classifier (NR-SVM. Moving from RSSI value based analysis to the normalized rank transformation based analysis, the principal features are prioritized and the dimensionalities of signature vectors are taken into account. The proposed method has been tested using sixteen different devices in a shopping mall with 88 shops. The experimental results demonstrate its robustness with no less than 98.75% correct estimation in 93.75% of the tested cases and 100% correct rate in 56.25% of cases. In the experiments, the new method shows better performance over the KNN, Naïve Bayes, Random Forest, and Neural Network algorithms. Furthermore, we have compared the proposed approach with three popular calibration-free transformation based methods, including difference method (DIFF, Signal Strength Difference (SSD, and the Hyperbolic Location Fingerprinting (HLF based SVM. The results show that the NR-SVM outperforms these popular methods.

  5. Grouped fuzzy SVM with EM-based partition of sample space for clustered microcalcification detection.

    Science.gov (United States)

    Wang, Huiya; Feng, Jun; Wang, Hongyu

    2017-07-20

    Detection of clustered microcalcification (MC) from mammograms plays essential roles in computer-aided diagnosis for early stage breast cancer. To tackle problems associated with the diversity of data structures of MC lesions and the variability of normal breast tissues, multi-pattern sample space learning is required. In this paper, a novel grouped fuzzy Support Vector Machine (SVM) algorithm with sample space partition based on Expectation-Maximization (EM) (called G-FSVM) is proposed for clustered MC detection. The diversified pattern of training data is partitioned into several groups based on EM algorithm. Then a series of fuzzy SVM are integrated for classification with each group of samples from the MC lesions and normal breast tissues. From DDSM database, a total of 1,064 suspicious regions are selected from 239 mammography, and the measurement of Accuracy, True Positive Rate (TPR), False Positive Rate (FPR) and EVL = TPR* 1-FPR are 0.82, 0.78, 0.14 and 0.72, respectively. The proposed method incorporates the merits of fuzzy SVM and multi-pattern sample space learning, decomposing the MC detection problem into serial simple two-class classification. Experimental results from synthetic data and DDSM database demonstrate that our integrated classification framework reduces the false positive rate significantly while maintaining the true positive rate.

  6. Evaluation of Effectiveness of Wavelet Based Denoising Schemes Using ANN and SVM for Bearing Condition Classification

    Directory of Open Access Journals (Sweden)

    Vijay G. S.

    2012-01-01

    Full Text Available The wavelet based denoising has proven its ability to denoise the bearing vibration signals by improving the signal-to-noise ratio (SNR and reducing the root-mean-square error (RMSE. In this paper seven wavelet based denoising schemes have been evaluated based on the performance of the Artificial Neural Network (ANN and the Support Vector Machine (SVM, for the bearing condition classification. The work consists of two parts, the first part in which a synthetic signal simulating the defective bearing vibration signal with Gaussian noise was subjected to these denoising schemes. The best scheme based on the SNR and the RMSE was identified. In the second part, the vibration signals collected from a customized Rolling Element Bearing (REB test rig for four bearing conditions were subjected to these denoising schemes. Several time and frequency domain features were extracted from the denoised signals, out of which a few sensitive features were selected using the Fisher’s Criterion (FC. Extracted features were used to train and test the ANN and the SVM. The best denoising scheme identified, based on the classification performances of the ANN and the SVM, was found to be the same as the one obtained using the synthetic signal.

  7. Elucidation of Metallic Plume and Spatter Characteristics Based on SVM During High-Power Disk Laser Welding

    International Nuclear Information System (INIS)

    Gao Xiangdong; Liu Guiqian

    2015-01-01

    During deep penetration laser welding, there exist plume (weak plasma) and spatters, which are the results of weld material ejection due to strong laser heating. The characteristics of plume and spatters are related to welding stability and quality. Characteristics of metallic plume and spatters were investigated during high-power disk laser bead-on-plate welding of Type 304 austenitic stainless steel plates at a continuous wave laser power of 10 kW. An ultraviolet and visible sensitive high-speed camera was used to capture the metallic plume and spatter images. Plume area, laser beam path through the plume, swing angle, distance between laser beam focus and plume image centroid, abscissa of plume centroid and spatter numbers are defined as eigenvalues, and the weld bead width was used as a characteristic parameter that reflected welding stability. Welding status was distinguished by SVM (support vector machine) after data normalization and characteristic analysis. Also, PCA (principal components analysis) feature extraction was used to reduce the dimensions of feature space, and PSO (particle swarm optimization) was used to optimize the parameters of SVM. Finally a classification model based on SVM was established to estimate the weld bead width and welding stability. Experimental results show that the established algorithm based on SVM could effectively distinguish the variation of weld bead width, thus providing an experimental example of monitoring high-power disk laser welding quality. (plasma technology)

  8. Fault detection of Tennessee Eastman process based on topological features and SVM

    Science.gov (United States)

    Zhao, Huiyang; Hu, Yanzhu; Ai, Xinbo; Hu, Yu; Meng, Zhen

    2018-03-01

    Fault detection in industrial process is a popular research topic. Although the distributed control system(DCS) has been introduced to monitor the state of industrial process, it still cannot satisfy all the requirements for fault detection of all the industrial systems. In this paper, we proposed a novel method based on topological features and support vector machine(SVM), for fault detection of industrial process. The proposed method takes global information of measured variables into account by complex network model and predicts whether a system has generated some faults or not by SVM. The proposed method can be divided into four steps, i.e. network construction, network analysis, model training and model testing respectively. Finally, we apply the model to Tennessee Eastman process(TEP). The results show that this method works well and can be a useful supplement for fault detection of industrial process.

  9. Construction of Pancreatic Cancer Classifier Based on SVM Optimized by Improved FOA

    Science.gov (United States)

    Ma, Xiaoqi

    2015-01-01

    A novel method is proposed to establish the pancreatic cancer classifier. Firstly, the concept of quantum and fruit fly optimal algorithm (FOA) are introduced, respectively. Then FOA is improved by quantum coding and quantum operation, and a new smell concentration determination function is defined. Finally, the improved FOA is used to optimize the parameters of support vector machine (SVM) and the classifier is established by optimized SVM. In order to verify the effectiveness of the proposed method, SVM and other classification methods have been chosen as the comparing methods. The experimental results show that the proposed method can improve the classifier performance and cost less time. PMID:26543867

  10. DISEÑO Y EVALUACIÓN DE UN CLASIFICADOR DE TEXTURAS BASADO EN LS-SVM

    OpenAIRE

    Beitmantt Cárdenas Quintero; Nelson Enrique Vera Parra; Pablo Emilio Rozo García

    2013-01-01

    Evaluar el desempeño y el costo computacional de diferentes arquitecturas y metodologías Least Square Support Vector Machine (LS-SVM) ante la segmentación de imágenes por textura y a partir de dichos resultados postular un modelo de un clasificador de texturas LS-SVM.  Metodología: Ante un problema de clasificación binaria representado por la segmentación  de 32 imágenes, organizadas en 4 grupos y formadas por pares de texturas típicas (granito/corteza, ladrillo/tapicería, madera/mármol, teji...

  11. Classification of different kinds of pesticide residues on lettuce based on fluorescence spectra and WT-BCC-SVM algorithm

    Science.gov (United States)

    Zhou, Xin; Jun, Sun; Zhang, Bing; Jun, Wu

    2017-07-01

    In order to improve the reliability of the spectrum feature extracted by wavelet transform, a method combining wavelet transform (WT) with bacterial colony chemotaxis algorithm and support vector machine (BCC-SVM) algorithm (WT-BCC-SVM) was proposed in this paper. Besides, we aimed to identify different kinds of pesticide residues on lettuce leaves in a novel and rapid non-destructive way by using fluorescence spectra technology. The fluorescence spectral data of 150 lettuce leaf samples of five different kinds of pesticide residues on the surface of lettuce were obtained using Cary Eclipse fluorescence spectrometer. Standard normalized variable detrending (SNV detrending), Savitzky-Golay coupled with Standard normalized variable detrending (SG-SNV detrending) were used to preprocess the raw spectra, respectively. Bacterial colony chemotaxis combined with support vector machine (BCC-SVM) and support vector machine (SVM) classification models were established based on full spectra (FS) and wavelet transform characteristics (WTC), respectively. Moreover, WTC were selected by WT. The results showed that the accuracy of training set, calibration set and the prediction set of the best optimal classification model (SG-SNV detrending-WT-BCC-SVM) were 100%, 98% and 93.33%, respectively. In addition, the results indicated that it was feasible to use WT-BCC-SVM to establish diagnostic model of different kinds of pesticide residues on lettuce leaves.

  12. SVM classifier on chip for melanoma detection.

    Science.gov (United States)

    Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak

    2017-07-01

    Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.

  13. Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood.

    Science.gov (United States)

    Zhang, Fan; Kaufman, Howard L; Deng, Youping; Drabier, Renee

    2013-01-01

    Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast cancer. This can be a challenge due to a number of factors and logistics. First, obtaining tissue biopsies can be difficult. Second, mammography may not detect small tumors, and is often unsatisfactory for younger women who typically have dense breast tissue. Lastly, breast cancer is not a single homogeneous disease but consists of multiple disease states, each arising from a distinct molecular mechanism and having a distinct clinical progression path which makes the disease difficult to detect and predict in early stages. In the paper, we present a Support Vector Machine based on Recursive Feature Elimination and Cross Validation (SVM-RFE-CV) algorithm for early detection of breast cancer in peripheral blood and show how to use SVM-RFE-CV to model the classification and prediction problem of early detection of breast cancer in peripheral blood.The training set which consists of 32 health and 33 cancer samples and the testing set consisting of 31 health and 34 cancer samples were randomly separated from a dataset of peripheral blood of breast cancer that is downloaded from Gene Express Omnibus. First, we identified the 42 differentially expressed biomarkers between "normal" and "cancer". Then, with the SVM-RFE-CV we extracted 15 biomarkers that yield zero cross validation score. Lastly, we compared the classification and prediction performance of SVM-RFE-CV with that of SVM and SVM Recursive Feature Elimination (SVM-RFE). We found that 1) the SVM-RFE-CV is suitable for analyzing noisy high-throughput microarray data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance (Area Under

  14. Image Interpolation Scheme based on SVM and Improved PSO

    Science.gov (United States)

    Jia, X. F.; Zhao, B. T.; Liu, X. X.; Song, H. P.

    2018-01-01

    In order to obtain visually pleasing images, a support vector machines (SVM) based interpolation scheme is proposed, in which the improved particle swarm optimization is applied to support vector machine parameters optimization. Training samples are constructed by the pixels around the pixel to be interpolated. Then the support vector machine with optimal parameters is trained using training samples. After the training, we can get the interpolation model, which can be employed to estimate the unknown pixel. Experimental result show that the interpolated images get improvement PNSR compared with traditional interpolation methods, which is agrees with the subjective quality.

  15. A Method for Aileron Actuator Fault Diagnosis Based on PCA and PGC-SVM

    Directory of Open Access Journals (Sweden)

    Wei-Li Qin

    2016-01-01

    Full Text Available Aileron actuators are pivotal components for aircraft flight control system. Thus, the fault diagnosis of aileron actuators is vital in the enhancement of the reliability and fault tolerant capability. This paper presents an aileron actuator fault diagnosis approach combining principal component analysis (PCA, grid search (GS, 10-fold cross validation (CV, and one-versus-one support vector machine (SVM. This method is referred to as PGC-SVM and utilizes the direct drive valve input, force motor current, and displacement feedback signal to realize fault detection and location. First, several common faults of aileron actuators, which include force motor coil break, sensor coil break, cylinder leakage, and amplifier gain reduction, are extracted from the fault quadrantal diagram; the corresponding fault mechanisms are analyzed. Second, the data feature extraction is performed with dimension reduction using PCA. Finally, the GS and CV algorithms are employed to train a one-versus-one SVM for fault classification, thus obtaining the optimal model parameters and assuring the generalization of the trained SVM, respectively. To verify the effectiveness of the proposed approach, four types of faults are introduced into the simulation model established by AMESim and Simulink. The results demonstrate its desirable diagnostic performance which outperforms that of the traditional SVM by comparison.

  16. MIEC-SVM: automated pipeline for protein peptide/ligand interaction prediction.

    Science.gov (United States)

    Li, Nan; Ainsworth, Richard I; Wu, Meixin; Ding, Bo; Wang, Wei

    2016-03-15

    MIEC-SVM is a structure-based method for predicting protein recognition specificity. Here, we present an automated MIEC-SVM pipeline providing an integrated and user-friendly workflow for construction and application of the MIEC-SVM models. This pipeline can handle standard amino acids and those with post-translational modifications (PTMs) or small molecules. Moreover, multi-threading and support to Sun Grid Engine (SGE) are implemented to significantly boost the computational efficiency. The program is available at http://wanglab.ucsd.edu/MIEC-SVM CONTACT: : wei-wang@ucsd.edu Supplementary data available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  18. Modulation transfer function (MTF) measurement method based on support vector machine (SVM)

    Science.gov (United States)

    Zhang, Zheng; Chen, Yueting; Feng, Huajun; Xu, Zhihai; Li, Qi

    2016-03-01

    An imaging system's spatial quality can be expressed by the system's modulation spread function (MTF) as a function of spatial frequency in terms of the linear response theory. Methods have been proposed to assess the MTF of an imaging system using point, slit or edge techniques. The edge method is widely used for the low requirement of targets. However, the traditional edge methods are limited by the edge angle. Besides, image noise will impair the measurement accuracy, making the measurement result unstable. In this paper, a novel measurement method based on the support vector machine (SVM) is proposed. Image patches with different edge angles and MTF levels are generated as the training set. Parameters related with MTF and image structure are extracted from the edge images. Trained with image parameters and the corresponding MTF, the SVM classifier can assess the MTF of any edge image. The result shows that the proposed method has an excellent performance on measuring accuracy and stability.

  19. Efficient HIK SVM learning for image classification.

    Science.gov (United States)

    Wu, Jianxin

    2012-10-01

    Histograms are used in almost every aspect of image processing and computer vision, from visual descriptors to image representations. Histogram intersection kernel (HIK) and support vector machine (SVM) classifiers are shown to be very effective in dealing with histograms. This paper presents contributions concerning HIK SVM for image classification. First, we propose intersection coordinate descent (ICD), a deterministic and scalable HIK SVM solver. ICD is much faster than, and has similar accuracies to, general purpose SVM solvers and other fast HIK SVM training methods. We also extend ICD to the efficient training of a broader family of kernels. Second, we show an important empirical observation that ICD is not sensitive to the C parameter in SVM, and we provide some theoretical analyses to explain this observation. ICD achieves high accuracies in many problems, using its default parameters. This is an attractive property for practitioners, because many image processing tasks are too large to choose SVM parameters using cross-validation.

  20. Gene Expression Signature in Endemic Osteoarthritis by Microarray Analysis

    Directory of Open Access Journals (Sweden)

    Xi Wang

    2015-05-01

    Full Text Available Kashin-Beck Disease (KBD is an endemic osteochondropathy with an unknown pathogenesis. Diagnosis of KBD is effective only in advanced cases, which eliminates the possibility of early treatment and leads to an inevitable exacerbation of symptoms. Therefore, we aim to identify an accurate blood-based gene signature for the detection of KBD. Previously published gene expression profile data on cartilage and peripheral blood mononuclear cells (PBMCs from adults with KBD were compared to select potential target genes. Microarray analysis was conducted to evaluate the expression of the target genes in a cohort of 100 KBD patients and 100 healthy controls. A gene expression signature was identified using a training set, which was subsequently validated using an independent test set with a minimum redundancy maximum relevance (mRMR algorithm and support vector machine (SVM algorithm. Fifty unique genes were differentially expressed between KBD patients and healthy controls. A 20-gene signature was identified that distinguished between KBD patients and controls with 90% accuracy, 85% sensitivity, and 95% specificity. This study identified a 20-gene signature that accurately distinguishes between patients with KBD and controls using peripheral blood samples. These results promote the further development of blood-based genetic biomarkers for detection of KBD.

  1. Modeling the milling tool wear by using an evolutionary SVM-based model from milling runs experimental data

    Science.gov (United States)

    Nieto, Paulino José García; García-Gonzalo, Esperanza; Vilán, José Antonio Vilán; Robleda, Abraham Segade

    2015-12-01

    The main aim of this research work is to build a new practical hybrid regression model to predict the milling tool wear in a regular cut as well as entry cut and exit cut of a milling tool. The model was based on Particle Swarm Optimization (PSO) in combination with support vector machines (SVMs). This optimization mechanism involved kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy. Bearing this in mind, a PSO-SVM-based model, which is based on the statistical learning theory, was successfully used here to predict the milling tool flank wear (output variable) as a function of the following input variables: the time duration of experiment, depth of cut, feed, type of material, etc. To accomplish the objective of this study, the experimental dataset represents experiments from runs on a milling machine under various operating conditions. In this way, data sampled by three different types of sensors (acoustic emission sensor, vibration sensor and current sensor) were acquired at several positions. A second aim is to determine the factors with the greatest bearing on the milling tool flank wear with a view to proposing milling machine's improvements. Firstly, this hybrid PSO-SVM-based regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the flank wear (output variable) and input variables (time, depth of cut, feed, etc.). Indeed, regression with optimal hyperparameters was performed and a determination coefficient of 0.95 was obtained. The agreement of this model with experimental data confirmed its good performance. Secondly, the main advantages of this PSO-SVM-based model are its capacity to produce a simple, easy-to-interpret model, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, the main conclusions of this study are exposed.

  2. Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Liying Yang

    2016-01-01

    Full Text Available Background. Precisely predicting cancer is crucial for cancer treatment. Gene expression profiles make it possible to analyze patterns between genes and cancers on the genome-wide scale. Gene expression data analysis, however, is confronted with enormous challenges for its characteristics, such as high dimensionality, small sample size, and low Signal-to-Noise Ratio. Results. This paper proposes a method, termed RS_SVM, to predict gene expression profiles via aggregating SVM trained on random subspaces. After choosing gene features through statistical analysis, RS_SVM randomly selects feature subsets to yield random subspaces and training SVM classifiers accordingly and then aggregates SVM classifiers to capture the advantage of ensemble learning. Experiments on eight real gene expression datasets are performed to validate the RS_SVM method. Experimental results show that RS_SVM achieved better classification accuracy and generalization performance in contrast with single SVM, K-nearest neighbor, decision tree, Bagging, AdaBoost, and the state-of-the-art methods. Experiments also explored the effect of subspace size on prediction performance. Conclusions. The proposed RS_SVM method yielded superior performance in analyzing gene expression profiles, which demonstrates that RS_SVM provides a good channel for such biological data.

  3. Application of SVM classifier in thermographic image classification for early detection of breast cancer

    Science.gov (United States)

    Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał

    2016-09-01

    This article presents the application of machine learning algorithms for early detection of breast cancer on the basis of thermographic images. Supervised learning model: Support vector machine (SVM) and Sequential Minimal Optimization algorithm (SMO) for the training of SVM classifier were implemented. The SVM classifier was included in a client-server application which enables to create a training set of examinations and to apply classifiers (including SVM) for the diagnosis and early detection of the breast cancer. The sensitivity and specificity of SVM classifier were calculated based on the thermographic images from studies. Furthermore, the heuristic method for SVM's parameters tuning was proposed.

  4. SPECTRAL RECONSTRUCTION BASED ON SVM FOR CROSS CALIBRATION

    Directory of Open Access Journals (Sweden)

    H. Gao

    2017-05-01

    Full Text Available Chinese HY-1C/1D satellites will use a 5nm/10nm-resolutional visible-near infrared(VNIR hyperspectral sensor with the solar calibrator to cross-calibrate with other sensors. The hyperspectral radiance data are composed of average radiance in the sensor’s passbands and bear a spectral smoothing effect, a transform from the hyperspectral radiance data to the 1-nm-resolution apparent spectral radiance by spectral reconstruction need to be implemented. In order to solve the problem of noise cumulation and deterioration after several times of iteration by the iterative algorithm, a novel regression method based on SVM is proposed, which can approach arbitrary complex non-linear relationship closely and provide with better generalization capability by learning. In the opinion of system, the relationship between the apparent radiance and equivalent radiance is nonlinear mapping introduced by spectral response function(SRF, SVM transform the low-dimensional non-linear question into high-dimensional linear question though kernel function, obtaining global optimal solution by virtue of quadratic form. The experiment is performed using 6S-simulated spectrums considering the SRF and SNR of the hyperspectral sensor, measured reflectance spectrums of water body and different atmosphere conditions. The contrastive result shows: firstly, the proposed method is with more reconstructed accuracy especially to the high-frequency signal; secondly, while the spectral resolution of the hyperspectral sensor reduces, the proposed method performs better than the iterative method; finally, the root mean square relative error(RMSRE which is used to evaluate the difference of the reconstructed spectrum and the real spectrum over the whole spectral range is calculated, it decreses by one time at least by proposed method.

  5. LMethyR-SVM: Predict Human Enhancers Using Low Methylated Regions based on Weighted Support Vector Machines.

    Science.gov (United States)

    Xu, Jingting; Hu, Hong; Dai, Yang

    The identification of enhancers is a challenging task. Various types of epigenetic information including histone modification have been utilized in the construction of enhancer prediction models based on a diverse panel of machine learning schemes. However, DNA methylation profiles generated from the whole genome bisulfite sequencing (WGBS) have not been fully explored for their potential in enhancer prediction despite the fact that low methylated regions (LMRs) have been implied to be distal active regulatory regions. In this work, we propose a prediction framework, LMethyR-SVM, using LMRs identified from cell-type-specific WGBS DNA methylation profiles and a weighted support vector machine learning framework. In LMethyR-SVM, the set of cell-type-specific LMRs is further divided into three sets: reliable positive, like positive and likely negative, according to their resemblance to a small set of experimentally validated enhancers in the VISTA database based on an estimated non-parametric density distribution. Then, the prediction model is obtained by solving a weighted support vector machine. We demonstrate the performance of LMethyR-SVM by using the WGBS DNA methylation profiles derived from the human embryonic stem cell type (H1) and the fetal lung fibroblast cell type (IMR90). The predicted enhancers are highly conserved with a reasonable validation rate based on a set of commonly used positive markers including transcription factors, p300 binding and DNase-I hypersensitive sites. In addition, we show evidence that the large fraction of the LMethyR-SVM predicted enhancers are not predicted by ChromHMM in H1 cell type and they are more enriched for the FANTOM5 enhancers. Our work suggests that low methylated regions detected from the WGBS data are useful as complementary resources to histone modification marks in developing models for the prediction of cell-type-specific enhancers.

  6. Multi-view L2-SVM and its multi-view core vector machine.

    Science.gov (United States)

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. An SVM-based solution for fault detection in wind turbines.

    Science.gov (United States)

    Santos, Pedro; Villa, Luisa F; Reñones, Aníbal; Bustillo, Andres; Maudes, Jesús

    2015-03-09

    Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs) are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs) shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.

  8. An SVM-Based Solution for Fault Detection in Wind Turbines

    Directory of Open Access Journals (Sweden)

    Pedro Santos

    2015-03-01

    Full Text Available Research into fault diagnosis in machines with a wide range of variable loads and speeds, such as wind turbines, is of great industrial interest. Analysis of the power signals emitted by wind turbines for the diagnosis of mechanical faults in their mechanical transmission chain is insufficient. A successful diagnosis requires the inclusion of accelerometers to evaluate vibrations. This work presents a multi-sensory system for fault diagnosis in wind turbines, combined with a data-mining solution for the classification of the operational state of the turbine. The selected sensors are accelerometers, in which vibration signals are processed using angular resampling techniques and electrical, torque and speed measurements. Support vector machines (SVMs are selected for the classification task, including two traditional and two promising new kernels. This multi-sensory system has been validated on a test-bed that simulates the real conditions of wind turbines with two fault typologies: misalignment and imbalance. Comparison of SVM performance with the results of artificial neural networks (ANNs shows that linear kernel SVM outperforms other kernels and ANNs in terms of accuracy, training and tuning times. The suitability and superior performance of linear SVM is also experimentally analyzed, to conclude that this data acquisition technique generates linearly separable datasets.

  9. [Study on application of SVM in prediction of coronary heart disease].

    Science.gov (United States)

    Zhu, Yue; Wu, Jianghua; Fang, Ying

    2013-12-01

    Base on the data of blood pressure, plasma lipid, Glu and UA by physical test, Support Vector Machine (SVM) was applied to identify coronary heart disease (CHD) in patients and non-CHD individuals in south China population for guide of further prevention and treatment of the disease. Firstly, the SVM classifier was built using radial basis kernel function, liner kernel function and polynomial kernel function, respectively. Secondly, the SVM penalty factor C and kernel parameter sigma were optimized by particle swarm optimization (PSO) and then employed to diagnose and predict the CHD. By comparison with those from artificial neural network with the back propagation (BP) model, linear discriminant analysis, logistic regression method and non-optimized SVM, the overall results of our calculation demonstrated that the classification performance of optimized RBF-SVM model could be superior to other classifier algorithm with higher accuracy rate, sensitivity and specificity, which were 94.51%, 92.31% and 96.67%, respectively. So, it is well concluded that SVM could be used as a valid method for assisting diagnosis of CHD.

  10. Face Verification using MLP and SVM

    OpenAIRE

    Cardinaux, Fabien; Marcel, Sébastien

    2002-01-01

    The performance of machine learning algorithms has steadily improved over the past few years, such as MLP or more recently SVM. In this paper, we compare two successful discriminant machine learning algorithms apply to the problem of face verification: MLP and SVM. These two algorithms are tested on a benchmark database, namely XM2VTS. Results show that a MLP is better than a SVM on this particular task.

  11. An SVM-Based Classifier for Estimating the State of Various Rotating Components in Agro-Industrial Machinery with a Vibration Signal Acquired from a Single Point on the Machine Chassis

    Directory of Open Access Journals (Sweden)

    Ruben Ruiz-Gonzalez

    2014-11-01

    Full Text Available The goal of this article is to assess the feasibility of estimating the state of various rotating components in agro-industrial machinery by employing just one vibration signal acquired from a single point on the machine chassis. To do so, a Support Vector Machine (SVM-based system is employed. Experimental tests evaluated this system by acquiring vibration data from a single point of an agricultural harvester, while varying several of its working conditions. The whole process included two major steps. Initially, the vibration data were preprocessed through twelve feature extraction algorithms, after which the Exhaustive Search method selected the most suitable features. Secondly, the SVM-based system accuracy was evaluated by using Leave-One-Out cross-validation, with the selected features as the input data. The results of this study provide evidence that (i accurate estimation of the status of various rotating components in agro-industrial machinery is possible by processing the vibration signal acquired from a single point on the machine structure; (ii the vibration signal can be acquired with a uniaxial accelerometer, the orientation of which does not significantly affect the classification accuracy; and, (iii when using an SVM classifier, an 85% mean cross-validation accuracy can be reached, which only requires a maximum of seven features as its input, and no significant improvements are noted between the use of either nonlinear or linear kernels.

  12. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    Science.gov (United States)

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  13. LMD Based Features for the Automatic Seizure Detection of EEG Signals Using SVM.

    Science.gov (United States)

    Zhang, Tao; Chen, Wanzhong

    2017-08-01

    Achieving the goal of detecting seizure activity automatically using electroencephalogram (EEG) signals is of great importance and significance for the treatment of epileptic seizures. To realize this aim, a newly-developed time-frequency analytical algorithm, namely local mean decomposition (LMD), is employed in the presented study. LMD is able to decompose an arbitrary signal into a series of product functions (PFs). Primarily, the raw EEG signal is decomposed into several PFs, and then the temporal statistical and non-linear features of the first five PFs are calculated. The features of each PF are fed into five classifiers, including back propagation neural network (BPNN), K-nearest neighbor (KNN), linear discriminant analysis (LDA), un-optimized support vector machine (SVM) and SVM optimized by genetic algorithm (GA-SVM), for five classification cases, respectively. Confluent features of all PFs and raw EEG are further passed into the high-performance GA-SVM for the same classification tasks. Experimental results on the international public Bonn epilepsy EEG dataset show that the average classification accuracy of the presented approach are equal to or higher than 98.10% in all the five cases, and this indicates the effectiveness of the proposed approach for automated seizure detection.

  14. LaSVM-based big data learning system for dynamic prediction of air pollution in Tehran.

    Science.gov (United States)

    Ghaemi, Z; Alimohammadi, A; Farnaghi, M

    2018-04-20

    Due to critical impacts of air pollution, prediction and monitoring of air quality in urban areas are important tasks. However, because of the dynamic nature and high spatio-temporal variability, prediction of the air pollutant concentrations is a complex spatio-temporal problem. Distribution of pollutant concentration is influenced by various factors such as the historical pollution data and weather conditions. Conventional methods such as the support vector machine (SVM) or artificial neural networks (ANN) show some deficiencies when huge amount of streaming data have to be analyzed for urban air pollution prediction. In order to overcome the limitations of the conventional methods and improve the performance of urban air pollution prediction in Tehran, a spatio-temporal system is designed using a LaSVM-based online algorithm. Pollutant concentration and meteorological data along with geographical parameters are continually fed to the developed online forecasting system. Performance of the system is evaluated by comparing the prediction results of the Air Quality Index (AQI) with those of a traditional SVM algorithm. Results show an outstanding increase of speed by the online algorithm while preserving the accuracy of the SVM classifier. Comparison of the hourly predictions for next coming 24 h, with those of the measured pollution data in Tehran pollution monitoring stations shows an overall accuracy of 0.71, root mean square error of 0.54 and coefficient of determination of 0.81. These results are indicators of the practical usefulness of the online algorithm for real-time spatial and temporal prediction of the urban air quality.

  15. A SVM-based quantitative fMRI method for resting-state functional network detection.

    Science.gov (United States)

    Song, Xiaomu; Chen, Nan-kuei

    2014-09-01

    Resting-state functional magnetic resonance imaging (fMRI) aims to measure baseline neuronal connectivity independent of specific functional tasks and to capture changes in the connectivity due to neurological diseases. Most existing network detection methods rely on a fixed threshold to identify functionally connected voxels under the resting state. Due to fMRI non-stationarity, the threshold cannot adapt to variation of data characteristics across sessions and subjects, and generates unreliable mapping results. In this study, a new method is presented for resting-state fMRI data analysis. Specifically, the resting-state network mapping is formulated as an outlier detection process that is implemented using one-class support vector machine (SVM). The results are refined by using a spatial-feature domain prototype selection method and two-class SVM reclassification. The final decision on each voxel is made by comparing its probabilities of functionally connected and unconnected instead of a threshold. Multiple features for resting-state analysis were extracted and examined using an SVM-based feature selection method, and the most representative features were identified. The proposed method was evaluated using synthetic and experimental fMRI data. A comparison study was also performed with independent component analysis (ICA) and correlation analysis. The experimental results show that the proposed method can provide comparable or better network detection performance than ICA and correlation analysis. The method is potentially applicable to various resting-state quantitative fMRI studies. Copyright © 2014 Elsevier Inc. All rights reserved.

  16. A Study on SVM Based on the Weighted Elitist Teaching-Learning-Based Optimization and Application in the Fault Diagnosis of Chemical Process

    Directory of Open Access Journals (Sweden)

    Cao Junxiang

    2015-01-01

    Full Text Available Teaching-Learning-Based Optimization (TLBO is a new swarm intelligence optimization algorithm that simulates the class learning process. According to such problems of the traditional TLBO as low optimizing efficiency and poor stability, this paper proposes an improved TLBO algorithm mainly by introducing the elite thought in TLBO and adopting different inertia weight decreasing strategies for elite and ordinary individuals of the teacher stage and the student stage. In this paper, the validity of the improved TLBO is verified by the optimizations of several typical test functions and the SVM optimized by the weighted elitist TLBO is used in the diagnosis and classification of common failure data of the TE chemical process. Compared with the SVM combining other traditional optimizing methods, the SVM optimized by the weighted elitist TLBO has a certain improvement in the accuracy of fault diagnosis and classification.

  17. Soft-sensing Modeling Based on MLS-SVM Inversion for L-lysine Fermentation Processes

    Directory of Open Access Journals (Sweden)

    Bo Wang

    2015-06-01

    Full Text Available A modeling approach 63 based on multiple output variables least squares support vector machine (MLS-SVM inversion is presented by a combination of inverse system and support vector machine theory. Firstly, a dynamic system model is developed based on material balance relation of a fed-batch fermentation process, with which it is analyzed whether an inverse system exists or not, and into which characteristic information of a fermentation process is introduced to set up an extended inversion model. Secondly, an initial extended inversion model is developed off-line by the use of the fitting capacity of MLS-SVM; on-line correction is made by the use of a differential evolution (DE algorithm on the basis of deviation information. Finally, a combined pseudo-linear system is formed by means of a serial connection of a corrected extended inversion model behind the L-lysine fermentation processes; thereby crucial biochemical parameters of a fermentation process could be predicted on-line. The simulation experiment shows that this soft-sensing modeling method features very high prediction precision and can predict crucial biochemical parameters of L-lysine fermentation process very well.

  18. Application of SVM on satellite images to detect hotspots in Jharia coal field region of India

    Energy Technology Data Exchange (ETDEWEB)

    Gautam, R.S.; Singh, D.; Mittal, A.; Sajin, P. [Indian Institute for Technology, Roorkee (India)

    2008-07-01

    The present paper deals with the application of Support Vector Machine (SVM) and image analysis techniques on NOAA/AVHRR satellite image to detect hotspots on the Jharia coal field region of India. One of the major advantages of using these satellite data is that the data are free with very good temporal resolution; while, one drawback is that these have low spatial resolution (i.e., approximately 1.1 km at nadir). Therefore, it is important to do research by applying some efficient optimization techniques along with the image analysis techniques to rectify these drawbacks and use satellite images for efficient hotspot detection and monitoring. For this purpose, SVM and multi-threshold techniques are explored for hotspot detection. The multi-threshold algorithm is developed to remove the cloud coverage from the land coverage. This algorithm also highlights the hotspots or fire spots in the suspected regions. SVM has the advantage over multi-thresholding technique that it can learn patterns from the examples and therefore is used to optimize the performance by removing the false points which are highlighted in the threshold technique. Both approaches can be used separately or in combination depending on the size of the image. The RBF (Radial Basis Function) kernel is used in training of three sets of inputs: brightness temperature of channel 3, Normalized Difference Vegetation Index (NDVI) and Global Environment Monitoring Index (GEMI), respectively. This makes a classified image in the output that highlights the hotspot and non-hotspot pixels. The performance of the SVM is also compared with the performance obtained from the neural networks and SVM appears to detect hotspots more accurately (greater than 91% classification accuracy) with lesser false alarm rate. The results obtained are found to be in good agreement with the ground based observations of the hotspots.

  19. Optimization of Support Vector Machine (SVM) for Object Classification

    Science.gov (United States)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  20. A hybrid particle swarm optimization-SVM classification for automatic cardiac auscultation

    Directory of Open Access Journals (Sweden)

    Prasertsak Charoen

    2017-04-01

    Full Text Available Cardiac auscultation is a method for a doctor to listen to heart sounds, using a stethoscope, for examining the condition of the heart. Automatic cardiac auscultation with machine learning is a promising technique to classify heart conditions without need of doctors or expertise. In this paper, we develop a classification model based on support vector machine (SVM and particle swarm optimization (PSO for an automatic cardiac auscultation system. The model consists of two parts: heart sound signal processing part and a proposed PSO for weighted SVM (WSVM classifier part. In this method, the PSO takes into account the degree of importance for each feature extracted from wavelet packet (WP decomposition. Then, by using principle component analysis (PCA, the features can be selected. The PSO technique is used to assign diverse weights to different features for the WSVM classifier. Experimental results show that both continuous and binary PSO-WSVM models achieve better classification accuracy on the heart sound samples, by reducing system false negatives (FNs, compared to traditional SVM and genetic algorithm (GA based SVM.

  1. Solution Path for Pin-SVM Classifiers With Positive and Negative $\\tau $ Values.

    Science.gov (United States)

    Huang, Xiaolin; Shi, Lei; Suykens, Johan A K

    2017-07-01

    Applying the pinball loss in a support vector machine (SVM) classifier results in pin-SVM. The pinball loss is characterized by a parameter τ . Its value is related to the quantile level and different τ values are suitable for different problems. In this paper, we establish an algorithm to find the entire solution path for pin-SVM with different τ values. This algorithm is based on the fact that the optimal solution to pin-SVM is continuous and piecewise linear with respect to τ . We also show that the nonnegativity constraint on τ is not necessary, i.e., τ can be extended to negative values. First, in some applications, a negative τ leads to better accuracy. Second, τ = -1 corresponds to a simple solution that links SVM and the classical kernel rule. The solution for τ = -1 can be obtained directly and then be used as a starting point of the solution path. The proposed method efficiently traverses τ values through the solution path, and then achieves good performance by a suitable τ . In particular, τ = 0 corresponds to C-SVM, meaning that the traversal algorithm can output a result at least as good as C-SVM with respect to validation error.

  2. GenSVM: a generalized multiclass support vector machine

    NARCIS (Netherlands)

    G.J.J. van den Burg (Gertjan); P.J.F. Groenen (Patrick)

    2016-01-01

    textabstractTraditional extensions of the binary support vector machine (SVM) to multiclass problems are either heuristics or require solving a large dual optimization problem. Here, a generalized multiclass SVM is proposed called GenSVM. In this method classification boundaries for a K-class

  3. Tuning to optimize SVM approach for assisting ovarian cancer diagnosis with photoacoustic imaging.

    Science.gov (United States)

    Wang, Rui; Li, Rui; Lei, Yanyan; Zhu, Quing

    2015-01-01

    Support vector machine (SVM) is one of the most effective classification methods for cancer detection. The efficiency and quality of a SVM classifier depends strongly on several important features and a set of proper parameters. Here, a series of classification analyses, with one set of photoacoustic data from ovarian tissues ex vivo and a widely used breast cancer dataset- the Wisconsin Diagnostic Breast Cancer (WDBC), revealed the different accuracy of a SVM classification in terms of the number of features used and the parameters selected. A pattern recognition system is proposed by means of SVM-Recursive Feature Elimination (RFE) with the Radial Basis Function (RBF) kernel. To improve the effectiveness and robustness of the system, an optimized tuning ensemble algorithm called as SVM-RFE(C) with correlation filter was implemented to quantify feature and parameter information based on cross validation. The proposed algorithm is first demonstrated outperforming SVM-RFE on WDBC. Then the best accuracy of 94.643% and sensitivity of 94.595% were achieved when using SVM-RFE(C) to test 57 new PAT data from 19 patients. The experiment results show that the classifier constructed with SVM-RFE(C) algorithm is able to learn additional information from new data and has significant potential in ovarian cancer diagnosis.

  4. COMPARISON OF SVM AND FUZZY CLASSIFIER FOR AN INDIAN SCRIPT

    Directory of Open Access Journals (Sweden)

    M. J. Baheti

    2012-01-01

    Full Text Available With the advent of technological era, conversion of scanned document (handwritten or printed into machine editable format has attracted many researchers. This paper deals with the problem of recognition of Gujarati handwritten numerals. Gujarati numeral recognition requires performing some specific steps as a part of preprocessing. For preprocessing digitization, segmentation, normalization and thinning are done with considering that the image have almost no noise. Further affine invariant moments based model is used for feature extraction and finally Support Vector Machine (SVM and Fuzzy classifiers are used for numeral classification. . The comparison of SVM and Fuzzy classifier is made and it can be seen that SVM procured better results as compared to Fuzzy Classifier.

  5. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications.

    Science.gov (United States)

    Ye, Fei; Lou, Xin Yuan; Sun, Lin Fu

    2017-01-01

    This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm's performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem.

  6. Introducing instrumental variables in the LS-SVM based identification framework

    NARCIS (Netherlands)

    Laurain, V.; Zheng, W-X.; Toth, R.

    2011-01-01

    Least-Squares Support Vector Machines (LS-SVM) represent a promising approach to identify nonlinear systems via nonparametric estimation of the nonlinearities in a computationally and stochastically attractive way. All the methods dedicated to the solution of this problem rely on the minimization of

  7. Entropy-based gene ranking without selection bias for the predictive classification of microarray data

    Directory of Open Access Journals (Sweden)

    Serafini Maria

    2003-11-01

    Full Text Available Abstract Background We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process. Results With E-RFE, we speed up the recursive feature elimination (RFE with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Conclusions Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  8. Nonlinear Time Series Prediction Using LS-SVM with Chaotic Mutation Evolutionary Programming for Parameter Optimization

    International Nuclear Information System (INIS)

    Xu Ruirui; Chen Tianlun; Gao Chengfeng

    2006-01-01

    Nonlinear time series prediction is studied by using an improved least squares support vector machine (LS-SVM) regression based on chaotic mutation evolutionary programming (CMEP) approach for parameter optimization. We analyze how the prediction error varies with different parameters (σ, γ) in LS-SVM. In order to select appropriate parameters for the prediction model, we employ CMEP algorithm. Finally, Nasdaq stock data are predicted by using this LS-SVM regression based on CMEP, and satisfactory results are obtained.

  9. FaaPred: a SVM-based prediction method for fungal adhesins and adhesin-like proteins.

    Directory of Open Access Journals (Sweden)

    Jayashree Ramana

    Full Text Available Adhesion constitutes one of the initial stages of infection in microbial diseases and is mediated by adhesins. Hence, identification and comprehensive knowledge of adhesins and adhesin-like proteins is essential to understand adhesin mediated pathogenesis and how to exploit its therapeutic potential. However, the knowledge about fungal adhesins is rudimentary compared to that of bacterial adhesins. In addition to host cell attachment and mating, the fungal adhesins play a significant role in homotypic and xenotypic aggregation, foraging and biofilm formation. Experimental identification of fungal adhesins is labor- as well as time-intensive. In this work, we present a Support Vector Machine (SVM based method for the prediction of fungal adhesins and adhesin-like proteins. The SVM models were trained with different compositional features, namely, amino acid, dipeptide, multiplet fractions, charge and hydrophobic compositions, as well as PSI-BLAST derived PSSM matrices. The best classifiers are based on compositional properties as well as PSSM and yield an overall accuracy of 86%. The prediction method based on best classifiers is freely accessible as a world wide web based server at http://bioinfo.icgeb.res.in/faap. This work will aid rapid and rational identification of fungal adhesins, expedite the pace of experimental characterization of novel fungal adhesins and enhance our knowledge about role of adhesins in fungal infections.

  10. A SVM-based method for sentiment analysis in Persian language

    Science.gov (United States)

    Hajmohammadi, Mohammad Sadegh; Ibrahim, Roliana

    2013-03-01

    Persian language is the official language of Iran, Tajikistan and Afghanistan. Local online users often represent their opinions and experiences on the web with written Persian. Although the information in those reviews is valuable to potential consumers and sellers, the huge amount of web reviews make it difficult to give an unbiased evaluation to a product. In this paper, standard machine learning techniques SVM and naive Bayes are incorporated into the domain of online Persian Movie reviews to automatically classify user reviews as positive or negative and performance of these two classifiers is compared with each other in this language. The effects of feature presentations on classification performance are discussed. We find that accuracy is influenced by interaction between the classification models and the feature options. The SVM classifier achieves as well as or better accuracy than naive Bayes in Persian movie. Unigrams are proved better features than bigrams and trigrams in capturing Persian sentiment orientation.

  11. An improved chaotic fruit fly optimization based on a mutation strategy for simultaneous feature selection and parameter optimization for SVM and its applications

    Science.gov (United States)

    Lou, Xin Yuan; Sun, Lin Fu

    2017-01-01

    This paper proposes a new support vector machine (SVM) optimization scheme based on an improved chaotic fly optimization algorithm (FOA) with a mutation strategy to simultaneously perform parameter setting turning for the SVM and feature selection. In the improved FOA, the chaotic particle initializes the fruit fly swarm location and replaces the expression of distance for the fruit fly to find the food source. However, the proposed mutation strategy uses two distinct generative mechanisms for new food sources at the osphresis phase, allowing the algorithm procedure to search for the optimal solution in both the whole solution space and within the local solution space containing the fruit fly swarm location. In an evaluation based on a group of ten benchmark problems, the proposed algorithm’s performance is compared with that of other well-known algorithms, and the results support the superiority of the proposed algorithm. Moreover, this algorithm is successfully applied in a SVM to perform both parameter setting turning for the SVM and feature selection to solve real-world classification problems. This method is called chaotic fruit fly optimization algorithm (CIFOA)-SVM and has been shown to be a more robust and effective optimization method than other well-known methods, particularly in terms of solving the medical diagnosis problem and the credit card problem. PMID:28369096

  12. Gene expression profiles reveal key genes for early diagnosis and treatment of adamantinomatous craniopharyngioma.

    Science.gov (United States)

    Yang, Jun; Hou, Ziming; Wang, Changjiang; Wang, Hao; Zhang, Hongbing

    2018-04-23

    Adamantinomatous craniopharyngioma (ACP) is an aggressive brain tumor that occurs predominantly in the pediatric population. Conventional diagnosis method and standard therapy cannot treat ACPs effectively. In this paper, we aimed to identify key genes for ACP early diagnosis and treatment. Datasets GSE94349 and GSE68015 were obtained from Gene Expression Omnibus database. Consensus clustering was applied to discover the gene clusters in the expression data of GSE94349 and functional enrichment analysis was performed on gene set in each cluster. The protein-protein interaction (PPI) network was built by the Search Tool for the Retrieval of Interacting Genes, and hubs were selected. Support vector machine (SVM) model was built based on the signature genes identified from enrichment analysis and PPI network. Dataset GSE94349 was used for training and testing, and GSE68015 was used for validation. Besides, RT-qPCR analysis was performed to analyze the expression of signature genes in ACP samples compared with normal controls. Seven gene clusters were discovered in the differentially expressed genes identified from GSE94349 dataset. Enrichment analysis of each cluster identified 25 pathways that highly associated with ACP. PPI network was built and 46 hubs were determined. Twenty-five pathway-related genes that overlapped with the hubs in PPI network were used as signatures to establish the SVM diagnosis model for ACP. The prediction accuracy of SVM model for training, testing, and validation data were 94, 85, and 74%, respectively. The expression of CDH1, CCL2, ITGA2, COL8A1, COL6A2, and COL6A3 were significantly upregulated in ACP tumor samples, while CAMK2A, RIMS1, NEFL, SYT1, and STX1A were significantly downregulated, which were consistent with the differentially expressed gene analysis. SVM model is a promising classification tool for screening and early diagnosis of ACP. The ACP-related pathways and signature genes will advance our knowledge of ACP pathogenesis

  13. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    Science.gov (United States)

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.

  14. Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils.

    Science.gov (United States)

    Devos, Olivier; Downey, Gerard; Duponchel, Ludovic

    2014-04-01

    Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Estimation of hydraulic jump characteristics of channels with sudden diverging side walls via SVM.

    Science.gov (United States)

    Roushangar, Kiyoumars; Valizadeh, Reyhaneh; Ghasempour, Roghayeh

    2017-10-01

    Sudden diverging channels are one of the energy dissipaters which can dissipate most of the kinetic energy of the flow through a hydraulic jump. An accurate prediction of hydraulic jump characteristics is an important step in designing hydraulic structures. This paper focuses on the capability of the support vector machine (SVM) as a meta-model approach for predicting hydraulic jump characteristics in different sudden diverging stilling basins (i.e. basins with and without appurtenances). In this regard, different models were developed and tested using 1,018 experimental data. The obtained results proved the capability of the SVM technique in predicting hydraulic jump characteristics and it was found that the developed models for a channel with a central block performed more successfully than models for channels without appurtenances or with a negative step. The superior performance for the length of hydraulic jump was obtained for the model with parameters F 1 (Froude number) and (h 2- h 1 )/h 1 (h 1 and h 2 are sequent depth of upstream and downstream respectively). Concerning the relative energy dissipation and sequent depth ratio, the model with parameters F 1 and h 1 /B (B is expansion ratio) led to the best results. According to the outcome of sensitivity analysis, Froude number had the most significant effect on the modeling. Also comparison between SVM and empirical equations indicated the great performance of the SVM.

  16. A novel application of wavelet based SVM to transient phenomena identification of power transformers

    International Nuclear Information System (INIS)

    Jazebi, S.; Vahidi, B.; Jannati, M.

    2011-01-01

    A novel differential protection approach is introduced in the present paper. The proposed scheme is a combination of Support Vector Machine (SVM) and wavelet transform theories. Two common transients such as magnetizing inrush current and internal fault are considered. A new wavelet feature is extracted which reduces the computational cost and enhances the discrimination accuracy of SVM. Particle swarm optimization technique (PSO) has been applied to tune SVM parameters. The suitable performance of this method is demonstrated by simulation of different faults and switching conditions on a power transformer in PSCAD/EMTDC software. The method has the advantages of high accuracy and low computational burden (less than a quarter of a cycle). The other advantage is that the method is not dependent on a specific threshold. Sympathetic and recovery inrush currents also have been simulated and investigated. Results show that the proposed method could remain stable even in noisy environments.

  17. A DWT and SVM based method for rolling element bearing fault diagnosis and its comparison with Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Sunil Tyagi

    2017-04-01

    Full Text Available A classification technique using Support Vector Machine (SVM classifier for detection of rolling element bearing fault is presented here.  The SVM was fed from features that were extracted from of vibration signals obtained from experimental setup consisting of rotating driveline that was mounted on rolling element bearings which were run in normal and with artificially faults induced conditions. The time-domain vibration signals were divided into 40 segments and simple features such as peaks in time domain and spectrum along with statistical features such as standard deviation, skewness, kurtosis etc. were extracted. Effectiveness of SVM classifier was compared with the performance of Artificial Neural Network (ANN classifier and it was found that the performance of SVM classifier is superior to that of ANN. The effect of pre-processing of the vibration signal by Discreet Wavelet Transform (DWT prior to feature extraction is also studied and it is shown that pre-processing of vibration signal with DWT enhances the effectiveness of both ANN and SVM classifiers. It has been demonstrated from experiment results that performance of SVM classifier is better than ANN in detection of bearing condition and pre-processing the vibration signal with DWT improves the performance of SVM classifier.

  18. An intelligent framework for medical image retrieval using MDCT and multi SVM.

    Science.gov (United States)

    Balan, J A Alex Rajju; Rajan, S Edward

    2014-01-01

    Volumes of medical images are rapidly generated in medical field and to manage them effectively has become a great challenge. This paper studies the development of innovative medical image retrieval based on texture features and accuracy. The objective of the paper is to analyze the image retrieval based on diagnosis of healthcare management systems. This paper traces the development of innovative medical image retrieval to estimate both the image texture features and accuracy. The texture features of medical images are extracted using MDCT and multi SVM. Both the theoretical approach and the simulation results revealed interesting observations and they were corroborated using MDCT coefficients and SVM methodology. All attempts to extract the data about the image in response to the query has been computed successfully and perfect image retrieval performance has been obtained. Experimental results on a database of 100 trademark medical images show that an integrated texture feature representation results in 98% of the images being retrieved using MDCT and multi SVM. Thus we have studied a multiclassification technique based on SVM which is prior suitable for medical images. The results show the retrieval accuracy of 98%, 99% for different sets of medical images with respect to the class of image.

  19. Throughput Maximization Using an SVM for Multi-Class Hypothesis-Based Spectrum Sensing in Cognitive Radio

    Directory of Open Access Journals (Sweden)

    Sana Ullah Jan

    2018-03-01

    Full Text Available A framework of spectrum sensing with a multi-class hypothesis is proposed to maximize the achievable throughput in cognitive radio networks. The energy range of a sensing signal under the hypothesis that the primary user is absent (in a conventional two-class hypothesis is further divided into quantized regions, whereas the hypothesis that the primary user is present is conserved. The non-radio frequency energy harvesting-equiped secondary user transmits, when the primary user is absent, with transmission power based on the hypothesis result (the energy level of the sensed signal and the residual energy in the battery: the lower the energy of the received signal, the higher the transmission power, and vice versa. Conversely, the lower is the residual energy in the node, the lower is the transmission power. This technique increases the throughput of a secondary link by providing a higher number of transmission events, compared to the conventional two-class hypothesis. Furthermore, transmission with low power for higher energy levels in the sensed signal reduces the probability of interference with primary users if, for instance, detection was missed. The familiar machine learning algorithm known as a support vector machine (SVM is used in a one-versus-rest approach to classify the input signal into predefined classes. The input signal to the SVM is composed of three statistical features extracted from the sensed signal and a number ranging from 0 to 100 representing the percentage of residual energy in the node’s battery. To increase the generalization of the classifier, k-fold cross-validation is utilized in the training phase. The experimental results show that an SVM with the given features performs satisfactorily for all kernels, but an SVM with a polynomial kernel outperforms linear and radial-basis function kernels in terms of accuracy. Furthermore, the proposed multi-class hypothesis achieves higher throughput compared to the

  20. a Comparison Study of Different Kernel Functions for Svm-Based Classification of Multi-Temporal Polarimetry SAR Data

    Science.gov (United States)

    Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.

    2014-10-01

    In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  1. CNN-SVM for Microvascular Morphological Type Recognition with Data Augmentation.

    Science.gov (United States)

    Xue, Di-Xiu; Zhang, Rong; Feng, Hui; Wang, Ya-Lei

    2016-01-01

    This paper focuses on the problem of feature extraction and the classification of microvascular morphological types to aid esophageal cancer detection. We present a patch-based system with a hybrid SVM model with data augmentation for intraepithelial papillary capillary loop recognition. A greedy patch-generating algorithm and a specialized CNN named NBI-Net are designed to extract hierarchical features from patches. We investigate a series of data augmentation techniques to progressively improve the prediction invariance of image scaling and rotation. For classifier boosting, SVM is used as an alternative to softmax to enhance generalization ability. The effectiveness of CNN feature representation ability is discussed for a set of widely used CNN models, including AlexNet, VGG-16, and GoogLeNet. Experiments are conducted on the NBI-ME dataset. The recognition rate is up to 92.74% on the patch level with data augmentation and classifier boosting. The results show that the combined CNN-SVM model beats models of traditional features with SVM as well as the original CNN with softmax. The synthesis results indicate that our system is able to assist clinical diagnosis to a certain extent.

  2. SVM Pixel Classification on Colour Image Segmentation

    Science.gov (United States)

    Barui, Subhrajit; Latha, S.; Samiappan, Dhanalakshmi; Muthu, P.

    2018-04-01

    The aim of image segmentation is to simplify the representation of an image with the help of cluster pixels into something meaningful to analyze. Segmentation is typically used to locate boundaries and curves in an image, precisely to label every pixel in an image to give each pixel an independent identity. SVM pixel classification on colour image segmentation is the topic highlighted in this paper. It holds useful application in the field of concept based image retrieval, machine vision, medical imaging and object detection. The process is accomplished step by step. At first we need to recognize the type of colour and the texture used as an input to the SVM classifier. These inputs are extracted via local spatial similarity measure model and Steerable filter also known as Gabon Filter. It is then trained by using FCM (Fuzzy C-Means). Both the pixel level information of the image and the ability of the SVM Classifier undergoes some sophisticated algorithm to form the final image. The method has a well developed segmented image and efficiency with respect to increased quality and faster processing of the segmented image compared with the other segmentation methods proposed earlier. One of the latest application result is the Light L16 camera.

  3. Intelligent Agent-Based Intrusion Detection System Using Enhanced Multiclass SVM

    Science.gov (United States)

    Ganapathy, S.; Yogesh, P.; Kannan, A.

    2012-01-01

    Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set. PMID:23056036

  4. An improved PSO-SVM model for online recognition defects in eddy current testing

    Science.gov (United States)

    Liu, Baoling; Hou, Dibo; Huang, Pingjie; Liu, Banteng; Tang, Huayi; Zhang, Wubo; Chen, Peihua; Zhang, Guangxin

    2013-12-01

    Accurate and rapid recognition of defects is essential for structural integrity and health monitoring of in-service device using eddy current (EC) non-destructive testing. This paper introduces a novel model-free method that includes three main modules: a signal pre-processing module, a classifier module and an optimisation module. In the signal pre-processing module, a kind of two-stage differential structure is proposed to suppress the lift-off fluctuation that could contaminate the EC signal. In the classifier module, multi-class support vector machine (SVM) based on one-against-one strategy is utilised for its good accuracy. In the optimisation module, the optimal parameters of classifier are obtained by an improved particle swarm optimisation (IPSO) algorithm. The proposed IPSO technique can improve convergence performance of the primary PSO through the following strategies: nonlinear processing of inertia weight, introductions of the black hole and simulated annealing model with extremum disturbance. The good generalisation ability of the IPSO-SVM model has been validated through adding additional specimen into the testing set. Experiments show that the proposed algorithm can achieve higher recognition accuracy and efficiency than other well-known classifiers and the superiorities are more obvious with less training set, which contributes to online application.

  5. Novel Hybrid of LS-SVM and Kalman Filter for GPS/INS Integration

    Science.gov (United States)

    Xu, Zhenkai; Li, Yong; Rizos, Chris; Xu, Xiaosu

    Integration of Global Positioning System (GPS) and Inertial Navigation System (INS) technologies can overcome the drawbacks of the individual systems. One of the advantages is that the integrated solution can provide continuous navigation capability even during GPS outages. However, bridging the GPS outages is still a challenge when Micro-Electro-Mechanical System (MEMS) inertial sensors are used. Methods being currently explored by the research community include applying vehicle motion constraints, optimal smoother, and artificial intelligence (AI) techniques. In the research area of AI, the neural network (NN) approach has been extensively utilised up to the present. In an NN-based integrated system, a Kalman filter (KF) estimates position, velocity and attitude errors, as well as the inertial sensor errors, to output navigation solutions while GPS signals are available. At the same time, an NN is trained to map the vehicle dynamics with corresponding KF states, and to correct INS measurements when GPS measurements are unavailable. To achieve good performance it is critical to select suitable quality and an optimal number of samples for the NN. This is sometimes too rigorous a requirement which limits real world application of NN-based methods.The support vector machine (SVM) approach is based on the structural risk minimisation principle, instead of the minimised empirical error principle that is commonly implemented in an NN. The SVM can avoid local minimisation and over-fitting problems in an NN, and therefore potentially can achieve a higher level of global performance. This paper focuses on the least squares support vector machine (LS-SVM), which can solve highly nonlinear and noisy black-box modelling problems. This paper explores the application of the LS-SVM to aid the GPS/INS integrated system, especially during GPS outages. The paper describes the principles of the LS-SVM and of the KF hybrid method, and introduces the LS-SVM regression algorithm. Field

  6. COMPARISON OF PERFORMANCES OF DIFFERENT SVM IMPLEMENTATIONS WHEN USED FOR AUTOMATED EVALUATION OF DESCRIPTIVE ANSWERS

    Directory of Open Access Journals (Sweden)

    C. Sunil Kumar

    2015-04-01

    Full Text Available In this paper, we studied the performances of models built using various SVM implementations during the multiclass classification task of automated evaluation of descriptive answers. The performances were evaluated on five datasets each with 900 samples and with each of the datasets treated using symmetric uncertainty feature selection filter. We quantitatively analyzed the best SVM implementation technique from amongst the 17 different SVM implementation combinations derived by using various SVM classifier libraries, SVM types and Kernel methods. Accuracy, F Score, Kappa and Area under ROC curve are used as model evaluation metrics in order to evaluate the models and rank them according to their performances. Based on the results, we derived the conclusion that SMO classifier when used with Polynomial kernel is the overall best performing classifier applicable for auto evaluation of descriptive answers.

  7. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    Science.gov (United States)

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of

  8. Training set extension for SVM ensemble in P300-speller with familiar face paradigm.

    Science.gov (United States)

    Li, Qi; Shi, Kaiyang; Gao, Ning; Li, Jian; Bai, Ou

    2018-03-27

    P300-spellers are brain-computer interface (BCI)-based character input systems. Support vector machine (SVM) ensembles are trained with large-scale training sets and used as classifiers in these systems. However, the required large-scale training data necessitate a prolonged collection time for each subject, which results in data collected toward the end of the period being contaminated by the subject's fatigue. This study aimed to develop a method for acquiring more training data based on a collected small training set. A new method was developed in which two corresponding training datasets in two sequences are superposed and averaged to extend the training set. The proposed method was tested offline on a P300-speller with the familiar face paradigm. The SVM ensemble with extended training set achieved 85% classification accuracy for the averaged results of four sequences, and 100% for 11 sequences in the P300-speller. In contrast, the conventional SVM ensemble with non-extended training set achieved only 65% accuracy for four sequences, and 92% for 11 sequences. The SVM ensemble with extended training set achieves higher classification accuracies than the conventional SVM ensemble, which verifies that the proposed method effectively improves the classification performance of BCI P300-spellers, thus enhancing their practicality.

  9. A linear-RBF multikernel SVM to classify big text corpora.

    Science.gov (United States)

    Romero, R; Iglesias, E L; Borrajo, L

    2015-01-01

    Support vector machine (SVM) is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters) to construct a defined structure (multikernel). The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

  10. Analisis Perbandingan KNN dengan SVM untuk Klasifikasi Penyakit Diabetes Retinopati berdasarkan Citra Eksudat dan Mikroaneurisma

    Directory of Open Access Journals (Sweden)

    SUCI AULIA

    2015-01-01

    Full Text Available ABSTRAK Penelitian mengenai pengklasifikasian tingkat keparahan penyakit Diabetes Retinopati berbasis image processing masih hangat dibicarakan, citra yang biasa digunakan untuk mendeteksi jenis penyakit ini adalah citra optik disk, mikroaneurisma, eksudat, dan hemorrhages yang berasal dari citra fundus. Pada penelitian ini telah dilakukan perbandingan algoritma SVM dengan KNN untuk klasifikasi penyakit diabetes retinopati (mild, moderate, severe berdasarkan citra eksudat dan microaneurisma. Untuk proses ekstraksi ciri digunakan metode wavelet  pada masing-masing kedua metode tersebut. Pada penelitian ini digunakan 160 data uji, masing-masing 40 citra untuk kelas normal, kelas mild, kelas moderate, kelas saviere. Tingkat akurasi yang diperoleh dengan menggunakan metode KNN lebih tinggi dibandingkan SVM, yaitu 65 % dan 62%. Klasifikasi dengan algoritma KNN diperoleh hasil terbaik dengan parameter K=9 cityblock. Sedangkan klasifikasi dengan metode SVM diperoleh hasil terbaik dengan parameter One Agains All. Kata kunci: Diabetic Retinopathy, KNN , SVM, Wavelet.   ABSTRACT Research based on severity classification of the disease diabetic retinopathy by using image processing method is still hotly debated, the image is used to detect the type of this disease is an optical image of the disk, microaneurysm, exudates, and bleeding of the image of the fundus. This study was performed to compare SVM method with KNN method for classification of diabetic retinopathy disease (mild, moderate, severe based on exudate and microaneurysm image. For feature extraction uses wavelet method, and each of the two methods. This study made use of 160 test data, each of 40 images for normal class, mild class, moderate class, severe class. The accuracy obtained by KNN higher than SVM, with 65% and 62%. KNN classification method achieved the best results with the parameters K = 9, cityblock. While the classification with SVM method obtained the best results with

  11. Sensitivity Analysis Based SVM Application on Automatic Incident Detection of Rural Road in China

    Directory of Open Access Journals (Sweden)

    Xingliang Liu

    2018-01-01

    Full Text Available Traditional automatic incident detection methods such as artificial neural networks, backpropagation neural network, and Markov chains are not suitable for addressing the incident detection problem of rural roads in China which have a relatively high accident rate and a low reaction speed caused by the character of small traffic volume. This study applies the support vector machine (SVM and parameter sensitivity analysis methods to build an accident detection algorithm in a rural road condition, based on real-time data collected in a field experiment. The sensitivity of four parameters (speed, front distance, vehicle group time interval, and free driving ratio is analyzed, and the data sets of two parameters with a significant sensitivity are chosen to form the traffic state feature vector. The SVM and k-fold cross validation (K-CV methods are used to build the accident detection algorithm, which shows an excellent performance in detection accuracy (98.15% of the training data set and 87.5% of the testing data set. Therefore, the problem of low incident reaction speed of rural roads in China could be solved to some extent.

  12. A Hybrid ICA-SVM Approach for Determining the Quality Variables at Fault in a Multivariate Process

    Directory of Open Access Journals (Sweden)

    Yuehjen E. Shao

    2012-01-01

    Full Text Available The monitoring of a multivariate process with the use of multivariate statistical process control (MSPC charts has received considerable attention. However, in practice, the use of MSPC chart typically encounters a difficulty. This difficult involves which quality variable or which set of the quality variables is responsible for the generation of the signal. This study proposes a hybrid scheme which is composed of independent component analysis (ICA and support vector machine (SVM to determine the fault quality variables when a step-change disturbance existed in a multivariate process. The proposed hybrid ICA-SVM scheme initially applies ICA to the Hotelling T2 MSPC chart to generate independent components (ICs. The hidden information of the fault quality variables can be identified in these ICs. The ICs are then served as the input variables of the classifier SVM for performing the classification process. The performance of various process designs is investigated and compared with the typical classification method. Using the proposed approach, the fault quality variables for a multivariate process can be accurately and reliably determined.

  13. Detecting microcalcifications in mammograms by using SVM method for the diagnostics of breast cancer

    Science.gov (United States)

    Wan, Baikun; Wang, Ruiping; Qi, Hongzhi; Cao, Xuchen

    2005-01-01

    Support vector machine (SVM) is a new statistical learning method. Compared with the classical machine learning methods, SVM learning discipline is to minimize the structural risk instead of the empirical risk of the classical methods, and it gives better generative performance. Because SVM algorithm is a convex quadratic optimization problem, the local optimal solution is certainly the global optimal one. In this paper a SVM algorithm is applied to detect the micro-calcifications (MCCs) in mammograms for the diagnostics of breast cancer that has not been reported yet. It had been tested with 10 mammograms and the results show that the algorithm can achieve a higher true positive in comparison with artificial neural network (ANN) based on the empirical risk minimization, and is valuable for further study and application in the clinical engineering.

  14. Hybrid NN/SVM Computational System for Optimizing Designs

    Science.gov (United States)

    Rai, Man Mohan

    2009-01-01

    A computational method and system based on a hybrid of an artificial neural network (NN) and a support vector machine (SVM) (see figure) has been conceived as a means of maximizing or minimizing an objective function, optionally subject to one or more constraints. Such maximization or minimization could be performed, for example, to optimize solve a data-regression or data-classification problem or to optimize a design associated with a response function. A response function can be considered as a subset of a response surface, which is a surface in a vector space of design and performance parameters. A typical example of a design problem that the method and system can be used to solve is that of an airfoil, for which a response function could be the spatial distribution of pressure over the airfoil. In this example, the response surface would describe the pressure distribution as a function of the operating conditions and the geometric parameters of the airfoil. The use of NNs to analyze physical objects in order to optimize their responses under specified physical conditions is well known. NN analysis is suitable for multidimensional interpolation of data that lack structure and enables the representation and optimization of a succession of numerical solutions of increasing complexity or increasing fidelity to the real world. NN analysis is especially useful in helping to satisfy multiple design objectives. Feedforward NNs can be used to make estimates based on nonlinear mathematical models. One difficulty associated with use of a feedforward NN arises from the need for nonlinear optimization to determine connection weights among input, intermediate, and output variables. It can be very expensive to train an NN in cases in which it is necessary to model large amounts of information. Less widely known (in comparison with NNs) are support vector machines (SVMs), which were originally applied in statistical learning theory. In terms that are necessarily

  15. An SVM model with hybrid kernels for hydrological time series

    Science.gov (United States)

    Wang, C.; Wang, H.; Zhao, X.; Xie, Q.

    2017-12-01

    Support Vector Machine (SVM) models have been widely applied to the forecast of climate/weather and its impact on other environmental variables such as hydrologic response to climate/weather. When using SVM, the choice of the kernel function plays the key role. Conventional SVM models mostly use one single type of kernel function, e.g., radial basis kernel function. Provided that there are several featured kernel functions available, each having its own advantages and drawbacks, a combination of these kernel functions may give more flexibility and robustness to SVM approach, making it suitable for a wide range of application scenarios. This paper presents such a linear combination of radial basis kernel and polynomial kernel for the forecast of monthly flowrate in two gaging stations using SVM approach. The results indicate significant improvement in the accuracy of predicted series compared to the approach with either individual kernel function, thus demonstrating the feasibility and advantages of such hybrid kernel approach for SVM applications.

  16. A Cancer Gene Selection Algorithm Based on the K-S Test and CFS

    Directory of Open Access Journals (Sweden)

    Qiang Su

    2017-01-01

    Full Text Available Background. To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S test and correlation-based feature selection (CFS principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. Results. We adopted support vector machines (SVM as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR, and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. Conclusions. The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.

  17. Extraction of prostatic lumina and automated recognition for prostatic calculus image using PCA-SVM.

    Science.gov (United States)

    Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D Joshua

    2011-01-01

    Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi.

  18. Extraction of Prostatic Lumina and Automated Recognition for Prostatic Calculus Image Using PCA-SVM

    Science.gov (United States)

    Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D. Joshua

    2011-01-01

    Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi. PMID:21461364

  19. SVM-based Partial Discharge Pattern Classification for GIS

    Science.gov (United States)

    Ling, Yin; Bai, Demeng; Wang, Menglin; Gong, Xiaojin; Gu, Chao

    2018-01-01

    Partial discharges (PD) occur when there are localized dielectric breakdowns in small regions of gas insulated substations (GIS). It is of high importance to recognize the PD patterns, through which we can diagnose the defects caused by different sources so that predictive maintenance can be conducted to prevent from unplanned power outage. In this paper, we propose an approach to perform partial discharge pattern classification. It first recovers the PRPD matrices from the PRPD2D images; then statistical features are extracted from the recovered PRPD matrix and fed into SVM for classification. Experiments conducted on a dataset containing thousands of images demonstrates the high effectiveness of the method.

  20. A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.

    Science.gov (United States)

    Halloran, John T; Rocke, David M

    2018-05-04

    Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l 2 -SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l 2 -SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l 2 -SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .

  1. Atterberg Limits Prediction Comparing SVM with ANFIS Model

    Directory of Open Access Journals (Sweden)

    Mohammad Murtaza Sherzoy

    2017-03-01

    Full Text Available Support Vector Machine (SVM and Adaptive Neuro-Fuzzy inference Systems (ANFIS both analytical methods are used to predict the values of Atterberg limits, such as the liquid limit, plastic limit and plasticity index. The main objective of this study is to make a comparison between both forecasts (SVM & ANFIS methods. All data of 54 soil samples are used and taken from the area of Peninsular Malaysian and tested for different parameters containing liquid limit, plastic limit, plasticity index and grain size distribution and were. The input parameter used in for this case are the fraction of grain size distribution which are the percentage of silt, clay and sand. The actual and predicted values of Atterberg limit which obtained from the SVM and ANFIS models are compared by using the correlation coefficient R2 and root mean squared error (RMSE value.  The outcome of the study show that the ANFIS model shows higher accuracy than SVM model for the liquid limit (R2 = 0.987, plastic limit (R2 = 0.949 and plastic index (R2 = 0966. RMSE value that obtained for both methods have shown that the ANFIS model has represent the best performance than SVM model to predict the Atterberg Limits as a whole.

  2. Identification and optimization of classifier genes from multi-class earthworm microarray dataset.

    Directory of Open Access Journals (Sweden)

    Ying Li

    Full Text Available Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. A variety of toxicological effects have been associated with explosive compounds TNT and RDX. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. We have developed an earthworm microarray containing 15,208 unique oligo probes and have used it to profile gene expression in 248 earthworms exposed to TNT, RDX or neither. We assembled a new machine learning pipeline consisting of several well-established feature filtering/selection and classification techniques to analyze the 248-array dataset in order to construct classifier models that can separate earthworm samples into three groups: control, TNT-treated, and RDX-treated. First, a total of 869 genes differentially expressed in response to TNT or RDX exposure were identified using a univariate statistical algorithm of class comparison. Then, decision tree-based algorithms were applied to select a subset of 354 classifier genes, which were ranked by their overall weight of significance. A multiclass support vector machine (MC-SVM method and an unsupervised K-mean clustering method were applied to independently refine the classifier, producing a smaller subset of 39 and 30 classifier genes, separately, with 11 common genes being potential biomarkers. The combined 58 genes were considered the refined subset and used to build MC-SVM and clustering models with classification accuracy of 83.5% and 56.9%, respectively. This study demonstrates that the machine learning approach can be used to identify and optimize a small subset of classifier/biomarker genes from high dimensional datasets and generate classification models of acceptable precision for multiple classes.

  3. A prediction model of drug-induced ototoxicity developed by an optimal support vector machine (SVM) method.

    Science.gov (United States)

    Zhou, Shu; Li, Guo-Bo; Huang, Lu-Yi; Xie, Huan-Zhang; Zhao, Ying-Lan; Chen, Yu-Zong; Li, Lin-Li; Yang, Sheng-Yong

    2014-08-01

    Drug-induced ototoxicity, as a toxic side effect, is an important issue needed to be considered in drug discovery. Nevertheless, current experimental methods used to evaluate drug-induced ototoxicity are often time-consuming and expensive, indicating that they are not suitable for a large-scale evaluation of drug-induced ototoxicity in the early stage of drug discovery. We thus, in this investigation, established an effective computational prediction model of drug-induced ototoxicity using an optimal support vector machine (SVM) method, GA-CG-SVM. Three GA-CG-SVM models were developed based on three training sets containing agents bearing different risk levels of drug-induced ototoxicity. For comparison, models based on naïve Bayesian (NB) and recursive partitioning (RP) methods were also used on the same training sets. Among all the prediction models, the GA-CG-SVM model II showed the best performance, which offered prediction accuracies of 85.33% and 83.05% for two independent test sets, respectively. Overall, the good performance of the GA-CG-SVM model II indicates that it could be used for the prediction of drug-induced ototoxicity in the early stage of drug discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. KOMPARASI MODEL SUPPORT VECTOR MACHINES (SVM DAN NEURAL NETWORK UNTUK MENGETAHUI TINGKAT AKURASI PREDIKSI TERTINGGI HARGA SAHAM

    Directory of Open Access Journals (Sweden)

    R. Hadapiningradja Kusumodestoni

    2017-09-01

    Full Text Available There are many types of investments to make money, one of which is in the form of shares. Shares is a trading company dealing with securities in the global capital markets. Stock Exchange or also called stock market is actually the activities of private companies in the form of buying and selling investments. To avoid losses in investing, we need a model of predictive analysis with high accuracy and supported by data - lots of data and accurately. The correct techniques in the analysis will be able to reduce the risk for investors in investing. There are many models used in the analysis of stock price movement prediction, in this study the researchers used models of neural networks (NN and a model of support vector machine (SVM. Based on the background of the problems that have been mentioned in the previous description it can be formulated the problem as follows: need an algorithm that can predict stock prices, and need a high accuracy rate by adding a data set on the prediction, two algorithms will be investigated expected results last researchers can deduce where the algorithm accuracy rate predictions are the highest or accurate, then the purpose of this study was to mengkomparasi or compare between the two algorithms are algorithms Neural Network algorithm and Support Vector Machine which later on the end result has an accuracy rate forecast stock prices highest to see the error value RMSEnya. After doing research using the model of neural network and model of support vector machine (SVM to predict the stock using the data value of the shares on the stock index hongkong dated July 20, 2016 at 16:26 pm until the date of 15 September 2016 at 17:40 pm as many as 729 data sets within an interval of 5 minute through a process of training, learning, and then continue the process of testing so the result is that by using a neural network model of the prediction accuracy of 0.503 +/- 0.009 (micro 503 while using the model of support vector machine

  5. Prediction of protein-protein interactions between viruses and human by an SVM model

    Directory of Open Access Journals (Sweden)

    Cui Guangyu

    2012-05-01

    Full Text Available Abstract Background Several computational methods have been developed to predict protein-protein interactions from amino acid sequences, but most of those methods are intended for the interactions within a species rather than for interactions across different species. Methods for predicting interactions between homogeneous proteins are not appropriate for finding those between heterogeneous proteins since they do not distinguish the interactions between proteins of the same species from those of different species. Results We developed a new method for representing a protein sequence of variable length in a frequency vector of fixed length, which encodes the relative frequency of three consecutive amino acids of a sequence. We built a support vector machine (SVM model to predict human proteins that interact with virus proteins. In two types of viruses, human papillomaviruses (HPV and hepatitis C virus (HCV, our SVM model achieved an average accuracy above 80%, which is higher than that of another SVM model with a different representation scheme. Using the SVM model and Gene Ontology (GO annotations of proteins, we predicted new interactions between virus proteins and human proteins. Conclusions Encoding the relative frequency of amino acid triplets of a protein sequence is a simple yet powerful representation method for predicting protein-protein interactions across different species. The representation method has several advantages: (1 it enables a prediction model to achieve a better performance than other representations, (2 it generates feature vectors of fixed length regardless of the sequence length, and (3 the same representation is applicable to different types of proteins.

  6. Comparison of hand-craft feature based SVM and CNN based deep learning framework for automatic polyp classification.

    Science.gov (United States)

    Younghak Shin; Balasingham, Ilangko

    2017-07-01

    Colonoscopy is a standard method for screening polyps by highly trained physicians. Miss-detected polyps in colonoscopy are potential risk factor for colorectal cancer. In this study, we investigate an automatic polyp classification framework. We aim to compare two different approaches named hand-craft feature method and convolutional neural network (CNN) based deep learning method. Combined shape and color features are used for hand craft feature extraction and support vector machine (SVM) method is adopted for classification. For CNN approach, three convolution and pooling based deep learning framework is used for classification purpose. The proposed framework is evaluated using three public polyp databases. From the experimental results, we have shown that the CNN based deep learning framework shows better classification performance than the hand-craft feature based methods. It achieves over 90% of classification accuracy, sensitivity, specificity and precision.

  7. A Classification Framework Applied to Cancer Gene Expression Profiles

    Directory of Open Access Journals (Sweden)

    Hussein Hijazi

    2013-01-01

    Full Text Available Classification of cancer based on gene expression has provided insight into possible treatment strategies. Thus, developing machine learning methods that can successfully distinguish among cancer subtypes or normal versus cancer samples is important. This work discusses supervised learning techniques that have been employed to classify cancers. Furthermore, a two-step feature selection method based on an attribute estimation method (e.g., ReliefF and a genetic algorithm was employed to find a set of genes that can best differentiate between cancer subtypes or normal versus cancer samples. The application of different classification methods (e.g., decision tree, k-nearest neighbor, support vector machine (SVM, bagging, and random forest on 5 cancer datasets shows that no classification method universally outperforms all the others. However, k-nearest neighbor and linear SVM generally improve the classification performance over other classifiers. Finally, incorporating diverse types of genomic data (e.g., protein-protein interaction data and gene expression increase the prediction accuracy as compared to using gene expression alone.

  8. Damage Detection of Structures for Ambient Loading Based on Cross Correlation Function Amplitude and SVM

    Directory of Open Access Journals (Sweden)

    Lin-sheng Huo

    2016-01-01

    Full Text Available An effective method for the damage detection of skeletal structures which combines the cross correlation function amplitude (CCFA with the support vector machine (SVM is presented in this paper. The proposed method consists of two stages. Firstly, the data features are extracted from the CCFA, which, calculated from dynamic responses and as a representation of the modal shapes of the structure, changes when damage occurs on the structure. The data features are then input into the SVM with the one-against-one (OAO algorithm to classify the damage status of the structure. The simulation data of IASC-ASCE benchmark model and a vibration experiment of truss structure are adopted to verify the feasibility of proposed method. The results show that the proposed method is suitable for the damage identification of skeletal structures with the limited sensors subjected to ambient excitation. As the CCFA based data features are sensitive to damage, the proposed method demonstrates its reliability in the diagnosis of structures with damage, especially for those with minor damage. In addition, the proposed method shows better noise robustness and is more suitable for noisy environments.

  9. Application of ANFIS and SVM Systems in Order to Estimate Monthly Reference Crop Evapotranspiration in the Northwest of Iran

    Directory of Open Access Journals (Sweden)

    F. Ahmadi

    2016-10-01

    Full Text Available Introduction Crop evapotranspiration modeling process mainly performs with empirical methods, aerodynamic and energy balance. In these methods, the evapotranspiration is calculated based on the average values of meteorological parameters at different time steps. The linear models didn’t have a good performance in this field due to high variability of evapotranspiration and the researchers have turned to the use of nonlinear and intelligent models. For accurate estimation of this hydrologic variable, it should be spending much time and money to measure many data (19. Materials and Methods Recently the new hybrid methods have been developed by combining some of methods such as artificial neural networks, fuzzy logic and evolutionary computation, that called Soft Computing and Intelligent Systems. These soft techniques are used in various fields of engineering. A fuzzy neurosis is a hybrid system that incorporates the decision ability of fuzzy logic with the computational ability of neural network, which provides a high capability for modeling and estimating. Basically, the Fuzzy part is used to classify the input data set and determines the degree of membership (that each number can be laying between 0 and 1 and decisions for the next activity made based on a set of rules and move to the next stage. Adaptive Neuro-Fuzzy Inference Systems (ANFIS includes some parts of a typical fuzzy expert system which the calculations at each step is performed by the hidden layer neurons and the learning ability of the neural network has been created to increase the system information (9. SVM is a one of supervised learning methods which used for classification and regression affairs. This method was developed by Vapink (15 based on statistical learning theory. The SVM is a method for binary classification in an arbitrary characteristic space, so it is suitable for prediction problems (12. The SVM is originally a two-class Classifier that separates the classes

  10. Hybrid Model Based on Genetic Algorithms and SVM Applied to Variable Selection within Fruit Juice Classification

    Directory of Open Access Journals (Sweden)

    C. Fernandez-Lozano

    2013-01-01

    Full Text Available Given the background of the use of Neural Networks in problems of apple juice classification, this paper aim at implementing a newly developed method in the field of machine learning: the Support Vector Machines (SVM. Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using SVM as a fitness function of the Genetic Algorithm (GA, the most representative variables for a specific classification problem can be selected.

  11. A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network

    Directory of Open Access Journals (Sweden)

    Han Kyungsook

    2010-06-01

    Full Text Available Abstract Background Genetic interaction profiles are highly informative and helpful for understanding the functional linkages between genes, and therefore have been extensively exploited for annotating gene functions and dissecting specific pathway structures. However, our understanding is rather limited to the relationship between double concurrent perturbation and various higher level phenotypic changes, e.g. those in cells, tissues or organs. Modifier screens, such as synthetic genetic arrays (SGA can help us to understand the phenotype caused by combined gene mutations. Unfortunately, exhaustive tests on all possible combined mutations in any genome are vulnerable to combinatorial explosion and are infeasible either technically or financially. Therefore, an accurate computational approach to predict genetic interaction is highly desirable, and such methods have the potential of alleviating the bottleneck on experiment design. Results In this work, we introduce a computational systems biology approach for the accurate prediction of pairwise synthetic genetic interactions (SGI. First, a high-coverage and high-precision functional gene network (FGN is constructed by integrating protein-protein interaction (PPI, protein complex and gene expression data; then, a graph-based semi-supervised learning (SSL classifier is utilized to identify SGI, where the topological properties of protein pairs in weighted FGN is used as input features of the classifier. We compare the proposed SSL method with the state-of-the-art supervised classifier, the support vector machines (SVM, on a benchmark dataset in S. cerevisiae to validate our method's ability to distinguish synthetic genetic interactions from non-interaction gene pairs. Experimental results show that the proposed method can accurately predict genetic interactions in S. cerevisiae (with a sensitivity of 92% and specificity of 91%. Noticeably, the SSL method is more efficient than SVM, especially for

  12. Steady Modeling for an Ammonia Synthesis Reactor Based on a Novel CDEAS-LS-SVM Model

    Directory of Open Access Journals (Sweden)

    Zhuoqian Liu

    2014-01-01

    Full Text Available A steady-state mathematical model is built in order to represent plant behavior under stationary operating conditions. A novel modeling using LS-SVR based on Cultural Differential Evolution with Ant Search is proposed. LS-SVM is adopted to establish the model of the net value of ammonia. The modeling method has fast convergence speed and good global adaptability for identification of the ammonia synthesis process. The LS-SVR model was established using the above-mentioned method. Simulation results verify the validity of the method.

  13. 基于信息熵的SVM入侵检测技术%Exploring SVM-based intrusion detection through information entropy theory

    Institute of Scientific and Technical Information of China (English)

    朱文杰; 王强; 翟献军

    2013-01-01

    在传统基于SVM的入侵检测中,核函数构造和特征选择采用先验知识,普遍存在准确度不高、效率低下的问题.通过信息熵理论与SVM算法相结合的方法改进为基于信息熵的SVM入侵检测算法,可以提高入侵检测的准确性,提升入侵检测的效率.基于信息熵的SVM入侵检测算法包括两个方面:一方面,根据样本包含的用户信息熵和方差,将样本特征统一,以特征是否属于置信区间来度量.将得到的样本特征置信向量作为SVM核函数的构造参数,既可保证训练样本集与最优分类面之间的对应关系,又可得到入侵检测需要的最大分类间隔;另一方面,将样本包含的用户信息量作为度量大幅度约简样本特征子集,不但降低了样本计算规模,而且提高了分类器的训练速度.实验表明,该算法在入侵检测系统中的应用优于传统的SVM算法.%In traditional SVM based intrusion detection approaches,both core function construction and feature selection use prior knowdege.Due to this,they are not only inefficient but also inaccurate.It is observed that integrating information entropy theory into SVM-based intrusion detection can enhance both the precision and the speed.Concludely speaking,SVM-based entropy intrusion detection algorithms are made up of two aspects:on one hand,setting sample confidence vector as core function's constructor of SVM algorithm can guarantee the mapping relationship between training sample and optimization classification plane.Also,the intrusion detection's maximum interval can be acquired.On the other hand,simplifying feature subset with samples's entropy as metric standard can not only shrink the computing scale but also improve the speed.Experiments prove that the SVM based entropy intrusion detection algoritm outperfomrs other tradional algorithms.

  14. Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders.

    Science.gov (United States)

    Subasi, Abdulhamit

    2013-06-01

    Support vector machine (SVM) is an extensively used machine learning method with many biomedical signal classification applications. In this study, a novel PSO-SVM model has been proposed that hybridized the particle swarm optimization (PSO) and SVM to improve the EMG signal classification accuracy. This optimization mechanism involves kernel parameter setting in the SVM training procedure, which significantly influences the classification accuracy. The experiments were conducted on the basis of EMG signal to classify into normal, neurogenic or myopathic. In the proposed method the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT) and a set of statistical features were extracted from these sub-bands to represent the distribution of wavelet coefficients. The obtained results obviously validate the superiority of the SVM method compared to conventional machine learning methods, and suggest that further significant enhancements in terms of classification accuracy can be achieved by the proposed PSO-SVM classification system. The PSO-SVM yielded an overall accuracy of 97.41% on 1200 EMG signals selected from 27 subject records against 96.75%, 95.17% and 94.08% for the SVM, the k-NN and the RBF classifiers, respectively. PSO-SVM is developed as an efficient tool so that various SVMs can be used conveniently as the core of PSO-SVM for diagnosis of neuromuscular disorders. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. A Realistic Seizure Prediction Study Based on Multiclass SVM.

    Science.gov (United States)

    Direito, Bruno; Teixeira, César A; Sales, Francisco; Castelo-Branco, Miguel; Dourado, António

    2017-05-01

    A patient-specific algorithm, for epileptic seizure prediction, based on multiclass support-vector machines (SVM) and using multi-channel high-dimensional feature sets, is presented. The feature sets, combined with multiclass classification and post-processing schemes aim at the generation of alarms and reduced influence of false positives. This study considers 216 patients from the European Epilepsy Database, and includes 185 patients with scalp EEG recordings and 31 with intracranial data. The strategy was tested over a total of 16,729.80[Formula: see text]h of inter-ictal data, including 1206 seizures. We found an overall sensitivity of 38.47% and a false positive rate per hour of 0.20. The performance of the method achieved statistical significance in 24 patients (11% of the patients). Despite the encouraging results previously reported in specific datasets, the prospective demonstration on long-term EEG recording has been limited. Our study presents a prospective analysis of a large heterogeneous, multicentric dataset. The statistical framework based on conservative assumptions, reflects a realistic approach compared to constrained datasets, and/or in-sample evaluations. The improvement of these results, with the definition of an appropriate set of features able to improve the distinction between the pre-ictal and nonpre-ictal states, hence minimizing the effect of confounding variables, remains a key aspect.

  16. Customer and performance rating in QFD using SVM classification

    Science.gov (United States)

    Dzulkifli, Syarizul Amri; Salleh, Mohd Najib Mohd; Leman, A. M.

    2017-09-01

    In a classification problem, where each input is associated to one output. Training data is used to create a model which predicts values to the true function. SVM is a popular method for binary classification due to their theoretical foundation and good generalization performance. However, when trained with noisy data, the decision hyperplane might deviate from optimal position because of the sum of misclassification errors in the objective function. In this paper, we introduce fuzzy in weighted learning approach for improving the accuracy of Support Vector Machine (SVM) classification. The main aim of this work is to determine appropriate weighted for SVM to adjust the parameters of learning method from a given set of noisy input to output data. The performance and customer rating in Quality Function Deployment (QFD) is used as our case study to determine implementing fuzzy SVM is highly scalable for very large data sets and generating high classification accuracy.

  17. Application of EMD-Based SVD and SVM to Coal-Gangue Interface Detection

    Directory of Open Access Journals (Sweden)

    Wei Liu

    2014-01-01

    Full Text Available Coal-gangue interface detection during top-coal caving mining is a challenging problem. This paper proposes a new vibration signal analysis approach to detecting the coal-gangue interface based on singular value decomposition (SVD techniques and support vector machines (SVMs. Due to the nonstationary characteristics in vibration signals of the tail boom support of the longwall mining machine in this complicated environment, the empirical mode decomposition (EMD is used to decompose the raw vibration signals into a number of intrinsic mode functions (IMFs by which the initial feature vector matrices can be formed automatically. By applying the SVD algorithm to the initial feature vector matrices, the singular values of matrices can be obtained and used as the input feature vectors of SVMs classifier. The analysis results of vibration signals from the tail boom support of a longwall mining machine show that the method based on EMD, SVD, and SVM is effective for coal-gangue interface detection even when the number of samples is small.

  18. SVM-Based Control System for a Robot Manipulator

    Directory of Open Access Journals (Sweden)

    Foudil Abdessemed

    2012-12-01

    Full Text Available Real systems are usually non-linear, ill-defined, have variable parameters and are subject to external disturbances. Modelling these systems is often an approximation of the physical phenomena involved. However, it is from this approximate system of representation that we propose - in this paper - to build a robust control, in the sense that it must ensure low sensitivity towards parameters, uncertainties, variations and external disturbances. The computed torque method is a well-established robot control technique which takes account of the dynamic coupling between the robot links. However, its main disadvantage lies on the assumption of an exactly known dynamic model which is not realizable in practice. To overcome this issue, we propose the estimation of the dynamics model of the nonlinear system with a machine learning regression method. The output of this regressor is used in conjunction with a PD controller to achieve the tracking trajectory task of a robot manipulator. In cases where some of the parameters of the plant undergo a change in their values, poor performance may result. To cope with this drawback, a fuzzy precompensator is inserted to reinforce the SVM computed torque-based controller and avoid any deterioration. The theory is developed and the simulation results are carried out on a two-degree of freedom robot manipulator to demonstrate the validity of the proposed approach.

  19. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    Directory of Open Access Journals (Sweden)

    Hala Alshamlan

    2015-01-01

    Full Text Available An artificial bee colony (ABC is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR, and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO. The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  20. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    Science.gov (United States)

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  1. Constructing and Validating High-Performance MIEC-SVM Models in Virtual Screening for Kinases: A Better Way for Actives Discovery.

    Science.gov (United States)

    Sun, Huiyong; Pan, Peichen; Tian, Sheng; Xu, Lei; Kong, Xiaotian; Li, Youyong; Dan Li; Hou, Tingjun

    2016-04-22

    The MIEC-SVM approach, which combines molecular interaction energy components (MIEC) derived from free energy decomposition and support vector machine (SVM), has been found effective in capturing the energetic patterns of protein-peptide recognition. However, the performance of this approach in identifying small molecule inhibitors of drug targets has not been well assessed and validated by experiments. Thereafter, by combining different model construction protocols, the issues related to developing best MIEC-SVM models were firstly discussed upon three kinase targets (ABL, ALK, and BRAF). As for the investigated targets, the optimized MIEC-SVM models performed much better than the models based on the default SVM parameters and Autodock for the tested datasets. Then, the proposed strategy was utilized to screen the Specs database for discovering potential inhibitors of the ALK kinase. The experimental results showed that the optimized MIEC-SVM model, which identified 7 actives with IC50 < 10 μM from 50 purchased compounds (namely hit rate of 14%, and 4 in nM level) and performed much better than Autodock (3 actives with IC50 < 10 μM from 50 purchased compounds, namely hit rate of 6%, and 2 in nM level), suggesting that the proposed strategy is a powerful tool in structure-based virtual screening.

  2. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification

    Directory of Open Access Journals (Sweden)

    D. Ramyachitra

    2015-09-01

    Full Text Available Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM, K-nearest neighbor (KNN, Interval Valued Classification (IVC and the improvised Interval Value based Particle Swarm Optimization (IVPSO algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  3. Interval-value Based Particle Swarm Optimization algorithm for cancer-type specific gene selection and sample classification.

    Science.gov (United States)

    Ramyachitra, D; Sofia, M; Manikandan, P

    2015-09-01

    Microarray technology allows simultaneous measurement of the expression levels of thousands of genes within a biological tissue sample. The fundamental power of microarrays lies within the ability to conduct parallel surveys of gene expression using microarray data. The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high compared to the number of data samples. Thus the difficulty that lies with data are of high dimensionality and the sample size is small. This research work addresses the problem by classifying resultant dataset using the existing algorithms such as Support Vector Machine (SVM), K-nearest neighbor (KNN), Interval Valued Classification (IVC) and the improvised Interval Value based Particle Swarm Optimization (IVPSO) algorithm. Thus the results show that the IVPSO algorithm outperformed compared with other algorithms under several performance evaluation functions.

  4. An Improved TA-SVM Method Without Matrix Inversion and Its Fast Implementation for Nonstationary Datasets.

    Science.gov (United States)

    Shi, Yingzhong; Chung, Fu-Lai; Wang, Shitong

    2015-09-01

    Recently, a time-adaptive support vector machine (TA-SVM) is proposed for handling nonstationary datasets. While attractive performance has been reported and the new classifier is distinctive in simultaneously solving several SVM subclassifiers locally and globally by using an elegant SVM formulation in an alternative kernel space, the coupling of subclassifiers brings in the computation of matrix inversion, thus resulting to suffer from high computational burden in large nonstationary dataset applications. To overcome this shortcoming, an improved TA-SVM (ITA-SVM) is proposed using a common vector shared by all the SVM subclassifiers involved. ITA-SVM not only keeps an SVM formulation, but also avoids the computation of matrix inversion. Thus, we can realize its fast version, that is, improved time-adaptive core vector machine (ITA-CVM) for large nonstationary datasets by using the CVM technique. ITA-CVM has the merit of asymptotic linear time complexity for large nonstationary datasets as well as inherits the advantage of TA-SVM. The effectiveness of the proposed classifiers ITA-SVM and ITA-CVM is also experimentally confirmed.

  5. Support vector machine based estimation of remaining useful life: current research status and future trends

    International Nuclear Information System (INIS)

    Huang, Hong Zhong; Wang, Hai Kun; Li, Yan Feng; Zhang, Longlong; Liu, Zhiliang

    2015-01-01

    Estimation of remaining useful life (RUL) is helpful to manage life cycles of machines and to reduce maintenance cost. Support vector machine (SVM) is a promising algorithm for estimation of RUL because it can easily process small training sets and multi-dimensional data. Many SVM based methods have been proposed to predict RUL of some key components. We did a literature review related to SVM based RUL estimation within a decade. The references reviewed are classified into two categories: improved SVM algorithms and their applications to RUL estimation. The latter category can be further divided into two types: one, to predict the condition state in the future and then build a relationship between state and RUL; two, to establish a direct relationship between current state and RUL. However, SVM is seldom used to track the degradation process and build an accurate relationship between the current health condition state and RUL. Based on the above review and summary, this paper points out that the ability to continually improve SVM, and obtain a novel idea for RUL prediction using SVM will be future works.

  6. Cardiac arrhythmia beat classification using DOST and PSO tuned SVM.

    Science.gov (United States)

    Raj, Sandeep; Ray, Kailash Chandra; Shankar, Om

    2016-11-01

    The increase in the number of deaths due to cardiovascular diseases (CVDs) has gained significant attention from the study of electrocardiogram (ECG) signals. These ECG signals are studied by the experienced cardiologist for accurate and proper diagnosis, but it becomes difficult and time-consuming for long-term recordings. Various signal processing techniques are studied to analyze the ECG signal, but they bear limitations due to the non-stationary behavior of ECG signals. Hence, this study aims to improve the classification accuracy rate and provide an automated diagnostic solution for the detection of cardiac arrhythmias. The proposed methodology consists of four stages, i.e. filtering, R-peak detection, feature extraction and classification stages. In this study, Wavelet based approach is used to filter the raw ECG signal, whereas Pan-Tompkins algorithm is used for detecting the R-peak inside the ECG signal. In the feature extraction stage, discrete orthogonal Stockwell transform (DOST) approach is presented for an efficient time-frequency representation (i.e. morphological descriptors) of a time domain signal and retains the absolute phase information to distinguish the various non-stationary behavior ECG signals. Moreover, these morphological descriptors are further reduced in lower dimensional space by using principal component analysis and combined with the dynamic features (i.e based on RR-interval of the ECG signals) of the input signal. This combination of two different kinds of descriptors represents each feature set of an input signal that is utilized for classification into subsequent categories by employing PSO tuned support vector machines (SVM). The proposed methodology is validated on the baseline MIT-BIH arrhythmia database and evaluated under two assessment schemes, yielding an improved overall accuracy of 99.18% for sixteen classes in the category-based and 89.10% for five classes (mapped according to AAMI standard) in the patient-based

  7. Hardware realization of an SVM algorithm implemented in FPGAs

    Science.gov (United States)

    Wiśniewski, Remigiusz; Bazydło, Grzegorz; Szcześniak, Paweł

    2017-08-01

    The paper proposes a technique of hardware realization of a space vector modulation (SVM) of state function switching in matrix converter (MC), oriented on the implementation in a single field programmable gate array (FPGA). In MC the SVM method is based on the instantaneous space-vector representation of input currents and output voltages. The traditional computation algorithms usually involve digital signal processors (DSPs) which consumes the large number of power transistors (18 transistors and 18 independent PWM outputs) and "non-standard positions of control pulses" during the switching sequence. Recently, hardware implementations become popular since computed operations may be executed much faster and efficient due to nature of the digital devices (especially concurrency). In the paper, we propose a hardware algorithm of SVM computation. In opposite to the existing techniques, the presented solution applies COordinate Rotation DIgital Computer (CORDIC) method to solve the trigonometric operations. Furthermore, adequate arithmetic modules (that is, sub-devices) used for intermediate calculations, such as code converters or proper sectors selectors (for output voltages and input current) are presented in detail. The proposed technique has been implemented as a design described with the use of Verilog hardware description language. The preliminary results of logic implementation oriented on the Xilinx FPGA (particularly, low-cost device from Artix-7 family from Xilinx was used) are also presented.

  8. Comparison of water extraction methods in Tibet based on GF-1 data

    Science.gov (United States)

    Jia, Lingjun; Shang, Kun; Liu, Jing; Sun, Zhongqing

    2018-03-01

    In this study, we compared four different water extraction methods with GF-1 data according to different water types in Tibet, including Support Vector Machine (SVM), Principal Component Analysis (PCA), Decision Tree Classifier based on False Normalized Difference Water Index (FNDWI-DTC), and PCA-SVM. The results show that all of the four methods can extract large area water body, but only SVM and PCA-SVM can obtain satisfying extraction results for small size water body. The methods were evaluated by both overall accuracy (OAA) and Kappa coefficient (KC). The OAA of PCA-SVM, SVM, FNDWI-DTC, PCA are 96.68%, 94.23%, 93.99%, 93.01%, and the KCs are 0.9308, 0.8995, 0.8962, 0.8842, respectively, in consistent with visual inspection. In summary, SVM is better for narrow rivers extraction and PCA-SVM is suitable for water extraction of various types. As for dark blue lakes, the methods using PCA can extract more quickly and accurately.

  9. Combined Forecasting Method of Landslide Deformation Based on MEEMD, Approximate Entropy, and WLS-SVM

    Directory of Open Access Journals (Sweden)

    Shaofeng Xie

    2017-01-01

    Full Text Available Given the chaotic characteristics of the time series of landslides, a new method based on modified ensemble empirical mode decomposition (MEEMD, approximate entropy and the weighted least square support vector machine (WLS-SVM was proposed. The method mainly started from the chaotic sequence of time-frequency analysis and improved the model performance as follows: first a deformation time series was decomposed into a series of subsequences with significantly different complexity using MEEMD. Then the approximate entropy method was used to generate a new subsequence for the combination of subsequences with similar complexity, which could effectively concentrate the component feature information and reduce the computational scale. Finally the WLS-SVM prediction model was established for each new subsequence. At the same time, phase space reconstruction theory and the grid search method were used to select the input dimension and the optimal parameters of the model, and then the superposition of each predicted value was the final forecasting result. Taking the landslide deformation data of Danba as an example, the experiments were carried out and compared with wavelet neural network, support vector machine, least square support vector machine and various combination schemes. The experimental results show that the algorithm has high prediction accuracy. It can ensure a better prediction effect even in landslide deformation periods of rapid fluctuation, and it can also better control the residual value and effectively reduce the error interval.

  10. SVMRFE based approach for prediction of most discriminatory gene target for type II diabetes

    Directory of Open Access Journals (Sweden)

    Atul Kumar

    2017-06-01

    Full Text Available Type II diabetes is a chronic condition that affects the way our body metabolizes sugar. The body's important source of fuel is now becoming a chronic disease all over the world. It is now very necessary to identify the new potential targets for the drugs which not only control the disease but also can treat it. Support vector machines are the classifier which has a potential to make a classification of the discriminatory genes and non-discriminatory genes. SVMRFE a modification of SVM ranks the genes based on their discriminatory power and eliminate the genes which are not involved in causing the disease. A gene regulatory network has been formed with the top ranked coding genes to identify their role in causing diabetes. To further validate the results pathway study was performed to identify the involvement of the coding genes in type II diabetes. The genes obtained from this study showed a significant involvement in causing the disease, which may be used as a potential drug target.

  11. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    Science.gov (United States)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  12. Robust LS-SVM-based adaptive constrained control for a class of uncertain nonlinear systems with time-varying predefined performance

    Science.gov (United States)

    Luo, Jianjun; Wei, Caisheng; Dai, Honghua; Yuan, Jianping

    2018-03-01

    This paper focuses on robust adaptive control for a class of uncertain nonlinear systems subject to input saturation and external disturbance with guaranteed predefined tracking performance. To reduce the limitations of classical predefined performance control method in the presence of unknown initial tracking errors, a novel predefined performance function with time-varying design parameters is first proposed. Then, aiming at reducing the complexity of nonlinear approximations, only two least-square-support-vector-machine-based (LS-SVM-based) approximators with two design parameters are required through norm form transformation of the original system. Further, a novel LS-SVM-based adaptive constrained control scheme is developed under the time-vary predefined performance using backstepping technique. Wherein, to avoid the tedious analysis and repeated differentiations of virtual control laws in the backstepping technique, a simple and robust finite-time-convergent differentiator is devised to only extract its first-order derivative at each step in the presence of external disturbance. In this sense, the inherent demerit of backstepping technique-;explosion of terms; brought by the recursive virtual controller design is conquered. Moreover, an auxiliary system is designed to compensate the control saturation. Finally, three groups of numerical simulations are employed to validate the effectiveness of the newly developed differentiator and the proposed adaptive constrained control scheme.

  13. Multiclass Classification of Cardiac Arrhythmia Using Improved Feature Selection and SVM Invariants.

    Science.gov (United States)

    Mustaqeem, Anam; Anwar, Syed Muhammad; Majid, Muahammad

    2018-01-01

    Arrhythmia is considered a life-threatening disease causing serious health issues in patients, when left untreated. An early diagnosis of arrhythmias would be helpful in saving lives. This study is conducted to classify patients into one of the sixteen subclasses, among which one class represents absence of disease and the other fifteen classes represent electrocardiogram records of various subtypes of arrhythmias. The research is carried out on the dataset taken from the University of California at Irvine Machine Learning Data Repository. The dataset contains a large volume of feature dimensions which are reduced using wrapper based feature selection technique. For multiclass classification, support vector machine (SVM) based approaches including one-against-one (OAO), one-against-all (OAA), and error-correction code (ECC) are employed to detect the presence and absence of arrhythmias. The SVM method results are compared with other standard machine learning classifiers using varying parameters and the performance of the classifiers is evaluated using accuracy, kappa statistics, and root mean square error. The results show that OAO method of SVM outperforms all other classifiers by achieving an accuracy rate of 81.11% when used with 80/20 data split and 92.07% using 90/10 data split option.

  14. Genome-wide prediction and analysis of human tissue-selective genes using microarray expression data

    Directory of Open Access Journals (Sweden)

    Teng Shaolei

    2013-01-01

    Full Text Available Abstract Background Understanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification. Results In this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs and Support Vector Machines (SVMs were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues. Conclusions A machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression.

  15. NINJA-OPS: Fast Accurate Marker Gene Alignment Using Concatenated Ribosomes.

    Directory of Open Access Journals (Sweden)

    Gabriel A Al-Ghalith

    2016-01-01

    Full Text Available The explosion of bioinformatics technologies in the form of next generation sequencing (NGS has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short, a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs. NINJA takes advantage of the Burrows-Wheeler (BW alignment using an artificial reference chromosome composed of concatenated reference sequences, the "concatesome," as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop.

  16. SVM-Based Dynamic Reconfiguration CPS for Manufacturing System in Industry 4.0

    Directory of Open Access Journals (Sweden)

    Hyun-Jun Shin

    2018-01-01

    Full Text Available CPS is potential application in various fields, such as medical, healthcare, energy, transportation, and defense, as well as Industry 4.0 in Germany. Although studies on the equipment aging and prediction of problem have been done by combining CPS with Industry 4.0, such studies were based on small numbers and majority of the papers focused primarily on CPS methodology. Therefore, it is necessary to study active self-protection to enable self-management functions, such as self-healing by applying CPS in shop-floor. In this paper, we have proposed modeling of shop-floor and a dynamic reconfigurable CPS scheme that can predict the occurrence of anomalies and self-protection in the model. For this purpose, SVM was used as a machine learning technology and it was possible to restrain overloading in manufacturing process. In addition, we design CPS framework based on machine learning for Industry 4.0, simulate it, and perform. Simulation results show the simulation model autonomously detects the abnormal situation and it is dynamically reconfigured through self-healing.

  17. Prediction of CO concentrations based on a hybrid Partial Least Square and Support Vector Machine model

    Science.gov (United States)

    Yeganeh, B.; Motlagh, M. Shafie Pour; Rashidi, Y.; Kamalan, H.

    2012-08-01

    Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS-SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS-SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65-85% for hybrid PLS-SVM model respectively. Also it was found that the hybrid PLS-SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS-SVM model.

  18. SVM-based feature extraction and classification of aflatoxin contaminated corn using fluorescence hyperspectral data

    Science.gov (United States)

    Support Vector Machine (SVM) was used in the Genetic Algorithms (GA) process to select and classify a subset of hyperspectral image bands. The method was applied to fluorescence hyperspectral data for the detection of aflatoxin contamination in Aspergillus flavus infected single corn kernels. In the...

  19. Extended SVM algorithms for multilevel trans-Z-source inverter

    Directory of Open Access Journals (Sweden)

    Aida Baghbany Oskouei

    2016-03-01

    Full Text Available This paper suggests extended algorithms for multilevel trans-Z-source inverter. These algorithms are based on space vector modulation (SVM, which works with high switching frequency and does not generate the mean value of the desired load voltage in every switching interval. In this topology the output voltage is not limited to dc voltage source similar to traditional cascaded multilevel inverter and can be increased with trans-Z-network shoot-through state control. Besides, it is more reliable against short circuit, and due to several number of dc sources in each phase of this topology, it is possible to use it in hybrid renewable energy. Proposed SVM algorithms include the following: Combined modulation algorithm (SVPWM and shoot-through implementation in dwell times of voltage vectors algorithm. These algorithms are compared from viewpoint of simplicity, accuracy, number of switching, and THD. Simulation and experimental results are presented to demonstrate the expected representations.

  20. A novel transmission line protection using DOST and SVM

    Directory of Open Access Journals (Sweden)

    M. Jaya Bharata Reddy

    2016-06-01

    Full Text Available This paper proposes a smart fault detection, classification and location (SFDCL methodology for transmission systems with multi-generators using discrete orthogonal Stockwell transform (DOST. The methodology is based on synchronized current measurements from remote telemetry units (RTUs installed at both ends of the transmission line. The energy coefficients extracted from the transient current signals due to occurrence of different types of faults using DOST are being utilized for real-time fault detection and classification. Support vector machine (SVM has been deployed for locating the fault distance using the extracted coefficients. A comparative study is performed for establishing the superiority of SVM over other popular computational intelligence methods, such as adaptive neuro-fuzzy inference system (ANFIS and artificial neural network (ANN, for more precise and reliable estimation of fault distance. The results corroborate the effectiveness of the suggested SFDCL algorithm for real-time transmission line fault detection, classification and localization.

  1. Predicting conversion from MCI to AD using resting-state fMRI, graph theoretical approach and SVM.

    Science.gov (United States)

    Hojjati, Seyed Hani; Ebrahimzadeh, Ata; Khazaee, Ali; Babajani-Feremi, Abbas

    2017-04-15

    We investigated identifying patients with mild cognitive impairment (MCI) who progress to Alzheimer's disease (AD), MCI converter (MCI-C), from those with MCI who do not progress to AD, MCI non-converter (MCI-NC), based on resting-state fMRI (rs-fMRI). Graph theory and machine learning approach were utilized to predict progress of patients with MCI to AD using rs-fMRI. Eighteen MCI converts (average age 73.6 years; 11 male) and 62 age-matched MCI non-converters (average age 73.0 years, 28 male) were included in this study. We trained and tested a support vector machine (SVM) to classify MCI-C from MCI-NC using features constructed based on the local and global graph measures. A novel feature selection algorithm was developed and utilized to select an optimal subset of features. Using subset of optimal features in SVM, we classified MCI-C from MCI-NC with an accuracy, sensitivity, specificity, and the area under the receiver operating characteristic (ROC) curve of 91.4%, 83.24%, 90.1%, and 0.95, respectively. Furthermore, results of our statistical analyses were used to identify the affected brain regions in AD. To the best of our knowledge, this is the first study that combines the graph measures (constructed based on rs-fMRI) with machine learning approach and accurately classify MCI-C from MCI-NC. Results of this study demonstrate potential of the proposed approach for early AD diagnosis and demonstrate capability of rs-fMRI to predict conversion from MCI to AD by identifying affected brain regions underlying this conversion. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Intrusion detection model using fusion of chi-square feature selection and multi class SVM

    Directory of Open Access Journals (Sweden)

    Ikram Sumaiya Thaseen

    2017-10-01

    Full Text Available Intrusion detection is a promising area of research in the domain of security with the rapid development of internet in everyday life. Many intrusion detection systems (IDS employ a sole classifier algorithm for classifying network traffic as normal or abnormal. Due to the large amount of data, these sole classifier models fail to achieve a high attack detection rate with reduced false alarm rate. However by applying dimensionality reduction, data can be efficiently reduced to an optimal set of attributes without loss of information and then classified accurately using a multi class modeling technique for identifying the different network attacks. In this paper, we propose an intrusion detection model using chi-square feature selection and multi class support vector machine (SVM. A parameter tuning technique is adopted for optimization of Radial Basis Function kernel parameter namely gamma represented by ‘ϒ’ and over fitting constant ‘C’. These are the two important parameters required for the SVM model. The main idea behind this model is to construct a multi class SVM which has not been adopted for IDS so far to decrease the training and testing time and increase the individual classification accuracy of the network attacks. The investigational results on NSL-KDD dataset which is an enhanced version of KDDCup 1999 dataset shows that our proposed approach results in a better detection rate and reduced false alarm rate. An experimentation on the computational time required for training and testing is also carried out for usage in time critical applications.

  3. Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.

    Science.gov (United States)

    Becker, Natalia; Toedt, Grischa; Lichter, Peter; Benner, Axel

    2011-05-09

    Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net.We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone.Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error.Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters.The penalized SVM

  4. Supervised learning methods for pathological arterial pulse wave differentiation: A SVM and neural networks approach.

    Science.gov (United States)

    Paiva, Joana S; Cardoso, João; Pereira, Tânia

    2018-01-01

    The main goal of this study was to develop an automatic method based on supervised learning methods, able to distinguish healthy from pathologic arterial pulse wave (APW), and those two from noisy waveforms (non-relevant segments of the signal), from the data acquired during a clinical examination with a novel optical system. The APW dataset analysed was composed by signals acquired in a clinical environment from a total of 213 subjects, including healthy volunteers and non-healthy patients. The signals were parameterised by means of 39pulse features: morphologic, time domain statistics, cross-correlation features, wavelet features. Multiclass Support Vector Machine Recursive Feature Elimination (SVM RFE) method was used to select the most relevant features. A comparative study was performed in order to evaluate the performance of the two classifiers: Support Vector Machine (SVM) and Artificial Neural Network (ANN). SVM achieved a statistically significant better performance for this problem with an average accuracy of 0.9917±0.0024 and a F-Measure of 0.9925±0.0019, in comparison with ANN, which reached the values of 0.9847±0.0032 and 0.9852±0.0031 for Accuracy and F-Measure, respectively. A significant difference was observed between the performances obtained with SVM classifier using a different number of features from the original set available. The comparison between SVM and NN allowed reassert the higher performance of SVM. The results obtained in this study showed the potential of the proposed method to differentiate those three important signal outcomes (healthy, pathologic and noise) and to reduce bias associated with clinical diagnosis of cardiovascular disease using APW. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

    Science.gov (United States)

    Bastani, Meysam; Vos, Larissa; Asgarian, Nasimeh; Deschenes, Jean; Graham, Kathryn; Mackey, John; Greiner, Russell

    2013-01-01

    Background Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. Methods To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. Results This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. Conclusions Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. PMID:24312637

  6. SVM and ANFIS Models for precipitaton Modeling (Case Study: GonbadKavouse

    Directory of Open Access Journals (Sweden)

    N. Zabet Pishkhani

    2016-10-01

    Full Text Available Introduction: In recent years, according to the intelligent models increased as new techniques and tools in hydrological processes such as precipitation forecasting. ANFIS model has good ability in train, construction and classification, and also has the advantage that allows the extraction of fuzzy rules from numerical information or knowledge. Another intelligent technique in recent years has been used in various areas is support vector machine (SVM. In this paper the ability of artificial intelligence methods including support vector machine (SVM and adaptive neuro fuzzy inference system (ANFIS were analyzed in monthly precipitation prediction. Materials and Methods: The study area was the city of Gonbad in Golestan Province. The city has a temperate climate in the southern highlands and southern plains, mountains and temperate humid, semi-arid and semi-arid in the north of Gorganroud river. In total, the city's climate is temperate and humid. In the present study, monthly precipitation was modeled in Gonbad using ANFIS and SVM and two different database structures were designed. The first structure: input layer consisted of mean temperature, relative humidity, pressure and wind speed at Gonbad station. The second structure: According to Pearson coefficient, the monthly precipitation data were used from four stations: Arazkoose, Bahalke, Tamar and Aqqala which had a higher correlation with Gonbad station precipitation. In this study precipitation data was used from 1995 to 2012. 80% data were used for model training and the remaining 20% of data for validation. SVM was developed from support vector machines in the 1990s by Vapnik. SVM has been widely recognized as a powerful tool to deal with function fitting problems. An Adaptive Neuro-Fuzzy Inference System (ANFIS refers, in general, to an adaptive network which performs the function of a fuzzy inference system. The most commonly used fuzzy system in ANFIS architectures is the Sugeno model

  7. Analyzing kernel matrices for the identification of differentially expressed genes.

    Directory of Open Access Journals (Sweden)

    Xiao-Lei Xia

    Full Text Available One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs, followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing [Formula: see text]-like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate.

  8. Evaluation of new reference genes in papaya for accurate transcript normalization under different experimental conditions.

    Directory of Open Access Journals (Sweden)

    Xiaoyang Zhu

    Full Text Available Real-time reverse transcription PCR (RT-qPCR is a preferred method for rapid and accurate quantification of gene expression studies. Appropriate application of RT-qPCR requires accurate normalization though the use of reference genes. As no single reference gene is universally suitable for all experiments, thus reference gene(s validation under different experimental conditions is crucial for RT-qPCR analysis. To date, only a few studies on reference genes have been done in other plants but none in papaya. In the present work, we selected 21 candidate reference genes, and evaluated their expression stability in 246 papaya fruit samples using three algorithms, geNorm, NormFinder and RefFinder. The samples consisted of 13 sets collected under different experimental conditions, including various tissues, different storage temperatures, different cultivars, developmental stages, postharvest ripening, modified atmosphere packaging, 1-methylcyclopropene (1-MCP treatment, hot water treatment, biotic stress and hormone treatment. Our results demonstrated that expression stability varied greatly between reference genes and that different suitable reference gene(s or combination of reference genes for normalization should be validated according to the experimental conditions. In general, the internal reference genes EIF (Eukaryotic initiation factor 4A, TBP1 (TATA binding protein 1 and TBP2 (TATA binding protein 2 genes had a good performance under most experimental conditions, whereas the most widely present used reference genes, ACTIN (Actin 2, 18S rRNA (18S ribosomal RNA and GAPDH (Glyceraldehyde-3-phosphate dehydrogenase were not suitable in many experimental conditions. In addition, two commonly used programs, geNorm and Normfinder, were proved sufficient for the validation. This work provides the first systematic analysis for the selection of superior reference genes for accurate transcript normalization in papaya under different experimental

  9. An SVM classifier to separate false signals from microcalcifications in digital mammograms

    Energy Technology Data Exchange (ETDEWEB)

    Bazzani, Armando; Bollini, Dante; Brancaccio, Rosa; Campanini, Renato; Riccardi, Alessandro; Romani, Davide [Department of Physics, University of Bologna (Italy); INFN, Bologna (Italy); Lanconelli, Nico [Department of Physics, University of Bologna, and INFN, Bologna (Italy). E-mail: nico.lanconelli@bo.infn.it; Bevilacqua, Alessandro [Department of Electronics, Computer Science and Systems, University of Bologna, and INFN, Bologna (Italy)

    2001-06-01

    In this paper we investigate the feasibility of using an SVM (support vector machine) classifier in our automatic system for the detection of clustered microcalcifications in digital mammograms. SVM is a technique for pattern recognition which relies on the statistical learning theory. It minimizes a function of two terms: the number of misclassified vectors of the training set and a term regarding the generalization classifier capability. We compare the SVM classifier with an MLP (multi-layer perceptron) in the false-positive reduction phase of our detection scheme: a detected signal is considered either microcalcification or false signal, according to the value of a set of its features. The SVM classifier gets slightly better results than the MLP one (Az value of 0.963 against 0.958) in the presence of a high number of training data; the improvement becomes much more evident (Az value of 0.952 against 0.918) in training sets of reduced size. Finally, the setting of the SVM classifier is much easier than the MLP one. (author)

  10. Reference genes for accurate transcript normalization in citrus genotypes under different experimental conditions.

    Directory of Open Access Journals (Sweden)

    Valéria Mafra

    Full Text Available Real-time reverse transcription PCR (RT-qPCR has emerged as an accurate and widely used technique for expression profiling of selected genes. However, obtaining reliable measurements depends on the selection of appropriate reference genes for gene expression normalization. The aim of this work was to assess the expression stability of 15 candidate genes to determine which set of reference genes is best suited for transcript normalization in citrus in different tissues and organs and leaves challenged with five pathogens (Alternaria alternata, Phytophthora parasitica, Xylella fastidiosa and Candidatus Liberibacter asiaticus. We tested traditional genes used for transcript normalization in citrus and orthologs of Arabidopsis thaliana genes described as superior reference genes based on transcriptome data. geNorm and NormFinder algorithms were used to find the best reference genes to normalize all samples and conditions tested. Additionally, each biotic stress was individually analyzed by geNorm. In general, FBOX (encoding a member of the F-box family and GAPC2 (GAPDH was the most stable candidate gene set assessed under the different conditions and subsets tested, while CYP (cyclophilin, TUB (tubulin and CtP (cathepsin were the least stably expressed genes found. Validation of the best suitable reference genes for normalizing the expression level of the WRKY70 transcription factor in leaves infected with Candidatus Liberibacter asiaticus showed that arbitrary use of reference genes without previous testing could lead to misinterpretation of data. Our results revealed FBOX, SAND (a SAND family protein, GAPC2 and UPL7 (ubiquitin protein ligase 7 to be superior reference genes, and we recommend their use in studies of gene expression in citrus species and relatives. This work constitutes the first systematic analysis for the selection of superior reference genes for transcript normalization in different citrus organs and under biotic stress.

  11. Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data

    Directory of Open Access Journals (Sweden)

    de los Reyes Benildo G

    2008-04-01

    Full Text Available Abstract Background Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published data. Integrating these complementary datasets helps infer a mutually consistent transcriptional regulatory network (TRN with strong similarity to the structure of the underlying genetic regulatory modules. Decomposing the TRN into a small set of recurring regulatory patterns, called network motifs (NM, facilitates the inference. Identifying NMs defined by specific transcription factors (TF establishes the framework structure of a TRN and allows the inference of TF-target gene relationship. This paper introduces a computational framework for utilizing data from multiple sources to infer TF-target gene relationships on the basis of NMs. The data include time course gene expression profiles, genome-wide location analysis data, binding sequence data, and gene ontology (GO information. Results The proposed computational framework was tested using gene expression data associated with cell cycle progression in yeast. Among 800 cell cycle related genes, 85 were identified as candidate TFs and classified into four previously defined NMs. The NMs for a subset of TFs are obtained from literature. Support vector machine (SVM classifiers were used to estimate NMs for the remaining TFs. The potential downstream target genes for the TFs were clustered into 34 biologically significant groups. The relationships between TFs and potential target gene clusters were examined by training recurrent neural networks whose topologies mimic the NMs to which the TFs are classified. The identified relationships between TFs and gene clusters were evaluated using the following biological validation and statistical analyses: (1 Gene set enrichment

  12. A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status.

    Directory of Open Access Journals (Sweden)

    Meysam Bastani

    Full Text Available BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions.

  13. Tumor Classification Using High-Order Gene Expression Profiles Based on Multilinear ICA

    Directory of Open Access Journals (Sweden)

    Ming-gang Du

    2009-01-01

    Full Text Available Motivation. Independent Components Analysis (ICA maximizes the statistical independence of the representational components of a training gene expression profiles (GEP ensemble, but it cannot distinguish relations between the different factors, or different modes, and it is not available to high-order GEP Data Mining. In order to generalize ICA, we introduce Multilinear-ICA and apply it to tumor classification using high order GEP. Firstly, we introduce the basis conceptions and operations of tensor and recommend Support Vector Machine (SVM classifier and Multilinear-ICA. Secondly, the higher score genes of original high order GEP are selected by using t-statistics and tabulate tensors. Thirdly, the tensors are performed by Multilinear-ICA. Finally, the SVM is used to classify the tumor subtypes. Results. To show the validity of the proposed method, we apply it to tumor classification using high order GEP. Though we only use three datasets, the experimental results show that the method is effective and feasible. Through this survey, we hope to gain some insight into the problem of high order GEP tumor classification, in aid of further developing more effective tumor classification algorithms.

  14. An improved conjugate gradient scheme to the solution of least squares SVM.

    Science.gov (United States)

    Chu, Wei; Ong, Chong Jin; Keerthi, S Sathiya

    2005-03-01

    The least square support vector machines (LS-SVM) formulation corresponds to the solution of a linear system of equations. Several approaches to its numerical solutions have been proposed in the literature. In this letter, we propose an improved method to the numerical solution of LS-SVM and show that the problem can be solved using one reduced system of linear equations. Compared with the existing algorithm for LS-SVM, the approach used in this letter is about twice as efficient. Numerical results using the proposed method are provided for comparisons with other existing algorithms.

  15. Accurate measurement of gene copy number for human alpha-defensin DEFA1A3.

    Science.gov (United States)

    Khan, Fayeza F; Carpenter, Danielle; Mitchell, Laura; Mansouri, Omniah; Black, Holly A; Tyson, Jess; Armour, John A L

    2013-10-20

    Multi-allelic copy number variants include examples of extensive variation between individuals in the copy number of important genes, most notably genes involved in immune function. The definition of this variation, and analysis of its impact on function, has been hampered by the technical difficulty of large-scale but accurate typing of genomic copy number. The copy-variable alpha-defensin locus DEFA1A3 on human chromosome 8 commonly varies between 4 and 10 copies per diploid genome, and presents considerable challenges for accurate high-throughput typing. In this study, we developed two paralogue ratio tests and three allelic ratio measurements that, in combination, provide an accurate and scalable method for measurement of DEFA1A3 gene number. We combined information from different measurements in a maximum-likelihood framework which suggests that most samples can be assigned to an integer copy number with high confidence, and applied it to typing 589 unrelated European DNA samples. Typing the members of three-generation pedigrees provided further reassurance that correct integer copy numbers had been assigned. Our results have allowed us to discover that the SNP rs4300027 is strongly associated with DEFA1A3 gene copy number in European samples. We have developed an accurate and robust method for measurement of DEFA1A3 copy number. Interrogation of rs4300027 and associated SNPs in Genome-Wide Association Study SNP data provides no evidence that alpha-defensin copy number is a strong risk factor for phenotypes such as Crohn's disease, type I diabetes, HIV progression and multiple sclerosis.

  16. Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

    Science.gov (United States)

    Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre

    2009-01-01

    Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932

  17. Towards a physiology-based measure of pain: patterns of human brain activity distinguish painful from non-painful thermal stimulation.

    Directory of Open Access Journals (Sweden)

    Justin E Brown

    Full Text Available Pain often exists in the absence of observable injury; therefore, the gold standard for pain assessment has long been self-report. Because the inability to verbally communicate can prevent effective pain management, research efforts have focused on the development of a tool that accurately assesses pain without depending on self-report. Those previous efforts have not proven successful at substituting self-report with a clinically valid, physiology-based measure of pain. Recent neuroimaging data suggest that functional magnetic resonance imaging (fMRI and support vector machine (SVM learning can be jointly used to accurately assess cognitive states. Therefore, we hypothesized that an SVM trained on fMRI data can assess pain in the absence of self-report. In fMRI experiments, 24 individuals were presented painful and nonpainful thermal stimuli. Using eight individuals, we trained a linear SVM to distinguish these stimuli using whole-brain patterns of activity. We assessed the performance of this trained SVM model by testing it on 16 individuals whose data were not used for training. The whole-brain SVM was 81% accurate at distinguishing painful from non-painful stimuli (p<0.0000001. Using distance from the SVM hyperplane as a confidence measure, accuracy was further increased to 84%, albeit at the expense of excluding 15% of the stimuli that were the most difficult to classify. Overall performance of the SVM was primarily affected by activity in pain-processing regions of the brain including the primary somatosensory cortex, secondary somatosensory cortex, insular cortex, primary motor cortex, and cingulate cortex. Region of interest (ROI analyses revealed that whole-brain patterns of activity led to more accurate classification than localized activity from individual brain regions. Our findings demonstrate that fMRI with SVM learning can assess pain without requiring any communication from the person being tested. We outline tasks that should be

  18. Semiquantitative dynamic contrast-enhanced MRI for accurate classification of complex adnexal masses.

    Science.gov (United States)

    Kazerooni, Anahita Fathi; Malek, Mahrooz; Haghighatkhah, Hamidreza; Parviz, Sara; Nabil, Mahnaz; Torbati, Leila; Assili, Sanam; Saligheh Rad, Hamidreza; Gity, Masoumeh

    2017-02-01

    To identify the best dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) descriptive parameters in predicting malignancy of complex ovarian masses, and develop an optimal decision tree for accurate classification of benign and malignant complex ovarian masses. Preoperative DCE-MR images of 55 sonographically indeterminate ovarian masses (27 benign and 28 malignant) were analyzed prospectively. Four descriptive parameters of the dynamic curve, namely, time-to-peak (TTP), wash-in-rate (WIR), relative signal intensity (SI rel ), and the initial area under the curve (IAUC 60 ) were calculated on the normalized curves of specified regions-of-interest (ROIs). A two-tailed Student's t-test and two automated classifiers, linear discriminant analysis (LDA) and support vector machines (SVMs), were used to compare the performance of the mentioned parameters individually and in combination with each other. TTP (P = 6.15E-8) and WIR (P = 5.65E-5) parameters induced the highest sensitivity (89% for LDA, and 97% for SVM) and specificity (93% for LDA, and 100% for SVM), respectively. Regarding the high sensitivity of TTP and high specificity of WIR and through their combination, an accurate and simple decision-tree classifier was designed using the line equation obtained by LDA classification model. The proposed classifier achieved an accuracy of 89% and area under the ROC curve of 93%. In this study an accurate decision-tree classifier based on a combination of TTP and WIR parameters was proposed, which provides a clinically flexible framework to aid radiologists/clinicians to reach a conclusive preoperative diagnosis and patient-specific therapy plan for distinguishing malignant from benign complex ovarian masses. 2 J. Magn. Reson. Imaging 2017;45:418-427. © 2016 International Society for Magnetic Resonance in Medicine.

  19. Parameters Optimization and Application to Glutamate Fermentation Model Using SVM

    OpenAIRE

    Zhang, Xiangsheng; Pan, Feng

    2015-01-01

    Aimed at the parameters optimization in support vector machine (SVM) for glutamate fermentation modelling, a new method is developed. It optimizes the SVM parameters via an improved particle swarm optimization (IPSO) algorithm which has better global searching ability. The algorithm includes detecting and handling the local convergence and exhibits strong ability to avoid being trapped in local minima. The material step of the method was shown. Simulation experiments demonstrate the effective...

  20. In Silico Prediction of Gamma-Aminobutyric Acid Type-A Receptors Using Novel Machine-Learning-Based SVM and GBDT Approaches

    Directory of Open Access Journals (Sweden)

    Zhijun Liao

    2016-01-01

    Full Text Available Gamma-aminobutyric acid type-A receptors (GABAARs belong to multisubunit membrane spanning ligand-gated ion channels (LGICs which act as the principal mediators of rapid inhibitory synaptic transmission in the human brain. Therefore, the category prediction of GABAARs just from the protein amino acid sequence would be very helpful for the recognition and research of novel receptors. Based on the proteins’ physicochemical properties, amino acids composition and position, a GABAAR classifier was first constructed using a 188-dimensional (188D algorithm at 90% cd-hit identity and compared with pseudo-amino acid composition (PseAAC and ProtrWeb web-based algorithms for human GABAAR proteins. Then, four classifiers including gradient boosting decision tree (GBDT, random forest (RF, a library for support vector machine (libSVM, and k-nearest neighbor (k-NN were compared on the dataset at cd-hit 40% low identity. This work obtained the highest correctly classified rate at 96.8% and the highest specificity at 99.29%. But the values of sensitivity, accuracy, and Matthew’s correlation coefficient were a little lower than those of PseAAC and ProtrWeb; GBDT and libSVM can make a little better performance than RF and k-NN at the second dataset. In conclusion, a GABAAR classifier was successfully constructed using only the protein sequence information.

  1. Classification of Hyperspectral Images by SVM Using a Composite Kernel by Employing Spectral, Spatial and Hierarchical Structure Information

    Directory of Open Access Journals (Sweden)

    Yi Wang

    2018-03-01

    Full Text Available In this paper, we introduce a novel classification framework for hyperspectral images (HSIs by jointly employing spectral, spatial, and hierarchical structure information. In this framework, the three types of information are integrated into the SVM classifier in a way of multiple kernels. Specifically, the spectral kernel is constructed through each pixel’s vector value in the original HSI, and the spatial kernel is modeled by using the extended morphological profile method due to its simplicity and effectiveness. To accurately characterize hierarchical structure features, the techniques of Fish-Markov selector (FMS, marker-based hierarchical segmentation (MHSEG and algebraic multigrid (AMG are combined. First, the FMS algorithm is used on the original HSI for feature selection to produce its spectral subset. Then, the multigrid structure of this subset is constructed using the AMG method. Subsequently, the MHSEG algorithm is exploited to obtain a hierarchy consist of a series of segmentation maps. Finally, the hierarchical structure information is represented by using these segmentation maps. The main contributions of this work is to present an effective composite kernel for HSI classification by utilizing spatial structure information in multiple scales. Experiments were conducted on two hyperspectral remote sensing images to validate that the proposed framework can achieve better classification results than several popular kernel-based classification methods in terms of both qualitative and quantitative analysis. Specifically, the proposed classification framework can achieve 13.46–15.61% in average higher than the standard SVM classifier under different training sets in the terms of overall accuracy.

  2. Identifying Regulatory Patterns at the 3'end Regions of Over-expressed and Under-expressed Genes

    KAUST Repository

    Othoum, Ghofran K

    2013-05-01

    Promoters, neighboring regulatory regions and those extending further upstream of the 5’end of genes, are considered one of the main components affecting the expression status of genes in a specific phenotype. More recently research by Chen et al. (2006, 2012) and Mapendano et al. (2010) demonstrated that the 3’end regulatory regions of genes also influence gene expression. However, the association between the regulatory regions surrounding 3’end of genes and their over- or under-expression status in a particular phenotype has not been systematically studied. The aim of this study is to ascertain if regulatory regions surrounding the 3’end of genes contain sufficient regulatory information to correlate genes with their expression status in a particular phenotype. Over- and under-expressed ovarian cancer (OC) genes were used as a model. Exploratory analysis of the 3’end regions were performed by transforming the annotated regions using principal component analysis (PCA), followed by clustering the transformed data thereby achieving a clear separation of genes with different expression status. Additionally, several classification algorithms such as Naïve Bayes, Random Forest and Support Vector Machine (SVM) were tested with different parameter settings to analyze the discriminatory capacity of the 3’end regions of genes related to their gene expression status. The best performance was achieved using the SVM classification model with 10-fold cross-validation that yielded an accuracy of 98.4%, sensitivity of 99.5% and specificity of 92.5%. For gene expression status for newly available instances, based on information derived from the 3’end regions, an SVM predictive model was developed with 10-fold cross-validation that yielded an accuracy of 67.0%, sensitivity of 73.2% and specificity of 61.0%. Moreover, building an SVM with polynomial kernel model to PCA transformed data yielded an accuracy of 83.1%, sensitivity of 92.5% and specificity of 74.8% using

  3. Identifying Regulatory Patterns at the 3'end Regions of Over-expressed and Under-expressed Genes

    KAUST Repository

    Othoum, Ghofran K

    2013-01-01

    Promoters, neighboring regulatory regions and those extending further upstream of the 5’end of genes, are considered one of the main components affecting the expression status of genes in a specific phenotype. More recently research by Chen et al. (2006, 2012) and Mapendano et al. (2010) demonstrated that the 3’end regulatory regions of genes also influence gene expression. However, the association between the regulatory regions surrounding 3’end of genes and their over- or under-expression status in a particular phenotype has not been systematically studied. The aim of this study is to ascertain if regulatory regions surrounding the 3’end of genes contain sufficient regulatory information to correlate genes with their expression status in a particular phenotype. Over- and under-expressed ovarian cancer (OC) genes were used as a model. Exploratory analysis of the 3’end regions were performed by transforming the annotated regions using principal component analysis (PCA), followed by clustering the transformed data thereby achieving a clear separation of genes with different expression status. Additionally, several classification algorithms such as Naïve Bayes, Random Forest and Support Vector Machine (SVM) were tested with different parameter settings to analyze the discriminatory capacity of the 3’end regions of genes related to their gene expression status. The best performance was achieved using the SVM classification model with 10-fold cross-validation that yielded an accuracy of 98.4%, sensitivity of 99.5% and specificity of 92.5%. For gene expression status for newly available instances, based on information derived from the 3’end regions, an SVM predictive model was developed with 10-fold cross-validation that yielded an accuracy of 67.0%, sensitivity of 73.2% and specificity of 61.0%. Moreover, building an SVM with polynomial kernel model to PCA transformed data yielded an accuracy of 83.1%, sensitivity of 92.5% and specificity of 74.8% using

  4. "Active Flux" DTFC-SVM Sensorless Control of IPMSM

    DEFF Research Database (Denmark)

    Boldea, Ion; Codruta Paicu, Mihaela; Gheorghe-Daniel, Andreescu,

    2009-01-01

    This paper proposes an implementation of a motionsensorless control system in wide speed range based on "active flux" observer, and direct torque and flux control with space vector modulation (DTFC-SVM) for the interior permanent magnet synchronous motor (IPMSM), without signal injection....... The concept of "active flux" (or "torque producing flux") turns all the rotor salient-pole ac machines into fully nonsalient-pole ones. A new function for Lq inductance depending on torque is introduced to model the magnetic saturation. Notable simplification in the rotor position and speed estimation...

  5. [Application of optimized parameters SVM based on photoacoustic spectroscopy method in fault diagnosis of power transformer].

    Science.gov (United States)

    Zhang, Yu-xin; Cheng, Zhi-feng; Xu, Zheng-ping; Bai, Jing

    2015-01-01

    In order to solve the problems such as complex operation, consumption for the carrier gas and long test period in traditional power transformer fault diagnosis approach based on dissolved gas analysis (DGA), this paper proposes a new method which is detecting 5 types of characteristic gas content in transformer oil such as CH4, C2H2, C2H4, C2H6 and H2 based on photoacoustic Spectroscopy and C2H2/C2H4, CH4/H2, C2H4/C2H6 three-ratios data are calculated. The support vector machine model was constructed using cross validation method under five support vector machine functions and four kernel functions, heuristic algorithms were used in parameter optimization for penalty factor c and g, which to establish the best SVM model for the highest fault diagnosis accuracy and the fast computing speed. Particles swarm optimization and genetic algorithm two types of heuristic algorithms were comparative studied in this paper for accuracy and speed in optimization. The simulation result shows that SVM model composed of C-SVC, RBF kernel functions and genetic algorithm obtain 97. 5% accuracy in test sample set and 98. 333 3% accuracy in train sample set, and genetic algorithm was about two times faster than particles swarm optimization in computing speed. The methods described in this paper has many advantages such as simple operation, non-contact measurement, no consumption for the carrier gas, long test period, high stability and sensitivity, the result shows that the methods described in this paper can instead of the traditional transformer fault diagnosis by gas chromatography and meets the actual project needs in transformer fault diagnosis.

  6. A DDoS Attack Detection Method Based on SVM in Software Defined Network

    Directory of Open Access Journals (Sweden)

    Jin Ye

    2018-01-01

    Full Text Available The detection of DDoS attacks is an important topic in the field of network security. The occurrence of software defined network (SDN (Zhang et al., 2018 brings up some novel methods to this topic in which some deep learning algorithm is adopted to model the attack behavior based on collecting from the SDN controller. However, the existing methods such as neural network algorithm are not practical enough to be applied. In this paper, the SDN environment by mininet and floodlight (Ning et al., 2014 simulation platform is constructed, 6-tuple characteristic values of the switch flow table is extracted, and then DDoS attack model is built by combining the SVM classification algorithms. The experiments show that average accuracy rate of our method is 95.24% with a small amount of flow collecting. Our work is of good value for the detection of DDoS attack in SDN.

  7. Weighted Feature Gaussian Kernel SVM for Emotion Recognition.

    Science.gov (United States)

    Wei, Wei; Jia, Qingxuan

    2016-01-01

    Emotion recognition with weighted feature based on facial expression is a challenging research topic and has attracted great attention in the past few years. This paper presents a novel method, utilizing subregion recognition rate to weight kernel function. First, we divide the facial expression image into some uniform subregions and calculate corresponding recognition rate and weight. Then, we get a weighted feature Gaussian kernel function and construct a classifier based on Support Vector Machine (SVM). At last, the experimental results suggest that the approach based on weighted feature Gaussian kernel function has good performance on the correct rate in emotion recognition. The experiments on the extended Cohn-Kanade (CK+) dataset show that our method has achieved encouraging recognition results compared to the state-of-the-art methods.

  8. Parameters Optimization and Application to Glutamate Fermentation Model Using SVM

    Directory of Open Access Journals (Sweden)

    Xiangsheng Zhang

    2015-01-01

    Full Text Available Aimed at the parameters optimization in support vector machine (SVM for glutamate fermentation modelling, a new method is developed. It optimizes the SVM parameters via an improved particle swarm optimization (IPSO algorithm which has better global searching ability. The algorithm includes detecting and handling the local convergence and exhibits strong ability to avoid being trapped in local minima. The material step of the method was shown. Simulation experiments demonstrate the effectiveness of the proposed algorithm.

  9. A metaheuristic optimization framework for informative gene selection

    Directory of Open Access Journals (Sweden)

    Kaberi Das

    Full Text Available This paper presents a metaheuristic framework using Harmony Search (HS with Genetic Algorithm (GA for gene selection. The internal architecture of the proposed model broadly works in two phases, in the first phase, the model allows the hybridization of HS with GA to compute and evaluate the fitness of the randomly selected solutions of binary strings and then HS ranks the solutions in descending order of their fitness. In the second phase, the offsprings are generated using crossover and mutation operations of GA and finally, those offsprings were selected for the next generation whose fitness value is more than their parents evaluated by SVM classifier. The accuracy of the final gene subsets obtained from this model has been evaluated using SVM classifiers. The merit of this approach is analyzed by experimental results on five benchmark datasets and the results showed an impressive accuracy over existing feature selection approaches. The occurrence of gene subsets selected from this model have also been computed and the most often selected gene subsets with the probability of [0.1–0.9] have been chosen as optimal sets of informative genes. Finally, the performance of those selected informative gene subsets have been measured and established through probabilistic measures. Keywords: Gene Selection, Metaheuristic, Harmony Search Algorithm, Genetic Algorithm, SVM

  10. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    Science.gov (United States)

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  11. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss Classification Using Image-Based Features

    Directory of Open Access Journals (Sweden)

    Mohammadmehdi Saberioon

    2018-03-01

    Full Text Available The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss were fed either a fish-meal based diet (80 fish or a 100% plant-based diet (80 fish and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF, Support vector machine (SVM, Logistic regression (LR and k-Nearest neighbours (k-NN. The SVM with radial based kernel provided the best classifier with correct classification rate (CCR of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40% classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin.

  12. A Method to Integrate GMM, SVM and DTW for Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Ing-Jr Ding

    2014-01-01

    Full Text Available This paper develops an effective and efficient scheme to integrate Gaussian mixture model (GMM, support vector machine (SVM, and dynamic time wrapping (DTW for automatic speaker recognition. GMM and SVM are two popular classifiers for speaker recognition applications. DTW is a fast and simple template matching method, and it is frequently seen in applications of speech recognition. In this work, DTW does not play a role to perform speech recognition, and it will be employed to be a verifier for verification of valid speakers. The proposed combination scheme of GMM, SVM and DTW, called SVMGMM-DTW, for speaker recognition in this study is a two-phase verification process task including GMM-SVM verification of the first phase and DTW verification of the second phase. By providing a double check to verify the identity of a speaker, it will be difficult for imposters to try to pass the security protection; therefore, the safety degree of speaker recognition systems will be largely increased. A series of experiments designed on door access control applications demonstrated that the superiority of the developed SVMGMM-DTW on speaker recognition accuracy.

  13. Fuzzy Pruning Based LS-SVM Modeling Development for a Fermentation Process

    Directory of Open Access Journals (Sweden)

    Weili Xiong

    2014-01-01

    Full Text Available Due to the complexity and uncertainty of microbial fermentation processes, data coming from the plants often contain some outliers. However, these data may be treated as the normal support vectors, which always deteriorate the performance of soft sensor modeling. Since the outliers also contaminate the correlation structure of the least square support vector machine (LS-SVM, the fuzzy pruning method is provided to deal with the problem. Furthermore, by assigning different fuzzy membership scores to data samples, the sensitivity of the model to the outliers can be reduced greatly. The effectiveness and efficiency of the proposed approach are demonstrated through two numerical examples as well as a simulator case of penicillin fermentation process.

  14. Modeling of SVM Diode Clamping Three-Level Inverter Connected to Grid

    DEFF Research Database (Denmark)

    Guo, Yougui; Zeng, Ping; Zhu, Jieqiong

    2011-01-01

    PLECS is used to model the diode clamping three-level inverter connected to grid and good results are obtained. First the output voltage SVM is described for diode clamping three-level inverter with loads connected to Y. Then the output voltage SVM of diode clamping three-level inverter is simply...... analyzed with loads connected to △. But it will be further researched in the future. Third, PLECS is briefly introduced. Fourth, the modeling of diode clamping three-level inverter is briefly presented with PLECS. Finally, a series of simulations are carried out. The simulation results tell us PLECS...... is very powerful tool to real power circuits and it is very easy to simulate them. They have also verified that SVM control strategy is feasible to control the diode clamping three-level inverter....

  15. A comparative QSAR study on the estrogenic activities of persistent organic pollutants by PLS and SVM

    Directory of Open Access Journals (Sweden)

    Fei Li

    2015-11-01

    Full Text Available Quantitative structure-activity relationships (QSARs were determined using partial least square (PLS and support vector machine (SVM. The predicted values by the final QSAR models were in good agreement with the corresponding experimental values. Chemical estrogenic activities are related to atomic properties (atomic Sanderson electronegativities, van der Waals volumes and polarizabilities. Comparison of the results obtained from two models, the SVM method exhibited better overall performances. Besides, three PLS models were constructed for some specific families based on their chemical structures. These predictive models should be useful to rapidly identify potential estrogenic endocrine disrupting chemicals.

  16. Comparison of ANN and SVM for classification of eye movements in EOG signals

    Science.gov (United States)

    Qi, Lim Jia; Alias, Norma

    2018-03-01

    Nowadays, electrooculogram is regarded as one of the most important biomedical signal in measuring and analyzing eye movement patterns. Thus, it is helpful in designing EOG-based Human Computer Interface (HCI). In this research, electrooculography (EOG) data was obtained from five volunteers. The (EOG) data was then preprocessed before feature extraction methods were employed to further reduce the dimensionality of data. Three feature extraction approaches were put forward, namely statistical parameters, autoregressive (AR) coefficients using Burg method, and power spectral density (PSD) using Yule-Walker method. These features would then become input to both artificial neural network (ANN) and support vector machine (SVM). The performance of the combination of different feature extraction methods and classifiers was presented and analyzed. It was found that statistical parameters + SVM achieved the highest classification accuracy of 69.75%.

  17. Study on specificity of colon carcinoma-associated serum markers and establishment of SVM prediction model

    Directory of Open Access Journals (Sweden)

    Lu Li

    2017-03-01

    Full Text Available We aimed to evaluate the specificity of 12 tumor markers related to colon carcinoma and identify the most sensitive index. Logistic regression and Bhattacharyya distance were used to evaluate the index. Then, different index combinations were used to establish a support vector machine (SVM diagnosis model of malignant colon carcinoma. The accuracy of the model was checked. High accuracy was assumed to indicate the high specificity of the index. Through Logistic regression, three indexes, CEA, HSP60 and CA199, were screened out. Using Bhattacharyya distance, four indexes with the largest Bhattacharyya distance were screened out, including CEA, NSE, AFP, and CA724. The specificity of the combination of the above six indexes was higher than that of other combinations, so did the accuracy of the established SVM identification model. Using Logistic regression and Bhattacharyya distance for detection and establishing an SVM model based on different serum marker combinations can increase diagnostic accuracy, providing a theoretical basis for application of mathematical models in cancer diagnosis.

  18. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages.

    Science.gov (United States)

    Yao, Yiqing; Xu, Xiaosu

    2017-02-24

    In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS) outages, a novel robust least squares support vector machine (LS-SVM)-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS). The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.

  19. A RLS-SVM Aided Fusion Methodology for INS during GPS Outages

    Directory of Open Access Journals (Sweden)

    Yiqing Yao

    2017-02-01

    Full Text Available In order to maintain a relatively high accuracy of navigation performance during global positioning system (GPS outages, a novel robust least squares support vector machine (LS-SVM-aided fusion methodology is explored to provide the pseudo-GPS position information for the inertial navigation system (INS. The relationship between the yaw, specific force, velocity, and the position increment is modeled. Rather than share the same weight in the traditional LS-SVM, the proposed algorithm allocates various weights for different data, which makes the system immune to the outliers. Field test data was collected to evaluate the proposed algorithm. The comparison results indicate that the proposed algorithm can effectively provide position corrections for standalone INS during the 300 s GPS outage, which outperforms the traditional LS-SVM method. Historical information is also involved to better represent the vehicle dynamics.

  20. SVM-Based System for Prediction of Epileptic Seizures from iEEG Signal

    Science.gov (United States)

    Cherkassky, Vladimir; Lee, Jieun; Veber, Brandon; Patterson, Edward E.; Brinkmann, Benjamin H.; Worrell, Gregory A.

    2017-01-01

    Objective This paper describes a data-analytic modeling approach for prediction of epileptic seizures from intracranial electroencephalogram (iEEG) recording of brain activity. Even though it is widely accepted that statistical characteristics of iEEG signal change prior to seizures, robust seizure prediction remains a challenging problem due to subject-specific nature of data-analytic modeling. Methods Our work emphasizes understanding of clinical considerations important for iEEG-based seizure prediction, and proper translation of these clinical considerations into data-analytic modeling assumptions. Several design choices during pre-processing and post-processing are considered and investigated for their effect on seizure prediction accuracy. Results Our empirical results show that the proposed SVM-based seizure prediction system can achieve robust prediction of preictal and interictal iEEG segments from dogs with epilepsy. The sensitivity is about 90–100%, and the false-positive rate is about 0–0.3 times per day. The results also suggest good prediction is subject-specific (dog or human), in agreement with earlier studies. Conclusion Good prediction performance is possible only if the training data contain sufficiently many seizure episodes, i.e., at least 5–7 seizures. Significance The proposed system uses subject-specific modeling and unbalanced training data. This system also utilizes three different time scales during training and testing stages. PMID:27362758

  1. Predicting the Metabolic Sites by Flavin-Containing Monooxygenase on Drug Molecules Using SVM Classification on Computed Quantum Mechanics and Circular Fingerprints Molecular Descriptors.

    Directory of Open Access Journals (Sweden)

    Chien-Wei Fu

    Full Text Available As an important enzyme in Phase I drug metabolism, the flavin-containing monooxygenase (FMO also metabolizes some xenobiotics with soft nucleophiles. The site of metabolism (SOM on a molecule is the site where the metabolic reaction is exerted by an enzyme. Accurate prediction of SOMs on drug molecules will assist the search for drug leads during the optimization process. Here, some quantum mechanics features such as the condensed Fukui function and attributes from circular fingerprints (called Molprint2D are computed and classified using the support vector machine (SVM for predicting some potential SOMs on a series of drugs that can be metabolized by FMO enzymes. The condensed Fukui function fA- representing the nucleophilicity of central atom A and the attributes from circular fingerprints accounting the influence of neighbors on the central atom. The total number of FMO substrates and non-substrates collected in the study is 85 and they are equally divided into the training and test sets with each carrying roughly the same number of potential SOMs. However, only N-oxidation and S-oxidation features were considered in the prediction since the available C-oxidation data was scarce. In the training process, the LibSVM package of WEKA package and the option of 10-fold cross validation are employed. The prediction performance on the test set evaluated by accuracy, Matthews correlation coefficient and area under ROC curve computed are 0.829, 0.659, and 0.877 respectively. This work reveals that the SVM model built can accurately predict the potential SOMs for drug molecules that are metabolizable by the FMO enzymes.

  2. Comparison of SVM, RF and ELM on an Electronic Nose for the Intelligent Evaluation of Paraffin Samples

    Directory of Open Access Journals (Sweden)

    Hong Men

    2018-01-01

    Full Text Available Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA and Partial Least Squares (PLS. Support Vector Machine (SVM, Random Forest (RF, and Extreme Learning Machine (ELM were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.

  3. Comparison Between Wind Power Prediction Models Based on Wavelet Decomposition with Least-Squares Support Vector Machine (LS-SVM and Artificial Neural Network (ANN

    Directory of Open Access Journals (Sweden)

    Maria Grazia De Giorgi

    2014-08-01

    Full Text Available A high penetration of wind energy into the electricity market requires a parallel development of efficient wind power forecasting models. Different hybrid forecasting methods were applied to wind power prediction, using historical data and numerical weather predictions (NWP. A comparative study was carried out for the prediction of the power production of a wind farm located in complex terrain. The performances of Least-Squares Support Vector Machine (LS-SVM with Wavelet Decomposition (WD were evaluated at different time horizons and compared to hybrid Artificial Neural Network (ANN-based methods. It is acknowledged that hybrid methods based on LS-SVM with WD mostly outperform other methods. A decomposition of the commonly known root mean square error was beneficial for a better understanding of the origin of the differences between prediction and measurement and to compare the accuracy of the different models. A sensitivity analysis was also carried out in order to underline the impact that each input had in the network training process for ANN. In the case of ANN with the WD technique, the sensitivity analysis was repeated on each component obtained by the decomposition.

  4. LS-SVM: uma nova ferramenta quimiométrica para regressão multivariada. Comparação de modelos de regressão LS-SVM e PLS na quantificação de adulterantes em leite em pó empregando NIR LS-SVM: a new chemometric tool for multivariate regression. Comparison of LS-SVM and pls regression for determination of common adulterants in powdered milk by nir spectroscopy

    Directory of Open Access Journals (Sweden)

    Marco F. Ferrão

    2007-08-01

    Full Text Available Least-squares support vector machines (LS-SVM were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine precisely the amount of one and two common adulterants simultaneously in powdered milk samples using LS-SVM and NIR spectra.

  5. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Maolong Xi

    2016-01-01

    Full Text Available This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO for cancer feature gene selection, coupling support vector machine (SVM for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV. Finally, the BQPSO coupling SVM (BQPSO/SVM, binary PSO coupling SVM (BPSO/SVM, and genetic algorithm coupling SVM (GA/SVM are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms.

  6. Cancer Feature Selection and Classification Using a Binary Quantum-Behaved Particle Swarm Optimization and Support Vector Machine

    Science.gov (United States)

    Sun, Jun; Liu, Li; Fan, Fangyun; Wu, Xiaojun

    2016-01-01

    This paper focuses on the feature gene selection for cancer classification, which employs an optimization algorithm to select a subset of the genes. We propose a binary quantum-behaved particle swarm optimization (BQPSO) for cancer feature gene selection, coupling support vector machine (SVM) for cancer classification. First, the proposed BQPSO algorithm is described, which is a discretized version of original QPSO for binary 0-1 optimization problems. Then, we present the principle and procedure for cancer feature gene selection and cancer classification based on BQPSO and SVM with leave-one-out cross validation (LOOCV). Finally, the BQPSO coupling SVM (BQPSO/SVM), binary PSO coupling SVM (BPSO/SVM), and genetic algorithm coupling SVM (GA/SVM) are tested for feature gene selection and cancer classification on five microarray data sets, namely, Leukemia, Prostate, Colon, Lung, and Lymphoma. The experimental results show that BQPSO/SVM has significant advantages in accuracy, robustness, and the number of feature genes selected compared with the other two algorithms. PMID:27642363

  7. Methods for discriminating gas-liquid two phase flow patterns based on gray neural networks and SVM

    International Nuclear Information System (INIS)

    Li Jingjing; Zhou Tao; Duan Jun; Zhang Lei

    2013-01-01

    Background: The flow patterns of two phase flow will directly influence the heat transfer and mass transfer of the flow. Purpose: By wavelet analysis of the pressure drop experimental data, the wavelet coefficients of different frequency can be obtained. Methods: Get the wavelet energy and then train them in the model of BP neural network to distinguish the flow patterns. Introduced the implant gray neural networks model and use it for the two phase flow for the first time. At the same time, set up the method of training the pressure data and wavelet energy data in the support vector machine. Results: Through treatment of the gray layer, the result of the neural network is more accuracy. It can obviously reduce the effect of data marginalization. The accuracy of the pressure drop Lib-SVM method is 95.2%. Conclusions: The results show that these three methods can make a distinction among the different flow patterns and the Lib-SVM method gets the best result, then the gray neural networks, and at last the BP neural networks. (authors)

  8. A fast learning method for large scale and multi-class samples of SVM

    Science.gov (United States)

    Fan, Yu; Guo, Huiming

    2017-06-01

    A multi-class classification SVM(Support Vector Machine) fast learning method based on binary tree is presented to solve its low learning efficiency when SVM processing large scale multi-class samples. This paper adopts bottom-up method to set up binary tree hierarchy structure, according to achieved hierarchy structure, sub-classifier learns from corresponding samples of each node. During the learning, several class clusters are generated after the first clustering of the training samples. Firstly, central points are extracted from those class clusters which just have one type of samples. For those which have two types of samples, cluster numbers of their positive and negative samples are set respectively according to their mixture degree, secondary clustering undertaken afterwards, after which, central points are extracted from achieved sub-class clusters. By learning from the reduced samples formed by the integration of extracted central points above, sub-classifiers are obtained. Simulation experiment shows that, this fast learning method, which is based on multi-level clustering, can guarantee higher classification accuracy, greatly reduce sample numbers and effectively improve learning efficiency.

  9. The system evaluation for report writing skills of summary by HGA-SVM with Ontology: Medical case study in problem based learning

    Science.gov (United States)

    Yenaeng, Sasikanchana; Saelee, Somkid; Samai, Wirachai

    2018-01-01

    The system evaluation for report writing skills of summary by Hybrid Genetic Algorithm-Support Vector Machines (HGA-SVM) with Ontology of Medical Case Study in Problem Based Learning (PBL) is a system was developed as a guideline of scoring for the facilitators or medical teacher. The essay answers come from medical student of medical education courses in the nervous system motion and Behavior I and II subject, a third year medical student 20 groups of 9-10 people, the Faculty of Medicine in Prince of Songkla University (PSU). The audit committee have the opinion that the ratings of individual facilitators are inadequate, this system to solve such problems. In this paper proposes a development of the system evaluation for report writing skills of summary by HGA-SVM with Ontology of medical case study in PBL which the mean scores of machine learning score and humans (facilitators) score were not different at the significantly level .05 all 3 essay parts contain problem essay part, hypothesis essay part and learning objective essay part. The result show that, the average score all 3 essay parts that were not significantly different from the rate at the level of significance .05.

  10. Improved Reliability-Based Optimization with Support Vector Machines and Its Application in Aircraft Wing Design

    Directory of Open Access Journals (Sweden)

    Yu Wang

    2015-01-01

    Full Text Available A new reliability-based design optimization (RBDO method based on support vector machines (SVM and the Most Probable Point (MPP is proposed in this work. SVM is used to create a surrogate model of the limit-state function at the MPP with the gradient information in the reliability analysis. This guarantees that the surrogate model not only passes through the MPP but also is tangent to the limit-state function at the MPP. Then, importance sampling (IS is used to calculate the probability of failure based on the surrogate model. This treatment significantly improves the accuracy of reliability analysis. For RBDO, the Sequential Optimization and Reliability Assessment (SORA is employed as well, which decouples deterministic optimization from the reliability analysis. The improved SVM-based reliability analysis is used to amend the error from linear approximation for limit-state function in SORA. A mathematical example and a simplified aircraft wing design demonstrate that the improved SVM-based reliability analysis is more accurate than FORM and needs less training points than the Monte Carlo simulation and that the proposed optimization strategy is efficient.

  11. Development of gene diagnosis for diabetes and cholecystis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Kono, Akira

    1998-01-01

    The gene structures of CCK, A type receptor in human, the rat and the mouse were investigated aiming to clarify that the aberration of the gene is involved in the incidences of diabetes and cholecystis. In this fiscal year, 1997, the normal structure of the gene and the accurate base sequence were analyzed using DNA fragments bound to 32 P-labelled cDNA of human CCKAR originated from the gene library of leucocyte. This gene contained about 2.2 x 10 5 base pairs and the base sequence was completely determined and registered to Japan DNA data bank (D85606). In addition, the genome structures and base sequences of mouse and rat CCKAR were analyzed and registered (D 85605 and D 50608, respectively). The differences in the base sequence of CCKAR among the species were found in the promotor region and the intron regions, suggesting that there might be differences in splicing among species. (M.N.)

  12. Accurate phylogenetic classification of DNA fragments based onsequence composition

    Energy Technology Data Exchange (ETDEWEB)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  13. An Integrative Approach to Accurate Vehicle Logo Detection

    Directory of Open Access Journals (Sweden)

    Hao Pan

    2013-01-01

    required for many applications in intelligent transportation systems and automatic surveillance. The task is challenging considering the small target of logos and the wide range of variability in shape, color, and illumination. A fast and reliable vehicle logo detection approach is proposed following visual attention mechanism from the human vision. Two prelogo detection steps, that is, vehicle region detection and a small RoI segmentation, rapidly focalize a small logo target. An enhanced Adaboost algorithm, together with two types of features of Haar and HOG, is proposed to detect vehicles. An RoI that covers logos is segmented based on our prior knowledge about the logos’ position relative to license plates, which can be accurately localized from frontal vehicle images. A two-stage cascade classier proceeds with the segmented RoI, using a hybrid of Gentle Adaboost and Support Vector Machine (SVM, resulting in precise logo positioning. Extensive experiments were conducted to verify the efficiency of the proposed scheme.

  14. SVM-Maj: a majorization approach to linear support vector machines with different hinge errors

    NARCIS (Netherlands)

    P.J.F. Groenen (Patrick); G.I. Nalbantov (Georgi); J.C. Bioch (Cor)

    2007-01-01

    textabstractSupport vector machines (SVM) are becoming increasingly popular for the prediction of a binary dependent variable. SVMs perform very well with respect to competing techniques. Often, the solution of an SVM is obtained by switching to the dual. In this paper, we stick to the primal

  15. Research progress in machine learning methods for gene-gene interaction detection.

    Science.gov (United States)

    Peng, Zhe-Ye; Tang, Zi-Jun; Xie, Min-Zhu

    2018-03-20

    Complex diseases are results of gene-gene and gene-environment interactions. However, the detection of high-dimensional gene-gene interactions is computationally challenging. In the last two decades, machine-learning approaches have been developed to detect gene-gene interactions with some successes. In this review, we summarize the progress in research on machine learning methods, as applied to gene-gene interaction detection. It systematically examines the principles and limitations of the current machine learning methods used in genome wide association studies (GWAS) to detect gene-gene interactions, such as neural networks (NN), random forest (RF), support vector machines (SVM) and multifactor dimensionality reduction (MDR), and provides some insights on the future research directions in the field.

  16. Static Voltage Stability Analysis by Using SVM and Neural Network

    Directory of Open Access Journals (Sweden)

    Mehdi Hajian

    2013-01-01

    Full Text Available Voltage stability is an important problem in power system networks. In this paper, in terms of static voltage stability, and application of Neural Networks (NN and Supported Vector Machine (SVM for estimating of voltage stability margin (VSM and predicting of voltage collapse has been investigated. This paper considers voltage stability in power system in two parts. The first part calculates static voltage stability margin by Radial Basis Function Neural Network (RBFNN. The advantage of the used method is high accuracy in online detecting the VSM. Whereas the second one, voltage collapse analysis of power system is performed by Probabilistic Neural Network (PNN and SVM. The obtained results in this paper indicate, that time and number of training samples of SVM, are less than NN. In this paper, a new model of training samples for detection system, using the normal distribution load curve at each load feeder, has been used. Voltage stability analysis is estimated by well-know L and VSM indexes. To demonstrate the validity of the proposed methods, IEEE 14 bus grid and the actual network of Yazd Province are used.

  17. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Science.gov (United States)

    Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds. PMID:29065640

  18. A Fast SVM-Based Tongue's Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis.

    Science.gov (United States)

    Kamarudin, Nur Diyana; Ooi, Chia Yee; Kawanabe, Tadaaki; Odaguchi, Hiroshi; Kobayashi, Fuminori

    2017-01-01

    In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye's ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue's multicolour classification based on a support vector machine (SVM) whose support vectors are reduced by our proposed k -means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k -means clustering is used to cluster a tongue image into four clusters of image background (black), deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  19. Dates fruits classification using SVM

    Science.gov (United States)

    Alzu'bi, Reem; Anushya, A.; Hamed, Ebtisam; Al Sha'ar, Eng. Abdelnour; Vincy, B. S. Angela

    2018-04-01

    In this paper, we used SVM in classifying various types of dates using their images. Dates have interesting different characteristics that can be valuable to distinguish and determine a particular date type. These characteristics include shape, texture, and color. A system that achieves 100% accuracy was built to classify the dates which can be eatable and cannot be eatable. The built system helps the food industry and customer in classifying dates depending on specific quality measures giving best performance with specific type of dates.

  20. Assessing reference genes for accurate transcript normalization using quantitative real-time PCR in pearl millet [Pennisetum glaucum (L. R. Br].

    Directory of Open Access Journals (Sweden)

    Prasenjit Saha

    Full Text Available Pearl millet [Pennisetum glaucum (L. R.Br.], a close relative of Panicoideae food crops and bioenergy grasses, offers an ideal system to perform functional genomics studies related to C4 photosynthesis and abiotic stress tolerance. Quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR provides a sensitive platform to conduct such gene expression analyses. However, the lack of suitable internal control reference genes for accurate transcript normalization during qRT-PCR analysis in pearl millet is the major limitation. Here, we conducted a comprehensive assessment of 18 reference genes on 234 samples which included an array of different developmental tissues, hormone treatments and abiotic stress conditions from three genotypes to determine appropriate reference genes for accurate normalization of qRT-PCR data. Analyses of Ct values using Stability Index, BestKeeper, ΔCt, Normfinder, geNorm and RefFinder programs ranked PP2A, TIP41, UBC2, UBQ5 and ACT as the most reliable reference genes for accurate transcript normalization under different experimental conditions. Furthermore, we validated the specificity of these genes for precise quantification of relative gene expression and provided evidence that a combination of the best reference genes are required to obtain optimal expression patterns for both endogeneous genes as well as transgenes in pearl millet.

  1. Microarray MAPH: accurate array-based detection of relative copy number in genomic DNA

    Directory of Open Access Journals (Sweden)

    Chan Alan

    2006-06-01

    Full Text Available Abstract Background Current methods for measurement of copy number do not combine all the desirable qualities of convenience, throughput, economy, accuracy and resolution. In this study, to improve the throughput associated with Multiplex Amplifiable Probe Hybridisation (MAPH we aimed to develop a modification based on the 3-Dimensional, Flow-Through Microarray Platform from PamGene International. In this new method, electrophoretic analysis of amplified products is replaced with photometric analysis of a probed oligonucleotide array. Copy number analysis of hybridised probes is based on a dual-label approach by comparing the intensity of Cy3-labelled MAPH probes amplified from test samples co-hybridised with similarly amplified Cy5-labelled reference MAPH probes. The key feature of using a hybridisation-based end point with MAPH is that discrimination of amplified probes is based on sequence and not fragment length. Results In this study we showed that microarray MAPH measurement of PMP22 gene dosage correlates well with PMP22 gene dosage determined by capillary MAPH and that copy number was accurately reported in analyses of DNA from 38 individuals, 12 of which were known to have Charcot-Marie-Tooth disease type 1A (CMT1A. Conclusion Measurement of microarray-based endpoints for MAPH appears to be of comparable accuracy to electrophoretic methods, and holds the prospect of fully exploiting the potential multiplicity of MAPH. The technology has the potential to simplify copy number assays for genes with a large number of exons, or of expanded sets of probes from dispersed genomic locations.

  2. Microarray MAPH: accurate array-based detection of relative copy number in genomic DNA.

    Science.gov (United States)

    Gibbons, Brian; Datta, Parikkhit; Wu, Ying; Chan, Alan; Al Armour, John

    2006-06-30

    Current methods for measurement of copy number do not combine all the desirable qualities of convenience, throughput, economy, accuracy and resolution. In this study, to improve the throughput associated with Multiplex Amplifiable Probe Hybridisation (MAPH) we aimed to develop a modification based on the 3-Dimensional, Flow-Through Microarray Platform from PamGene International. In this new method, electrophoretic analysis of amplified products is replaced with photometric analysis of a probed oligonucleotide array. Copy number analysis of hybridised probes is based on a dual-label approach by comparing the intensity of Cy3-labelled MAPH probes amplified from test samples co-hybridised with similarly amplified Cy5-labelled reference MAPH probes. The key feature of using a hybridisation-based end point with MAPH is that discrimination of amplified probes is based on sequence and not fragment length. In this study we showed that microarray MAPH measurement of PMP22 gene dosage correlates well with PMP22 gene dosage determined by capillary MAPH and that copy number was accurately reported in analyses of DNA from 38 individuals, 12 of which were known to have Charcot-Marie-Tooth disease type 1A (CMT1A). Measurement of microarray-based endpoints for MAPH appears to be of comparable accuracy to electrophoretic methods, and holds the prospect of fully exploiting the potential multiplicity of MAPH. The technology has the potential to simplify copy number assays for genes with a large number of exons, or of expanded sets of probes from dispersed genomic locations.

  3. Rancang Bangun Inverter SVM Berbasis Mikrokontroler PIC 18F4431 Untuk Sistem VSD

    OpenAIRE

    Tarmizi; Muyassar

    2013-01-01

    Sebuah sistem pengaturan kecepatan motor disebut dengan sistem Variable Speed Drives (VSD). Sistem VSD motor induksi menggunakan inverter untuk mengatur frekuensi suplai motor. Untuk mendapatkan frekuensi suplai motor yang mendekati sinusoidal, inveter perlu di switching dengan metode tertentu. Pada penelitian ini, switching inverter 3 fasa menggunakan metode SVM (Space Vector Modulation) yang dikontrol oleh Mikrokontroler PIC18F4431. Sebelum dilakukan ekperimen, inverter SVM ini lakukan si...

  4. An IPSO-SVM algorithm for security state prediction of mine production logistics system

    Science.gov (United States)

    Zhang, Yanliang; Lei, Junhui; Ma, Qiuli; Chen, Xin; Bi, Runfang

    2017-06-01

    A theoretical basis for the regulation of corporate security warning and resources was provided in order to reveal the laws behind the security state in mine production logistics. Considering complex mine production logistics system and the variable is difficult to acquire, a superior security status predicting model of mine production logistics system based on the improved particle swarm optimization and support vector machine (IPSO-SVM) is proposed in this paper. Firstly, through the linear adjustments of inertia weight and learning weights, the convergence speed and search accuracy are enhanced with the aim to deal with situations associated with the changeable complexity and the data acquisition difficulty. The improved particle swarm optimization (IPSO) is then introduced to resolve the problem of parameter settings in traditional support vector machines (SVM). At the same time, security status index system is built to determine the classification standards of safety status. The feasibility and effectiveness of this method is finally verified using the experimental results.

  5. SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity.

    Science.gov (United States)

    Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong

    2016-01-01

    Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.

  6. A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM

    Directory of Open Access Journals (Sweden)

    Chenchen Huang

    2014-01-01

    Full Text Available Feature extraction is a very important part in speech emotion recognition, and in allusion to feature extraction in speech emotion recognition problems, this paper proposed a new method of feature extraction, using DBNs in DNN to extract emotional features in speech signal automatically. By training a 5 layers depth DBNs, to extract speech emotion feature and incorporate multiple consecutive frames to form a high dimensional feature. The features after training in DBNs were the input of nonlinear SVM classifier, and finally speech emotion recognition multiple classifier system was achieved. The speech emotion recognition rate of the system reached 86.5%, which was 7% higher than the original method.

  7. Application of SVM methods for mid-term load forecasting

    Directory of Open Access Journals (Sweden)

    Božić Miloš

    2011-01-01

    Full Text Available This paper presents an approach for the medium-term load forecasting using Support Vector Machines (SVMs. The proposed SVM model was employed to predict the maximum daily load demand for the period of a month. Analyses of available data were performed and the most important features for the construction of SVM model are selected. It was shown that the size and the structure of the training set may significantly affect the accuracy of predictions. The presented model was tested by applying it on real-life load data obtained from distribution company 'ED Jugoistok' for the territory of city Niš and its surroundings. Experimental results show that the proposed approach gives acceptable results for the entire period of prediction, which are in range with other solutions in this area.

  8. A WFS-SVM Model for Soil Salinity Mapping in Keriya Oasis, Northwestern China Using Polarimetric Decomposition and Fully PolSAR Data

    Directory of Open Access Journals (Sweden)

    Ilyas Nurmemet

    2018-04-01

    Full Text Available Timely monitoring and mapping of salt-affected areas are essential for the prevention of land degradation and sustainable soil management in arid and semi-arid regions. The main objective of this study was to develop Synthetic Aperture Radar (SAR polarimetry techniques for improved soil salinity mapping in the Keriya Oasis in the Xinjiang Uyghur Autonomous Region (Xinjiang, China, where salinized soil appears to be a major threat to local agricultural productivity. Multiple polarimetric target decomposition, optimal feature subset selection (wrapper feature selector, WFS, and support vector machine (SVM algorithms were used for optimal soil salinization classification using quad-polarized PALSAR-2 data. A threefold exercise was conducted. First, 16 polarimetric decomposition methods were implemented and a wide range of polarimetric parameters and SAR discriminators were derived in order to mine hidden information in PolSAR data. Second, the optimal polarimetric feature subset that constitutes 19 polarimetric elements was selected adopting the WFS approach; optimum classification parameters were identified, and the optimal SVM classification model was obtained by employing a cross-validation method. Third, the WFS-SVM classification model was constructed, optimized, and implemented based on the optimal match of polarimetric features and optimum classification parameters. Soils with different salinization degrees (i.e., highly, moderately and slightly salinized soils were extracted. Finally, classification results were compared with the Wishart supervised classification and conventional SVM classification to examine the performance of the proposed method for salinity mapping. Detailed field investigations and ground data were used for the validation of the adopted methods. The overall accuracy and kappa coefficient of the proposed WFS-SVM model were 87.57% and 0.85, respectively that were much higher than those obtained by the Wishart supervised

  9. The efficacy of support vector machines (SVM)

    Indian Academy of Sciences (India)

    (2006) by applying an SVM statistical learning machine on the time-scale wavelet decomposition methods. We used the data of 108 events in central Japan with magnitude ranging from 3 to 7.4 recorded at KiK-net network stations, for a source–receiver distance of up to 150 km during the period 1998–2011. We applied a ...

  10. A Fast SVM-Based Tongue’s Colour Classification Aided by k-Means Clustering Identifiers and Colour Attributes as Computer-Assisted Tool for Tongue Diagnosis

    Directory of Open Access Journals (Sweden)

    Nur Diyana Kamarudin

    2017-01-01

    Full Text Available In tongue diagnosis, colour information of tongue body has kept valuable information regarding the state of disease and its correlation with the internal organs. Qualitatively, practitioners may have difficulty in their judgement due to the instable lighting condition and naked eye’s ability to capture the exact colour distribution on the tongue especially the tongue with multicolour substance. To overcome this ambiguity, this paper presents a two-stage tongue’s multicolour classification based on a support vector machine (SVM whose support vectors are reduced by our proposed k-means clustering identifiers and red colour range for precise tongue colour diagnosis. In the first stage, k-means clustering is used to cluster a tongue image into four clusters of image background (black, deep red region, red/light red region, and transitional region. In the second-stage classification, red/light red tongue images are further classified into red tongue or light red tongue based on the red colour range derived in our work. Overall, true rate classification accuracy of the proposed two-stage classification to diagnose red, light red, and deep red tongue colours is 94%. The number of support vectors in SVM is improved by 41.2%, and the execution time for one image is recorded as 48 seconds.

  11. Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN

    Science.gov (United States)

    Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai

    2017-01-01

    Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. PMID:28737705

  12. Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN.

    Science.gov (United States)

    Zhu, Lianzhang; Chen, Leiming; Zhao, Dehai; Zhou, Jiehan; Zhang, Weishan

    2017-07-24

    Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed.

  13. Common voltage eliminating of SVM diode clamping three-level inverter connected to grid

    DEFF Research Database (Denmark)

    Guo, Yougui; Zeng, Ping; Zhu, Jieqiong

    2011-01-01

    A novel method of common voltage eliminating is put forward for SVM diode clamping three-level inverter connected to grid by calculation of common voltage of its various switching states. PLECS is used to model this three-level inverter connected to grid and good results are obtained. First...... analysis of common mode voltage for switching states of diode clamping 3-level inverter is given in detail. Second the common mode voltage eliminating control strategy of SVM is described for diode clamping three-level inverter. Third, PLECS is briefly introduced. Fourth, the modeling of diode clamping...... three-level inverter is presented with PLECS. Finally, a series of simulations are carried out. The simulation results tell us PLECS is a very powerful tool to real power circuits modeling. They have also verified that proposed common mode voltage eliminating control strategy of SVM is feasible...

  14. Probability-based collaborative filtering model for predicting gene-disease associations.

    Science.gov (United States)

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  15. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants.

    Science.gov (United States)

    Vieira, Lucas Maciel; Grativol, Clicia; Thiebaut, Flavia; Carvalho, Thais G; Hardoim, Pablo R; Hemerly, Adriana; Lifschitz, Sergio; Ferreira, Paulo Cavalcanti Gomes; Walter, Maria Emilia M T

    2017-03-04

    Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane ( Saccharum spp.) and in maize ( Zea mays ). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  16. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants

    Directory of Open Access Journals (Sweden)

    Lucas Maciel Vieira

    2017-03-01

    Full Text Available Non-coding RNAs (ncRNAs constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs, which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM. We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp. and in maize (Zea mays. From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  17. Adaptive predictors based on probabilistic SVM for real time disruption mitigation on JET

    Science.gov (United States)

    Murari, A.; Lungaroni, M.; Peluso, E.; Gaudio, P.; Vega, J.; Dormido-Canto, S.; Baruzzo, M.; Gelfusa, M.; Contributors, JET

    2018-05-01

    Detecting disruptions with sufficient anticipation time is essential to undertake any form of remedial strategy, mitigation or avoidance. Traditional predictors based on machine learning techniques can be very performing, if properly optimised, but do not provide a natural estimate of the quality of their outputs and they typically age very quickly. In this paper a new set of tools, based on probabilistic extensions of support vector machines (SVM), are introduced and applied for the first time to JET data. The probabilistic output constitutes a natural qualification of the prediction quality and provides additional flexibility. An adaptive training strategy ‘from scratch’ has also been devised, which allows preserving the performance even when the experimental conditions change significantly. Large JET databases of disruptions, covering entire campaigns and thousands of discharges, have been analysed, both for the case of the graphite and the ITER Like Wall. Performance significantly better than any previous predictor using adaptive training has been achieved, satisfying even the requirements of the next generation of devices. The adaptive approach to the training has also provided unique information about the evolution of the operational space. The fact that the developed tools give the probability of disruption improves the interpretability of the results, provides an estimate of the predictor quality and gives new insights into the physics. Moreover, the probabilistic treatment permits to insert more easily these classifiers into general decision support and control systems.

  18. Forecasting Caspian Sea level changes using satellite altimetry data (June 1992-December 2013) based on evolutionary support vector regression algorithms and gene expression programming

    Science.gov (United States)

    Imani, Moslem; You, Rey-Jer; Kuo, Chung-Yen

    2014-10-01

    Sea level forecasting at various time intervals is of great importance in water supply management. Evolutionary artificial intelligence (AI) approaches have been accepted as an appropriate tool for modeling complex nonlinear phenomena in water bodies. In the study, we investigated the ability of two AI techniques: support vector machine (SVM), which is mathematically well-founded and provides new insights into function approximation, and gene expression programming (GEP), which is used to forecast Caspian Sea level anomalies using satellite altimetry observations from June 1992 to December 2013. SVM demonstrates the best performance in predicting Caspian Sea level anomalies, given the minimum root mean square error (RMSE = 0.035) and maximum coefficient of determination (R2 = 0.96) during the prediction periods. A comparison between the proposed AI approaches and the cascade correlation neural network (CCNN) model also shows the superiority of the GEP and SVM models over the CCNN.

  19. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    Directory of Open Access Journals (Sweden)

    Daqing Zhang

    2015-01-01

    Full Text Available Blood-brain barrier (BBB is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration.

  20. Classification of EEG-P300 Signals Extracted from Brain Activities in BCI Systems Using ν-SVM and BLDA Algorithms

    Directory of Open Access Journals (Sweden)

    Ali MOMENNEZHAD

    2014-06-01

    Full Text Available In this paper, a linear predictive coding (LPC model is used to improve classification accuracy, convergent speed to maximum accuracy, and maximum bitrates in brain computer interface (BCI system based on extracting EEG-P300 signals. First, EEG signal is filtered in order to eliminate high frequency noise. Then, the parameters of filtered EEG signal are extracted using LPC model. Finally, the samples are reconstructed by LPC coefficients and two classifiers, a Bayesian Linear discriminant analysis (BLDA, and b the υ-support vector machine (υ-SVM are applied in order to classify. The proposed algorithm performance is compared with fisher linear discriminant analysis (FLDA. Results show that the efficiency of our algorithm in improving classification accuracy and convergent speed to maximum accuracy are much better. As example at the proposed algorithms, respectively BLDA with LPC model and υ-SVM with LPC model with8 electrode configuration for subject S1 the total classification accuracy is improved as 9.4% and 1.7%. And also, subject 7 at BLDA and υ-SVM with LPC model algorithms (LPC+BLDA and LPC+ υ-SVM after block 11th converged to maximum accuracy but Fisher Linear Discriminant Analysis (FLDA algorithm did not converge to maximum accuracy (with the same configuration. So, it can be used as a promising tool in designing BCI systems.

  1. Accurate radiotherapy positioning system investigation based on video

    International Nuclear Information System (INIS)

    Tao Shengxiang; Wu Yican

    2006-01-01

    This paper introduces the newest research production on patient positioning method in accurate radiotherapy brought by Accurate Radiotherapy Treating System (ARTS) research team of Institute of Plasma Physics of Chinese Academy of Sciences, such as the positioning system based on binocular vision, the position-measuring system based on contour matching and the breath gate controlling system for positioning. Their basic principle, the application occasion and the prospects are briefly depicted. (authors)

  2. A robust combination approach for short-term wind speed forecasting and analysis – Combination of the ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM) forecasts using a GPR (Gaussian Process Regression) model

    International Nuclear Information System (INIS)

    Wang, Jianzhou; Hu, Jianming

    2015-01-01

    With the increasing importance of wind power as a component of power systems, the problems induced by the stochastic and intermittent nature of wind speed have compelled system operators and researchers to search for more reliable techniques to forecast wind speed. This paper proposes a combination model for probabilistic short-term wind speed forecasting. In this proposed hybrid approach, EWT (Empirical Wavelet Transform) is employed to extract meaningful information from a wind speed series by designing an appropriate wavelet filter bank. The GPR (Gaussian Process Regression) model is utilized to combine independent forecasts generated by various forecasting engines (ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM)) in a nonlinear way rather than the commonly used linear way. The proposed approach provides more probabilistic information for wind speed predictions besides improving the forecasting accuracy for single-value predictions. The effectiveness of the proposed approach is demonstrated with wind speed data from two wind farms in China. The results indicate that the individual forecasting engines do not consistently forecast short-term wind speed for the two sites, and the proposed combination method can generate a more reliable and accurate forecast. - Highlights: • The proposed approach can make probabilistic modeling for wind speed series. • The proposed approach adapts to the time-varying characteristic of the wind speed. • The hybrid approach can extract the meaningful components from the wind speed series. • The proposed method can generate adaptive, reliable and more accurate forecasting results. • The proposed model combines four independent forecasting engines in a nonlinear way.

  3. Optical spectroscopy techniques can accurately distinguish benign and malignant renal tumours.

    Science.gov (United States)

    Couapel, Jean-Philippe; Senhadji, Lotfi; Rioux-Leclercq, Nathalie; Verhoest, Grégory; Lavastre, Olivier; de Crevoisier, Renaud; Bensalah, Karim

    2013-05-01

    WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: There is little known about optical spectroscopy techniques ability to evaluate renal tumours. This study shows for the first time the ability of Raman and optical reflectance spectroscopy to distinguish benign and malignant renal tumours in an ex vivo environment. We plan to develop this optical assistance in the operating room in the near future. To evaluate the ability of Raman spectroscopy (RS) and optical reflectance spectroscopy (ORS) to distinguish benign and malignant renal tumours at surgery. Between March and October 2011, RS and ORS spectra were prospectively acquired on surgical renal specimens removed for suspicion of renal cell carcinoma (RCC). Optical measurements were done immediately after surgery. Optical signals were normalised to ensure comparison between spectra. Initial and final portions of each spectrum were removed to avoid artefacts. A support vector machine (SVM) was built and tested using a leave-one-out cross-validation. Classification scores, including accuracy, sensitivity and specificity were calculated on the entire population and in patients with tumours of 700 optical spectra were obtained and submitted to SVM classification. The SVM could recognise benign and malignant renal tumours with an accuracy of 96% (RS) and 88% (ORS) in the whole population and with an accuracy of 93% (RS) and 95% (ORS) in the present subset of small renal tumours (Benign and malignant renal tumours can be accurately discriminated by a combination of RS and ORS. In vivo experiments are needed to further assess the value of optical spectroscopy techniques. © 2012 BJU International.

  4. APPLICATION OF FUSION WITH SAR AND OPTICAL IMAGES IN LAND USE CLASSIFICATION BASED ON SVM

    Directory of Open Access Journals (Sweden)

    C. Bao

    2012-07-01

    Full Text Available As the increment of remote sensing data with multi-space resolution, multi-spectral resolution and multi-source, data fusion technologies have been widely used in geological fields. Synthetic Aperture Radar (SAR and optical camera are two most common sensors presently. The multi-spectral optical images express spectral features of ground objects, while SAR images express backscatter information. Accuracy of the image classification could be effectively improved fusing the two kinds of images. In this paper, Terra SAR-X images and ALOS multi-spectral images were fused for land use classification. After preprocess such as geometric rectification, radiometric rectification noise suppression and so on, the two kind images were fused, and then SVM model identification method was used for land use classification. Two different fusion methods were used, one is joining SAR image into multi-spectral images as one band, and the other is direct fusing the two kind images. The former one can raise the resolution and reserve the texture information, and the latter can reserve spectral feature information and improve capability of identifying different features. The experiment results showed that accuracy of classification using fused images is better than only using multi-spectral images. Accuracy of classification about roads, habitation and water bodies was significantly improved. Compared to traditional classification method, the method of this paper for fused images with SVM classifier could achieve better results in identifying complicated land use classes, especially for small pieces ground features.

  5. DisArticle: a web server for SVM-based discrimination of articles on traditional medicine.

    Science.gov (United States)

    Kim, Sang-Kyun; Nam, SeJin; Kim, SangHyun

    2017-01-28

    Much research has been done in Northeast Asia to show the efficacy of traditional medicine. While MEDLINE contains many biomedical articles including those on traditional medicine, it does not categorize those articles by specific research area. The aim of this study was to provide a method that searches for articles only on traditional medicine in Northeast Asia, including traditional Chinese medicine, from among the articles in MEDLINE. This research established an SVM-based classifier model to identify articles on traditional medicine. The TAK + HM classifier, trained with the features of title, abstract, keywords, herbal data, and MeSH, has a precision of 0.954 and a recall of 0.902. In particular, the feature of herbal data significantly increased the performance of the classifier. By using the TAK + HM classifier, a total of about 108,000 articles were discriminated as articles on traditional medicine from among all articles in MEDLINE. We also built a web server called DisArticle ( http://informatics.kiom.re.kr/disarticle ), in which users can search for the articles and obtain statistical data. Because much evidence-based research on traditional medicine has been published in recent years, it has become necessary to search for articles on traditional medicine exclusively in literature databases. DisArticle can help users to search for and analyze the research trends in traditional medicine.

  6. Gene expression signatures of radiation response are specific, durable and accurate in mice and humans.

    Directory of Open Access Journals (Sweden)

    Sarah K Meadows

    2008-04-01

    Full Text Available Previous work has demonstrated the potential for peripheral blood (PB gene expression profiling for the detection of disease or environmental exposures.We have sought to determine the impact of several variables on the PB gene expression profile of an environmental exposure, ionizing radiation, and to determine the specificity of the PB signature of radiation versus other genotoxic stresses. Neither genotype differences nor the time of PB sampling caused any lessening of the accuracy of PB signatures to predict radiation exposure, but sex difference did influence the accuracy of the prediction of radiation exposure at the lowest level (50 cGy. A PB signature of sepsis was also generated and both the PB signature of radiation and the PB signature of sepsis were found to be 100% specific at distinguishing irradiated from septic animals. We also identified human PB signatures of radiation exposure and chemotherapy treatment which distinguished irradiated patients and chemotherapy-treated individuals within a heterogeneous population with accuracies of 90% and 81%, respectively.We conclude that PB gene expression profiles can be identified in mice and humans that are accurate in predicting medical conditions, are specific to each condition and remain highly accurate over time.

  7. Exploring QSARs of the interaction of flavonoids with GABA (A) receptor using MLR, ANN and SVM techniques.

    Science.gov (United States)

    Deeb, Omar; Shaik, Basheerulla; Agrawal, Vijay K

    2014-10-01

    Quantitative Structure-Activity Relationship (QSAR) models for binding affinity constants (log Ki) of 78 flavonoid ligands towards the benzodiazepine site of GABA (A) receptor complex were calculated using the machine learning methods: artificial neural network (ANN) and support vector machine (SVM) techniques. The models obtained were compared with those obtained using multiple linear regression (MLR) analysis. The descriptor selection and model building were performed with 10-fold cross-validation using the training data set. The SVM and MLR coefficient of determination values are 0.944 and 0.879, respectively, for the training set and are higher than those of ANN models. Though the SVM model shows improvement of training set fitting, the ANN model was superior to SVM and MLR in predicting the test set. Randomization test is employed to check the suitability of the models.

  8. SVM to detect the presence of visitors in a smart home environment.

    Science.gov (United States)

    Petersen, Johanna; Larimer, Nicole; Kaye, Jeffrey A; Pavel, Misha; Hayes, Tamara L

    2012-01-01

    With the rising age of the population, there is increased need to help elderly maintain their independence. Smart homes, employing passive sensor networks and pervasive computing techniques, enable the unobtrusive assessment of activities and behaviors of the elderly which can be useful for health state assessment and intervention. Due to the multiple health benefits associated with socializing, accurately tracking whether an individual has visitors to their home is one of the more important aspects of elders' behaviors that could be assessed with smart home technology. With this goal, we have developed a preliminary SVM model to identify periods where untagged visitors are present in the home. Using the dwell time, number of sensor firings, and number of transitions between major living spaces (living room, dining room, kitchen and bathroom) as features in the model, and self report from two subjects as ground truth, we were able to accurately detect the presence of visitors in the home with a sensitivity and specificity of 0.90 and 0.89 for subject 1, and of 0.67 and 0.78 for subject 2, respectively. These preliminary data demonstrate the feasibility of detecting visitors with in-home sensor data, but highlight the need for more advanced modeling techniques so the model performs well for all subjects and all types of visitors.

  9. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information.

    Science.gov (United States)

    Jia, Bin; Wang, Xiaodong

    2013-12-17

    : The extended Kalman filter (EKF) has been applied to inferring gene regulatory networks. However, it is well known that the EKF becomes less accurate when the system exhibits high nonlinearity. In addition, certain prior information about the gene regulatory network exists in practice, and no systematic approach has been developed to incorporate such prior information into the Kalman-type filter for inferring the structure of the gene regulatory network. In this paper, an inference framework based on point-based Gaussian approximation filters that can exploit the prior information is developed to solve the gene regulatory network inference problem. Different point-based Gaussian approximation filters, including the unscented Kalman filter (UKF), the third-degree cubature Kalman filter (CKF3), and the fifth-degree cubature Kalman filter (CKF5) are employed. Several types of network prior information, including the existing network structure information, sparsity assumption, and the range constraint of parameters, are considered, and the corresponding filters incorporating the prior information are developed. Experiments on a synthetic network of eight genes and the yeast protein synthesis network of five genes are carried out to demonstrate the performance of the proposed framework. The results show that the proposed methods provide more accurate inference results than existing methods, such as the EKF and the traditional UKF.

  10. A New Hybrid Model FPA-SVM Considering Cointegration for Particular Matter Concentration Forecasting: A Case Study of Kunming and Yuxi, China.

    Science.gov (United States)

    Li, Weide; Kong, Demeng; Wu, Jinran

    2017-01-01

    Air pollution in China is becoming more serious especially for the particular matter (PM) because of rapid economic growth and fast expansion of urbanization. To solve the growing environment problems, daily PM2.5 and PM10 concentration data form January 1, 2015, to August 23, 2016, in Kunming and Yuxi (two important cities in Yunnan Province, China) are used to present a new hybrid model CI-FPA-SVM to forecast air PM2.5 and PM10 concentration in this paper. The proposed model involves two parts. Firstly, due to its deficiency to assess the possible correlation between different variables, the cointegration theory is introduced to get the input-output relationship and then obtain the nonlinear dynamical system with support vector machine (SVM), in which the parameters c and g are optimized by flower pollination algorithm (FPA). Six benchmark models, including FPA-SVM, CI-SVM, CI-GA-SVM, CI-PSO-SVM, CI-FPA-NN, and multiple linear regression model, are considered to verify the superiority of the proposed hybrid model. The empirical study results demonstrate that the proposed model CI-FPA-SVM is remarkably superior to all considered benchmark models for its high prediction accuracy, and the application of the model for forecasting can give effective monitoring and management of further air quality.

  11. Decision Fusion Based on Hyperspectral and Multispectral Satellite Imagery for Accurate Forest Species Mapping

    Directory of Open Access Journals (Sweden)

    Dimitris G. Stavrakoudis

    2014-07-01

    Full Text Available This study investigates the effectiveness of combining multispectral very high resolution (VHR and hyperspectral satellite imagery through a decision fusion approach, for accurate forest species mapping. Initially, two fuzzy classifications are conducted, one for each satellite image, using a fuzzy output support vector machine (SVM. The classification result from the hyperspectral image is then resampled to the multispectral’s spatial resolution and the two sources are combined using a simple yet efficient fusion operator. Thus, the complementary information provided from the two sources is effectively exploited, without having to resort to computationally demanding and time-consuming typical data fusion or vector stacking approaches. The effectiveness of the proposed methodology is validated in a complex Mediterranean forest landscape, comprising spectrally similar and spatially intermingled species. The decision fusion scheme resulted in an accuracy increase of 8% compared to the classification using only the multispectral imagery, whereas the increase was even higher compared to the classification using only the hyperspectral satellite image. Perhaps most importantly, its accuracy was significantly higher than alternative multisource fusion approaches, although the latter are characterized by much higher computation, storage, and time requirements.

  12. Forecasting Dry Bulk Freight Index with Improved SVM

    Directory of Open Access Journals (Sweden)

    Qianqian Han

    2014-01-01

    Full Text Available An improved SVM model is presented to forecast dry bulk freight index (BDI in this paper, which is a powerful tool for operators and investors to manage the market trend and avoid price risking shipping industry. The BDI is influenced by many factors, especially the random incidents in dry bulk market, inducing the difficulty in forecasting of BDI. Therefore, to eliminate the impact of random incidents in dry bulk market, wavelet transform is adopted to denoise the BDI data series. Hence, the combined model of wavelet transform and support vector machine is developed to forecast BDI in this paper. Lastly, the BDI data in 2005 to 2012 are presented to test the proposed model. The 84 prior consecutive monthly BDI data are the inputs of the model, and the last 12 monthly BDI data are the outputs of model. The parameters of the model are optimized by genetic algorithm and the final model is conformed through SVM training. This paper compares the forecasting result of proposed method and three other forecasting methods. The result shows that the proposed method has higher accuracy and could be used to forecast the short-term trend of the BDI.

  13. Evaluation of liquefaction potential of soil based on standard penetration test using multi-gene genetic programming model

    Science.gov (United States)

    Muduli, Pradyut; Das, Sarat

    2014-06-01

    This paper discusses the evaluation of liquefaction potential of soil based on standard penetration test (SPT) dataset using evolutionary artificial intelligence technique, multi-gene genetic programming (MGGP). The liquefaction classification accuracy (94.19%) of the developed liquefaction index (LI) model is found to be better than that of available artificial neural network (ANN) model (88.37%) and at par with the available support vector machine (SVM) model (94.19%) on the basis of the testing data. Further, an empirical equation is presented using MGGP to approximate the unknown limit state function representing the cyclic resistance ratio (CRR) of soil based on developed LI model. Using an independent database of 227 cases, the overall rates of successful prediction of occurrence of liquefaction and non-liquefaction are found to be 87, 86, and 84% by the developed MGGP based model, available ANN and the statistical models, respectively, on the basis of calculated factor of safety (F s) against the liquefaction occurrence.

  14. Support vector machine based diagnostic system for breast cancer using swarm intelligence.

    Science.gov (United States)

    Chen, Hui-Ling; Yang, Bo; Wang, Gang; Wang, Su-Jing; Liu, Jie; Liu, Da-You

    2012-08-01

    Breast cancer is becoming a leading cause of death among women in the whole world, meanwhile, it is confirmed that the early detection and accurate diagnosis of this disease can ensure a long survival of the patients. In this paper, a swarm intelligence technique based support vector machine classifier (PSO_SVM) is proposed for breast cancer diagnosis. In the proposed PSO-SVM, the issue of model selection and feature selection in SVM is simultaneously solved under particle swarm (PSO optimization) framework. A weighted function is adopted to design the objective function of PSO, which takes into account the average accuracy rates of SVM (ACC), the number of support vectors (SVs) and the selected features simultaneously. Furthermore, time varying acceleration coefficients (TVAC) and inertia weight (TVIW) are employed to efficiently control the local and global search in PSO algorithm. The effectiveness of PSO-SVM has been rigorously evaluated against the Wisconsin Breast Cancer Dataset (WBCD), which is commonly used among researchers who use machine learning methods for breast cancer diagnosis. The proposed system is compared with the grid search method with feature selection by F-score. The experimental results demonstrate that the proposed approach not only obtains much more appropriate model parameters and discriminative feature subset, but also needs smaller set of SVs for training, giving high predictive accuracy. In addition, Compared to the existing methods in previous studies, the proposed system can also be regarded as a promising success with the excellent classification accuracy of 99.3% via 10-fold cross validation (CV) analysis. Moreover, a combination of five informative features is identified, which might provide important insights to the nature of the breast cancer disease and give an important clue for the physicians to take a closer attention. We believe the promising result can ensure that the physicians make very accurate diagnostic decision in

  15. Using Generalized Entropies and OC-SVM with Mahalanobis Kernel for Detection and Classification of Anomalies in Network Traffic

    Directory of Open Access Journals (Sweden)

    Jayro Santiago-Paz

    2015-09-01

    Full Text Available Network anomaly detection and classification is an important open issue in network security. Several approaches and systems based on different mathematical tools have been studied and developed, among them, the Anomaly-Network Intrusion Detection System (A-NIDS, which monitors network traffic and compares it against an established baseline of a “normal” traffic profile. Then, it is necessary to characterize the “normal” Internet traffic. This paper presents an approach for anomaly detection and classification based on Shannon, Rényi and Tsallis entropies of selected features, and the construction of regions from entropy data employing the Mahalanobis distance (MD, and One Class Support Vector Machine (OC-SVM with different kernels (Radial Basis Function (RBF and Mahalanobis Kernel (MK for “normal” and abnormal traffic. Regular and non-regular regions built from “normal” traffic profiles allow anomaly detection, while the classification is performed under the assumption that regions corresponding to the attack classes have been previously characterized. Although this approach allows the use of as many features as required, only four well-known significant features were selected in our case. In order to evaluate our approach, two different data sets were used: one set of real traffic obtained from an Academic Local Area Network (LAN, and the other a subset of the 1998 MIT-DARPA set. For these data sets, a True positive rate up to 99.35%, a True negative rate up to 99.83% and a False negative rate at about 0.16% were yielded. Experimental results show that certain q-values of the generalized entropies and the use of OC-SVM with RBF kernel improve the detection rate in the detection stage, while the novel inclusion of MK kernel in OC-SVM and k-temporal nearest neighbors improve accuracy in classification. In addition, the results show that using the Box-Cox transformation, the Mahalanobis distance yielded high detection rates with

  16. Estimation of Costs and Durations of Construction of Urban Roads Using ANN and SVM

    Directory of Open Access Journals (Sweden)

    Igor Peško

    2017-01-01

    Full Text Available Offer preparation has always been a specific part of a building process which has significant impact on company business. Due to the fact that income greatly depends on offer’s precision and the balance between planned costs, both direct and overheads, and wished profit, it is necessary to prepare a precise offer within required time and available resources which are always insufficient. The paper presents a research of precision that can be achieved while using artificial intelligence for estimation of cost and duration in construction projects. Both artificial neural networks (ANNs and support vector machines (SVM are analysed and compared. The best SVM has shown higher precision, when estimating costs, with mean absolute percentage error (MAPE of 7.06% compared to the most precise ANNs which has achieved precision of 25.38%. Estimation of works duration has proved to be more difficult. The best MAPEs were 22.77% and 26.26% for SVM and ANN, respectively.

  17. Segmentasi Citra menggunakan Support Vector Machine (SVM dan Ellipsoid Region Search Strategy (ERSS Arimoto Entropy berdasarkan Ciri Warna dan Tekstur

    Directory of Open Access Journals (Sweden)

    Lukman Hakim

    2016-02-01

    Full Text Available Abstrak Segmentasi citra merupakan suatu metode penting dalam pengolahan citra digital yang bertujuan membagi citra menjadi beberapa region yang homogen berdasarkan kriteria kemiripan tertentu. Salah satu syarat utama yang harus dimiliki suatu metode segmentasi citra yaitu menghasilkan citra boundary yang optimal.Untuk memenuhi syarat tersebut suatu metode segmentasi membutuhkan suatu klasifikasi piksel citra yang dapat memisahkan piksel secara linier dan non-linear. Pada penelitian ini, penulis mengusulkan metode segmentasi citra menggunakan SVM dan entropi Arimoto berbasis ERSS sehingga tahan terhadap derau dan mempunyai kompleksitas yang rendah untuk menghasilkan citra boundary yang optimal. Pertama, ekstraksi ciri warna dengan local homogeneity dan ciri tekstur dengan menggunakan Gray Level Co-occurrence Matrix (GLCM yang menghasilkan beberapa fitur. Kedua, pelabelan dengan Arimoto berbasis ERSS yang digunakan sebagai kelas dalam klasifikasi. Ketiga, hasil ekstraksi fitur dan training kemudian diklasifikasi berdasarkan label dengan SVM yang telah di-training. Dari percobaan yang dilakukan menunjukkan hasil segmentasi kurang optimal dengan akurasi 69 %. Reduksi fitur perlu dilakukan untuk menghasilkan citra yang tersegmentasi dengan baik. Kata kunci: segmentasi citra, support vector machine, ERSS Arimoto Entropy, ekstraksi ciri. Abstract Image segmentation is an important tool in image processing that divides an image into homogeneous regions based on certain similarity criteria, which ideally should be meaning-full for a certain purpose. Optimal boundary is one of the main criteria that an image segmentation method should has. A classification method that can partitions pixel linearly or non-linearly is needed by an image segmentation method. We propose a color image segmentation using Support Vector Machine (SVM classification and ERSS Arimoto entropy thresholding to get optimal boundary of segmented image that noise-free and low complexity

  18. A New Hybrid Model FPA-SVM Considering Cointegration for Particular Matter Concentration Forecasting: A Case Study of Kunming and Yuxi, China

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-01-01

    Full Text Available Air pollution in China is becoming more serious especially for the particular matter (PM because of rapid economic growth and fast expansion of urbanization. To solve the growing environment problems, daily PM2.5 and PM10 concentration data form January 1, 2015, to August 23, 2016, in Kunming and Yuxi (two important cities in Yunnan Province, China are used to present a new hybrid model CI-FPA-SVM to forecast air PM2.5 and PM10 concentration in this paper. The proposed model involves two parts. Firstly, due to its deficiency to assess the possible correlation between different variables, the cointegration theory is introduced to get the input-output relationship and then obtain the nonlinear dynamical system with support vector machine (SVM, in which the parameters c and g are optimized by flower pollination algorithm (FPA. Six benchmark models, including FPA-SVM, CI-SVM, CI-GA-SVM, CI-PSO-SVM, CI-FPA-NN, and multiple linear regression model, are considered to verify the superiority of the proposed hybrid model. The empirical study results demonstrate that the proposed model CI-FPA-SVM is remarkably superior to all considered benchmark models for its high prediction accuracy, and the application of the model for forecasting can give effective monitoring and management of further air quality.

  19. Meta-analytic approach to the accurate prediction of secreted virulence effectors in gram-negative bacteria

    Directory of Open Access Journals (Sweden)

    Sato Yoshiharu

    2011-11-01

    Full Text Available Abstract Background Many pathogens use a type III secretion system to translocate virulence proteins (called effectors in order to adapt to the host environment. To date, many prediction tools for effector identification have been developed. However, these tools are insufficiently accurate for producing a list of putative effectors that can be applied directly for labor-intensive experimental verification. This also suggests that important features of effectors have yet to be fully characterized. Results In this study, we have constructed an accurate approach to predicting secreted virulence effectors from Gram-negative bacteria. This consists of a support vector machine-based discriminant analysis followed by a simple criteria-based filtering. The accuracy was assessed by estimating the average number of true positives in the top-20 ranking in the genome-wide screening. In the validation, 10 sets of 20 training and 20 testing examples were randomly selected from 40 known effectors of Salmonella enterica serovar Typhimurium LT2. On average, the SVM portion of our system predicted 9.7 true positives from 20 testing examples in the top-20 of the prediction. Removal of the N-terminal instability, codon adaptation index and ProtParam indices decreased the score to 7.6, 8.9 and 7.9, respectively. These discrimination features suggested that the following characteristics of effectors had been uncovered: unstable N-terminus, non-optimal codon usage, hydrophilic, and less aliphathic. The secondary filtering process represented by coexpression analysis and domain distribution analysis further refined the average true positive counts to 12.3. We further confirmed that our system can correctly predict known effectors of P. syringae DC3000, strongly indicating its feasibility. Conclusions We have successfully developed an accurate prediction system for screening effectors on a genome-wide scale. We confirmed the accuracy of our system by external validation

  20. OPTIMALISASI SUPPORT VEKTOR MACHINE (SVM UNTUK KLASIFIKASI TEMA TUGAS AKHIR BERBASIS K-MEANS

    Directory of Open Access Journals (Sweden)

    Oman Somantri

    2017-01-01

    Full Text Available The difficulty in determining the classification of students final project theme often experienced by each college. The purpose of this study is to provide a decision support for policy makers in the study program so that each student can be achieved in accordance with their own competence. From the research that has been done text mining algorithms using Support Vector Machine ( SVM and K -Means as the technology used was produced a better accuracy rate with an accuracy rate of 86.21 % when compared to the SVM without K -Means is 85 , 38 %

  1. Stationary Wavelet Transform and AdaBoost with SVM Based Pathological Brain Detection in MRI Scanning.

    Science.gov (United States)

    Nayak, Deepak Ranjan; Dash, Ratnakar; Majhi, Banshidhar

    2017-01-01

    This paper presents an automatic classification system for segregating pathological brain from normal brains in magnetic resonance imaging scanning. The proposed system employs contrast limited adaptive histogram equalization scheme to enhance the diseased region in brain MR images. Two-dimensional stationary wavelet transform is harnessed to extract features from the preprocessed images. The feature vector is constructed using the energy and entropy values, computed from the level- 2 SWT coefficients. Then, the relevant and uncorrelated features are selected using symmetric uncertainty ranking filter. Subsequently, the selected features are given input to the proposed AdaBoost with support vector machine classifier, where SVM is used as the base classifier of AdaBoost algorithm. To validate the proposed system, three standard MR image datasets, Dataset-66, Dataset-160, and Dataset- 255 have been utilized. The 5 runs of k-fold stratified cross validation results indicate the suggested scheme offers better performance than other existing schemes in terms of accuracy and number of features. The proposed system earns ideal classification over Dataset-66 and Dataset-160; whereas, for Dataset- 255, an accuracy of 99.45% is achieved. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  2. High resolution tempo-spatial ozone prediction with SVM and LSTM

    Science.gov (United States)

    Gao, D.; Zhang, Y.; Qu, Z.; Sadighi, K.; Coffey, E.; LIU, Q.; Hannigan, M.; Henze, D. K.; Dick, R.; Shang, L.; Lv, Q.

    2017-12-01

    To investigate and predict the exposure of ozone and other pollutants in urban areas, we utilize data from various infrastructures including EPA, NOAA and RIITS from government of Los Angeles and construct statistical models to conduct ozone concentration prediction in Los Angeles areas at finer spatial and temporal granularity. Our work involves cyber data such as traffic, roads and population data as features for prediction. Two statistical models, Support Vector Machine (SVM) and Long Short-term Memory (LSTM, deep learning method) are used for prediction. . Our experiments show that kernelized SVM gains better prediction performance when taking traffic counts, road density and population density as features, with a prediction RMSE of 7.99 ppb for all-time ozone and 6.92 ppb for peak-value ozone. With simulated NOx from Chemical Transport Model(CTM) as features, SVM generates even better prediction performance, with a prediction RMSE of 6.69ppb. We also build LSTM, which has shown great advantages at dealing with temporal sequences, to predict ozone concentration by treating ozone concentration as spatial-temporal sequences. Trained by ozone concentration measurements from the 13 EPA stations in LA area, the model achieves 4.45 ppb RMSE. Besides, we build a variant of this model which adds spatial dynamics into the model in the form of transition matrix that reveals new knowledge on pollutant transition. The forgetting gate of the trained LSTM is consistent with the delay effect of ozone concentration and the trained transition matrix shows spatial consistency with the common direction of winds in LA area.

  3. Vehicle Detection with Occlusion Handling, Tracking, and OC-SVM Classification: A High Performance Vision-Based System

    Science.gov (United States)

    Velazquez-Pupo, Roxana; Sierra-Romero, Alberto; Torres-Roman, Deni; Shkvarko, Yuriy V.; Romero-Delgado, Misael

    2018-01-01

    This paper presents a high performance vision-based system with a single static camera for traffic surveillance, for moving vehicle detection with occlusion handling, tracking, counting, and One Class Support Vector Machine (OC-SVM) classification. In this approach, moving objects are first segmented from the background using the adaptive Gaussian Mixture Model (GMM). After that, several geometric features are extracted, such as vehicle area, height, width, centroid, and bounding box. As occlusion is present, an algorithm was implemented to reduce it. The tracking is performed with adaptive Kalman filter. Finally, the selected geometric features: estimated area, height, and width are used by different classifiers in order to sort vehicles into three classes: small, midsize, and large. Extensive experimental results in eight real traffic videos with more than 4000 ground truth vehicles have shown that the improved system can run in real time under an occlusion index of 0.312 and classify vehicles with a global detection rate or recall, precision, and F-measure of up to 98.190%, and an F-measure of up to 99.051% for midsize vehicles. PMID:29382078

  4. Identification of Importin 8 (IPO8 as the most accurate reference gene for the clinicopathological analysis of lung specimens

    Directory of Open Access Journals (Sweden)

    Pio Ruben

    2008-11-01

    Full Text Available Abstract Background The accurate normalization of differentially expressed genes in lung cancer is essential for the identification of novel therapeutic targets and biomarkers by real time RT-PCR and microarrays. Although classical "housekeeping" genes, such as GAPDH, HPRT1, and beta-actin have been widely used in the past, their accuracy as reference genes for lung tissues has not been proven. Results We have conducted a thorough analysis of a panel of 16 candidate reference genes for lung specimens and lung cell lines. Gene expression was measured by quantitative real time RT-PCR and expression stability was analyzed with the softwares GeNorm and NormFinder, mean of |ΔCt| (= |Ct Normal-Ct tumor| ± SEM, and correlation coefficients among genes. Systematic comparison between candidates led us to the identification of a subset of suitable reference genes for clinical samples: IPO8, ACTB, POLR2A, 18S, and PPIA. Further analysis showed that IPO8 had a very low mean of |ΔCt| (0.70 ± 0.09, with no statistically significant differences between normal and malignant samples and with excellent expression stability. Conclusion Our data show that IPO8 is the most accurate reference gene for clinical lung specimens. In addition, we demonstrate that the commonly used genes GAPDH and HPRT1 are inappropriate to normalize data derived from lung biopsies, although they are suitable as reference genes for lung cell lines. We thus propose IPO8 as a novel reference gene for lung cancer samples.

  5. Fast and accurate methods for phylogenomic analyses

    Directory of Open Access Journals (Sweden)

    Warnow Tandy

    2011-10-01

    Full Text Available Abstract Background Species phylogenies are not estimated directly, but rather through phylogenetic analyses of different gene datasets. However, true gene trees can differ from the true species tree (and hence from one another due to biological processes such as horizontal gene transfer, incomplete lineage sorting, and gene duplication and loss, so that no single gene tree is a reliable estimate of the species tree. Several methods have been developed to estimate species trees from estimated gene trees, differing according to the specific algorithmic technique used and the biological model used to explain differences between species and gene trees. Relatively little is known about the relative performance of these methods. Results We report on a study evaluating several different methods for estimating species trees from sequence datasets, simulating sequence evolution under a complex model including indels (insertions and deletions, substitutions, and incomplete lineage sorting. The most important finding of our study is that some fast and simple methods are nearly as accurate as the most accurate methods, which employ sophisticated statistical methods and are computationally quite intensive. We also observe that methods that explicitly consider errors in the estimated gene trees produce more accurate trees than methods that assume the estimated gene trees are correct. Conclusions Our study shows that highly accurate estimations of species trees are achievable, even when gene trees differ from each other and from the species tree, and that these estimations can be obtained using fairly simple and computationally tractable methods.

  6. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  7. Coal demand prediction based on a support vector machine model

    Energy Technology Data Exchange (ETDEWEB)

    Jia, Cun-liang; Wu, Hai-shan; Gong, Dun-wei [China University of Mining & Technology, Xuzhou (China). School of Information and Electronic Engineering

    2007-01-15

    A forecasting model for coal demand of China using a support vector regression was constructed. With the selected embedding dimension, the output vectors and input vectors were constructed based on the coal demand of China from 1980 to 2002. After compared with lineal kernel and Sigmoid kernel, a radial basis function(RBF) was adopted as the kernel function. By analyzing the relationship between the error margin of prediction and the model parameters, the proper parameters were chosen. The support vector machines (SVM) model with multi-input and single output was proposed. Compared the predictor based on RBF neural networks with test datasets, the results show that the SVM predictor has higher precision and greater generalization ability. In the end, the coal demand from 2003 to 2006 is accurately forecasted. l0 refs., 2 figs., 4 tabs.

  8. Robust Non-Linear Direct Torque and Flux Control of Adjustable Speed Sensorless PMSM Drive Based on SVM Using a PI Predictive Controller

    Directory of Open Access Journals (Sweden)

    F. Naceri

    2010-01-01

    Full Text Available This paper presents a new sensorless direct torque control method for voltage inverter – fed PMSM. The control methodis used a modified Direct Torque Control scheme with constant inverter switching frequency using Space Vector Modulation(DTC-SVM. The variation of stator and rotor resistance due to changes in temperature or frequency deteriorates theperformance of DTC-SVM controller by introducing errors in the estimated flux linkage and the electromagnetic torque.As a result, this approach will not be suitable for high power drives such as those used in tractions, as they require goodtorque control performance at considerably lower frequency. A novel stator resistance estimator is proposed. The estimationmethod is implemented using the Extended Kalman Filter. Finally extensive simulation results are presented to validate theproposed technique. The system is tested at different speeds and a very satisfactory performance has been achieved.

  9. Enzymic colorimetry-based DNA chip: a rapid and accurate assay for detecting mutations for clarithromycin resistance in the 23S rRNA gene of Helicobacter pylori.

    Science.gov (United States)

    Xuan, Shi-Hai; Zhou, Yu-Gui; Shao, Bo; Cui, Ya-Lin; Li, Jian; Yin, Hong-Bo; Song, Xiao-Ping; Cong, Hui; Jing, Feng-Xiang; Jin, Qing-Hui; Wang, Hui-Min; Zhou, Jie

    2009-11-01

    Macrolide drugs, such as clarithromycin (CAM), are a key component of many combination therapies used to eradicate Helicobacter pylori. However, resistance to CAM is increasing in H. pylori and is becoming a serious problem in H. pylori eradication therapy. CAM resistance in H. pylori is mostly due to point mutations (A2142G/C, A2143G) in the peptidyltransferase-encoding region of the 23S rRNA gene. In this study an enzymic colorimetry-based DNA chip was developed to analyse single-nucleotide polymorphisms of the 23S rRNA gene to determine the prevalence of mutations in CAM-related resistance in H. pylori-positive patients. The results of the colorimetric DNA chip were confirmed by direct DNA sequencing. In 63 samples, the incidence of the A2143G mutation was 17.46 % (11/63). The results of the colorimetric DNA chip were concordant with DNA sequencing in 96.83 % of results (61/63). The colorimetric DNA chip could detect wild-type and mutant signals at every site, even at a DNA concentration of 1.53 x 10(2) copies microl(-1). Thus, the colorimetric DNA chip is a reliable assay for rapid and accurate detection of mutations in the 23S rRNA gene of H. pylori that lead to CAM-related resistance, directly from gastric tissues.

  10. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    Science.gov (United States)

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  11. Using an Integrated Group Decision Method Based on SVM, TFN-RS-AHP, and TOPSIS-CD for Cloud Service Supplier Selection

    Directory of Open Access Journals (Sweden)

    Lian-hui Li

    2017-01-01

    Full Text Available To solve the cloud service supplier selection problem under the background of cloud computing emergence, an integrated group decision method is proposed. The cloud service supplier selection index framework is built from two perspectives of technology and technology management. Support vector machine- (SVM- based classification model is applied for the preliminary screening to reduce the number of candidate suppliers. A triangular fuzzy number-rough sets-analytic hierarchy process (TFN-RS-AHP method is designed to calculate supplier’s index value by expert’s wisdom and experience. The index weight is determined by criteria importance through intercriteria correlation (CRITIC. The suppliers are evaluated by the improved TOPSIS replacing Euclidean distance with connection distance (TOPSIS-CD. An electric power enterprise’s case is given to illustrate the correctness and feasibility of the proposed method.

  12. Discrimination between Alzheimer's Disease and Mild Cognitive Impairment Using SOM and PSO-SVM

    Directory of Open Access Journals (Sweden)

    Shih-Ting Yang

    2013-01-01

    Full Text Available In this study, an MRI-based classification framework was proposed to distinguish the patients with AD and MCI from normal participants by using multiple features and different classifiers. First, we extracted features (volume and shape from MRI data by using a series of image processing steps. Subsequently, we applied principal component analysis (PCA to convert a set of features of possibly correlated variables into a smaller set of values of linearly uncorrelated variables, decreasing the dimensions of feature space. Finally, we developed a novel data mining framework in combination with support vector machine (SVM and particle swarm optimization (PSO for the AD/MCI classification. In order to compare the hybrid method with traditional classifier, two kinds of classifiers, that is, SVM and a self-organizing map (SOM, were trained for patient classification. With the proposed framework, the classification accuracy is improved up to 82.35% and 77.78% in patients with AD and MCI. The result achieved up to 94.12% and 88.89% in AD and MCI by combining the volumetric features and shape features and using PCA. The present results suggest that novel multivariate methods of pattern matching reach a clinically relevant accuracy for the a priori prediction of the progression from MCI to AD.

  13. Automated Classification and Removal of EEG Artifacts With SVM and Wavelet-ICA.

    Science.gov (United States)

    Sai, Chong Yeh; Mokhtar, Norrima; Arof, Hamzah; Cumming, Paul; Iwahashi, Masahiro

    2018-05-01

    Brain electrical activity recordings by electroencephalography (EEG) are often contaminated with signal artifacts. Procedures for automated removal of EEG artifacts are frequently sought for clinical diagnostics and brain-computer interface applications. In recent years, a combination of independent component analysis (ICA) and discrete wavelet transform has been introduced as standard technique for EEG artifact removal. However, in performing the wavelet-ICA procedure, visual inspection or arbitrary thresholding may be required for identifying artifactual components in the EEG signal. We now propose a novel approach for identifying artifactual components separated by wavelet-ICA using a pretrained support vector machine (SVM). Our method presents a robust and extendable system that enables fully automated identification and removal of artifacts from EEG signals, without applying any arbitrary thresholding. Using test data contaminated by eye blink artifacts, we show that our method performed better in identifying artifactual components than did existing thresholding methods. Furthermore, wavelet-ICA in conjunction with SVM successfully removed target artifacts, while largely retaining the EEG source signals of interest. We propose a set of features including kurtosis, variance, Shannon's entropy, and range of amplitude as training and test data of SVM to identify eye blink artifacts in EEG signals. This combinatorial method is also extendable to accommodate multiple types of artifacts present in multichannel EEG. We envision future research to explore other descriptive features corresponding to other types of artifactual components.

  14. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition

    Science.gov (United States)

    Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

    2017-01-01

    Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition. PMID:28937987

  15. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

    Science.gov (United States)

    Song, QingJun; Jiang, HaiYan; Song, Qinghui; Zhao, XieGuang; Wu, Xiaoxuan

    2017-01-01

    Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score) feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB) algorithm plus Support vector machine (SVM) is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

  16. Integrated Features by Administering the Support Vector Machine (SVM of Translational Initiations Sites in Alternative Polymorphic Contex

    Directory of Open Access Journals (Sweden)

    Nurul Arneida Husin

    2012-04-01

    Full Text Available Many algorithms and methods have been proposed for classification problems in bioinformatics. In this study, the discriminative approach in particular support vector machines (SVM is employed to recognize the studied TIS patterns. The applied discriminative approach is used to learn about some discriminant functions of samples that have been labelled as positive or negative. After learning, the discriminant functions are employed to decide whether a new sample is true or false. In this study, support vector machines (SVM is employed to recognize the patterns for studied translational initiation sites in alternative weak context. The method has been optimized with the best parameters selected; c=100, E=10-6 and ex=2 for non linear kernel function. Results show that with top 5 features and non linear kernel, the best prediction accuracy achieved is 95.8%. J48 algorithm is applied to compare with SVM with top 15 features and the results show a good prediction accuracy of 95.8%. This indicates that the top 5 features selected by the IGR method and that are performed by SVM are sufficient to use in the prediction of TIS in weak contexts.

  17. Prediction of size-fractionated airborne particle-bound metals using MLR, BP-ANN and SVM analyses.

    Science.gov (United States)

    Leng, Xiang'zi; Wang, Jinhua; Ji, Haibo; Wang, Qin'geng; Li, Huiming; Qian, Xin; Li, Fengying; Yang, Meng

    2017-08-01

    Size-fractionated heavy metal concentrations were observed in airborne particulate matter (PM) samples collected from 2014 to 2015 (spanning all four seasons) from suburban (Xianlin) and industrial (Pukou) areas in Nanjing, a megacity of southeast China. Rapid prediction models of size-fractionated metals were established based on multiple linear regression (MLR), back propagation artificial neural network (BP-ANN) and support vector machine (SVM) by using meteorological factors and PM concentrations as input parameters. About 38% and 77% of PM 2.5 concentrations in Xianlin and Pukou, respectively, were beyond the Chinese National Ambient Air Quality Standard limit of 75 μg/m 3 . Nearly all elements had higher concentrations in industrial areas, and in winter among the four seasons. Anthropogenic elements such as Pb, Zn, Cd and Cu showed larger percentages in the fine fraction (ø≤2.5 μm), whereas the crustal elements including Al, Ba, Fe, Ni, Sr and Ti showed larger percentages in the coarse fraction (ø > 2.5 μm). SVM showed a higher training correlation coefficient (R), and lower mean absolute error (MAE) as well as lower root mean square error (RMSE), than MLR and BP-ANN for most metals. All the three methods showed better prediction results for Ni, Al, V, Cd and As, whereas relatively poor for Cr and Fe. The daily airborne metal concentrations in 2015 were then predicted by the fully trained SVM models and the results showed the heaviest pollution of airborne heavy metals occurred in December and January, whereas the lightest pollution occurred in June and July. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Effective and efficient Grassfinch kernel for SVM classification and its application to recognition based on image set

    International Nuclear Information System (INIS)

    Du, Genyuan; Tian, Shengli; Qiu, Yingyu; Xu, Chunyan

    2016-01-01

    This paper presents an effective and efficient kernel approach to recognize image set which is represented as a point on extended Grassmannian manifold. Several recent studies focus on the applicability of discriminant analysis on Grassmannian manifold and suffer from not obtaining the inherent nonlinear structure of the data itself. Therefore, we propose an extension of Grassmannian manifold to address this issue. Instead of using a linear data embedding with PCA, we develop a non-linear data embedding of such manifold using kernel PCA. This paper mainly consider three folds: 1) introduce a non-linear data embedding of extended Grassmannian manifold, 2) derive a distance metric of Grassmannian manifold, 3) develop an effective and efficient Grassmannian kernel for SVM classification. The extended Grassmannian manifold naturally arises in the application to recognition based on image set, such as face and object recognition. Experiments on several standard databases show better classification accuracy. Furthermore, experimental results indicate that our proposed approach significantly reduces time complexity in comparison to graph embedding discriminant analysis.

  19. Prediction of Flood Warning in Taiwan Using Nonlinear SVM with Simulated Annealing Algorithm

    Science.gov (United States)

    Lee, C.

    2013-12-01

    The issue of the floods is important in Taiwan. It is because the narrow and high topography of the island make lots of rivers steep in Taiwan. The tropical depression likes typhoon always causes rivers to flood. Prediction of river flow under the extreme rainfall circumstances is important for government to announce the warning of flood. Every time typhoon passed through Taiwan, there were always floods along some rivers. The warning is classified to three levels according to the warning water levels in Taiwan. The propose of this study is to predict the level of floods warning from the information of precipitation, rainfall duration and slope of riverbed. To classify the level of floods warning by the above-mentioned information and modeling the problems, a machine learning model, nonlinear Support vector machine (SVM), is formulated to classify the level of floods warning. In addition, simulated annealing (SA), a probabilistic heuristic algorithm, is used to determine the optimal parameter of the SVM model. A case study of flooding-trend rivers of different gradients in Taiwan is conducted. The contribution of this SVM model with simulated annealing is capable of making efficient announcement for flood warning and keeping the danger of flood from residents along the rivers.

  20. Identification and evaluation of new reference genes in Gossypium hirsutum for accurate normalization of real-time quantitative RT-PCR data

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcio

    2010-03-01

    Full Text Available Abstract Background Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR. Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. Results By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1α5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhβTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. Conclusion We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references

  1. Identification and evaluation of new reference genes in Gossypium hirsutum for accurate normalization of real-time quantitative RT-PCR data.

    Science.gov (United States)

    Artico, Sinara; Nardeli, Sarah M; Brilhante, Osmundo; Grossi-de-Sa, Maria Fátima; Alves-Ferreira, Marcio

    2010-03-21

    Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1alpha5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhbetaTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene expression measures in

  2. Detection of Cross Site Scripting Attack in Wireless Networks Using n-Gram and SVM

    Directory of Open Access Journals (Sweden)

    Jun-Ho Choi

    2012-01-01

    Full Text Available Large parts of attacks targeting the web are aiming at the weak point of web application. Even though SQL injection, which is the form of XSS (Cross Site Scripting attacks, is not a threat to the system to operate the web site, it is very critical to the places that deal with the important information because sensitive information can be obtained and falsified. In this paper, the method to detect themalicious SQL injection script code which is the typical XSS attack using n-Gram indexing and SVM (Support Vector Machine is proposed. In order to test the proposed method, the test was conducted after classifying each data set as normal code and malicious code, and the malicious script code was detected by applying index term generated by n-Gram and data set generated by code dictionary to SVM classifier. As a result, when the malicious script code detection was conducted using n-Gram index term and SVM, the superior performance could be identified in detecting malicious script and the more improved results than existing methods could be seen in the malicious script code detection recall.

  3. A novel mutual information-based Boolean network inference method from time-series gene expression data.

    Directory of Open Access Journals (Sweden)

    Shohag Barman

    Full Text Available Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately.In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods.Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.

  4. An accurate clone-based haplotyping method by overlapping pool sequencing.

    Science.gov (United States)

    Li, Cheng; Cao, Changchang; Tu, Jing; Sun, Xiao

    2016-07-08

    Chromosome-long haplotyping of human genomes is important to identify genetic variants with differing gene expression, in human evolution studies, clinical diagnosis, and other biological and medical fields. Although several methods have realized haplotyping based on sequencing technologies or population statistics, accuracy and cost are factors that prohibit their wide use. Borrowing ideas from group testing theories, we proposed a clone-based haplotyping method by overlapping pool sequencing. The clones from a single individual were pooled combinatorially and then sequenced. According to the distinct pooling pattern for each clone in the overlapping pool sequencing, alleles for the recovered variants could be assigned to their original clones precisely. Subsequently, the clone sequences could be reconstructed by linking these alleles accordingly and assembling them into haplotypes with high accuracy. To verify the utility of our method, we constructed 130 110 clones in silico for the individual NA12878 and simulated the pooling and sequencing process. Ultimately, 99.9% of variants on chromosome 1 that were covered by clones from both parental chromosomes were recovered correctly, and 112 haplotype contigs were assembled with an N50 length of 3.4 Mb and no switch errors. A comparison with current clone-based haplotyping methods indicated our method was more accurate. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. PCA-MLP SVM distinction of salivary Raman spectra of dengue fever infection.

    Science.gov (United States)

    Radzol, A R M; Lee, Khuan Y; Mansor, W; Wong, P S; Looi, I

    2017-07-01

    Dengue fever (DF) is a disease of major concern caused by flavivirus infection. Delayed diagnosis leads to severe stages, which could be deadly. Of recent, non-structural protein (NS1) has been acknowledged as a biomarker, alternative to immunoglobulins for early detection of dengue in blood. Further, non-invasive detection of NS1 in saliva makes the approach more appealing. However, since its concentration in saliva is less than blood, a sensitive and specific technique, Surface Enhanced Raman Spectroscopy (SERS), is employed. Our work here intends to define an optimal PCA-SVM (Principal Component Analysis-Support Vector Machine) with Multilayer Layer Perceptron (MLP) kernel model to distinct between positive and negative NS1 infected samples from salivary SERS spectra, which, to the best of our knowledge, has never been explored. Salivary samples of DF positive and negative subjects were collected, pre-processed and analyzed. PCA and SVM classifier were then used to differentiate the SERS analyzed spectra. Since performance of the model depends on the PCA criterion and MLP parameters, both are examined in tandem. Its performance is also compared to our previous works on simulated NS1 salivary samples. It is found that the best PCA-SVM (MLP) model can be defined by 95 PCs from CPV criterion with P1 and P2 values of 0.01 and -0.2 respectively. A classification performance of [76.88%, 85.92%, 67.83%] is achieved.

  6. Combination of minimum enclosing balls classifier with SVM in coal-rock recognition.

    Directory of Open Access Journals (Sweden)

    QingJun Song

    Full Text Available Top-coal caving technology is a productive and efficient method in modern mechanized coal mining, the study of coal-rock recognition is key to realizing automation in comprehensive mechanized coal mining. In this paper we propose a new discriminant analysis framework for coal-rock recognition. In the framework, a data acquisition model with vibration and acoustic signals is designed and the caving dataset with 10 feature variables and three classes is got. And the perfect combination of feature variables can be automatically decided by using the multi-class F-score (MF-Score feature selection. In terms of nonlinear mapping in real-world optimization problem, an effective minimum enclosing ball (MEB algorithm plus Support vector machine (SVM is proposed for rapid detection of coal-rock in the caving process. In particular, we illustrate how to construct MEB-SVM classifier in coal-rock recognition which exhibit inherently complex distribution data. The proposed method is examined on UCI data sets and the caving dataset, and compared with some new excellent SVM classifiers. We conduct experiments with accuracy and Friedman test for comparison of more classifiers over multiple on the UCI data sets. Experimental results demonstrate that the proposed algorithm has good robustness and generalization ability. The results of experiments on the caving dataset show the better performance which leads to a promising feature selection and multi-class recognition in coal-rock recognition.

  7. The more you learn, the less you store : Memory-controlled incremental SVM for visual place recognition

    OpenAIRE

    Pronobis, Andrzej; Jie, Luo; Caputo, Barbara

    2010-01-01

    The capability to learn from experience is a key property for autonomous cognitive systems working in realistic settings. To this end, this paper presents an SVM-based algorithm, capable of learning model representations incrementally while keeping under control memory requirements. We combine an incremental extension of SVMs [43] with a method reducing the number of support vectors needed to build the decision function without any loss in performance [15] introducing a parameter which permit...

  8. Rapid and accurate synthesis of TALE genes from synthetic oligonucleotides.

    Science.gov (United States)

    Wang, Fenghua; Zhang, Hefei; Gao, Jingxia; Chen, Fengjiao; Chen, Sijie; Zhang, Cuizhen; Peng, Gang

    2016-01-01

    Custom synthesis of transcription activator-like effector (TALE) genes has relied upon plasmid libraries of pre-fabricated TALE-repeat monomers or oligomers. Here we describe a novel synthesis method that directly incorporates annealed synthetic oligonucleotides into the TALE-repeat units. Our approach utilizes iterative sets of oligonucleotides and a translational frame check strategy to ensure the high efficiency and accuracy of TALE-gene synthesis. TALE arrays of more than 20 repeats can be constructed, and the majority of the synthesized constructs have perfect sequences. In addition, this novel oligonucleotide-based method can readily accommodate design changes to the TALE repeats. We demonstrated an increased gene targeting efficiency against a genomic site containing a potentially methylated cytosine by incorporating non-conventional repeat variable di-residue (RVD) sequences.

  9. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  10. Real-time detection with AdaBoost-svm combination in various face orientation

    Science.gov (United States)

    Fhonna, R. P.; Nasution, M. K. M.; Tulus

    2018-03-01

    Most of the research has used algorithm AdaBoost-SVM for face detection. However, to our knowledge so far there is no research has been facing detection on real-time data with various orientations using the combination of AdaBoost and Support Vector Machine (SVM). Characteristics of complex and diverse face variations and real-time data in various orientations, and with a very complex application will slow down the performance of the face detection system this becomes a challenge in this research. Face orientation performed on the detection system, that is 900, 450, 00, -450, and -900. This combination method is expected to be an effective and efficient solution in various face orientations. The results showed that the highest average detection rate is on the face detection oriented 00 and the lowest detection rate is in the face orientation 900.

  11. Parameter optimization using GA in SVM to predict damage level of non-reshaped berm breakwater.

    Digital Repository Service at National Institute of Oceanography (India)

    Harish, N.; Lokesha.; Mandal, S.; Rao, S.; Patil, S.G.

    tools, such as Artificial Neural Network (ANN), Support Vector Machine (SVM), Adaptive Neuro Fuzzy Inference System (ANFIS), etc., are successfully used in different fields (Kazperkiewiecz et al 1995, Voga and Belchior 2006, Dong et al 2005). Also... Balas C.E., Koc M.L. and Tur R.(2010) ‘’Artificial neural networks based on principal component analysis, fuzzy systems and fuzzy neural networks for preliminary design of rubble mound breakwaters’’, Applied Ocean Research, 32, 425 – 433. Dong B., Cao C...

  12. Maximized Inter-Class Weighted Mean for Fast and Accurate Mitosis Cells Detection in Breast Cancer Histopathology Images.

    Science.gov (United States)

    Nateghi, Ramin; Danyali, Habibollah; Helfroush, Mohammad Sadegh

    2017-08-14

    Based on the Nottingham criteria, the number of mitosis cells in histopathological slides is an important factor in diagnosis and grading of breast cancer. For manual grading of mitosis cells, histopathology slides of the tissue are examined by pathologists at 40× magnification for each patient. This task is very difficult and time-consuming even for experts. In this paper, a fully automated method is presented for accurate detection of mitosis cells in histopathology slide images. First a method based on maximum-likelihood is employed for segmentation and extraction of mitosis cell. Then a novel Maximized Inter-class Weighted Mean (MIWM) method is proposed that aims at reducing the number of extracted non-mitosis candidates that results in reducing the false positive mitosis detection rate. Finally, segmented candidates are classified into mitosis and non-mitosis classes by using a support vector machine (SVM) classifier. Experimental results demonstrate a significant improvement in accuracy of mitosis cells detection in different grades of breast cancer histopathological images.

  13. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    Science.gov (United States)

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based

  14. Robust multi-tissue gene panel for cancer detection

    Directory of Open Access Journals (Sweden)

    Talantov Dmitri

    2010-06-01

    Full Text Available Abstract Background We have identified a set of genes whose relative mRNA expression levels in various solid tumors can be used to robustly distinguish cancer from matching normal tissue. Our current feature set consists of 113 gene probes for 104 unique genes, originally identified as differentially expressed in solid primary tumors in microarray data on Affymetrix HG-U133A platform in five tissue types: breast, colon, lung, prostate and ovary. For each dataset, we first identified a set of genes significantly differentially expressed in tumor vs. normal tissue at p-value = 0.05 using an experimentally derived error model. Our common cancer gene panel is the intersection of these sets of significantly dysregulated genes and can distinguish tumors from normal tissue on all these five tissue types. Methods Frozen tumor specimens were obtained from two commercial vendors Clinomics (Pittsfield, MA and Asterand (Detroit, MI. Biotinylated targets were prepared using published methods (Affymetrix, CA and hybridized to Affymetrix U133A GeneChips (Affymetrix, CA. Expression values for each gene were calculated using Affymetrix GeneChip analysis software MAS 5.0. We then used a software package called Genes@Work for differential expression discovery, and SVM light linear kernel for building classification models. Results We validate the predictability of this gene list on several publicly available data sets generated on the same platform. Of note, when analysing the lung cancer data set of Spira et al, using an SVM linear kernel classifier, our gene panel had 94.7% leave-one-out accuracy compared to 87.8% using the gene panel in the original paper. In addition, we performed high-throughput validation on the Dana Farber Cancer Institute GCOD database and several GEO datasets. Conclusions Our result showed the potential for this panel as a robust classification tool for multiple tumor types on the Affymetrix platform, as well as other whole genome arrays

  15. Data on Support Vector Machines (SVM model to forecast photovoltaic power

    Directory of Open Access Journals (Sweden)

    M. Malvoni

    2016-12-01

    Full Text Available The data concern the photovoltaic (PV power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled “Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data” (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015 [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA are applied to the Least Squares Support Vector Machines (LS-SVM to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.

  16. A review for detecting gene-gene interactions using machine learning methods in genetic epidemiology.

    Science.gov (United States)

    Koo, Ching Lee; Liew, Mei Jing; Mohamad, Mohd Saberi; Salleh, Abdul Hakim Mohamed

    2013-01-01

    Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.

  17. A Review for Detecting Gene-Gene Interactions Using Machine Learning Methods in Genetic Epidemiology

    Directory of Open Access Journals (Sweden)

    Ching Lee Koo

    2013-01-01

    Full Text Available Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs, support vector machine (SVM, and random forests (RFs in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.

  18. STUDY COMPARISON OF SVM-, K-NN- AND BACKPROPAGATION-BASED CLASSIFIER FOR IMAGE RETRIEVAL

    Directory of Open Access Journals (Sweden)

    Muhammad Athoillah

    2015-03-01

    Full Text Available Classification is a method for compiling data systematically according to the rules that have been set previously. In recent years classification method has been proven to help many people’s work, such as image classification, medical biology, traffic light, text classification etc. There are many methods to solve classification problem. This variation method makes the researchers find it difficult to determine which method is best for a problem. This framework is aimed to compare the ability of classification methods, such as Support Vector Machine (SVM, K-Nearest Neighbor (K-NN, and Backpropagation, especially in study cases of image retrieval with five category of image dataset. The result shows that K-NN has the best average result in accuracy with 82%. It is also the fastest in average computation time with 17,99 second during retrieve session for all categories class. The Backpropagation, however, is the slowest among three of them. In average it needed 883 second for training session and 41,7 second for retrieve session.

  19. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  20. Remaining Useful Life Prediction for Lithium-Ion Batteries Based on Gaussian Processes Mixture

    Science.gov (United States)

    Li, Lingling; Wang, Pengchong; Chao, Kuei-Hsiang; Zhou, Yatong; Xie, Yang

    2016-01-01

    The remaining useful life (RUL) prediction of Lithium-ion batteries is closely related to the capacity degeneration trajectories. Due to the self-charging and the capacity regeneration, the trajectories have the property of multimodality. Traditional prediction models such as the support vector machines (SVM) or the Gaussian Process regression (GPR) cannot accurately characterize this multimodality. This paper proposes a novel RUL prediction method based on the Gaussian Process Mixture (GPM). It can process multimodality by fitting different segments of trajectories with different GPR models separately, such that the tiny differences among these segments can be revealed. The method is demonstrated to be effective for prediction by the excellent predictive result of the experiments on the two commercial and chargeable Type 1850 Lithium-ion batteries, provided by NASA. The performance comparison among the models illustrates that the GPM is more accurate than the SVM and the GPR. In addition, GPM can yield the predictive confidence interval, which makes the prediction more reliable than that of traditional models. PMID:27632176

  1. Enhanced gene ranking approaches using modified trace ratio algorithm for gene expression data

    Directory of Open Access Journals (Sweden)

    Shruti Mishra

    Full Text Available Microarray technology enables the understanding and investigation of gene expression levels by analyzing high dimensional datasets that contain few samples. Over time, microarray expression data have been collected for studying the underlying biological mechanisms of disease. One such application for understanding the mechanism is by constructing a gene regulatory network (GRN. One of the foremost key criteria for GRN discovery is gene selection. Choosing a generous set of genes for the structure of the network is highly desirable. For this role, two suitable methods were proposed for selection of appropriate genes. The first approach comprises a gene selection method called Information gain, where the dataset is reformed and fused with another distinct algorithm called Trace Ratio (TR. Our second method is the implementation of our projected modified TR algorithm, where the scoring base for finding weight matrices has been re-designed. Both the methods' efficiency was shown with different classifiers that include variants of the Artificial Neural Network classifier, such as Resilient Propagation, Quick Propagation, Back Propagation, Manhattan Propagation and Radial Basis Function Neural Network and also the Support Vector Machine (SVM classifier. In the study, it was confirmed that both of the proposed methods worked well and offered high accuracy with a lesser number of iterations as compared to the original Trace Ratio algorithm. Keywords: Gene regulatory network, Gene selection, Information gain, Trace ratio, Canonical correlation analysis, Classification

  2. Whole genome sequencing options for bacterial strain typing and epidemiologic analysis based on single nucleotide polymorphism versus gene-by-gene-based approaches.

    Science.gov (United States)

    Schürch, A C; Arredondo-Alonso, S; Willems, R J L; Goering, R V

    2018-04-01

    Whole genome sequence (WGS)-based strain typing finds increasing use in the epidemiologic analysis of bacterial pathogens in both public health as well as more localized infection control settings. This minireview describes methodologic approaches that have been explored for WGS-based epidemiologic analysis and considers the challenges and pitfalls of data interpretation. Personal collection of relevant publications. When applying WGS to study the molecular epidemiology of bacterial pathogens, genomic variability between strains is translated into measures of distance by determining single nucleotide polymorphisms in core genome alignments or by indexing allelic variation in hundreds to thousands of core genes, assigning types to unique allelic profiles. Interpreting isolate relatedness from these distances is highly organism specific, and attempts to establish species-specific cutoffs are unlikely to be generally applicable. In cases where single nucleotide polymorphism or core gene typing do not provide the resolution necessary for accurate assessment of the epidemiology of bacterial pathogens, inclusion of accessory gene or plasmid sequences may provide the additional required discrimination. As with all epidemiologic analysis, realizing the full potential of the revolutionary advances in WGS-based approaches requires understanding and dealing with issues related to the fundamental steps of data generation and interpretation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  3. Development of Health Parameter Model for Risk Prediction of CVD Using SVM

    Directory of Open Access Journals (Sweden)

    P. Unnikrishnan

    2016-01-01

    Full Text Available Current methods of cardiovascular risk assessment are performed using health factors which are often based on the Framingham study. However, these methods have significant limitations due to their poor sensitivity and specificity. We have compared the parameters from the Framingham equation with linear regression analysis to establish the effect of training of the model for the local database. Support vector machine was used to determine the effectiveness of machine learning approach with the Framingham health parameters for risk assessment of cardiovascular disease (CVD. The result shows that while linear model trained using local database was an improvement on Framingham model, SVM based risk assessment model had high sensitivity and specificity of prediction of CVD. This indicates that using the health parameters identified using Framingham study, machine learning approach overcomes the low sensitivity and specificity of Framingham model.

  4. Semi-supervised prediction of gene regulatory networks using machine learning algorithms.

    Science.gov (United States)

    Patel, Nihir; Wang, Jason T L

    2015-10-01

    Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.

  5. Parameters selection in gene selection using Gaussian kernel support vector machines by genetic algorithm

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear statistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two representative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method performs well in selecting genes and achieves high classification accuracies with these genes.

  6. Forecasting of Power Grid Investment in China Based on Support Vector Machine Optimized by Differential Evolution Algorithm and Grey Wolf Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Shuyu Dai

    2018-04-01

    Full Text Available In recent years, the construction of China’s power grid has experienced rapid development, and its scale has leaped into the first place in the world. Accurate and effective prediction of power grid investment can not only help pool funds and rationally arrange investment in power grid construction, but also reduce capital costs and economic risks, which plays a crucial role in promoting power grid investment planning and construction process. In order to forecast the power grid investment of China accurately, firstly on the basis of analyzing the influencing factors of power grid investment, the influencing factors system for China’s power grid investment forecasting is constructed in this article. The method of grey relational analysis is used for screening the main influencing factors as the prediction model input. Then, a novel power grid investment prediction model based on DE-GWO-SVM (support vector machine optimized by differential evolution and grey wolf optimization algorithm is proposed. Next, two cases are taken for empirical analysis to prove that the DE-GWO-SVM model has strong generalization capacity and has achieved a good prediction effect for power grid investment forecasting in China. Finally, the DE-GWO-SVM model is adopted to forecast power grid investment in China from 2018 to 2022.

  7. A simple and accurate two-step long DNA sequences synthesis strategy to improve heterologous gene expression in pichia.

    Directory of Open Access Journals (Sweden)

    Jiang-Ke Yang

    Full Text Available In vitro gene chemical synthesis is a powerful tool to improve the expression of gene in heterologous system. In this study, a two-step gene synthesis strategy that combines an assembly PCR and an overlap extension PCR (AOE was developed. In this strategy, the chemically synthesized oligonucleotides were assembled into several 200-500 bp fragments with 20-25 bp overlap at each end by assembly PCR, and then an overlap extension PCR was conducted to assemble all these fragments into a full length DNA sequence. Using this method, we de novo designed and optimized the codon of Rhizopus oryzae lipase gene ROL (810 bp and Aspergillus niger phytase gene phyA (1404 bp. Compared with the original ROL gene and phyA gene, the codon-optimized genes expressed at a significantly higher level in yeasts after methanol induction. We believe this AOE method to be of special interest as it is simple, accurate and has no limitation with respect to the size of the gene to be synthesized. Combined with de novo design, this method allows the rapid synthesis of a gene optimized for expression in the system of choice and production of sufficient biological material for molecular characterization and biotechnological application.

  8. Metabolic changes in rat serum after administration of suberoylanilide hydroxamic acid and discriminated by SVM.

    Science.gov (United States)

    Yu, J; Wu, H; Lin, Z; Su, K; Zhang, J; Sun, F; Wang, X; Wen, C; Cao, H; Hu, L

    2017-12-01

    Suberoylanilide hydroxamic acid (SAHA) exerts marked anticancer effects via promotion of apoptosis, cell cycle arrest, and prevention of oncogene expression. In this study, serum metabolomics and artificial intelligence recognition were used to investigate SAHA toxicity. Forty rats (220 ± 20 g) were randomly divided into control and three SAHA groups (low, medium, and high); the experimental groups were treated with 12.3, 24.5, or 49.0 mg kg -1 SAHA once a day via intragastric administration. After 7 days, blood samples from the four groups were collected and analyzed by gas chromatography-mass spectrometry, and pathological changes in the liver were examined using microscopy. The results showed that increased levels of urea, oleic acid, and glutaconic acid were the most significant indicators of toxicity. Octadecanoic acid, pentadecanoic acid, glycerol, propanoic acid, and uric acid levels were lower in the high SAHA group. Microscopic observation revealed no obvious damage to the liver. Based on these data, a support vector machine (SVM) discrimination model was established that recognized the metabolic changes in the three SAHA groups and the control group with 100% accuracy. In conclusion, the main toxicity caused by SAHA was due to excessive metabolism of saturated fatty acids, which could be recognized by an SVM model.

  9. SVM prediction of ligand-binding sites in bacterial lipoproteins employing shape and physio-chemical descriptors.

    Science.gov (United States)

    Kadam, Kiran; Prabhakar, Prashant; Jayaraman, V K

    2012-11-01

    Bacterial lipoproteins play critical roles in various physiological processes including the maintenance of pathogenicity and numbers of them are being considered as potential candidates for generating novel vaccines. In this work, we put forth an algorithm to identify and predict ligand-binding sites in bacterial lipoproteins. The method uses three types of pocket descriptors, namely fpocket descriptors, 3D Zernike descriptors and shell descriptors, and combines them with Support Vector Machine (SVM) method for the classification. The three types of descriptors represent shape-based properties of the pocket as well as its local physio-chemical features. All three types of descriptors, along with their hybrid combinations are evaluated with SVM and to improve classification performance, WEKA-InfoGain feature selection is applied. Results obtained in the study show that the classifier successfully differentiates between ligand-binding and non-binding pockets. For the combination of three types of descriptors, 10 fold cross-validation accuracy of 86.83% is obtained for training while the selected model achieved test Matthews Correlation Coefficient (MCC) of 0.534. Individually or in combination with new and existing methods, our model can be a very useful tool for the prediction of potential ligand-binding sites in bacterial lipoproteins.

  10. Using visible and near-infrared diffuse reflectance spectroscopy for predicting soil properties based on regression with peaks parameters as derived from continuum-removed spectra

    Science.gov (United States)

    Vasat, Radim; Klement, Ales; Jaksik, Ondrej; Kodesova, Radka; Drabek, Ondrej; Boruvka, Lubos

    2014-05-01

    Visible and near-infrared diffuse reflectance spectroscopy (VNIR-DRS) provides a rapid and inexpensive tool for simultaneous prediction of a variety of soil properties. Usually, some sophisticated multivariate mathematical or statistical methods are employed in order to extract the required information from the raw spectra measurement. For this purpose especially the Partial least squares regression (PLSR) and Support vector machines (SVM) are the most frequently used. These methods generally benefit from the complexity with which the soil spectra are treated. But it is interesting that also techniques that focus only on a single spectral feature, such as a simple linear regression with selected continuum-removed spectra (CRS) characteristic (e.g. peak depth), can often provide competitive results. Therefore, we decided to enhance the potential of CRS taking into account all possible CRS peak parameters (area, width and depth) and develop a comprehensive methodology based on multiple linear regression approach. The eight considered soil properties were oxidizable carbon content (Cox), exchangeable (pHex) and active soil pH (pHa), particle and bulk density, CaCO3 content, crystalline and amorphous (Fed) and amorphous Fe (Feox) forms. In four cases (pHa, bulk density, Fed and Feox), of which two (Fed and Feox) were predicted reliably accurately (0.50 interestingly, in the case of particle density, the presented approach outperformed the PLSR and SVM dramatically offering a fairly accurate prediction (R2cv = 0.827) against two failures (R2cv = 0.034 and 0.121 for PLSR and SVM, resp.). In last two cases (Cox and CaCO3) a slightly worse results were achieved then with PLSR and SVM with overall fairly accurate prediction (R2cv > 0.80). Acknowledgment: Authors acknowledge the financial support of the Ministry of Agriculture of the Czech Republic (grant No. QJ1230319).

  11. Penerapan Support Vector Machine (SVM untuk Pengkategorian Penelitian

    Directory of Open Access Journals (Sweden)

    Fithri Selva Jumeilah

    2017-07-01

    Full Text Available Research every college will continue to grow. Research will be stored in softcopy and hardcopy. The preparation of the research should be categorized in order to facilitate the search for people who need reference. To categorize the research, we need a method for text mining, one of them is with the implementation of Support Vector Machines (SVM. The data used to recognize the characteristics of each category then it takes secondary data which is a collection of abstracts of research. The data will be pre-processed with several stages: case folding converts all the letters into lowercase, stop words removal removal of very common words, tokenizing discard punctuation, and stemming searching for root words by removing the prefix and suffix. Further data that has undergone preprocessing will be converted into a numerical form with for the term weighting stage that is the weighting contribution of each word. From the results of term weighting then obtained data that can be used for data training and test data. The training process is done by providing input in the form of text data that is known to the class or category. Then by using the Support Vector Machines algorithm, the input data is transformed into a rule, function, or knowledge model that can be used in the prediction process. From the results of this study obtained that the categorization of research produced by SVM has been very good. This is proven by the results of the test which resulted in an accuracy of 90%.

  12. Fast subcellular localization by cascaded fusion of signal-based and homology-based methods

    Directory of Open Access Journals (Sweden)

    Wang Wei

    2011-10-01

    Full Text Available Abstract Background The functions of proteins are closely related to their subcellular locations. In the post-genomics era, the amount of gene and protein data grows exponentially, which necessitates the prediction of subcellular localization by computational means. Results This paper proposes mitigating the computation burden of alignment-based approaches to subcellular localization prediction by a cascaded fusion of cleavage site prediction and profile alignment. Specifically, the informative segments of protein sequences are identified by a cleavage site predictor using the information in their N-terminal shorting signals. Then, the sequences are truncated at the cleavage site positions, and the shortened sequences are passed to PSI-BLAST for computing their profiles. Subcellular localization are subsequently predicted by a profile-to-profile alignment support-vector-machine (SVM classifier. To further reduce the training and recognition time of the classifier, the SVM classifier is replaced by a new kernel method based on the perturbational discriminant analysis (PDA. Conclusions Experimental results on a new dataset based on Swiss-Prot Release 57.5 show that the method can make use of the best property of signal- and homology-based approaches and can attain an accuracy comparable to that achieved by using full-length sequences. Analysis of profile-alignment score matrices suggest that both profile creation time and profile alignment time can be reduced without significant reduction in subcellular localization accuracy. It was found that PDA enjoys a short training time as compared to the conventional SVM. We advocate that the method will be important for biologists to conduct large-scale protein annotation or for bioinformaticians to perform preliminary investigations on new algorithms that involve pairwise alignments.

  13. Damage level prediction of non-reshaped berm breakwater using ANN, SVM and ANFIS models

    Digital Repository Service at National Institute of Oceanography (India)

    Mandal, S.; SubbaRao; Harish, N.; Lokesha

    Marine Structures Laboratory, Department of Applied Mechanics and Hydraulics, NITK, Surathkal, India. Soft computing techniques like Artificial Neural Network (ANN), Support Vector Machine (SVM) and Adaptive Neuro Fuzzy Inference system (ANFIS) models...

  14. SVM-based prediction of propeptide cleavage sites in spider toxins identifies toxin innovation in an Australian tarantula.

    Directory of Open Access Journals (Sweden)

    Emily S W Wong

    Full Text Available Spider neurotoxins are commonly used as pharmacological tools and are a popular source of novel compounds with therapeutic and agrochemical potential. Since venom peptides are inherently toxic, the host spider must employ strategies to avoid adverse effects prior to venom use. It is partly for this reason that most spider toxins encode a protective proregion that upon enzymatic cleavage is excised from the mature peptide. In order to identify the mature toxin sequence directly from toxin transcripts, without resorting to protein sequencing, the propeptide cleavage site in the toxin precursor must be predicted bioinformatically. We evaluated different machine learning strategies (support vector machines, hidden Markov model and decision tree and developed an algorithm (SpiderP for prediction of propeptide cleavage sites in spider toxins. Our strategy uses a support vector machine (SVM framework that combines both local and global sequence information. Our method is superior or comparable to current tools for prediction of propeptide sequences in spider toxins. Evaluation of the SVM method on an independent test set of known toxin sequences yielded 96% sensitivity and 100% specificity. Furthermore, we sequenced five novel peptides (not used to train the final predictor from the venom of the Australian tarantula Selenotypus plumipes to test the accuracy of the predictor and found 80% sensitivity and 99.6% 8-mer specificity. Finally, we used the predictor together with homology information to predict and characterize seven groups of novel toxins from the deeply sequenced venom gland transcriptome of S. plumipes, which revealed structural complexity and innovations in the evolution of the toxins. The precursor prediction tool (SpiderP is freely available on ArachnoServer (http://www.arachnoserver.org/spiderP.html, a web portal to a comprehensive relational database of spider toxins. All training data, test data, and scripts used are available from

  15. Energy-Based Metrics for Arthroscopic Skills Assessment.

    Science.gov (United States)

    Poursartip, Behnaz; LeBel, Marie-Eve; McCracken, Laura C; Escoto, Abelardo; Patel, Rajni V; Naish, Michael D; Trejos, Ana Luisa

    2017-08-05

    Minimally invasive skills assessment methods are essential in developing efficient surgical simulators and implementing consistent skills evaluation. Although numerous methods have been investigated in the literature, there is still a need to further improve the accuracy of surgical skills assessment. Energy expenditure can be an indication of motor skills proficiency. The goals of this study are to develop objective metrics based on energy expenditure, normalize these metrics, and investigate classifying trainees using these metrics. To this end, different forms of energy consisting of mechanical energy and work were considered and their values were divided by the related value of an ideal performance to develop normalized metrics. These metrics were used as inputs for various machine learning algorithms including support vector machines (SVM) and neural networks (NNs) for classification. The accuracy of the combination of the normalized energy-based metrics with these classifiers was evaluated through a leave-one-subject-out cross-validation. The proposed method was validated using 26 subjects at two experience levels (novices and experts) in three arthroscopic tasks. The results showed that there are statistically significant differences between novices and experts for almost all of the normalized energy-based metrics. The accuracy of classification using SVM and NN methods was between 70% and 95% for the various tasks. The results show that the normalized energy-based metrics and their combination with SVM and NN classifiers are capable of providing accurate classification of trainees. The assessment method proposed in this study can enhance surgical training by providing appropriate feedback to trainees about their level of expertise and can be used in the evaluation of proficiency.

  16. Shallow water bathymetry mapping using Support Vector Machine (SVM) technique and multispectral imagery

    NARCIS (Netherlands)

    Misra, Ankita; Vojinovic, Zoran; Ramakrishnan, Balaji; Luijendijk, Arjen; Ranasinghe, Roshanka

    2018-01-01

    Satellite imagery along with image processing techniques prove to be efficient tools for bathymetry retrieval as they provide time and cost-effective alternatives to traditional methods of water depth estimation. In this article, a nonlinear machine learning technique of Support Vector Machine (SVM)

  17. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    OpenAIRE

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better ...

  18. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    Science.gov (United States)

    Zhou, Xionghui; Liu, Juan

    2014-01-01

    Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for

  19. An Object-Based Classification of Mangroves Using a Hybrid Decision Tree—Support Vector Machine Approach

    Directory of Open Access Journals (Sweden)

    Benjamin W. Heumann

    2011-11-01

    Full Text Available Mangroves provide valuable ecosystem goods and services such as carbon sequestration, habitat for terrestrial and marine fauna, and coastal hazard mitigation. The use of satellite remote sensing to map mangroves has become widespread as it can provide accurate, efficient, and repeatable assessments. Traditional remote sensing approaches have failed to accurately map fringe mangroves and true mangrove species due to relatively coarse spatial resolution and/or spectral confusion with landward vegetation. This study demonstrates the use of the new Worldview-2 sensor, Object-based image analysis (OBIA, and support vector machine (SVM classification to overcome both of these limitations. An exploratory spectral separability showed that individual mangrove species could not be spectrally separated, but a distinction between true and associate mangrove species could be made. An OBIA classification was used that combined a decision-tree classification with the machine-learning SVM classification. Results showed an overall accuracy greater than 94% (kappa = 0.863 for classifying true mangroves species and other dense coastal vegetation at the object level. There remain serious challenges to accurately mapping fringe mangroves using remote sensing data due to spectral similarity of mangrove and associate species, lack of clear zonation between species, and mixed pixel effects, especially when vegetation is sparse or degraded.

  20. Identification of a 251 gene expression signature that can accurately detect M. tuberculosis in patients with and without HIV co-infection.

    Directory of Open Access Journals (Sweden)

    Noor Dawany

    Full Text Available BACKGROUND: Co-infection with tuberculosis (TB is the leading cause of death in HIV-infected individuals. However, diagnosis of TB, especially in the presence of an HIV co-infection, can be limiting due to the high inaccuracy associated with the use of conventional diagnostic methods. Here we report a gene signature that can identify a tuberculosis infection in patients co-infected with HIV as well as in the absence of HIV. METHODS: We analyzed global gene expression data from peripheral blood mononuclear cell (PBMC samples of patients that were either mono-infected with HIV or co-infected with HIV/TB and used support vector machines to identify a gene signature that can distinguish between the two classes. We then validated our results using publically available gene expression data from patients mono-infected with TB. RESULTS: Our analysis successfully identified a 251-gene signature that accurately distinguishes patients co-infected with HIV/TB from those infected with HIV only, with an overall accuracy of 81.4% (sensitivity = 76.2%, specificity = 86.4%. Furthermore, we show that our 251-gene signature can also accurately distinguish patients with active TB in the absence of an HIV infection from both patients with a latent TB infection and healthy controls (88.9-94.7% accuracy; 69.2-90% sensitivity and 90.3-100% specificity. We also demonstrate that the expression levels of the 251-gene signature diminish as a correlate of the length of TB treatment. CONCLUSIONS: A 251-gene signature is described to (a detect TB in the presence or absence of an HIV co-infection, and (b assess response to treatment following anti-TB therapy.

  1. Improving Accuracy of Intrusion Detection Model Using PCA and optimized SVM

    Directory of Open Access Journals (Sweden)

    Sumaiya Thaseen Ikram

    2016-06-01

    Full Text Available Intrusion detection is very essential for providing security to different network domains and is mostly used for locating and tracing the intruders. There are many problems with traditional intrusion detection models (IDS such as low detection capability against unknown network attack, high false alarm rate and insufficient analysis capability. Hence the major scope of the research in this domain is to develop an intrusion detection model with improved accuracy and reduced training time. This paper proposes a hybrid intrusiondetection model by integrating the principal component analysis (PCA and support vector machine (SVM. The novelty of the paper is the optimization of kernel parameters of the SVM classifier using automatic parameter selection technique. This technique optimizes the punishment factor (C and kernel parameter gamma (γ, thereby improving the accuracy of the classifier and reducing the training and testing time. The experimental results obtained on the NSL KDD and gurekddcup dataset show that the proposed technique performs better with higher accuracy, faster convergence speed and better generalization. Minimum resources are consumed as the classifier input requires reduced feature set for optimum classification. A comparative analysis of hybrid models with the proposed model is also performed.

  2. Ab-initio conformational epitope structure prediction using genetic algorithm and SVM for vaccine design.

    Science.gov (United States)

    Moghram, Basem Ameen; Nabil, Emad; Badr, Amr

    2018-01-01

    T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95

  3. Traffic Flow Prediction Model for Large-Scale Road Network Based on Cloud Computing

    Directory of Open Access Journals (Sweden)

    Zhaosheng Yang

    2014-01-01

    Full Text Available To increase the efficiency and precision of large-scale road network traffic flow prediction, a genetic algorithm-support vector machine (GA-SVM model based on cloud computing is proposed in this paper, which is based on the analysis of the characteristics and defects of genetic algorithm and support vector machine. In cloud computing environment, firstly, SVM parameters are optimized by the parallel genetic algorithm, and then this optimized parallel SVM model is used to predict traffic flow. On the basis of the traffic flow data of Haizhu District in Guangzhou City, the proposed model was verified and compared with the serial GA-SVM model and parallel GA-SVM model based on MPI (message passing interface. The results demonstrate that the parallel GA-SVM model based on cloud computing has higher prediction accuracy, shorter running time, and higher speedup.

  4. An Entropy-based gene selection method for cancer classification using microarray data

    Directory of Open Access Journals (Sweden)

    Krishnan Arun

    2005-03-01

    Full Text Available Abstract Background Accurate diagnosis of cancer subtypes remains a challenging problem. Building classifiers based on gene expression data is a promising approach; yet the selection of non-redundant but relevant genes is difficult. The selected gene set should be small enough to allow diagnosis even in regular clinical laboratories and ideally identify genes involved in cancer-specific regulatory pathways. Here an entropy-based method is proposed that selects genes related to the different cancer classes while at the same time reducing the redundancy among the genes. Results The present study identifies a subset of features by maximizing the relevance and minimizing the redundancy of the selected genes. A merit called normalized mutual information is employed to measure the relevance and the redundancy of the genes. In order to find a more representative subset of features, an iterative procedure is adopted that incorporates an initial clustering followed by data partitioning and the application of the algorithm to each of the partitions. A leave-one-out approach then selects the most commonly selected genes across all the different runs and the gene selection algorithm is applied again to pare down the list of selected genes until a minimal subset is obtained that gives a satisfactory accuracy of classification. The algorithm was applied to three different data sets and the results obtained were compared to work done by others using the same data sets Conclusion This study presents an entropy-based iterative algorithm for selecting genes from microarray data that are able to classify various cancer sub-types with high accuracy. In addition, the feature set obtained is very compact, that is, the redundancy between genes is reduced to a large extent. This implies that classifiers can be built with a smaller subset of genes.

  5. Analysis And Voice Recognition In Indonesian Language Using MFCC And SVM Method

    Directory of Open Access Journals (Sweden)

    Harvianto Harvianto

    2016-06-01

    Full Text Available Voice recognition technology is one of biometric technology. Sound is a unique part of the human being which made an individual can be easily distinguished one from another. Voice can also provide information such as gender, emotion, and identity of the speaker. This research will record human voices that pronounce digits between 0 and 9 with and without noise. Features of this sound recording will be extracted using Mel Frequency Cepstral Coefficient (MFCC. Mean, standard deviation, max, min, and the combination of them will be used to construct the feature vectors. This feature vectors then will be classified using Support Vector Machine (SVM. There will be two classification models. The first one is based on the speaker and the other one based on the digits pronounced. The classification model then will be validated by performing 10-fold cross-validation.The best average accuracy from two classification model is 91.83%. This result achieved using Mean + Standard deviation + Min + Max as features.

  6. A New Feature Extraction Method Based on EEMD and Multi-Scale Fuzzy Entropy for Motor Bearing

    Directory of Open Access Journals (Sweden)

    Huimin Zhao

    2016-12-01

    Full Text Available Feature extraction is one of the most important, pivotal, and difficult problems in mechanical fault diagnosis, which directly relates to the accuracy of fault diagnosis and the reliability of early fault prediction. Therefore, a new fault feature extraction method, called the EDOMFE method based on integrating ensemble empirical mode decomposition (EEMD, mode selection, and multi-scale fuzzy entropy is proposed to accurately diagnose fault in this paper. The EEMD method is used to decompose the vibration signal into a series of intrinsic mode functions (IMFs with a different physical significance. The correlation coefficient analysis method is used to calculate and determine three improved IMFs, which are close to the original signal. The multi-scale fuzzy entropy with the ability of effective distinguishing the complexity of different signals is used to calculate the entropy values of the selected three IMFs in order to form a feature vector with the complexity measure, which is regarded as the inputs of the support vector machine (SVM model for training and constructing a SVM classifier (EOMSMFD based on EDOMFE and SVM for fulfilling fault pattern recognition. Finally, the effectiveness of the proposed method is validated by real bearing vibration signals of the motor with different loads and fault severities. The experiment results show that the proposed EDOMFE method can effectively extract fault features from the vibration signal and that the proposed EOMSMFD method can accurately diagnose the fault types and fault severities for the inner race fault, the outer race fault, and rolling element fault of the motor bearing. Therefore, the proposed method provides a new fault diagnosis technology for rotating machinery.

  7. Mining Key Skeleton Poses with Latent SVM for Action Recognition

    Directory of Open Access Journals (Sweden)

    Xiaoqiang Li

    2017-01-01

    Full Text Available Human action recognition based on 3D skeleton has become an active research field in recent years with the recently developed commodity depth sensors. Most published methods analyze an entire 3D depth data, construct mid-level part representations, or use trajectory descriptor of spatial-temporal interest point for recognizing human activities. Unlike previous work, a novel and simple action representation is proposed in this paper which models the action as a sequence of inconsecutive and discriminative skeleton poses, named as key skeleton poses. The pairwise relative positions of skeleton joints are used as feature of the skeleton poses which are mined with the aid of the latent support vector machine (latent SVM. The advantage of our method is resisting against intraclass variation such as noise and large nonlinear temporal deformation of human action. We evaluate the proposed approach on three benchmark action datasets captured by Kinect devices: MSR Action 3D dataset, UTKinect Action dataset, and Florence 3D Action dataset. The detailed experimental results demonstrate that the proposed approach achieves superior performance to the state-of-the-art skeleton-based action recognition methods.

  8. Stochastic Optimized Relevance Feedback Particle Swarm Optimization for Content Based Image Retrieval

    Directory of Open Access Journals (Sweden)

    Muhammad Imran

    2014-01-01

    Full Text Available One of the major challenges for the CBIR is to bridge the gap between low level features and high level semantics according to the need of the user. To overcome this gap, relevance feedback (RF coupled with support vector machine (SVM has been applied successfully. However, when the feedback sample is small, the performance of the SVM based RF is often poor. To improve the performance of RF, this paper has proposed a new technique, namely, PSO-SVM-RF, which combines SVM based RF with particle swarm optimization (PSO. The aims of this proposed technique are to enhance the performance of SVM based RF and also to minimize the user interaction with the system by minimizing the RF number. The PSO-SVM-RF was tested on the coral photo gallery containing 10908 images. The results obtained from the experiments showed that the proposed PSO-SVM-RF achieved 100% accuracy in 8 feedback iterations for top 10 retrievals and 80% accuracy in 6 iterations for 100 top retrievals. This implies that with PSO-SVM-RF technique high accuracy rate is achieved at a small number of iterations.

  9. Application of machine learning on brain cancer multiclass classification

    Science.gov (United States)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  10. Global discriminative learning for higher-accuracy computational gene prediction.

    Directory of Open Access Journals (Sweden)

    Axel Bernal

    2007-03-01

    Full Text Available Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns.

  11. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    Science.gov (United States)

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Optimal parameters of the SVM for temperature prediction

    Directory of Open Access Journals (Sweden)

    X. Shi

    2015-05-01

    Full Text Available This paper established three different optimization models in order to predict the Foping station temperature value. The dimension was reduced to change multivariate climate factors into a few variables by principal component analysis (PCA. And the parameters of support vector machine (SVM were optimized with genetic algorithm (GA, particle swarm optimization (PSO and developed genetic algorithm. The most suitable method was applied for parameter optimization by comparing the results of three different models. The results are as follows: The developed genetic algorithm optimization parameters of the predicted values were closest to the measured value after the analog trend, and it is the most fitting measured value trends, and its homing speed is relatively fast.

  13. A meta-analysis based method for prioritizing candidate genes involved in a pre-specific function

    Directory of Open Access Journals (Sweden)

    Jingjing Zhai

    2016-12-01

    Full Text Available The identification of genes associated with a given biological function in plants remains a challenge, although network-based gene prioritization algorithms have been developed for Arabidopsis thaliana and many non-model plant species. Nevertheless, these network-based gene prioritization algorithms have encountered several problems; one in particular is that of unsatisfactory prediction accuracy due to limited network coverage, varying link quality, and/or uncertain network connectivity. Thus a model that integrates complementary biological data may be expected to increase the prediction accuracy of gene prioritization. Towards this goal, we developed a novel gene prioritization method named RafSee, to rank candidate genes using a random forest algorithm that integrates sequence, evolutionary, and epigenetic features of plants. Subsequently, we proposed an integrative approach named RAP (Rank Aggregation-based data fusion for gene Prioritization, in which an order statistics-based meta-analysis was used to aggregate the rank of the network-based gene prioritization method and RafSee, for accurately prioritizing candidate genes involved in a pre-specific biological function. Finally, we showcased the utility of RAP by prioritizing 380 flowering-time genes in Arabidopsis. The ‘leave-one-out’ cross-validation experiment showed that RafSee could work as a complement to a current state-of-art network-based gene prioritization system (AraNet v2. Moreover, RAP ranked 53.68% (204/380 flowering-time genes higher than AraNet v2, resulting in an 39.46% improvement in term of the first quartile rank. Further evaluations also showed that RAP was effective in prioritizing genes-related to different abiotic stresses. To enhance the usability of RAP for Arabidopsis and non-model plant species, an R package implementing the method is freely available at http://bioinfo.nwafu.edu.cn/software.

  14. BLE-BASED ACCURATE INDOOR LOCATION TRACKING FOR HOME AND OFFICE

    OpenAIRE

    Joonghong Park; Jaehoon Kim; Sungwon Kang

    2015-01-01

    Nowadays the use of smart mobile devices and the accompanying needs for emerging services relying on indoor location-based services (LBS) for mobile devices are rapidly increasing. For more accurate location tracking using Bluetooth Low Energy (BLE), this paper proposes a novel trilateration-based algorithm and presents experimental results that demonstrate its effectiveness.

  15. A nonlinear QSAR study using oscillating search and SVM as an efficient algorithm to model the inhibition of reverse transcriptase by HEPT derivatives

    International Nuclear Information System (INIS)

    Ferkous, F.; Saihi, Y.

    2018-01-01

    Quantitative structure-activity relationships were constructed for 107 inhibitors of HIV-1 reverse transcriptase that are derivatives of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT). A combination of a support vector machine (SVM) and oscillating search (OS) algorithms for feature selection was adopted to select the most appropriate descriptors. The application was optimized to obtain an SVM model to predict the biological activity EC50 of the HEPT derivatives with a minimum number of descriptors (SpMax4 B h (e) MLOGP MATS5m) and high values of R2 and Q2 (0.8662, 0.8769). The statistical results showed good correlation between the activity and three best descriptors were included in the best SVM model. The values of R2 and Q2 confirmed the stability and good predictive ability of the model. The SVM technique was adequate to produce an effective QSAR model and outperformed those in the literature and the predictive stages for the inhibitory activity of reverse transcriptase by HEPT derivatives. (author)

  16. Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

    Science.gov (United States)

    Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

    2006-11-01

    To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.

  17. Vision-Based Recognition of Activities by a Humanoid Robot

    Directory of Open Access Journals (Sweden)

    Mounîm A. El-Yacoubi

    2015-12-01

    Full Text Available We present an autonomous assistive robotic system for human activity recognition from video sequences. Due to the large variability inherent to video capture from a non-fixed robot (as opposed to a fixed camera, as well as the robot's limited computing resources, implementation has been guided by robustness to this variability and by memory and computing speed efficiency. To accommodate motion speed variability across users, we encode motion using dense interest point trajectories. Our recognition model harnesses the dense interest point bag-of-words representation through an intersection kernel-based SVM that better accommodates the large intra-class variability stemming from a robot operating in different locations and conditions. To contextually assess the engine as implemented in the robot, we compare it with the most recent approaches of human action recognition performed on public datasets (non-robot-based, including a novel approach of our own that is based on a two-layer SVM-hidden conditional random field sequential recognition model. The latter's performance is among the best within the recent state of the art. We show that our robot-based recognition engine, while less accurate than the sequential model, nonetheless shows good performances, especially given the adverse test conditions of the robot, relative to those of a fixed camera.

  18. Fluctuation localization imaging-based fluorescence in situ hybridization (fliFISH) for accurate detection and counting of RNA copies in single cells

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Yi; Hu, Dehong; Markillie, Lye Meng; Chrisler, William B.; Gaffrey, Matthew J.; Ansong, Charles; Sussel, Lori; Orr, Galya

    2017-10-04

    Quantitative gene expression analysis in intact single cells can be achieved using single molecule- based fluorescence in situ hybridization (smFISH). This approach relies on fluorescence intensity to distinguish between true signals, emitted from an RNA copy hybridized with multiple FISH sub-probes, and background noise. Thus, the precision in smFISH is often compromised by partial or nonspecific binding of sub-probes and tissue autofluorescence, limiting its accuracy. Here we provide an accurate approach for setting quantitative thresholds between true and false signals, which relies on blinking frequencies of photoswitchable dyes. This fluctuation localization imaging-based FISH (fliFISH) uses blinking frequency patterns, emitted from a transcript bound to multiple sub-probes, which are distinct from blinking patterns emitted from partial or nonspecifically bound sub-probes and autofluorescence. Using multicolor fliFISH, we identified radial gene expression patterns in mouse pancreatic islets for insulin, the transcription factor, NKX2-2, and their ratio (Nkx2-2/Ins2). These radial patterns, showing higher values in β cells at the islet core and lower values in peripheral cells, were lost in diabetic mouse islets. In summary, fliFISH provides an accurate, quantitative approach for detecting and counting true RNA copies and rejecting false signals by their distinct blinking frequency patterns, laying the foundation for reliable single-cell transcriptomics.

  19. A robust regression based on weighted LSSVM and penalized trimmed squares

    International Nuclear Information System (INIS)

    Liu, Jianyong; Wang, Yong; Fu, Chengqun; Guo, Jie; Yu, Qin

    2016-01-01

    Least squares support vector machine (LS-SVM) for nonlinear regression is sensitive to outliers in the field of machine learning. Weighted LS-SVM (WLS-SVM) overcomes this drawback by adding weight to each training sample. However, as the number of outliers increases, the accuracy of WLS-SVM may decrease. In order to improve the robustness of WLS-SVM, a new robust regression method based on WLS-SVM and penalized trimmed squares (WLSSVM–PTS) has been proposed. The algorithm comprises three main stages. The initial parameters are obtained by least trimmed squares at first. Then, the significant outliers are identified and eliminated by the Fast-PTS algorithm. The remaining samples with little outliers are estimated by WLS-SVM at last. The statistical tests of experimental results carried out on numerical datasets and real-world datasets show that the proposed WLSSVM–PTS is significantly robust than LS-SVM, WLS-SVM and LSSVM–LTS.

  20. Fault detection and diagnosis of an industrial steam turbine using fusion of SVM (support vector machine) and ANFIS (adaptive neuro-fuzzy inference system) classifiers

    Energy Technology Data Exchange (ETDEWEB)

    Salahshoor, Karim [Department of Instrumentation and Automation, Petroleum University of Technology, Tehran (Iran, Islamic Republic of); Kordestani, Mojtaba; Khoshro, Majid S. [Department of Control Engineering, Islamic Azad University South Tehran branch (Iran, Islamic Republic of)

    2010-12-15

    The subject of FDD (fault detection and diagnosis) has gained widespread industrial interest in machine condition monitoring applications. This is mainly due to the potential advantage to be achieved from reduced maintenance costs, improved productivity and increased machine availability. This paper presents a new FDD scheme for condition machinery of an industrial steam turbine using a data fusion methodology. Fusion of a SVM (support vector machine) classifier with an ANFIS (adaptive neuro-fuzzy inference system) classifier, integrated into a common framework, is utilized to enhance the fault detection and diagnostic tasks. For this purpose, a multi-attribute data is fused into aggregated values of a single attribute by OWA (ordered weighted averaging) operators. The simulation studies indicate that the resulting fusion-based scheme outperforms the individual SVM and ANFIS systems to detect and diagnose incipient steam turbine faults. (author)

  1. Comparison between SARS CoV and MERS CoV Using Apriori Algorithm, Decision Tree, SVM

    Directory of Open Access Journals (Sweden)

    Jang Seongpil

    2016-01-01

    Full Text Available MERS (Middle East Respiratory Syndrome is a worldwide disease these days. The number of infected people is 1038(08/03/2015 in Saudi Arabia and 186(08/03/2015 in South Korea. MERS is all over the world including Europe and the fatality rate is 38.8%, East Asia and the Middle East. The MERS is also known as a cousin of SARS (Severe Acute Respiratory Syndrome because both diseases show similar symptoms such as high fever and difficulty in breathing. This is why we compared MERS with SARS. We used data of the spike glycoprotein from NCBI. As a way of analyzing the protein, apriori algorithm, decision tree, SVM were used, and particularly SVM was iterated by normal, polynomial, and sigmoid. The result came out that the MERS and the SARS are alike but also different in some way.

  2. How to perform RT-qPCR accurately in plant species? A case study on flower colour gene expression in an azalea (Rhododendron simsii hybrids) mapping population.

    Science.gov (United States)

    De Keyser, Ellen; Desmet, Laurence; Van Bockstaele, Erik; De Riek, Jan

    2013-06-24

    Flower colour variation is one of the most crucial selection criteria in the breeding of a flowering pot plant, as is also the case for azalea (Rhododendron simsii hybrids). Flavonoid biosynthesis was studied intensively in several species. In azalea, flower colour can be described by means of a 3-gene model. However, this model does not clarify pink-coloration. The last decade gene expression studies have been implemented widely for studying flower colour. However, the methods used were often only semi-quantitative or quantification was not done according to the MIQE-guidelines. We aimed to develop an accurate protocol for RT-qPCR and to validate the protocol to study flower colour in an azalea mapping population. An accurate RT-qPCR protocol had to be established. RNA quality was evaluated in a combined approach by means of different techniques e.g. SPUD-assay and Experion-analysis. We demonstrated the importance of testing noRT-samples for all genes under study to detect contaminating DNA. In spite of the limited sequence information available, we prepared a set of 11 reference genes which was validated in flower petals; a combination of three reference genes was most optimal. Finally we also used plasmids for the construction of standard curves. This allowed us to calculate gene-specific PCR efficiencies for every gene to assure an accurate quantification. The validity of the protocol was demonstrated by means of the study of six genes of the flavonoid biosynthesis pathway. No correlations were found between flower colour and the individual expression profiles. However, the combination of early pathway genes (CHS, F3H, F3'H and FLS) is clearly related to co-pigmentation with flavonols. The late pathway genes DFR and ANS are to a minor extent involved in differentiating between coloured and white flowers. Concerning pink coloration, we could demonstrate that the lower intensity in this type of flowers is correlated to the expression of F3'H. Currently in plant

  3. Support vector machine-based exergetic modelling of a DI diesel engine running on biodiesel–diesel blends containing expanded polystyrene

    International Nuclear Information System (INIS)

    Shamshirband, Shahaboddin; Tabatabaei, Meisam; Aghbashlo, Mortaza; Yee, Por Lip; Petković, Dalibor

    2016-01-01

    Highlights: • SVM-based thermodynamic modelling of a DI diesel engine working with diesel/biodiesel blends containing EPS. • Comparison of SVM-WT, SVM-FFA, SVM-RBF, SVM-QPSO, and ANN approaches for exergetic modelling of the engine. • Satisfactory performance of the SVM-WT for performance modelling of the engine over the other approaches. - Abstract: In the present study, four Support Vector Machine-based (SVM-based) approaches and the standard artificial neural network (ANN) model were designed and compared in modelling the exergetic parameters of a DI diesel engine running on diesel/biodiesel blends containing expanded polystyrene (EPS) wastes. For this aim, the SVM was coupled with discrete wavelet transform (SVM-WT), firefly algorithm (SVM-FFA), radial basis function (SVM-RBF) and quantum particle swarm optimization (SVM-QPSO). The exergetic data were computed using mass, energy, and exergy balance equations for the engine at different speeds and loads as well as various biodiesel and EPS wastes quantities. Three statistical indicators namely root means square error, coefficient of determination and Pearson coefficient were used to access the capability of the developed approaches for exergetic performance modelling of the DI diesel engine. The modelling results indicated that the SVM-WT approach was more efficient in exergetic modelling of the engine than the other three approaches. Moreover, the results obtained confirmed the effectiveness of the SVM-WT model in identifying the most exergy-efficient combustion conditions and the best fuel composition for achieving the most cost-effective and eco-friendly combustion process.

  4. Support Vector Machine Based on Adaptive Acceleration Particle Swarm Optimization

    Science.gov (United States)

    Abdulameer, Mohammed Hasan; Othman, Zulaiha Ali

    2014-01-01

    Existing face recognition methods utilize particle swarm optimizer (PSO) and opposition based particle swarm optimizer (OPSO) to optimize the parameters of SVM. However, the utilization of random values in the velocity calculation decreases the performance of these techniques; that is, during the velocity computation, we normally use random values for the acceleration coefficients and this creates randomness in the solution. To address this problem, an adaptive acceleration particle swarm optimization (AAPSO) technique is proposed. To evaluate our proposed method, we employ both face and iris recognition based on AAPSO with SVM (AAPSO-SVM). In the face and iris recognition systems, performance is evaluated using two human face databases, YALE and CASIA, and the UBiris dataset. In this method, we initially perform feature extraction and then recognition on the extracted features. In the recognition process, the extracted features are used for SVM training and testing. During the training and testing, the SVM parameters are optimized with the AAPSO technique, and in AAPSO, the acceleration coefficients are computed using the particle fitness values. The parameters in SVM, which are optimized by AAPSO, perform efficiently for both face and iris recognition. A comparative analysis between our proposed AAPSO-SVM and the PSO-SVM technique is presented. PMID:24790584

  5. Support Vector Machine Based on Adaptive Acceleration Particle Swarm Optimization

    Directory of Open Access Journals (Sweden)

    Mohammed Hasan Abdulameer

    2014-01-01

    Full Text Available Existing face recognition methods utilize particle swarm optimizer (PSO and opposition based particle swarm optimizer (OPSO to optimize the parameters of SVM. However, the utilization of random values in the velocity calculation decreases the performance of these techniques; that is, during the velocity computation, we normally use random values for the acceleration coefficients and this creates randomness in the solution. To address this problem, an adaptive acceleration particle swarm optimization (AAPSO technique is proposed. To evaluate our proposed method, we employ both face and iris recognition based on AAPSO with SVM (AAPSO-SVM. In the face and iris recognition systems, performance is evaluated using two human face databases, YALE and CASIA, and the UBiris dataset. In this method, we initially perform feature extraction and then recognition on the extracted features. In the recognition process, the extracted features are used for SVM training and testing. During the training and testing, the SVM parameters are optimized with the AAPSO technique, and in AAPSO, the acceleration coefficients are computed using the particle fitness values. The parameters in SVM, which are optimized by AAPSO, perform efficiently for both face and iris recognition. A comparative analysis between our proposed AAPSO-SVM and the PSO-SVM technique is presented.

  6. Fault Diagnosis of Rotating Machinery Based on Multisensor Information Fusion Using SVM and Time-Domain Features

    Directory of Open Access Journals (Sweden)

    Ling-li Jiang

    2014-01-01

    Full Text Available Multisensor information fusion, when applied to fault diagnosis, the time-space scope, and the quantity of information are expanded compared to what could be acquired by a single sensor, so the diagnostic object can be described more comprehensively. This paper presents a methodology of fault diagnosis in rotating machinery using multisensor information fusion that all the features are calculated using vibration data in time domain to constitute fusional vector and the support vector machine (SVM is used for classification. The effectiveness of the presented methodology is tested by three case studies: diagnostic of faulty gear, rolling bearing, and identification of rotor crack. For each case study, the sensibilities of the features are analyzed. The results indicate that the peak factor is the most sensitive feature in the twelve time-domain features for identifying gear defect, and the mean, amplitude square, root mean square, root amplitude, and standard deviation are all sensitive for identifying gear, rolling bearing, and rotor crack defect comparatively.

  7. Fluoroscopic gating without implanted fiducial markers for lung cancer radiotherapy based on support vector machines

    International Nuclear Information System (INIS)

    Cui Ying; Dy, Jennifer G; Alexander, Brian; Jiang, Steve B

    2008-01-01

    Various problems with the current state-of-the-art techniques for gated radiotherapy have prevented this new treatment modality from being widely implemented in clinical routine. These problems are caused mainly by applying various external respiratory surrogates. There might be large uncertainties in deriving the tumor position from external respiratory surrogates. While tracking implanted fiducial markers has sufficient accuracy, this procedure may not be widely accepted due to the risk of pneumothorax. Previously, we have developed a technique to generate gating signals from fluoroscopic images without implanted fiducial markers using template matching methods (Berbeco et al 2005 Phys. Med. Biol. 50 4481-90, Cui et al 2007b Phys. Med. Biol. 52 741-55). In this note, our main contribution is to provide a totally different new view of the gating problem by recasting it as a classification problem. Then, we solve this classification problem by a well-studied powerful classification method called a support vector machine (SVM). Note that the goal of an automated gating tool is to decide when to turn the beam ON or OFF. We treat ON and OFF as the two classes in our classification problem. We create our labeled training data during the patient setup session by utilizing the reference gating signal, manually determined by a radiation oncologist. We then pre-process these labeled training images and build our SVM prediction model. During treatment delivery, fluoroscopic images are continuously acquired, pre-processed and sent as an input to the SVM. Finally, our SVM model will output the predicted labels as gating signals. We test the proposed technique on five sequences of fluoroscopic images from five lung cancer patients against the reference gating signal as ground truth. We compare the performance of the SVM to our previous template matching method (Cui et al 2007b Phys. Med. Biol. 52 741-55). We find that the SVM is slightly more accurate on average (1-3%) than

  8. A least square support vector machine-based approach for contingency classification and ranking in a large power system

    Directory of Open Access Journals (Sweden)

    Bhanu Pratap Soni

    2016-12-01

    Full Text Available This paper proposes an effective supervised learning approach for static security assessment of a large power system. Supervised learning approach employs least square support vector machine (LS-SVM to rank the contingencies and predict the system severity level. The severity of the contingency is measured by two scalar performance indices (PIs: line MVA performance index (PIMVA and Voltage-reactive power performance index (PIVQ. SVM works in two steps. Step I is the estimation of both standard indices (PIMVA and PIVQ that is carried out under different operating scenarios and Step II contingency ranking is carried out based on the values of PIs. The effectiveness of the proposed methodology is demonstrated on IEEE 39-bus (New England system. The approach can be beneficial tool which is less time consuming and accurate security assessment and contingency analysis at energy management center.

  9. Using In-Service and Coaching to Increase Teachers' Accurate Use of Research-Based Strategies

    Science.gov (United States)

    Kretlow, Allison G.; Cooke, Nancy L.; Wood, Charles L.

    2012-01-01

    Increasing the accurate use of research-based practices in classrooms is a critical issue. Professional development is one of the most practical ways to provide practicing teachers with training related to research-based practices. This study examined the effects of in-service plus follow-up coaching on first grade teachers' accurate delivery of…

  10. A Nonlinear Model for Gene-Based Gene-Environment Interaction

    Directory of Open Access Journals (Sweden)

    Jian Sa

    2016-06-01

    Full Text Available A vast amount of literature has confirmed the role of gene-environment (G×E interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.

  11. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    Science.gov (United States)

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  12. A network-based gene expression signature informs prognosis and treatment for colorectal cancer patients.

    Directory of Open Access Journals (Sweden)

    Mingguang Shi

    Full Text Available Several studies have reported gene expression signatures that predict recurrence risk in stage II and III colorectal cancer (CRC patients with minimal gene membership overlap and undefined biological relevance. The goal of this study was to investigate biological themes underlying these signatures, to infer genes of potential mechanistic importance to the CRC recurrence phenotype and to test whether accurate prognostic models can be developed using mechanistically important genes.We investigated eight published CRC gene expression signatures and found no functional convergence in Gene Ontology enrichment analysis. Using a random walk-based approach, we integrated these signatures and publicly available somatic mutation data on a protein-protein interaction network and inferred 487 genes that were plausible candidate molecular underpinnings for the CRC recurrence phenotype. We named the list of 487 genes a NEM signature because it integrated information from Network, Expression, and Mutation. The signature showed significant enrichment in four biological processes closely related to cancer pathophysiology and provided good coverage of known oncogenes, tumor suppressors, and CRC-related signaling pathways. A NEM signature-based Survival Support Vector Machine prognostic model was trained using a microarray gene expression dataset and tested on an independent dataset. The model-based scores showed a 75.7% concordance with the real survival data and separated patients into two groups with significantly different relapse-free survival (p = 0.002. Similar results were obtained with reversed training and testing datasets (p = 0.007. Furthermore, adjuvant chemotherapy was significantly associated with prolonged survival of the high-risk patients (p = 0.006, but not beneficial to the low-risk patients (p = 0.491.The NEM signature not only reflects CRC biology but also informs patient prognosis and treatment response. Thus, the network-based

  13. Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC Study by Biochemically-inspired Machine Learning [version 3; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Eliseos J. Mucaki

    2017-05-01

    Full Text Available Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT and, in some cases, chemotherapy (CT agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients; was also used to derive gene signatures of other HT  (tamoxifen and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM model of paclitaxel response containing genes ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, and TUBB4B was 78.6% accurate in predicting survival of 84 patients treated with both HT and CT (median survival ≥ 4.4 yr. Accuracy was lower (73.4% in 304 untreated patients. The performance of other machine learning approaches was also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of genes BCL2L1, BBC3, FGF2, FN1, and TWIST1 was 81.1% accurate in 53 CT patients. In addition, a random forest (RF classifier using a gene signature (ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2,SLCO1B3, TUBB1, TUBB4A, and TUBB4B predicted >3-year survival with 85.5% accuracy in 420 HT patients. A similar RF gene signature showed 82.7% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for

  14. Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning.

    Science.gov (United States)

    Mucaki, Eliseos J; Baranova, Katherina; Pham, Huy Q; Rezaeian, Iman; Angelov, Dimo; Ngom, Alioune; Rueda, Luis; Rogan, Peter K

    2016-01-01

    Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients; was also used to derive gene signatures of other HT  (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing genes  ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, and TUBB4B  was 78.6% accurate in predicting survival of 84 patients treated with both HT and CT (median survival ≥ 4.4 yr). Accuracy was lower (73.4%) in 304 untreated patients. The performance of other machine learning approaches was also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of genes  BCL2L1, BBC3, FGF2, FN1,  and  TWIST1   was 81.1% accurate in 53 CT patients. In addition, a random forest (RF) classifier using a gene signature ( ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2,SLCO1B3, TUBB1, TUBB4A,  and TUBB4B ) predicted >3-year survival with 85.5% accuracy in 420 HT patients. A similar RF gene signature showed 82.7% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for

  15. Noninvasive extraction of fetal electrocardiogram based on Support Vector Machine

    Science.gov (United States)

    Fu, Yumei; Xiang, Shihan; Chen, Tianyi; Zhou, Ping; Huang, Weiyan

    2015-10-01

    The fetal electrocardiogram (FECG) signal has important clinical value for diagnosing the fetal heart diseases and choosing suitable therapeutics schemes to doctors. So, the noninvasive extraction of FECG from electrocardiogram (ECG) signals becomes a hot research point. A new method, the Support Vector Machine (SVM) is utilized for the extraction of FECG with limited size of data. Firstly, the theory of the SVM and the principle of the extraction based on the SVM are studied. Secondly, the transformation of maternal electrocardiogram (MECG) component in abdominal composite signal is verified to be nonlinear and fitted with the SVM. Then, the SVM is trained, and the training results are compared with the real data to ensure the effect of the training. Meanwhile, the parameters of the SVM are optimized to achieve the best performance so that the learning machine can be utilized to fit the unknown samples. Finally, the FECG is extracted by removing the optimal estimation of MECG component from the abdominal composite signal. In order to evaluate the performance of FECG extraction based on the SVM, the Signal-to-Noise Ratio (SNR) and the visual test are used. The experimental results show that the FECG with good quality can be extracted, its SNR ratio is significantly increased as high as 9.2349 dB and the time cost is significantly decreased as short as 0.802 seconds. Compared with the traditional method, the noninvasive extraction method based on the SVM has a simple realization, the shorter treatment time and the better extraction quality under the same conditions.

  16. A support vector machine and a random forest classifier indicates a 15-miRNA set related to osteosarcoma recurrence

    Directory of Open Access Journals (Sweden)

    He Y

    2018-01-01

    Full Text Available Yunfei He,1,2,* Jun Ma,1,* An Wang,1,3,* Weiheng Wang,1 Shengchang Luo,1 Yaoming Liu,2 Xiaojian Ye1 1Department of Orthopaedics, Changzheng Hospital Affiliated with Second Military Medical University, Shanghai, 2Department of Orthopaedics, Lanzhou General Hospital of Lanzhou Military Command Region, Lanzhou, 3Department of Orthopaedics, Shanghai Armed Police Force Hospital, Shanghai, People’s Republic of China *These authors contributed equally to this work Background: Osteosarcoma, which originates in the mesenchymal tissue, is the prevalent primary solid malignancy of the bone. It is of great importance to explore the mechanisms of metastasis and recurrence, which are two primary reasons accounting for the high death rate in osteosarcoma. Data and methods: Three miRNA expression profiles related to osteosarcoma were downloaded from GEO DataSets. Differentially expressed miRNAs (DEmiRs were screened using MetaDE.ES of the MetaDE package. A support vector machine (SVM classifier was constructed using optimal miRNAs, and its prediction efficiency for recurrence was detected in independent datasets. Finally, a co-expression network was constructed based on the DEmiRs and their target genes. Results: In total, 78 significantly DEmiRs were screened. The SVM classifier constructed by 15 miRNAs could accurately classify 58 samples in 65 samples (89.2% in the GSE39040 database, which was validated in another two databases, GSE39052 (84.62%, 22/26 and GSE79181 (91.3%, 21/23. Cox regression showed that four miRNAs, including hsa-miR-10b, hsa-miR-1227, hsa-miR-146b-3p, and hsa-miR-873, significantly correlated with tumor recurrence time. There were 137, 147, 145, and 77 target genes of the above four miRNAs, respectively, which were assigned to 17 gene ontology functionally annotated terms and 14 Kyoto Encyclopedia of Genes and Genomes pathways. Among them, the “Osteoclast differentiation” pathway contained a total of seven target genes and was

  17. Comparison of lists of genes based on functional profiles

    Directory of Open Access Journals (Sweden)

    Salicrú Miquel

    2011-10-01

    Full Text Available Abstract Background How to compare studies on the basis of their biological significance is a problem of central importance in high-throughput genomics. Many methods for performing such comparisons are based on the information in databases of functional annotation, such as those that form the Gene Ontology (GO. Typically, they consist of analyzing gene annotation frequencies in some pre-specified GO classes, in a class-by-class way, followed by p-value adjustment for multiple testing. Enrichment analysis, where a list of genes is compared against a wider universe of genes, is the most common example. Results A new global testing procedure and a method incorporating it are presented. Instead of testing separately for each GO class, a single global test for all classes under consideration is performed. The test is based on the distance between the functional profiles, defined as the joint frequencies of annotation in a given set of GO classes. These classes may be chosen at one or more GO levels. The new global test is more powerful and accurate with respect to type I errors than the usual class-by-class approach. When applied to some real datasets, the results suggest that the method may also provide useful information that complements the tests performed using a class-by-class approach if gene counts are sparse in some classes. An R library, goProfiles, implements these methods and is available from Bioconductor, http://bioconductor.org/packages/release/bioc/html/goProfiles.html. Conclusions The method provides an inferential basis for deciding whether two lists are functionally different. For global comparisons it is preferable to the global chi-square test of homogeneity. Furthermore, it may provide additional information if used in conjunction with class-by-class methods.

  18. Reverse transcription-quantitative polymerase chain reaction: description of a RIN-based algorithm for accurate data normalization

    Directory of Open Access Journals (Sweden)

    Boissière-Michot Florence

    2009-04-01

    Full Text Available Abstract Background Reverse transcription-quantitative polymerase chain reaction (RT-qPCR is the gold standard technique for mRNA quantification, but appropriate normalization is required to obtain reliable data. Normalization to accurately quantitated RNA has been proposed as the most reliable method for in vivo biopsies. However, this approach does not correct differences in RNA integrity. Results In this study, we evaluated the effect of RNA degradation on the quantification of the relative expression of nine genes (18S, ACTB, ATUB, B2M, GAPDH, HPRT, POLR2L, PSMB6 and RPLP0 that cover a wide expression spectrum. Our results show that RNA degradation could introduce up to 100% error in gene expression measurements when RT-qPCR data were normalized to total RNA. To achieve greater resolution of small differences in transcript levels in degraded samples, we improved this normalization method by developing a corrective algorithm that compensates for the loss of RNA integrity. This approach allowed us to achieve higher accuracy, since the average error for quantitative measurements was reduced to 8%. Finally, we applied our normalization strategy to the quantification of EGFR, HER2 and HER3 in 104 rectal cancer biopsies. Taken together, our data show that normalization of gene expression measurements by taking into account also RNA degradation allows much more reliable sample comparison. Conclusion We developed a new normalization method of RT-qPCR data that compensates for loss of RNA integrity and therefore allows accurate gene expression quantification in human biopsies.

  19. Pre-cancer risk assessment in habitual smokers from DIC images of oral exfoliative cells using active contour and SVM analysis.

    Science.gov (United States)

    Dey, Susmita; Sarkar, Ripon; Chatterjee, Kabita; Datta, Pallab; Barui, Ananya; Maity, Santi P

    2017-04-01

    Habitual smokers are known to be at higher risk for developing oral cancer, which is increasing at an alarming rate globally. Conventionally, oral cancer is associated with high mortality rates, although recent reports show the improved survival outcomes by early diagnosis of disease. An effective prediction system which will enable to identify the probability of cancer development amongst the habitual smokers, is thus expected to benefit sizable number of populations. Present work describes a non-invasive, integrated method for early detection of cellular abnormalities based on analysis of different cyto-morphological features of exfoliative oral epithelial cells. Differential interference contrast (DIC) microscopy provides a potential optical tool as this mode provides a pseudo three dimensional (3-D) image with detailed morphological and textural features obtained from noninvasive, label free epithelial cells. For segmentation of DIC images, gradient vector flow snake model active contour process has been adopted. To evaluate cellular abnormalities amongst habitual smokers, the selected morphological and textural features of epithelial cells are compared with the non-smoker (-ve control group) group and clinically diagnosed pre-cancer patients (+ve control group) using support vector machine (SVM) classifier. Accuracy of the developed SVM based classification has been found to be 86% with 80% sensitivity and 89% specificity in classifying the features from the volunteers having smoking habit. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. MO-DE-207B-05: Predicting Gene Mutations in Renal Cell Carcinoma Based On CT Imaging Features: Validation Using TCGA-TCIA Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Chen, X; Zhou, Z; Thomas, K; Wang, J [UT Southwestern Medical Center, Dallas, TX (United States)

    2016-06-15

    Purpose: The goal of this work is to investigate the use of contrast enhanced computed tomographic (CT) features for the prediction of mutations of BAP1, PBRM1, and VHL genes in renal cell carcinoma (RCC). Methods: For this study, we used two patient databases with renal cell carcinoma (RCC). The first one consisted of 33 patients from our institution (UT Southwestern Medical Center, UTSW). The second one consisted of 24 patients from the Cancer Imaging Archive (TCIA), where each patient is connected by a unique identi?er to the tissue samples from the Cancer Genome Atlas (TCGA). From the contrast enhanced CT image of each patient, tumor contour was first delineated by a physician. Geometry, intensity, and texture features were extracted from the delineated tumor. Based on UTSW dataset, we completed feature selection and trained a support vector machine (SVM) classifier to predict mutations of BAP1, PBRM1 and VHL genes. We then used TCIA-TCGA dataset to validate the predictive model build upon UTSW dataset. Results: The prediction accuracy of gene expression of TCIA-TCGA patients was 0.83 (20 of 24), 0.83 (20 of 24), and 0.75 (18 of 24) for BAP1, PBRM1, and VHL respectively. For BAP1 gene, texture feature was the most prominent feature type. For PBRM1 gene, intensity feature was the most prominent. For VHL gene, geometry, intensity, and texture features were all important. Conclusion: Using our feature selection strategy and models, we achieved predictive accuracy over 0.75 for all three genes under the condition of using patient data from one institution for training and data from other institutions for testing. These results suggest that radiogenomics can be used to aid in prognosis and used as convenient surrogates for expensive and time consuming gene assay procedures.

  1. Supervised classification of combined copy number and gene expression data

    Directory of Open Access Journals (Sweden)

    Riccadonna S.

    2007-12-01

    Full Text Available In this paper we apply a predictive profiling method to genome copy number aberrations (CNA in combination with gene expression and clinical data to identify molecular patterns of cancer pathophysiology. Predictive models and optimal feature lists for the platforms are developed by a complete validation SVM-based machine learning system. Ranked list of genome CNA sites (assessed by comparative genomic hybridization arrays – aCGH and of differentially expressed genes (assessed by microarray profiling with Affy HG-U133A chips are computed and combined on a breast cancer dataset for the discrimination of Luminal/ ER+ (Lum/ER+ and Basal-like/ER- classes. Different encodings are developed and applied to the CNA data, and predictive variable selection is discussed. We analyze the combination of profiling information between the platforms, also considering the pathophysiological data. A specific subset of patients is identified that has a different response to classification by chromosomal gains and losses and by differentially expressed genes, corroborating the idea that genomic CNA can represent an independent source for tumor classification.

  2. A Support Vector Machine-Based Gender Identification Using Speech Signal

    Science.gov (United States)

    Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk

    We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.

  3. Hi-Plex for Simple, Accurate, and Cost-Effective Amplicon-based Targeted DNA Sequencing.

    Science.gov (United States)

    Pope, Bernard J; Hammet, Fleur; Nguyen-Dumont, Tu; Park, Daniel J

    2018-01-01

    Hi-Plex is a suite of methods to enable simple, accurate, and cost-effective highly multiplex PCR-based targeted sequencing (Nguyen-Dumont et al., Biotechniques 58:33-36, 2015). At its core is the principle of using gene-specific primers (GSPs) to "seed" (or target) the reaction and universal primers to "drive" the majority of the reaction. In this manner, effects on amplification efficiencies across the target amplicons can, to a large extent, be restricted to early seeding cycles. Product sizes are defined within a relatively narrow range to enable high-specificity size selection, replication uniformity across target sites (including in the context of fragmented input DNA such as that derived from fixed tumor specimens (Nguyen-Dumont et al., Biotechniques 55:69-74, 2013; Nguyen-Dumont et al., Anal Biochem 470:48-51, 2015), and application of high-specificity genetic variant calling algorithms (Pope et al., Source Code Biol Med 9:3, 2014; Park et al., BMC Bioinformatics 17:165, 2016). Hi-Plex offers a streamlined workflow that is suitable for testing large numbers of specimens without the need for automation.

  4. Estimation of Anti-HIV Activity of HEPT Analogues Using MLR, ANN, and SVM Techniques

    Directory of Open Access Journals (Sweden)

    Basheerulla Shaik

    2013-01-01

    value than those of MLR and SVM techniques. Rm2= metrics and ridge regression analysis indicated that the proposed four-variable model MATS5e, RDF080u, T(O⋯O, and MATS5m as correlating descriptors is the best for estimating the anti-HIV activity (log 1/C present set of compounds.

  5. Sentiment analysis: a comparison of deep learning neural network algorithm with SVM and naϊve Bayes for Indonesian text

    Science.gov (United States)

    Calvin Frans Mariel, Wahyu; Mariyah, Siti; Pramana, Setia

    2018-03-01

    Deep learning is a new era of machine learning techniques that essentially imitate the structure and function of the human brain. It is a development of deeper Artificial Neural Network (ANN) that uses more than one hidden layer. Deep Learning Neural Network has a great ability on recognizing patterns from various data types such as picture, audio, text, and many more. In this paper, the authors tries to measure that algorithm’s ability by applying it into the text classification. The classification task herein is done by considering the content of sentiment in a text which is also called as sentiment analysis. By using several combinations of text preprocessing and feature extraction techniques, we aim to compare the precise modelling results of Deep Learning Neural Network with the other two commonly used algorithms, the Naϊve Bayes and Support Vector Machine (SVM). This algorithm comparison uses Indonesian text data with balanced and unbalanced sentiment composition. Based on the experimental simulation, Deep Learning Neural Network clearly outperforms the Naϊve Bayes and SVM and offers a better F-1 Score while for the best feature extraction technique which improves that modelling result is Bigram.

  6. A rapid method of accurate detection and differentiation of Newcastle disease virus pathotypes by demonstrating multiple bands in degenerate primer based nested RT-PCR.

    Science.gov (United States)

    Desingu, P A; Singh, S D; Dhama, K; Kumar, O R Vinodh; Singh, R; Singh, R K

    2015-02-01

    A rapid and accurate method of detection and differentiation of virulent and avirulent Newcastle disease virus (NDV) pathotypes was developed. The NDV detection was carried out for different domestic avian field isolates and pigeon paramyxo virus-1 (25 field isolates and 9 vaccine strains) by using APMV-I "fusion" (F) gene Class II specific external primer A and B (535bp), internal primer C and D (238bp) based reverses transcriptase PCR (RT-PCR). The internal degenerative reverse primer D is specific for F gene cleavage position of virulent strain of NDV. The nested RT-PCR products of avirulent strains showed two bands (535bp and 424bp) while virulent strains showed four bands (535bp, 424bp, 349bp and 238bp) on agar gel electrophoresis. This is the first report regarding development and use of degenerate primer based nested RT-PCR for accurate detection and differentiation of NDV pathotypes by demonstrating multiple PCR band patterns. Being a rapid, simple, and economical test, the developed method could serve as a valuable alternate diagnostic tool for characterizing NDV isolates and carrying out molecular epidemiological surveillance studies for this important pathogen of poultry. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. Accelerometer and Camera-Based Strategy for Improved Human Fall Detection

    KAUST Repository

    Zerrouki, Nabil; Harrou, Fouzi; Sun, Ying; Houacine, Amrane

    2016-01-01

    In this paper, we address the problem of detecting human falls using anomaly detection. Detection and classification of falls are based on accelerometric data and variations in human silhouette shape. First, we use the exponentially weighted moving average (EWMA) monitoring scheme to detect a potential fall in the accelerometric data. We used an EWMA to identify features that correspond with a particular type of fall allowing us to classify falls. Only features corresponding with detected falls were used in the classification phase. A benefit of using a subset of the original data to design classification models minimizes training time and simplifies models. Based on features corresponding to detected falls, we used the support vector machine (SVM) algorithm to distinguish between true falls and fall-like events. We apply this strategy to the publicly available fall detection databases from the university of Rzeszow’s. Results indicated that our strategy accurately detected and classified fall events, suggesting its potential application to early alert mechanisms in the event of fall situations and its capability for classification of detected falls. Comparison of the classification results using the EWMA-based SVM classifier method with those achieved using three commonly used machine learning classifiers, neural network, K-nearest neighbor and naïve Bayes, proved our model superior.

  8. Accelerometer and Camera-Based Strategy for Improved Human Fall Detection

    KAUST Repository

    Zerrouki, Nabil

    2016-10-29

    In this paper, we address the problem of detecting human falls using anomaly detection. Detection and classification of falls are based on accelerometric data and variations in human silhouette shape. First, we use the exponentially weighted moving average (EWMA) monitoring scheme to detect a potential fall in the accelerometric data. We used an EWMA to identify features that correspond with a particular type of fall allowing us to classify falls. Only features corresponding with detected falls were used in the classification phase. A benefit of using a subset of the original data to design classification models minimizes training time and simplifies models. Based on features corresponding to detected falls, we used the support vector machine (SVM) algorithm to distinguish between true falls and fall-like events. We apply this strategy to the publicly available fall detection databases from the university of Rzeszow’s. Results indicated that our strategy accurately detected and classified fall events, suggesting its potential application to early alert mechanisms in the event of fall situations and its capability for classification of detected falls. Comparison of the classification results using the EWMA-based SVM classifier method with those achieved using three commonly used machine learning classifiers, neural network, K-nearest neighbor and naïve Bayes, proved our model superior.

  9. DNA regulatory motif selection based on support vector machine ...

    African Journals Online (AJOL)

    ... machine (SVM) and its application in microarray experiment of Kashin-Beck disease. ... speed and amount of the corresponding mRNA in gene replication process. ... and revealed that some motifs may be related to the immune reactions.

  10. Indexed variation graphs for efficient and accurate resistome profiling.

    Science.gov (United States)

    Rowe, Will P M; Winn, Martyn D

    2018-05-14

    Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and accurate method for resistome profiling that addresses these complications and improves upon currently available tools. Our method combines a variation graph representation of gene sets with an LSH Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, GROOT, and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 minutes using a single CPU. Our method is not restricted to resistom