WorldWideScience

Sample records for vector machine regression

  1. Phase Space Prediction of Chaotic Time Series with Nu-Support Vector Machine Regression

    International Nuclear Information System (INIS)

    Ye Meiying; Wang Xiaodong

    2005-01-01

    A new class of support vector machine, nu-support vector machine, is discussed which can handle both classification and regression. We focus on nu-support vector machine regression and use it for phase space prediction of chaotic time series. The effectiveness of the method is demonstrated by applying it to the Henon map. This study also compares nu-support vector machine with back propagation (BP) networks in order to better evaluate the performance of the proposed methods. The experimental results show that the nu-support vector machine regression obtains lower root mean squared error than the BP networks and provides an accurate chaotic time series prediction. These results can be attributable to the fact that nu-support vector machine implements the structural risk minimization principle and this leads to better generalization than the BP networks.

  2. Deep Support Vector Machines for Regression Problems

    NARCIS (Netherlands)

    Wiering, Marco; Schutten, Marten; Millea, Adrian; Meijster, Arnold; Schomaker, Lambertus

    2013-01-01

    In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain

  3. Prediction of protein binding sites using physical and chemical descriptors and the support vector machine regression method

    International Nuclear Information System (INIS)

    Sun Zhong-Hua; Jiang Fan

    2010-01-01

    In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method. (rapid communication)

  4. Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression

    Science.gov (United States)

    Fei, Cheng-Wei; Bai, Guang-Chen

    2014-12-01

    To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.

  5. Twin support vector machines models, extensions and applications

    CERN Document Server

    Jayadeva; Chandra, Suresh

    2017-01-01

    This book provides a systematic and focused study of the various aspects of twin support vector machines (TWSVM) and related developments for classification and regression. In addition to presenting most of the basic models of TWSVM and twin support vector regression (TWSVR) available in the literature, it also discusses the important and challenging applications of this new machine learning methodology. A chapter on “Additional Topics” has been included to discuss kernel optimization and support tensor machine topics, which are comparatively new but have great potential in applications. It is primarily written for graduate students and researchers in the area of machine learning and related topics in computer science, mathematics, electrical engineering, management science and finance.

  6. A fuzzy regression with support vector machine approach to the estimation of horizontal global solar radiation

    International Nuclear Information System (INIS)

    Baser, Furkan; Demirhan, Haydar

    2017-01-01

    Accurate estimation of the amount of horizontal global solar radiation for a particular field is an important input for decision processes in solar radiation investments. In this article, we focus on the estimation of yearly mean daily horizontal global solar radiation by using an approach that utilizes fuzzy regression functions with support vector machine (FRF-SVM). This approach is not seriously affected by outlier observations and does not suffer from the over-fitting problem. To demonstrate the utility of the FRF-SVM approach in the estimation of horizontal global solar radiation, we conduct an empirical study over a dataset collected in Turkey and applied the FRF-SVM approach with several kernel functions. Then, we compare the estimation accuracy of the FRF-SVM approach to an adaptive neuro-fuzzy system and a coplot supported-genetic programming approach. We observe that the FRF-SVM approach with a Gaussian kernel function is not affected by both outliers and over-fitting problem and gives the most accurate estimates of horizontal global solar radiation among the applied approaches. Consequently, the use of hybrid fuzzy functions and support vector machine approaches is found beneficial in long-term forecasting of horizontal global solar radiation over a region with complex climatic and terrestrial characteristics. - Highlights: • A fuzzy regression functions with support vector machines approach is proposed. • The approach is robust against outlier observations and over-fitting problem. • Estimation accuracy of the model is superior to several existent alternatives. • A new solar radiation estimation model is proposed for the region of Turkey. • The model is useful under complex terrestrial and climatic conditions.

  7. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    Science.gov (United States)

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  8. Vector regression introduced

    Directory of Open Access Journals (Sweden)

    Mok Tik

    2014-06-01

    Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.

  9. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Directory of Open Access Journals (Sweden)

    Suduan Chen

    2014-01-01

    Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  10. Failure and reliability prediction by support vector machines regression of time series data

    International Nuclear Information System (INIS)

    Chagas Moura, Marcio das; Zio, Enrico; Lins, Isis Didier; Droguett, Enrique

    2011-01-01

    Support Vector Machines (SVMs) are kernel-based learning methods, which have been successfully adopted for regression problems. However, their use in reliability applications has not been widely explored. In this paper, a comparative analysis is presented in order to evaluate the SVM effectiveness in forecasting time-to-failure and reliability of engineered components based on time series data. The performance on literature case studies of SVM regression is measured against other advanced learning methods such as the Radial Basis Function, the traditional MultiLayer Perceptron model, Box-Jenkins autoregressive-integrated-moving average and the Infinite Impulse Response Locally Recurrent Neural Networks. The comparison shows that in the analyzed cases, SVM outperforms or is comparable to other techniques. - Highlights: → Realistic modeling of reliability demands complex mathematical formulations. → SVM is proper when the relation input/output is unknown or very costly to be obtained. → Results indicate the potential of SVM for reliability time series prediction. → Reliability estimates support the establishment of adequate maintenance strategies.

  11. Support vector machines optimization based theory, algorithms, and extensions

    CERN Document Server

    Deng, Naiyang; Zhang, Chunhua

    2013-01-01

    Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)-classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built.The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twi

  12. Regression model of support vector machines for least squares prediction of crystallinity of cracking catalysts by infrared spectroscopy

    International Nuclear Information System (INIS)

    Comesanna Garcia, Yumirka; Dago Morales, Angel; Talavera Bustamante, Isneri

    2010-01-01

    The recently introduction of the least squares support vector machines method for regression purposes in the field of Chemometrics has provided several advantages to linear and nonlinear multivariate calibration methods. The objective of the paper was to propose the use of the least squares support vector machine as an alternative multivariate calibration method for the prediction of the percentage of crystallinity of fluidized catalytic cracking catalysts, by means of Fourier transform mid-infrared spectroscopy. A linear kernel was used in the calculations of the regression model. The optimization of its gamma parameter was carried out using the leave-one-out cross-validation procedure. The root mean square error of prediction was used to measure the performance of the model. The accuracy of the results obtained with the application of the method is in accordance with the uncertainty of the X-ray powder diffraction reference method. To compare the generalization capability of the developed method, a comparison study was carried out, taking into account the results achieved with the new model and those reached through the application of linear calibration methods. The developed method can be easily implemented in refinery laboratories

  13. A novel improved fuzzy support vector machine based stock price trend forecast model

    OpenAIRE

    Wang, Shuheng; Li, Guohao; Bao, Yifan

    2018-01-01

    Application of fuzzy support vector machine in stock price forecast. Support vector machine is a new type of machine learning method proposed in 1990s. It can deal with classification and regression problems very successfully. Due to the excellent learning performance of support vector machine, the technology has become a hot research topic in the field of machine learning, and it has been successfully applied in many fields. However, as a new technology, there are many limitations to support...

  14. Support vector machine for diagnosis cancer disease: A comparative study

    Directory of Open Access Journals (Sweden)

    Nasser H. Sweilam

    2010-12-01

    Full Text Available Support vector machine has become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. Training a support vector machine requires the solution of a very large quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, several approaches exist for circumventing the above shortcomings and work well. Another learning algorithm, particle swarm optimization, Quantum-behave Particle Swarm for training SVM is introduced. Another approach named least square support vector machine (LSSVM and active set strategy are introduced. The obtained results by these methods are tested on a breast cancer dataset and compared with the exact solution model problem.

  15. QSAR models for prediction study of HIV protease inhibitors using support vector machines, neural networks and multiple linear regression

    Directory of Open Access Journals (Sweden)

    Rachid Darnag

    2017-02-01

    Full Text Available Support vector machines (SVM represent one of the most promising Machine Learning (ML tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR models using molecular descriptors. Multiple linear regression (MLR and artificial neural networks (ANNs were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure–activity relationships was evaluated.

  16. Estimating transmitted waves of floating breakwater using support vector regression model

    Digital Repository Service at National Institute of Oceanography (India)

    Mandal, S.; Hegde, A.V.; Kumar, V.; Patil, S.G.

    is first mapped onto an m-dimensional feature space using some fixed (nonlinear) mapping, and then a linear model is constructed in this feature space (Ivanciuc Ovidiu 2007). Using mathematical notation, the linear model in the feature space f(x, w... regressive vector machines, Ocean Engineering Journal, Vol – 36, pp 339 – 347, 2009. 3. Ivanciuc Ovidiu, Applications of support vector machines in chemistry, Review in Computational Chemistry, Eds K. B. Lipkouitz and T. R. Cundari, Vol – 23...

  17. Least Square Support Vector Machine Classifier vs a Logistic Regression Classifier on the Recognition of Numeric Digits

    Directory of Open Access Journals (Sweden)

    Danilo A. López-Sarmiento

    2013-11-01

    Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.

  18. Prediction of Five Softwood Paper Properties from its Density using Support Vector Machine Regression Techniques

    Directory of Open Access Journals (Sweden)

    Esperanza García-Gonzalo

    2016-01-01

    Full Text Available Predicting paper properties based on a limited number of measured variables can be an important tool for the industry. Mathematical models were developed to predict mechanical and optical properties from the corresponding paper density for some softwood papers using support vector machine regression with the Radial Basis Function Kernel. A dataset of different properties of paper handsheets produced from pulps of pine (Pinus pinaster and P. sylvestris and cypress species (Cupressus lusitanica, C. sempervirens, and C. arizonica beaten at 1000, 4000, and 7000 revolutions was used. The results show that it is possible to obtain good models (with high coefficient of determination with two variables: the numerical variable density and the categorical variable species.

  19. Vector grammars and PN machines

    Institute of Scientific and Technical Information of China (English)

    蒋昌俊

    1996-01-01

    The concept of vector grammars under the string semantic is introduced.The dass of vector grammars is given,which is similar to the dass of Chomsky grammars.The regular vector grammar is divided further.The strong and weak relation between the vector grammar and scalar grammar is discussed,so the spectrum system graph of scalar and vector grammars is made.The equivalent relation between the regular vector grammar and Petri nets (also called PN machine) is pointed.The hybrid PN machine is introduced,and its language is proved equivalent to the language of the context-free vector grammar.So the perfect relation structure between vector grammars and PN machines is formed.

  20. Laser-induced Breakdown spectroscopy quantitative analysis method via adaptive analytical line selection and relevance vector machine regression model

    International Nuclear Information System (INIS)

    Yang, Jianhong; Yi, Cancan; Xu, Jinwu; Ma, Xianghong

    2015-01-01

    A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine. - Highlights: • Both training and testing samples are considered for analytical lines selection. • The analytical lines are auto-selected based on the built-in characteristics of spectral lines. • The new method can achieve better prediction accuracy and modeling robustness. • Model predictions are given with confidence interval of probabilistic distribution

  1. Vector control of induction machines

    CERN Document Server

    Robyns, Benoit

    2012-01-01

    After a brief introduction to the main law of physics and fundamental concepts inherent in electromechanical conversion, ""Vector Control of Induction Machines"" introduces the standard mathematical models for induction machines - whichever rotor technology is used - as well as several squirrel-cage induction machine vector-control strategies. The use of causal ordering graphs allows systematization of the design stage, as well as standardization of the structure of control devices. ""Vector Control of Induction Machines"" suggests a unique approach aimed at reducing parameter sensitivity for

  2. Machine Learning Multi-Stage Classification and Regression in the Search for Vector-like Quarks and the Neyman Construction in Signal Searches

    CERN Document Server

    Leone, Robert Matthew

    A search for vector-like quarks (VLQs) decaying to a Z boson using multi-stage machine learning was compared to a search using a standard square cuts search strategy. VLQs are predicted by several new theories beyond the Standard Model. The searches used 20.3 inverse femtobarns of proton-proton collisions at a center-of-mass energy of 8 TeV collected with the ATLAS detector in 2012 at the CERN Large Hadron Collider. CLs upper limits on production cross sections of vector-like top and bottom quarks were computed for VLQs produced singly or in pairs, Tsingle, Bsingle, Tpair, and Bpair. The two stage machine learning classification search strategy did not provide any improvement over the standard square cuts strategy, but for Tpair, Bpair, and Tsingle, a third stage of machine learning regression was able to lower the upper limits of high signal masses by as much as 50%. Additionally, new test statistics were developed for use in the Neyman construction of confidence regions in order to address deficiencies in c...

  3. Prediction of Machine Tool Condition Using Support Vector Machine

    International Nuclear Information System (INIS)

    Wang Peigong; Meng Qingfeng; Zhao Jian; Li Junjie; Wang Xiufeng

    2011-01-01

    Condition monitoring and predicting of CNC machine tools are investigated in this paper. Considering the CNC machine tools are often small numbers of samples, a condition predicting method for CNC machine tools based on support vector machines (SVMs) is proposed, then one-step and multi-step condition prediction models are constructed. The support vector machines prediction models are used to predict the trends of working condition of a certain type of CNC worm wheel and gear grinding machine by applying sequence data of vibration signal, which is collected during machine processing. And the relationship between different eigenvalue in CNC vibration signal and machining quality is discussed. The test result shows that the trend of vibration signal Peak-to-peak value in surface normal direction is most relevant to the trend of surface roughness value. In trends prediction of working condition, support vector machine has higher prediction accuracy both in the short term ('One-step') and long term (multi-step) prediction compared to autoregressive (AR) model and the RBF neural network. Experimental results show that it is feasible to apply support vector machine to CNC machine tool condition prediction.

  4. Support vector machines applications

    CERN Document Server

    Guo, Guodong

    2014-01-01

    Support vector machines (SVM) have both a solid mathematical background and good performance in practical applications. This book focuses on the recent advances and applications of the SVM in different areas, such as image processing, medical practice, computer vision, pattern recognition, machine learning, applied statistics, business intelligence, and artificial intelligence. The aim of this book is to create a comprehensive source on support vector machine applications, especially some recent advances.

  5. Prediction of retention indices for frequently reported compounds of plant essential oils using multiple linear regression, partial least squares, and support vector machine.

    Science.gov (United States)

    Yan, Jun; Huang, Jian-Hua; He, Min; Lu, Hong-Bing; Yang, Rui; Kong, Bo; Xu, Qing-Song; Liang, Yi-Zeng

    2013-08-01

    Retention indices for frequently reported compounds of plant essential oils on three different stationary phases were investigated. Multivariate linear regression, partial least squares, and support vector machine combined with a new variable selection approach called random-frog recently proposed by our group, were employed to model quantitative structure-retention relationships. Internal and external validations were performed to ensure the stability and predictive ability. All the three methods could obtain an acceptable model, and the optimal results by support vector machine based on a small number of informative descriptors with the square of correlation coefficient for cross validation, values of 0.9726, 0.9759, and 0.9331 on the dimethylsilicone stationary phase, the dimethylsilicone phase with 5% phenyl groups, and the PEG stationary phase, respectively. The performances of two variable selection approaches, random-frog and genetic algorithm, are compared. The importance of the variables was found to be consistent when estimated from correlation coefficients in multivariate linear regression equations and selection probability in model spaces. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Generation of daily global solar irradiation with support vector machines for regression

    International Nuclear Information System (INIS)

    Antonanzas-Torres, F.; Urraca, R.; Antonanzas, J.; Fernandez-Ceniceros, J.; Martinez-de-Pison, F.J.

    2015-01-01

    Highlights: • New methodology for estimation of daily solar irradiation with SVR. • Automatic procedure for training models and selecting meteorological features. • This methodology outperforms other well-known parametric and numeric techniques. - Abstract: Solar global irradiation is barely recorded in isolated rural areas around the world. Traditionally, solar resource estimation has been performed using parametric-empirical models based on the relationship of solar irradiation with other atmospheric and commonly measured variables, such as temperatures, rainfall, and sunshine duration, achieving a relatively high level of certainty. Considerable improvement in soft-computing techniques, which have been applied extensively in many research fields, has lead to improvements in solar global irradiation modeling, although most of these techniques lack spatial generalization. This new methodology proposes support vector machines for regression with optimized variable selection via genetic algorithms to generate non-locally dependent and accurate models. A case of study in Spain has demonstrated the value of this methodology. It achieved a striking reduction in the mean absolute error (MAE) – 41.4% and 19.9% – as compared to classic parametric models; Bristow & Campbell and Antonanzas-Torres et al., respectively

  7. Support vector machines classifiers of physical activities in preschoolers

    Science.gov (United States)

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  8. Support Vector Machine and Application in Seizure Prediction

    KAUST Repository

    Qiu, Simeng

    2018-04-01

    Nowadays, Machine learning (ML) has been utilized in various kinds of area which across the range from engineering field to business area. In this paper, we first present several kernel machine learning methods of solving classification, regression and clustering problems. These have good performance but also have some limitations. We present examples to each method and analyze the advantages and disadvantages for solving different scenarios. Then we focus on one of the most popular classification methods, Support Vectors Machine (SVM). In addition, we introduce the basic theory, advantages and scenarios of using Support Vector Machine (SVM) deal with classification problems. We also explain a convenient approach of tacking SVM problems which are called Sequential Minimal Optimization (SMO). Moreover, one class SVM can be understood in a different way which is called Support Vector Data Description (SVDD). This is a famous non-linear model problem compared with SVM problems, SVDD can be solved by utilizing Gaussian RBF kernel function combined with SMO. At last, we compared the difference and performance of SVM-SMO implementation and SVM-SVDD implementation. About the application part, we utilized SVM method to handle seizure forecasting in canine epilepsy, after comparing the results from different methods such as random forest, extremely randomized tree, and SVM to classify preictal (pre-seizure) and interictal (interval-seizure) binary data. We draw the conclusion that SVM has the best performance.

  9. Twin Support Vector Machine: A review from 2007 to 2014

    Directory of Open Access Journals (Sweden)

    Divya Tomar

    2015-03-01

    Full Text Available Twin Support Vector Machine (TWSVM is an emerging machine learning method suitable for both classification and regression problems. It utilizes the concept of Generalized Eigen-values Proximal Support Vector Machine (GEPSVM and finds two non-parallel planes for each class by solving a pair of Quadratic Programming Problems. It enhances the computational speed as compared to the traditional Support Vector Machine (SVM. TWSVM was initially constructed to solve binary classification problems; later researchers successfully extended it for multi-class problem domain. TWSVM always gives promising empirical results, due to which it has many attractive features which enhance its applicability. This paper presents the research development of TWSVM in recent years. This study is divided into two main broad categories - variant based and multi-class based TWSVM methods. The paper primarily discusses the basic concept of TWSVM and highlights its applications in recent years. A comparative analysis of various research contributions based on TWSVM is also presented. This is helpful for researchers to effectively utilize the TWSVM as an emergent research methodology and encourage them to work further in the performance enhancement of TWSVM.

  10. Learning with Support Vector Machines

    CERN Document Server

    Campbell, Colin

    2010-01-01

    Support Vectors Machines have become a well established tool within machine learning. They work well in practice and have now been used across a wide range of applications from recognizing hand-written digits, to face identification, text categorisation, bioinformatics, and database marketing. In this book we give an introductory overview of this subject. We start with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise. We show that this framework can be extended to many other scenarios such a

  11. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  12. Near Real-Time Dust Aerosol Detection with Support Vector Machines for Regression

    Science.gov (United States)

    Rivas-Perea, P.; Rivas-Perea, P. E.; Cota-Ruiz, J.; Aragon Franco, R. A.

    2015-12-01

    Remote sensing instruments operating in the near-infrared spectrum usually provide the necessary information for further dust aerosol spectral analysis using statistical or machine learning algorithms. Such algorithms have proven to be effective in analyzing very specific case studies or dust events. However, very few make the analysis open to the public on a regular basis, fewer are designed specifically to operate in near real-time to higher resolutions, and almost none give a global daily coverage. In this research we investigated a large-scale approach to a machine learning algorithm called "support vector regression". The algorithm uses four near-infrared spectral bands from NASA MODIS instrument: B20 (3.66-3.84μm), B29 (8.40-8.70μm), B31 (10.78-11.28μm), and B32 (11.77-12.27μm). The algorithm is presented with ground truth from more than 30 distinct reported dust events, from different geographical regions, at different seasons, both over land and sea cover, in the presence of clouds and clear sky, and in the presence of fires. The purpose of our algorithm is to learn to distinguish the dust aerosols spectral signature from other spectral signatures, providing as output an estimate of the probability of a data point being consistent with dust aerosol signatures. During modeling with ground truth, our algorithm achieved more than 90% of accuracy, and the current live performance of the algorithm is remarkable. Moreover, our algorithm is currently operating in near real-time using NASA's Land, Atmosphere Near real-time Capability for EOS (LANCE) servers, providing a high resolution global overview including 64, 32, 16, 8, 4, 2, and 1km. The near real-time analysis of our algorithm is now available to the general public at http://dust.reev.us and archives of the results starting from 2012 are available upon request.

  13. Virtual Vector Machine for Bayesian Online Classification

    OpenAIRE

    Minka, Thomas P.; Xiang, Rongjing; Yuan; Qi

    2012-01-01

    In a typical online learning scenario, a learner is required to process a large data stream using a small memory buffer. Such a requirement is usually in conflict with a learner's primary pursuit of prediction accuracy. To address this dilemma, we introduce a novel Bayesian online classi cation algorithm, called the Virtual Vector Machine. The virtual vector machine allows you to smoothly trade-off prediction accuracy with memory size. The virtual vector machine summarizes the information con...

  14. Prediction of hourly PM2.5 using a space-time support vector regression model

    Science.gov (United States)

    Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

    2018-05-01

    Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.

  15. Clustering Categories in Support Vector Machines

    DEFF Research Database (Denmark)

    Carrizosa, Emilio; Nogales-Gómez, Amaya; Morales, Dolores Romero

    2017-01-01

    The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain in in...

  16. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    Science.gov (United States)

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  17. Lithium-ion battery remaining useful life prediction based on grey support vector machines

    Directory of Open Access Journals (Sweden)

    Xiaogang Li

    2015-12-01

    Full Text Available In this article, an improved grey prediction model is proposed to address low-accuracy prediction issue of grey forecasting model. The first step is using a trigonometric function to transform the original data sequence to smooth the data, which is called smoothness of grey prediction model, and then a grey support vector machine model by integrating the improved grey model with support vector machine is introduced. At the initial stage of the model, trigonometric functions and accumulation generation operation can be used to preprocess the data, which enhances the smoothness of the data and reduces the associated randomness. In addition, support vector machine is implemented to establish a prediction model for the pre-processed data and select the optimal model parameters via genetic algorithms. Finally, the data are restored through the ‘regressive generate’ operation to obtain the forecasting data. To prove that the grey support vector machine model is superior to the other models, the battery life data from the Center for Advanced Life Cycle Engineering are selected, and the presented model is used to predict the remaining useful life of the battery. The predicted result is compared to that of grey model and support vector machines. For a more intuitive comparison of the three models, this article quantifies the root mean square errors for these three different models in the case of different ratio of training samples and prediction samples. The results show that the effect of grey support vector machine model is optimal, and the corresponding root mean square error is only 3.18%.

  18. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    Science.gov (United States)

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  19. Estimation of perceptible water vapor of atmosphere using artificial neural network, support vector machine and multiple linear regression algorithm and their comparative study

    Science.gov (United States)

    Shastri, Niket; Pathak, Kamlesh

    2018-05-01

    The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.

  20. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    Science.gov (United States)

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  1. Coal demand prediction based on a support vector machine model

    Energy Technology Data Exchange (ETDEWEB)

    Jia, Cun-liang; Wu, Hai-shan; Gong, Dun-wei [China University of Mining & Technology, Xuzhou (China). School of Information and Electronic Engineering

    2007-01-15

    A forecasting model for coal demand of China using a support vector regression was constructed. With the selected embedding dimension, the output vectors and input vectors were constructed based on the coal demand of China from 1980 to 2002. After compared with lineal kernel and Sigmoid kernel, a radial basis function(RBF) was adopted as the kernel function. By analyzing the relationship between the error margin of prediction and the model parameters, the proper parameters were chosen. The support vector machines (SVM) model with multi-input and single output was proposed. Compared the predictor based on RBF neural networks with test datasets, the results show that the SVM predictor has higher precision and greater generalization ability. In the end, the coal demand from 2003 to 2006 is accurately forecasted. l0 refs., 2 figs., 4 tabs.

  2. Collapse moment estimation by support vector machines for wall-thinned pipe bends and elbows

    International Nuclear Information System (INIS)

    Na, Man Gyun; Kim, Jin Weon; Hwang, In Joon

    2007-01-01

    The collapse moment due to wall-thinned defects is estimated through support vector machines with parameters optimized by a genetic algorithm. The support vector regression models are developed and applied to numerical data obtained from the finite element analysis for wall-thinned defects in piping systems. The support vector regression models are optimized by using both the data sets (training data and optimization data) prepared for training and optimization, and its performance verification is performed by using another data set (test data) different from the training data and the optimization data. In this work, three support vector regression models are developed, respectively, for three data sets divided into the three classes of extrados, intrados, and crown defects, which is because they have different characteristics. The relative root mean square (RMS) errors of the estimated collapse moment are 0.2333% for the training data, 0.5229% for the optimization data and 0.5011% for the test data. It is known from this result that the support vector regression models are sufficiently accurate to be used in the integrity evaluation of wall-thinned pipe bends and elbows

  3. Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study

    Science.gov (United States)

    Al-Anazi, A. F.; Gates, I. D.

    2010-12-01

    In wells with limited log and core data, porosity, a fundamental and essential property to characterize reservoirs, is challenging to estimate by conventional statistical methods from offset well log and core data in heterogeneous formations. Beyond simple regression, neural networks have been used to develop more accurate porosity correlations. Unfortunately, neural network-based correlations have limited generalization ability and global correlations for a field are usually less accurate compared to local correlations for a sub-region of the reservoir. In this paper, support vector machines are explored as an intelligent technique to correlate porosity to well log data. Recently, support vector regression (SVR), based on the statistical learning theory, have been proposed as a new intelligence technique for both prediction and classification tasks. The underlying formulation of support vector machines embodies the structural risk minimization (SRM) principle which has been shown to be superior to the traditional empirical risk minimization (ERM) principle employed by conventional neural networks and classical statistical methods. This new formulation uses margin-based loss functions to control model complexity independently of the dimensionality of the input space, and kernel functions to project the estimation problem to a higher dimensional space, which enables the solution of more complex nonlinear problem optimization methods to exist for a globally optimal solution. SRM minimizes an upper bound on the expected risk using a margin-based loss function ( ɛ-insensitivity loss function for regression) in contrast to ERM which minimizes the error on the training data. Unlike classical learning methods, SRM, indexed by margin-based loss function, can also control model complexity independent of dimensionality. The SRM inductive principle is designed for statistical estimation with finite data where the ERM inductive principle provides the optimal solution (the

  4. Automatic Modulation Recognition by Support Vector Machines Using Wavelet Kernel

    Energy Technology Data Exchange (ETDEWEB)

    Feng, X Z; Yang, J; Luo, F L; Chen, J Y; Zhong, X P [College of Mechatronic Engineering and Automation, National University of Defense Technology, Changsha (China)

    2006-10-15

    Automatic modulation identification plays a significant role in electronic warfare, electronic surveillance systems and electronic counter measure. The task of modulation recognition of communication signals is to determine the modulation type and signal parameters. In fact, automatic modulation identification can be range to an application of pattern recognition in communication field. The support vector machines (SVM) is a new universal learning machine which is widely used in the fields of pattern recognition, regression estimation and probability density. In this paper, a new method using wavelet kernel function was proposed, which maps the input vector xi into a high dimensional feature space F. In this feature space F, we can construct the optimal hyperplane that realizes the maximal margin in this space. That is to say, we can use SVM to classify the communication signals into two groups, namely analogue modulated signals and digitally modulated signals. In addition, computer simulation results are given at last, which show good performance of the method.

  5. Automatic Modulation Recognition by Support Vector Machines Using Wavelet Kernel

    International Nuclear Information System (INIS)

    Feng, X Z; Yang, J; Luo, F L; Chen, J Y; Zhong, X P

    2006-01-01

    Automatic modulation identification plays a significant role in electronic warfare, electronic surveillance systems and electronic counter measure. The task of modulation recognition of communication signals is to determine the modulation type and signal parameters. In fact, automatic modulation identification can be range to an application of pattern recognition in communication field. The support vector machines (SVM) is a new universal learning machine which is widely used in the fields of pattern recognition, regression estimation and probability density. In this paper, a new method using wavelet kernel function was proposed, which maps the input vector xi into a high dimensional feature space F. In this feature space F, we can construct the optimal hyperplane that realizes the maximal margin in this space. That is to say, we can use SVM to classify the communication signals into two groups, namely analogue modulated signals and digitally modulated signals. In addition, computer simulation results are given at last, which show good performance of the method

  6. T-wave end detection using neural networks and Support Vector Machines.

    Science.gov (United States)

    Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román

    2018-05-01

    In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.

  7. Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression

    Science.gov (United States)

    Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.

    2010-01-01

    In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.

  8. Efficient Prediction of Low-Visibility Events at Airports Using Machine-Learning Regression

    Science.gov (United States)

    Cornejo-Bueno, L.; Casanova-Mateo, C.; Sanz-Justo, J.; Cerro-Prada, E.; Salcedo-Sanz, S.

    2017-11-01

    We address the prediction of low-visibility events at airports using machine-learning regression. The proposed model successfully forecasts low-visibility events in terms of the runway visual range at the airport, with the use of support-vector regression, neural networks (multi-layer perceptrons and extreme-learning machines) and Gaussian-process algorithms. We assess the performance of these algorithms based on real data collected at the Valladolid airport, Spain. We also propose a study of the atmospheric variables measured at a nearby tower related to low-visibility atmospheric conditions, since they are considered as the inputs of the different regressors. A pre-processing procedure of these input variables with wavelet transforms is also described. The results show that the proposed machine-learning algorithms are able to predict low-visibility events well. The Gaussian process is the best algorithm among those analyzed, obtaining over 98% of the correct classification rate in low-visibility events when the runway visual range is {>}1000 m, and about 80% under this threshold. The performance of all the machine-learning algorithms tested is clearly affected in extreme low-visibility conditions ({algorithm performance in daytime and nighttime conditions, and for different prediction time horizons.

  9. 2D Quantitative Structure-Property Relationship Study of Mycotoxins by Multiple Linear Regression and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Fereshteh Shiri

    2010-08-01

    Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

  10. Support vector regression model for the estimation of γ-ray buildup factors for multi-layer shields

    International Nuclear Information System (INIS)

    Trontl, Kresimir; Smuc, Tomislav; Pevec, Dubravko

    2007-01-01

    The accuracy of the point-kernel method, which is a widely used practical tool for γ-ray shielding calculations, strongly depends on the quality and accuracy of buildup factors used in the calculations. Although, buildup factors for single-layer shields comprised of a single material are well known, calculation of buildup factors for stratified shields, each layer comprised of different material or a combination of materials, represent a complex physical problem. Recently, a new compact mathematical model for multi-layer shield buildup factor representation has been suggested for embedding into point-kernel codes thus replacing traditionally generated complex mathematical expressions. The new regression model is based on support vector machines learning technique, which is an extension of Statistical Learning Theory. The paper gives complete description of the novel methodology with results pertaining to realistic engineering multi-layer shielding geometries. The results based on support vector regression machine learning confirm that this approach provides a framework for general, accurate and computationally acceptable multi-layer buildup factor model

  11. Subspace identification of Hammer stein models using support vector machines

    International Nuclear Information System (INIS)

    Al-Dhaifallah, Mujahed

    2011-01-01

    System identification is the art of finding mathematical tools and algorithms that build an appropriate mathematical model of a system from measured input and output data. Hammerstein model, consisting of a memoryless nonlinearity followed by a dynamic linear element, is often a good trade-off as it can represent some dynamic nonlinear systems very accurately, but is nonetheless quite simple. Moreover, the extensive knowledge about LTI system representations can be applied to the dynamic linear block. On the other hand, finding an effective representation for the nonlinearity is an active area of research. Recently, support vector machines (SVMs) and least squares support vector machines (LS-SVMs) have demonstrated powerful abilities in approximating linear and nonlinear functions. In contrast with other approximation methods, SVMs do not require a-priori structural information. Furthermore, there are well established methods with guaranteed convergence (ordinary least squares, quadratic programming) for fitting LS-SVMs and SVMs. The general objective of this research is to develop new subspace algorithms for Hammerstein systems based on SVM regression.

  12. Support Vector Hazards Machine: A Counting Process Framework for Learning Risk Scores for Censored Outcomes.

    Science.gov (United States)

    Wang, Yuanjia; Chen, Tianle; Zeng, Donglin

    2016-01-01

    Learning risk scores to predict dichotomous or continuous outcomes using machine learning approaches has been studied extensively. However, how to learn risk scores for time-to-event outcomes subject to right censoring has received little attention until recently. Existing approaches rely on inverse probability weighting or rank-based regression, which may be inefficient. In this paper, we develop a new support vector hazards machine (SVHM) approach to predict censored outcomes. Our method is based on predicting the counting process associated with the time-to-event outcomes among subjects at risk via a series of support vector machines. Introducing counting processes to represent time-to-event data leads to a connection between support vector machines in supervised learning and hazards regression in standard survival analysis. To account for different at risk populations at observed event times, a time-varying offset is used in estimating risk scores. The resulting optimization is a convex quadratic programming problem that can easily incorporate non-linearity using kernel trick. We demonstrate an interesting link from the profiled empirical risk function of SVHM to the Cox partial likelihood. We then formally show that SVHM is optimal in discriminating covariate-specific hazard function from population average hazard function, and establish the consistency and learning rate of the predicted risk using the estimated risk scores. Simulation studies show improved prediction accuracy of the event times using SVHM compared to existing machine learning methods and standard conventional approaches. Finally, we analyze two real world biomedical study data where we use clinical markers and neuroimaging biomarkers to predict age-at-onset of a disease, and demonstrate superiority of SVHM in distinguishing high risk versus low risk subjects.

  13. Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis

    Science.gov (United States)

    Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried

    2018-03-01

    This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.

  14. Indonesian Stock Prediction using Support Vector Machine (SVM

    Directory of Open Access Journals (Sweden)

    Santoso Murtiyanto

    2018-01-01

    Full Text Available This project is part of developing software to provide predictive information technology-based services artificial intelligence (Machine Intelligence or Machine Learning that will be utilized in the money market community. The prediction method used in this early stages uses the combination of Gaussian Mixture Model and Support Vector Machine with Python programming. The system predicts the price of Astra International (stock code: ASII.JK stock data. The data used was taken during 17 yr period of January 2000 until September 2017. Some data was used for training/modeling (80 % of data and the remainder (20 % was used for testing. An integrated model comprising Gaussian Mixture Model and Support Vector Machine system has been tested to predict stock market of ASII.JK for l d in advance. This model has been compared with the Market Cummulative Return. From the results, it is depicts that the Gaussian Mixture Model-Support Vector Machine based stock predicted model, offers significant improvement over the compared models resulting sharpe ratio of 3.22.

  15. A Wavelet Kernel-Based Primal Twin Support Vector Machine for Economic Development Prediction

    Directory of Open Access Journals (Sweden)

    Fang Su

    2013-01-01

    Full Text Available Economic development forecasting allows planners to choose the right strategies for the future. This study is to propose economic development prediction method based on the wavelet kernel-based primal twin support vector machine algorithm. As gross domestic product (GDP is an important indicator to measure economic development, economic development prediction means GDP prediction in this study. The wavelet kernel-based primal twin support vector machine algorithm can solve two smaller sized quadratic programming problems instead of solving a large one as in the traditional support vector machine algorithm. Economic development data of Anhui province from 1992 to 2009 are used to study the prediction performance of the wavelet kernel-based primal twin support vector machine algorithm. The comparison of mean error of economic development prediction between wavelet kernel-based primal twin support vector machine and traditional support vector machine models trained by the training samples with the 3–5 dimensional input vectors, respectively, is given in this paper. The testing results show that the economic development prediction accuracy of the wavelet kernel-based primal twin support vector machine model is better than that of traditional support vector machine.

  16. Image superresolution using support vector regression.

    Science.gov (United States)

    Ni, Karl S; Nguyen, Truong Q

    2007-06-01

    A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.

  17. Online Support Vector Regression with Varying Parameters for Time-Dependent Data

    International Nuclear Information System (INIS)

    Omitaomu, Olufemi A.; Jeong, Myong K.; Badiru, Adedeji B.

    2011-01-01

    Support vector regression (SVR) is a machine learning technique that continues to receive interest in several domains including manufacturing, engineering, and medicine. In order to extend its application to problems in which datasets arrive constantly and in which batch processing of the datasets is infeasible or expensive, an accurate online support vector regression (AOSVR) technique was proposed. The AOSVR technique efficiently updates a trained SVR function whenever a sample is added to or removed from the training set without retraining the entire training data. However, the AOSVR technique assumes that the new samples and the training samples are of the same characteristics; hence, the same value of SVR parameters is used for training and prediction. This assumption is not applicable to data samples that are inherently noisy and non-stationary such as sensor data. As a result, we propose Accurate On-line Support Vector Regression with Varying Parameters (AOSVR-VP) that uses varying SVR parameters rather than fixed SVR parameters, and hence accounts for the variability that may exist in the samples. To accomplish this objective, we also propose a generalized weight function to automatically update the weights of SVR parameters in on-line monitoring applications. The proposed function allows for lower and upper bounds for SVR parameters. We tested our proposed approach and compared results with the conventional AOSVR approach using two benchmark time series data and sensor data from nuclear power plant. The results show that using varying SVR parameters is more applicable to time dependent data.

  18. Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis.

    Science.gov (United States)

    Oguntunde, Philip G; Lischeid, Gunnar; Dietrich, Ottfried

    2018-03-01

    This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease (P  1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.

  19. Hybrid genetic algorithm tuned support vector machine regression for wave transmission prediction of horizontally interlaced multilayer moored floating pipe breakwater

    Digital Repository Service at National Institute of Oceanography (India)

    Patil, S.G.; Mandal, S.; Hegde, A.V.; Muruganandam, A.

    Support Vector Machine (SVM) works on structural risk minimization principle that has greater generalization ability and is superior to the empirical risk minimization principle as adopted in conventional neural network models. However...

  20. Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression

    Directory of Open Access Journals (Sweden)

    Ali Asghar Besalatpour

    2016-02-01

    Full Text Available Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010. Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA, mean weight diameter (MWD, geometric mean diameter (GMD of aggregates, and water-dispersible clay (WDC content (Calero et al., 2008. Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010. Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014. The main objective of this study is to investigate the potential use of support vector machines (SVMs method for estimating soil aggregate stability (as quantified by GMD as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E, which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM content was determined by the Walkley-Black method (Nelson & Sommers, 1986. Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand were measured using the procedure described by Gee & Bauder (1986 and calcium carbonate equivalent (CCE content was determined by the back-titration method (Nelson, 1982. The modified Kemper & Rosenau (1986 method was used to determine wet-aggregate stability (GMD. The topographic attributes of elevation, slope, and aspect were characterized using a 20-m

  1. Adaptive image denoising based on support vector machine and wavelet description

    Science.gov (United States)

    An, Feng-Ping; Zhou, Xian-Wei

    2017-12-01

    Adaptive image denoising method decomposes the original image into a series of basic pattern feature images on the basis of wavelet description and constructs the support vector machine regression function to realize the wavelet description of the original image. The support vector machine method allows the linear expansion of the signal to be expressed as a nonlinear function of the parameters associated with the SVM. Using the radial basis kernel function of SVM, the original image can be extended into a MEXICAN function and a residual trend. This MEXICAN represents a basic image feature pattern. If the residual does not fluctuate, it can also be represented as a characteristic pattern. If the residuals fluctuate significantly, it is treated as a new image and the same decomposition process is repeated until the residuals obtained by the decomposition do not significantly fluctuate. Experimental results show that the proposed method in this paper performs well; especially, it satisfactorily solves the problem of image noise removal. It may provide a new tool and method for image denoising.

  2. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    Science.gov (United States)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  3. Wind Power Ramp Events Prediction with Hybrid Machine Learning Regression Techniques and Reanalysis Data

    Directory of Open Access Journals (Sweden)

    Laura Cornejo-Bueno

    2017-11-01

    Full Text Available Wind Power Ramp Events (WPREs are large fluctuations of wind power in a short time interval, which lead to strong, undesirable variations in the electric power produced by a wind farm. Its accurate prediction is important in the effort of efficiently integrating wind energy in the electric system, without affecting considerably its stability, robustness and resilience. In this paper, we tackle the problem of predicting WPREs by applying Machine Learning (ML regression techniques. Our approach consists of using variables from atmospheric reanalysis data as predictive inputs for the learning machine, which opens the possibility of hybridizing numerical-physical weather models with ML techniques for WPREs prediction in real systems. Specifically, we have explored the feasibility of a number of state-of-the-art ML regression techniques, such as support vector regression, artificial neural networks (multi-layer perceptrons and extreme learning machines and Gaussian processes to solve the problem. Furthermore, the ERA-Interim reanalysis from the European Center for Medium-Range Weather Forecasts is the one used in this paper because of its accuracy and high resolution (in both spatial and temporal domains. Aiming at validating the feasibility of our predicting approach, we have carried out an extensive experimental work using real data from three wind farms in Spain, discussing the performance of the different ML regression tested in this wind power ramp event prediction problem.

  4. Using support vector machines to identify literacy skills: Evidence from eye movements.

    Science.gov (United States)

    Lou, Ya; Liu, Yanping; Kaakinen, Johanna K; Li, Xingshan

    2017-06-01

    Is inferring readers' literacy skills possible by analyzing their eye movements during text reading? This study used Support Vector Machines (SVM) to analyze eye movement data from 61 undergraduate students who read a multiple-paragraph, multiple-topic expository text. Forward fixation time, first-pass rereading time, second-pass fixation time, and regression path reading time on different regions of the text were provided as features. The SVM classification algorithm assisted in distinguishing high-literacy-skilled readers from low-literacy-skilled readers with 80.3 % accuracy. Results demonstrate the effectiveness of combining eye tracking and machine learning techniques to detect readers with low literacy skills, and suggest that such approaches can be potentially used in predicting other cognitive abilities.

  5. Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features

    Directory of Open Access Journals (Sweden)

    Peek Andrew S

    2007-06-01

    Full Text Available Abstract Background RNA interference (RNAi is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid

  6. Support vector machine in machine condition monitoring and fault diagnosis

    Science.gov (United States)

    Widodo, Achmad; Yang, Bo-Suk

    2007-08-01

    Recently, the issue of machine condition monitoring and fault diagnosis as a part of maintenance system became global due to the potential advantages to be gained from reduced maintenance costs, improved productivity and increased machine availability. This paper presents a survey of machine condition monitoring and fault diagnosis using support vector machine (SVM). It attempts to summarize and review the recent research and developments of SVM in machine condition monitoring and diagnosis. Numerous methods have been developed based on intelligent systems such as artificial neural network, fuzzy expert system, condition-based reasoning, random forest, etc. However, the use of SVM for machine condition monitoring and fault diagnosis is still rare. SVM has excellent performance in generalization so it can produce high accuracy in classification for machine condition monitoring and diagnosis. Until 2006, the use of SVM in machine condition monitoring and fault diagnosis is tending to develop towards expertise orientation and problem-oriented domain. Finally, the ability to continually change and obtain a novel idea for machine condition monitoring and fault diagnosis using SVM will be future works.

  7. Application of Support Vector Machine to Forex Monitoring

    Science.gov (United States)

    Kamruzzaman, Joarder; Sarker, Ruhul A.

    Previous studies have demonstrated superior performance of artificial neural network (ANN) based forex forecasting models over traditional regression models. This paper applies support vector machines to build a forecasting model from the historical data using six simple technical indicators and presents a comparison with an ANN based model trained by scaled conjugate gradient (SCG) learning algorithm. The models are evaluated and compared on the basis of five commonly used performance metrics that measure closeness of prediction as well as correctness in directional change. Forecasting results of six different currencies against Australian dollar reveal superior performance of SVM model using simple linear kernel over ANN-SCG model in terms of all the evaluation metrics. The effect of SVM parameter selection on prediction performance is also investigated and analyzed.

  8. Image Jacobian Matrix Estimation Based on Online Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shangqin Mao

    2012-10-01

    Full Text Available Research into robotics visual servoing is an important area in the field of robotics. It has proven difficult to achieve successful results for machine vision and robotics in unstructured environments without using any a priori camera or kinematic models. In uncalibrated visual servoing, image Jacobian matrix estimation methods can be divided into two groups: the online method and the offline method. The offline method is not appropriate for most natural environments. The online method is robust but rough. Moreover, if the images feature configuration changes, it needs to restart the approximating procedure. A novel approach based on an online support vector regression (OL-SVR algorithm is proposed which overcomes the drawbacks and combines the virtues just mentioned.

  9. Reconfigurable support vector machine classifier with approximate computing

    NARCIS (Netherlands)

    van Leussen, M.J.; Huisken, J.; Wang, L.; Jiao, H.; De Gyvez, J.P.

    2017-01-01

    Support Vector Machine (SVM) is one of the most popular machine learning algorithms. An energy-efficient SVM classifier is proposed in this paper, where approximate computing is utilized to reduce energy consumption and silicon area. A hardware architecture with reconfigurable kernels and

  10. Using support vector regression to predict PM10 and PM2.5

    International Nuclear Information System (INIS)

    Weizhen, Hou; Zhengqiang, Li; Yuhuan, Zhang; Hua, Xu; Ying, Zhang; Kaitao, Li; Donghui, Li; Peng, Wei; Yan, Ma

    2014-01-01

    Support vector machine (SVM), as a novel and powerful machine learning tool, can be used for the prediction of PM 10 and PM 2.5 (particulate matter less or equal than 10 and 2.5 micrometer) in the atmosphere. This paper describes the development of a successive over relaxation support vector regress (SOR-SVR) model for the PM 10 and PM 2.5 prediction, based on the daily average aerosol optical depth (AOD) and meteorological parameters (atmospheric pressure, relative humidity, air temperature, wind speed), which were all measured in Beijing during the year of 2010–2012. The Gaussian kernel function, as well as the k-fold crosses validation and grid search method, are used in SVR model to obtain the optimal parameters to get a better generalization capability. The result shows that predicted values by the SOR-SVR model agree well with the actual data and have a good generalization ability to predict PM 10 and PM 2.5 . In addition, AOD plays an important role in predicting particulate matter with SVR model, which should be included in the prediction model. If only considering the meteorological parameters and eliminating AOD from the SVR model, the prediction results of predict particulate matter will be not satisfying

  11. Short-Term Prediction of Air Pollution in Macau Using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Chi-Man Vong

    2012-01-01

    Full Text Available Forecasting of air pollution is a popular and important topic in recent years due to the health impact caused by air pollution. It is necessary to build an early warning system, which provides forecast and also alerts health alarm to local inhabitants by medical practitioners and the local government. Meteorological and pollutions data collected daily at monitoring stations of Macau can be used in this study to build a forecasting system. Support vector machines (SVMs, a novel type of machine learning technique based on statistical learning theory, can be used for regression and time series prediction. SVM is capable of good generalization while the performance of the SVM model is often hinged on the appropriate choice of the kernel.

  12. Infinite ensemble of support vector machines for prediction of ...

    African Journals Online (AJOL)

    user

    the support vector machines (SVMs), a machine learning algorithm used ... work designs so that specific, quantitative workplace assessments can be made ... with SVMs can be obtained by embedding the base learners (hypothesis) into a.

  13. Support vector machine for automatic pain recognition

    Science.gov (United States)

    Monwar, Md Maruf; Rezaei, Siamak

    2009-02-01

    Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.

  14. Support Vector Machine Classification of Drunk Driving Behaviour.

    Science.gov (United States)

    Chen, Huiqin; Chen, Lei

    2017-01-23

    Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R-R intervals (SDNN), the root mean square value of the difference of the adjacent R-R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  15. Support Vector Machine Classification of Drunk Driving Behaviour

    Directory of Open Access Journals (Sweden)

    Huiqin Chen

    2017-01-01

    Full Text Available Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN, the root mean square value of the difference of the adjacent R–R interval series (RMSSD, low frequency (LF, high frequency (HF, the ratio of the low and high frequencies (LF/HF, and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.

  16. Estimation of the wind turbine yaw error by support vector machines

    DEFF Research Database (Denmark)

    Sheibat-Othman, Nida; Othman, Sami; Tayari, Raoaa

    2015-01-01

    Wind turbine yaw error information is of high importance in controlling wind turbine power and structural load. Normally used wind vanes are imprecise. In this work, the estimation of yaw error in wind turbines is studied using support vector machines for regression (SVR). As the methodology...... is data-based, simulated data from a high fidelity aero-elastic model is used for learning. The model simulates a variable speed horizontal-axis wind turbine composed of three blades and a full converter. Both partial load (blade angles fixed at 0 deg) and full load zones (active pitch actuators...

  17. Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-06-28

    A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is

  18. A Novel Support Vector Machine with Globality-Locality Preserving

    Directory of Open Access Journals (Sweden)

    Cheng-Long Ma

    2014-01-01

    Full Text Available Support vector machine (SVM is regarded as a powerful method for pattern classification. However, the solution of the primal optimal model of SVM is susceptible for class distribution and may result in a nonrobust solution. In order to overcome this shortcoming, an improved model, support vector machine with globality-locality preserving (GLPSVM, is proposed. It introduces globality-locality preserving into the standard SVM, which can preserve the manifold structure of the data space. We complete rich experiments on the UCI machine learning data sets. The results validate the effectiveness of the proposed model, especially on the Wine and Iris databases; the recognition rate is above 97% and outperforms all the algorithms that were developed from SVM.

  19. Prediction of Hydrocarbon Reservoirs Permeability Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    R. Gholami

    2012-01-01

    Full Text Available Permeability is a key parameter associated with the characterization of any hydrocarbon reservoir. In fact, it is not possible to have accurate solutions to many petroleum engineering problems without having accurate permeability value. The conventional methods for permeability determination are core analysis and well test techniques. These methods are very expensive and time consuming. Therefore, attempts have usually been carried out to use artificial neural network for identification of the relationship between the well log data and core permeability. In this way, recent works on artificial intelligence techniques have led to introduce a robust machine learning methodology called support vector machine. This paper aims to utilize the SVM for predicting the permeability of three gas wells in the Southern Pars field. Obtained results of SVM showed that the correlation coefficient between core and predicted permeability is 0.97 for testing dataset. Comparing the result of SVM with that of a general regression neural network (GRNN revealed that the SVM approach is faster and more accurate than the GRNN in prediction of hydrocarbon reservoirs permeability.

  20. A Two-Layer Least Squares Support Vector Machine Approach to Credit Risk Assessment

    Science.gov (United States)

    Liu, Jingli; Li, Jianping; Xu, Weixuan; Shi, Yong

    Least squares support vector machine (LS-SVM) is a revised version of support vector machine (SVM) and has been proved to be a useful tool for pattern recognition. LS-SVM had excellent generalization performance and low computational cost. In this paper, we propose a new method called two-layer least squares support vector machine which combines kernel principle component analysis (KPCA) and linear programming form of least square support vector machine. With this method sparseness and robustness is obtained while solving large dimensional and large scale database. A U.S. commercial credit card database is used to test the efficiency of our method and the result proved to be a satisfactory one.

  1. Ameliorated Austenite Carbon Content Control in Austempered Ductile Irons by Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Chan-Yun Yang

    2013-01-01

    Full Text Available Austempered ductile iron has emerged as a notable material in several engineering fields, including marine applications. The initial austenite carbon content after austenization transform but before austempering process for generating bainite matrix proved critical in controlling the resulted microstructure and thus mechanical properties. In this paper, support vector regression is employed in order to establish a relationship between the initial carbon concentration in the austenite with austenization temperature and alloy contents, thereby exercising improved control in the mechanical properties of the austempered ductile irons. Particularly, the paper emphasizes a methodology tailored to deal with a limited amount of available data with intrinsically contracted and skewed distribution. The collected information from a variety of data sources presents another challenge of highly uncertain variance. The authors present a hybrid model consisting of a procedure of a histogram equalizer and a procedure of a support-vector-machine (SVM- based regression to gain a more robust relationship to respond to the challenges. The results show greatly improved accuracy of the proposed model in comparison to two former established methodologies. The sum squared error of the present model is less than one fifth of that of the two previous models.

  2. Fast multi-output relevance vector regression

    OpenAIRE

    Ha, Youngmin

    2017-01-01

    This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V

  3. The Weighted Support Vector Machine Based on Hybrid Swarm Intelligence Optimization for Icing Prediction of Transmission Line

    Directory of Open Access Journals (Sweden)

    Xiaomin Xu

    2015-01-01

    Full Text Available Not only can the icing coat on transmission line cause the electrical fault of gap discharge and icing flashover but also it will lead to the mechanical failure of tower, conductor, insulators, and others. It will bring great harm to the people’s daily life and work. Thus, accurate prediction of ice thickness has important significance for power department to control the ice disaster effectively. Based on the analysis of standard support vector machine, this paper presents a weighted support vector machine regression model based on the similarity (WSVR. According to the different importance of samples, this paper introduces the weighted support vector machine and optimizes its parameters by hybrid swarm intelligence optimization algorithm with the particle swarm and ant colony (PSO-ACO, which improves the generalization ability of the model. In the case study, the actual data of ice thickness and climate in a certain area of Hunan province have been used to predict the icing thickness of the area, which verifies the validity and applicability of this proposed method. The predicted results show that the intelligent model proposed in this paper has higher precision and stronger generalization ability.

  4. Density Based Support Vector Machines for Classification

    OpenAIRE

    Zahra Nazari; Dongshik Kang

    2015-01-01

    Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...

  5. Evaluating automatically parallelized versions of the support vector machine

    NARCIS (Netherlands)

    Codreanu, Valeriu; Droge, Bob; Williams, David; Yasar, Burhan; Yang, Fo; Liu, Baoquan; Dong, Feng; Surinta, Olarik; Schomaker, Lambertus; Roerdink, Jos; Wiering, Marco

    2014-01-01

    The support vector machine (SVM) is a supervised learning algorithm used for recognizing patterns in data. It is a very popular technique in machine learning and has been successfully used in applications such as image classification, protein classification, and handwriting recognition. However, the

  6. Evaluating automatically parallelized versions of the support vector machine

    NARCIS (Netherlands)

    Codreanu, V.; Dröge, B.; Williams, D.; Yasar, B.; Yang, P.; Liu, B.; Dong, F.; Surinta, O.; Schomaker, L.R.B.; Roerdink, J.B.T.M.; Wiering, M.A.

    2016-01-01

    The support vector machine (SVM) is a supervised learning algorithm used for recognizing patterns in data. It is a very popular technique in machine learning and has been successfully used in applications such as image classification, protein classification, and handwriting recognition. However, the

  7. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics

    OpenAIRE

    HUANG, SHUJUN; CAI, NIANGUANG; PACHECO, PEDRO PENZUTI; NARANDES, SHAVIRA; WANG, YANG; XU, WAYNE

    2017-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better ...

  8. Online Sequential Projection Vector Machine with Adaptive Data Mean Update.

    Science.gov (United States)

    Chen, Lin; Jia, Ji-Ting; Zhang, Qiong; Deng, Wan-Yu; Wei, Wei

    2016-01-01

    We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM) which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1) the projection vectors for dimension reduction, (2) the input weights, biases, and output weights, and (3) the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD) approach, adaptive multihyperplane machine (AMM), primal estimated subgradient solver (Pegasos), online sequential extreme learning machine (OSELM), and SVD + OSELM (feature selection based on SVD is performed before OSELM). The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.

  9. Online Sequential Projection Vector Machine with Adaptive Data Mean Update

    Directory of Open Access Journals (Sweden)

    Lin Chen

    2016-01-01

    Full Text Available We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1 the projection vectors for dimension reduction, (2 the input weights, biases, and output weights, and (3 the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD approach, adaptive multihyperplane machine (AMM, primal estimated subgradient solver (Pegasos, online sequential extreme learning machine (OSELM, and SVD + OSELM (feature selection based on SVD is performed before OSELM. The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.

  10. Intelligent Quality Prediction Using Weighted Least Square Support Vector Regression

    Science.gov (United States)

    Yu, Yaojun

    A novel quality prediction method with mobile time window is proposed for small-batch producing process based on weighted least squares support vector regression (LS-SVR). The design steps and learning algorithm are also addressed. In the method, weighted LS-SVR is taken as the intelligent kernel, with which the small-batch learning is solved well and the nearer sample is set a larger weight, while the farther is set the smaller weight in the history data. A typical machining process of cutting bearing outer race is carried out and the real measured data are used to contrast experiment. The experimental results demonstrate that the prediction accuracy of the weighted LS-SVR based model is only 20%-30% that of the standard LS-SVR based one in the same condition. It provides a better candidate for quality prediction of small-batch producing process.

  11. Infinite ensemble of support vector machines for prediction of ...

    African Journals Online (AJOL)

    Many researchers have demonstrated the use of artificial neural networks (ANNs) to predict musculoskeletal disorders risk associated with occupational exposures. In order to improve the accuracy of LBDs risk classification, this paper proposes to use the support vector machines (SVMs), a machine learning algorithm used ...

  12. Variance inflation in high dimensional Support Vector Machines

    DEFF Research Database (Denmark)

    Abrahamsen, Trine Julie; Hansen, Lars Kai

    2013-01-01

    Many important machine learning models, supervised and unsupervised, are based on simple Euclidean distance or orthogonal projection in a high dimensional feature space. When estimating such models from small training sets we face the problem that the span of the training data set input vectors...... the case of Support Vector Machines (SVMS) and we propose a non-parametric scheme to restore proper generalizability. We illustrate the algorithm and its ability to restore performance on a wide range of benchmark data sets....... follow a different probability law with less variance. While the problem and basic means to reconstruct and deflate are well understood in unsupervised learning, the case of supervised learning is less well understood. We here investigate the effect of variance inflation in supervised learning including...

  13. A multi-scale relevance vector regression approach for daily urban water demand forecasting

    Science.gov (United States)

    Bai, Yun; Wang, Pu; Li, Chuan; Xie, Jingjing; Wang, Yin

    2014-09-01

    Water is one of the most important resources for economic and social developments. Daily water demand forecasting is an effective measure for scheduling urban water facilities. This work proposes a multi-scale relevance vector regression (MSRVR) approach to forecast daily urban water demand. The approach uses the stationary wavelet transform to decompose historical time series of daily water supplies into different scales. At each scale, the wavelet coefficients are used to train a machine-learning model using the relevance vector regression (RVR) method. The estimated coefficients of the RVR outputs for all of the scales are employed to reconstruct the forecasting result through the inverse wavelet transform. To better facilitate the MSRVR forecasting, the chaos features of the daily water supply series are analyzed to determine the input variables of the RVR model. In addition, an adaptive chaos particle swarm optimization algorithm is used to find the optimal combination of the RVR model parameters. The MSRVR approach is evaluated using real data collected from two waterworks and is compared with recently reported methods. The results show that the proposed MSRVR method can forecast daily urban water demand much more precisely in terms of the normalized root-mean-square error, correlation coefficient, and mean absolute percentage error criteria.

  14. Predicting subject-driven actions and sensory experience in a virtual world with relevance vector machine regression of fMRI data.

    Science.gov (United States)

    Valente, Giancarlo; De Martino, Federico; Esposito, Fabrizio; Goebel, Rainer; Formisano, Elia

    2011-05-15

    In this work we illustrate the approach of the Maastricht Brain Imaging Center to the PBAIC 2007 competition, where participants had to predict, based on fMRI measurements of brain activity, subject driven actions and sensory experience in a virtual world. After standard pre-processing (slice scan time correction, motion correction), we generated rating predictions based on linear Relevance Vector Machine (RVM) learning from all brain voxels. Spatial and temporal filtering of the time series was optimized rating by rating. For some of the ratings (e.g. Instructions, Hits, Faces, Velocity), linear RVM regression was accurate and very consistent within and between subjects. For other ratings (e.g. Arousal, Valence) results were less satisfactory. Our approach ranked overall second. To investigate the role of different brain regions in ratings prediction we generated predictive maps, i.e. maps of the weighted contribution of each voxel to the predicted rating. These maps generally included (but were not limited to) "specialized" regions which are consistent with results from conventional neuroimaging studies and known functional neuroanatomy. In conclusion, Sparse Bayesian Learning models, such as RVM, appear to be a valuable approach to the multivariate regression of fMRI time series. The implementation of the Automatic Relevance Determination criterion is particularly suitable and provides a good generalization, despite the limited number of samples which is typically available in fMRI. Predictive maps allow disclosing multi-voxel patterns of brain activity that predict perceptual and behavioral subjective experience. Copyright © 2010 Elsevier Inc. All rights reserved.

  15. Accelerating Relevance Vector Machine for Large-Scale Data on Spark

    Directory of Open Access Journals (Sweden)

    Liu Fang

    2017-01-01

    Full Text Available Relevance vector machine (RVM is a machine learning algorithm based on a sparse Bayesian framework, which performs well when running classification and regression tasks on small-scale datasets. However, RVM also has certain drawbacks which restricts its practical applications such as (1 slow training process, (2 poor performance on training large-scale datasets. In order to solve these problem, we propose Discrete AdaBoost RVM (DAB-RVM which incorporate ensemble learning in RVM at first. This method performs well with large-scale low-dimensional datasets. However, as the number of features increases, the training time of DAB-RVM increases as well. To avoid this phenomenon, we utilize the sufficient training samples of large-scale datasets and propose all features boosting RVM (AFB-RVM, which modifies the way of obtaining weak classifiers. In our experiments we study the differences between various boosting techniques with RVM, demonstrating the performance of the proposed approaches on Spark. As a result of this paper, two proposed approaches on Spark for different types of large-scale datasets are available.

  16. Representative Vector Machines: A Unified Framework for Classical Classifiers.

    Science.gov (United States)

    Gui, Jie; Liu, Tongliang; Tao, Dacheng; Sun, Zhenan; Tan, Tieniu

    2016-08-01

    Classifier design is a fundamental problem in pattern recognition. A variety of pattern classification methods such as the nearest neighbor (NN) classifier, support vector machine (SVM), and sparse representation-based classification (SRC) have been proposed in the literature. These typical and widely used classifiers were originally developed from different theory or application motivations and they are conventionally treated as independent and specific solutions for pattern classification. This paper proposes a novel pattern classification framework, namely, representative vector machines (or RVMs for short). The basic idea of RVMs is to assign the class label of a test example according to its nearest representative vector. The contributions of RVMs are twofold. On one hand, the proposed RVMs establish a unified framework of classical classifiers because NN, SVM, and SRC can be interpreted as the special cases of RVMs with different definitions of representative vectors. Thus, the underlying relationship among a number of classical classifiers is revealed for better understanding of pattern classification. On the other hand, novel and advanced classifiers are inspired in the framework of RVMs. For example, a robust pattern classification method called discriminant vector machine (DVM) is motivated from RVMs. Given a test example, DVM first finds its k -NNs and then performs classification based on the robust M-estimator and manifold regularization. Extensive experimental evaluations on a variety of visual recognition tasks such as face recognition (Yale and face recognition grand challenge databases), object categorization (Caltech-101 dataset), and action recognition (Action Similarity LAbeliNg) demonstrate the advantages of DVM over other classifiers.

  17. Landslide susceptibility mapping using support vector machine and ...

    Indian Academy of Sciences (India)

    the prediction rate methods, the validation process was performed by ... support vector machine (SVM); geographical information systems (GIS); ... 2012a), decision tree methods (Akgun .... gence or divergence of water during downhill flow.

  18. Research on intrusion detection based on Kohonen network and support vector machine

    Science.gov (United States)

    Shuai, Chunyan; Yang, Hengcheng; Gong, Zeweiyi

    2018-05-01

    In view of the problem of low detection accuracy and the long detection time of support vector machine, which directly applied to the network intrusion detection system. Optimization of SVM parameters can greatly improve the detection accuracy, but it can not be applied to high-speed network because of the long detection time. a method based on Kohonen neural network feature selection is proposed to reduce the optimization time of support vector machine parameters. Firstly, this paper is to calculate the weights of the KDD99 network intrusion data by Kohonen network and select feature by weight. Then, after the feature selection is completed, genetic algorithm (GA) and grid search method are used for parameter optimization to find the appropriate parameters and classify them by support vector machines. By comparing experiments, it is concluded that feature selection can reduce the time of parameter optimization, which has little influence on the accuracy of classification. The experiments suggest that the support vector machine can be used in the network intrusion detection system and reduce the missing rate.

  19. Wavelength detection in FBG sensor networks using least squares support vector regression

    Science.gov (United States)

    Chen, Jing; Jiang, Hao; Liu, Tundong; Fu, Xiaoli

    2014-04-01

    A wavelength detection method for a wavelength division multiplexing (WDM) fiber Bragg grating (FBG) sensor network is proposed based on least squares support vector regression (LS-SVR). As a kind of promising machine learning technique, LS-SVR is employed to approximate the inverse function of the reflection spectrum. The LS-SVR detection model is established from the training samples, and then the Bragg wavelength of each FBG can be directly identified by inputting the measured spectrum into the well-trained model. We also discuss the impact of the sample size and the preprocess of the input spectrum on the performance of the training effectiveness. The results demonstrate that our approach is effective in improving the accuracy for sensor networks with a large number of FBGs.

  20. LINEAR KERNEL SUPPORT VECTOR MACHINES FOR MODELING PORE-WATER PRESSURE RESPONSES

    Directory of Open Access Journals (Sweden)

    KHAMARUZAMAN W. YUSOF

    2017-08-01

    Full Text Available Pore-water pressure responses are vital in many aspects of slope management, design and monitoring. Its measurement however, is difficult, expensive and time consuming. Studies on its predictions are lacking. Support vector machines with linear kernel was used here to predict the responses of pore-water pressure to rainfall. Pore-water pressure response data was collected from slope instrumentation program. Support vector machine meta-parameter calibration and model development was carried out using grid search and k-fold cross validation. The mean square error for the model on scaled test data is 0.0015 and the coefficient of determination is 0.9321. Although pore-water pressure response to rainfall is a complex nonlinear process, the use of linear kernel support vector machine can be employed where high accuracy can be sacrificed for computational ease and time.

  1. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution

    Science.gov (United States)

    Kisi, Ozgur; Parmar, Kulwinder Singh

    2016-03-01

    This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.

  2. Model Predictive Engine Air-Ratio Control Using Online Sequential Relevance Vector Machine

    Directory of Open Access Journals (Sweden)

    Hang-cheong Wong

    2012-01-01

    Full Text Available Engine power, brake-specific fuel consumption, and emissions relate closely to air ratio (i.e., lambda among all the engine variables. An accurate and adaptive model for lambda prediction is essential to effective lambda control for long term. This paper utilizes an emerging technique, relevance vector machine (RVM, to build a reliable time-dependent lambda model which can be continually updated whenever a sample is added to, or removed from, the estimated lambda model. The paper also presents a new model predictive control (MPC algorithm for air-ratio regulation based on RVM. This study shows that the accuracy, training, and updating time of the RVM model are superior to the latest modelling methods, such as diagonal recurrent neural network (DRNN and decremental least-squares support vector machine (DLSSVM. Moreover, the control algorithm has been implemented on a real car to test. Experimental results reveal that the control performance of the proposed relevance vector machine model predictive controller (RVMMPC is also superior to DRNNMPC, support vector machine-based MPC, and conventional proportional-integral (PI controller in production cars. Therefore, the proposed RVMMPC is a promising scheme to replace conventional PI controller for engine air-ratio control.

  3. Weighted K-means support vector machine for cancer prediction.

    Science.gov (United States)

    Kim, SungHwan

    2016-01-01

    To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).

  4. Intelligent Design of Metal Oxide Gas Sensor Arrays Using Reciprocal Kernel Support Vector Regression

    Science.gov (United States)

    Dougherty, Andrew W.

    Metal oxides are a staple of the sensor industry. The combination of their sensitivity to a number of gases, and the electrical nature of their sensing mechanism, make the particularly attractive in solid state devices. The high temperature stability of the ceramic material also make them ideal for detecting combustion byproducts where exhaust temperatures can be high. However, problems do exist with metal oxide sensors. They are not very selective as they all tend to be sensitive to a number of reduction and oxidation reactions on the oxide's surface. This makes sensors with large numbers of sensors interesting to study as a method for introducing orthogonality to the system. Also, the sensors tend to suffer from long term drift for a number of reasons. In this thesis I will develop a system for intelligently modeling metal oxide sensors and determining their suitability for use in large arrays designed to analyze exhaust gas streams. It will introduce prior knowledge of the metal oxide sensors' response mechanisms in order to produce a response function for each sensor from sparse training data. The system will use the same technique to model and remove any long term drift from the sensor response. It will also provide an efficient means for determining the orthogonality of the sensor to determine whether they are useful in gas sensing arrays. The system is based on least squares support vector regression using the reciprocal kernel. The reciprocal kernel is introduced along with a method of optimizing the free parameters of the reciprocal kernel support vector machine. The reciprocal kernel is shown to be simpler and to perform better than an earlier kernel, the modified reciprocal kernel. Least squares support vector regression is chosen as it uses all of the training points and an emphasis was placed throughout this research for extracting the maximum information from very sparse data. The reciprocal kernel is shown to be effective in modeling the sensor

  5. Prediction of Banking Systemic Risk Based on Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Shouwei Li

    2013-01-01

    Full Text Available Banking systemic risk is a complex nonlinear phenomenon and has shed light on the importance of safeguarding financial stability by recent financial crisis. According to the complex nonlinear characteristics of banking systemic risk, in this paper we apply support vector machine (SVM to the prediction of banking systemic risk in an attempt to suggest a new model with better explanatory power and stability. We conduct a case study of an SVM-based prediction model for Chinese banking systemic risk and find the experiment results showing that support vector machine is an efficient method in such case.

  6. Experimental comparison of support vector machines with random ...

    Indian Academy of Sciences (India)

    dient method, support vector machines, and random forests to improve producer accuracy and overall classification accuracy. The performance comparison of these classifiers is valuable for a decision maker ... ping, surveillance system, resource management, tracking ... rocks, water bodies, and anthropogenic elements,.

  7. Product Quality Modelling Based on Incremental Support Vector Machine

    International Nuclear Information System (INIS)

    Wang, J; Zhang, W; Qin, B; Shi, W

    2012-01-01

    Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.

  8. Support vector methods for survival analysis: a comparison between ranking and regression approaches.

    Science.gov (United States)

    Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

    2011-10-01

    To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods

  9. An Ensemble of Deep Support Vector Machines for Image Categorization

    NARCIS (Netherlands)

    Abdullah, Azizi; Veltkamp, Remco C.; Wiering, Marco

    2009-01-01

    This paper presents the deep support vector machine (D-SVM) inspired by the increasing popularity of deep belief networks for image recognition. Our deep SVM trains an SVM in the standard way and then uses the kernel activations of support vectors as inputs for training another SVM at the next

  10. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    Science.gov (United States)

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  11. Relevance vector machine technique for the inverse scattering problem

    International Nuclear Information System (INIS)

    Wang Fang-Fang; Zhang Ye-Rong

    2012-01-01

    A novel method based on the relevance vector machine (RVM) for the inverse scattering problem is presented in this paper. The nonlinearity and the ill-posedness inherent in this problem are simultaneously considered. The nonlinearity is embodied in the relation between the scattered field and the target property, which can be obtained through the RVM training process. Besides, rather than utilizing regularization, the ill-posed nature of the inversion is naturally accounted for because the RVM can produce a probabilistic output. Simulation results reveal that the proposed RVM-based approach can provide comparative performances in terms of accuracy, convergence, robustness, generalization, and improved performance in terms of sparse property in comparison with the support vector machine (SVM) based approach. (general)

  12. GenSVM: a generalized multiclass support vector machine

    NARCIS (Netherlands)

    G.J.J. van den Burg (Gertjan); P.J.F. Groenen (Patrick)

    2016-01-01

    textabstractTraditional extensions of the binary support vector machine (SVM) to multiclass problems are either heuristics or require solving a large dual optimization problem. Here, a generalized multiclass SVM is proposed called GenSVM. In this method classification boundaries for a K-class

  13. Soft sensor development and optimization of the commercial petrochemical plant integrating support vector regression and genetic algorithm

    Directory of Open Access Journals (Sweden)

    S.K. Lahiri

    2009-09-01

    Full Text Available Soft sensors have been widely used in the industrial process control to improve the quality of the product and assure safety in the production. The core of a soft sensor is to construct a soft sensing model. This paper introduces support vector regression (SVR, a new powerful machine learning methodbased on a statistical learning theory (SLT into soft sensor modeling and proposes a new soft sensing modeling method based on SVR. This paper presents an artificial intelligence based hybrid soft sensormodeling and optimization strategies, namely support vector regression – genetic algorithm (SVR-GA for modeling and optimization of mono ethylene glycol (MEG quality variable in a commercial glycol plant. In the SVR-GA approach, a support vector regression model is constructed for correlating the process data comprising values of operating and performance variables. Next, model inputs describing the process operating variables are optimized using genetic algorithm with a view to maximize the process performance. The SVR-GA is a new strategy for soft sensor modeling and optimization. The major advantage of the strategies is that modeling and optimization can be conducted exclusively from the historic process data wherein the detailed knowledge of process phenomenology (reaction mechanism, kinetics etc. is not required. Using SVR-GA strategy, a number of sets of optimized operating conditions were found. The optimized solutions, when verified in an actual plant, resulted in a significant improvement in the quality.

  14. Sistem Deteksi Retinopati Diabetik Menggunakan Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Wahyudi Setiawan

    2014-02-01

    Full Text Available Diabetic Retinopathy is a complication of Diabetes Melitus. It can be a blindness if untreated settled as early as possible. System created in this thesis is the detection of diabetic retinopathy level of the image obtained from fundus photographs. There are three main steps to resolve the problems, preprocessing, feature extraction and classification. Preprocessing methods that used in this system are Grayscale Green Channel, Gaussian Filter, Contrast Limited Adaptive Histogram Equalization and Masking. Two Dimensional Linear Discriminant Analysis (2DLDA is used for feature extraction. Support Vector Machine (SVM is used for classification. The test result performed by taking a dataset of MESSIDOR with number of images that vary for the training phase, otherwise is used for the testing phase. Test result show the optimal accuracy are 84% .   Keywords : Diabetic Retinopathy, Support Vector Machine, Two Dimensional Linear Discriminant Analysis, MESSIDOR

  15. Progressive Classification Using Support Vector Machines

    Science.gov (United States)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user

  16. Towards smart energy systems: application of kernel machine regression for medium term electricity load forecasting.

    Science.gov (United States)

    Alamaniotis, Miltiadis; Bargiotas, Dimitrios; Tsoukalas, Lefteri H

    2016-01-01

    Integration of energy systems with information technologies has facilitated the realization of smart energy systems that utilize information to optimize system operation. To that end, crucial in optimizing energy system operation is the accurate, ahead-of-time forecasting of load demand. In particular, load forecasting allows planning of system expansion, and decision making for enhancing system safety and reliability. In this paper, the application of two types of kernel machines for medium term load forecasting (MTLF) is presented and their performance is recorded based on a set of historical electricity load demand data. The two kernel machine models and more specifically Gaussian process regression (GPR) and relevance vector regression (RVR) are utilized for making predictions over future load demand. Both models, i.e., GPR and RVR, are equipped with a Gaussian kernel and are tested on daily predictions for a 30-day-ahead horizon taken from the New England Area. Furthermore, their performance is compared to the ARMA(2,2) model with respect to mean average percentage error and squared correlation coefficient. Results demonstrate the superiority of RVR over the other forecasting models in performing MTLF.

  17. A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

    Science.gov (United States)

    Khawaja, Taimoor Saleem

    A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior

  18. Investigation of support vector machine for the detection of architectural distortion in mammographic images

    International Nuclear Information System (INIS)

    Guo, Q; Shao, J; Ruiz, V

    2005-01-01

    This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma

  19. Investigation of support vector machine for the detection of architectural distortion in mammographic images

    Energy Technology Data Exchange (ETDEWEB)

    Guo, Q [Department of Cybernetics, University of Reading, Reading RG6 6AY (United Kingdom); Shao, J [Department of Electronics, University of Kent at Canterbury, Kent CT2 7NT (United Kingdom); Ruiz, V [Department of Cybernetics, University of Reading, Reading RG6 6AY (United Kingdom)

    2005-01-01

    This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma.

  20. Particle swarm optimization-based support vector machine for forecasting dissolved gases content in power transformer oil

    Energy Technology Data Exchange (ETDEWEB)

    Fei, Sheng-wei; Wang, Ming-Jun; Miao, Yu-bin; Tu, Jun; Liu, Cheng-liang [School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240 (China)

    2009-06-15

    Forecasting of dissolved gases content in power transformer oil is a complicated problem due to its nonlinearity and the small quantity of training data. Support vector machine (SVM) has been successfully employed to solve regression problem of nonlinearity and small sample. However, the practicability of SVM is effected due to the difficulty of selecting appropriate SVM parameters. Particle swarm optimization (PSO) is a new optimization method, which is motivated by social behaviour of organisms such as bird flocking and fish schooling. The method not only has strong global search capability, but also is very easy to implement. Thus, the proposed PSO-SVM model is applied to forecast dissolved gases content in power transformer oil in this paper, among which PSO is used to determine free parameters of support vector machine. The experimental data from several electric power companies in China is used to illustrate the performance of proposed PSO-SVM model. The experimental results indicate that the PSO-SVM method can achieve greater forecasting accuracy than grey model, artificial neural network under the circumstances of small sample. (author)

  1. Particle swarm optimization-based support vector machine for forecasting dissolved gases content in power transformer oil

    Energy Technology Data Exchange (ETDEWEB)

    Fei Shengwei [School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240 (China)], E-mail: feishengwei@sohu.com; Wang Mingjun; Miao Yubin; Tu Jun; Liu Chengliang [School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240 (China)

    2009-06-15

    Forecasting of dissolved gases content in power transformer oil is a complicated problem due to its nonlinearity and the small quantity of training data. Support vector machine (SVM) has been successfully employed to solve regression problem of nonlinearity and small sample. However, the practicability of SVM is effected due to the difficulty of selecting appropriate SVM parameters. Particle swarm optimization (PSO) is a new optimization method, which is motivated by social behaviour of organisms such as bird flocking and fish schooling. The method not only has strong global search capability, but also is very easy to implement. Thus, the proposed PSO-SVM model is applied to forecast dissolved gases content in power transformer oil in this paper, among which PSO is used to determine free parameters of support vector machine. The experimental data from several electric power companies in China is used to illustrate the performance of proposed PSO-SVM model. The experimental results indicate that the PSO-SVM method can achieve greater forecasting accuracy than grey model, artificial neural network under the circumstances of small sample.

  2. Support vector machine for the diagnosis of malignant mesothelioma

    Science.gov (United States)

    Ushasukhanya, S.; Nithyakalyani, A.; Sivakumar, V.

    2018-04-01

    Harmful mesothelioma is an illness in which threatening (malignancy) cells shape in the covering of the trunk or stomach area. Being presented to asbestos can influence the danger of threatening mesothelioma. Signs and side effects of threatening mesothelioma incorporate shortness of breath and agony under the rib confine. Tests that inspect within the trunk and belly are utilized to recognize (find) and analyse harmful mesothelioma. Certain elements influence forecast (shot of recuperation) and treatment choices. In this review, Support vector machine (SVM) classifiers were utilized for Mesothelioma sickness conclusion. SVM output is contrasted by concentrating on Mesothelioma’s sickness and findings by utilizing similar information set. The support vector machine algorithm gives 92.5% precision acquired by means of 3-overlap cross-approval. The Mesothelioma illness dataset were taken from an organization reports from Turkey.

  3. Multivariate calibration with least-squares support vector machines.

    NARCIS (Netherlands)

    Thissen, U.M.J.; Ustun, B.; Melssen, W.J.; Buydens, L.M.C.

    2004-01-01

    This paper proposes the use of least-squares support vector machines (LS-SVMs) as a relatively new nonlinear multivariate calibration method, capable of dealing with ill-posed problems. LS-SVMs are an extension of "traditional" SVMs that have been introduced recently in the field of chemistry and

  4. Fruit fly optimization based least square support vector regression for blind image restoration

    Science.gov (United States)

    Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei

    2014-11-01

    The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and

  5. Investigation of Pear Drying Performance by Different Methods and Regression of Convective Heat Transfer Coefficient with Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Mehmet Das

    2018-01-01

    Full Text Available In this study, an air heated solar collector (AHSC dryer was designed to determine the drying characteristics of the pear. Flat pear slices of 10 mm thickness were used in the experiments. The pears were dried both in the AHSC dryer and under the sun. Panel glass temperature, panel floor temperature, panel inlet temperature, panel outlet temperature, drying cabinet inlet temperature, drying cabinet outlet temperature, drying cabinet temperature, drying cabinet moisture, solar radiation, pear internal temperature, air velocity and mass loss of pear were measured at 30 min intervals. Experiments were carried out during the periods of June 2017 in Elazig, Turkey. The experiments started at 8:00 a.m. and continued till 18:00. The experiments were continued until the weight changes in the pear slices stopped. Wet basis moisture content (MCw, dry basis moisture content (MCd, adjustable moisture ratio (MR, drying rate (DR, and convective heat transfer coefficient (hc were calculated with both in the AHSC dryer and the open sun drying experiment data. It was found that the values of hc in both drying systems with a range 12.4 and 20.8 W/m2 °C. Three different kernel models were used in the support vector machine (SVM regression to construct the predictive model of the calculated hc values for both systems. The mean absolute error (MAE, root mean squared error (RMSE, relative absolute error (RAE and root relative absolute error (RRAE analysis were performed to indicate the predictive model’s accuracy. As a result, the rate of drying of the pear was examined for both systems and it was observed that the pear had dried earlier in the AHSC drying system. A predictive model was obtained using the SVM regression for the calculated hc values for the pear in the AHSC drying system. The normalized polynomial kernel was determined as the best kernel model in SVM for estimating the hc values.

  6. Deep neural mapping support vector machines.

    Science.gov (United States)

    Li, Yujian; Zhang, Ting

    2017-09-01

    The choice of kernel has an important effect on the performance of a support vector machine (SVM). The effect could be reduced by NEUROSVM, an architecture using multilayer perceptron for feature extraction and SVM for classification. In binary classification, a general linear kernel NEUROSVM can be theoretically simplified as an input layer, many hidden layers, and an SVM output layer. As a feature extractor, the sub-network composed of the input and hidden layers is first trained together with a virtual ordinary output layer by backpropagation, then with the output of its last hidden layer taken as input of the SVM classifier for further training separately. By taking the sub-network as a kernel mapping from the original input space into a feature space, we present a novel model, called deep neural mapping support vector machine (DNMSVM), from the viewpoint of deep learning. This model is also a new and general kernel learning method, where the kernel mapping is indeed an explicit function expressed as a sub-network, different from an implicit function induced by a kernel function traditionally. Moreover, we exploit a two-stage procedure of contrastive divergence learning and gradient descent for DNMSVM to jointly training an adaptive kernel mapping instead of a kernel function, without requirement of kernel tricks. As a whole of the sub-network and the SVM classifier, the joint training of DNMSVM is done by using gradient descent to optimize the objective function with the sub-network layer-wise pre-trained via contrastive divergence learning of restricted Boltzmann machines. Compared to the separate training of NEUROSVM, the joint training is a new algorithm for DNMSVM to have advantages over NEUROSVM. Experimental results show that DNMSVM can outperform NEUROSVM and RBFSVM (i.e., SVM with the kernel of radial basis function), demonstrating its effectiveness. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    Directory of Open Access Journals (Sweden)

    Santana Isabel

    2011-08-01

    Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

  8. Chord Recognition Based on Temporal Correlation Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Zhongyang Rao

    2016-05-01

    Full Text Available In this paper, we propose a method called temporal correlation support vector machine (TCSVM for automatic major-minor chord recognition in audio music. We first use robust principal component analysis to separate the singing voice from the music to reduce the influence of the singing voice and consider the temporal correlations of the chord features. Using robust principal component analysis, we expect the low-rank component of the spectrogram matrix to contain the musical accompaniment and the sparse component to contain the vocal signals. Then, we extract a new logarithmic pitch class profile (LPCP feature called enhanced LPCP from the low-rank part. To exploit the temporal correlation among the LPCP features of chords, we propose an improved support vector machine algorithm called TCSVM. We perform this study using the MIREX’09 (Music Information Retrieval Evaluation eXchange Audio Chord Estimation dataset. Furthermore, we conduct comprehensive experiments using different pitch class profile feature vectors to examine the performance of TCSVM. The results of our method are comparable to the state-of-the-art methods that entered the MIREX in 2013 and 2014 for the MIREX’09 Audio Chord Estimation task dataset.

  9. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.

    Science.gov (United States)

    Huang, Shujun; Cai, Nianguang; Pacheco, Pedro Penzuti; Narrandes, Shavira; Wang, Yang; Xu, Wayne

    2018-01-01

    Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications. Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  10. Hyperspectral image classification using Support Vector Machine

    International Nuclear Information System (INIS)

    Moughal, T A

    2013-01-01

    Classification of land cover hyperspectral images is a very challenging task due to the unfavourable ratio between the number of spectral bands and the number of training samples. The focus in many applications is to investigate an effective classifier in terms of accuracy. The conventional multiclass classifiers have the ability to map the class of interest but the considerable efforts and large training sets are required to fully describe the classes spectrally. Support Vector Machine (SVM) is suggested in this paper to deal with the multiclass problem of hyperspectral imagery. The attraction to this method is that it locates the optimal hyper plane between the class of interest and the rest of the classes to separate them in a new high-dimensional feature space by taking into account only the training samples that lie on the edge of the class distributions known as support vectors and the use of the kernel functions made the classifier more flexible by making it robust against the outliers. A comparative study has undertaken to find an effective classifier by comparing Support Vector Machine (SVM) to the other two well known classifiers i.e. Maximum likelihood (ML) and Spectral Angle Mapper (SAM). At first, the Minimum Noise Fraction (MNF) was applied to extract the best possible features form the hyperspectral imagery and then the resulting subset of the features was applied to the classifiers. Experimental results illustrate that the integration of MNF and SVM technique significantly reduced the classification complexity and improves the classification accuracy.

  11. Prototype Vector Machine for Large Scale Semi-Supervised Learning

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Kai; Kwok, James T.; Parvin, Bahram

    2009-04-29

    Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.

  12. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification

    Directory of Open Access Journals (Sweden)

    Wang Lily

    2008-07-01

    Full Text Available Abstract Background Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain. Results In the present paper we identify methodological biases of prior work comparing random forests and support vector machines and conduct a new rigorous evaluation of the two algorithms that corrects these limitations. Our experiments use 22 diagnostic and prognostic datasets and show that support vector machines outperform random forests, often by a large margin. Our data also underlines the importance of sound research design in benchmarking and comparison of bioinformatics algorithms. Conclusion We found that both on average and in the majority of microarray datasets, random forests are outperformed by support vector machines both in the settings when no gene selection is performed and when several popular gene selection methods are used.

  13. Reservoir Inflow Prediction under GCM Scenario Downscaled by Wavelet Transform and Support Vector Machine Hybrid Models

    Directory of Open Access Journals (Sweden)

    Gusfan Halik

    2015-01-01

    Full Text Available Climate change has significant impacts on changing precipitation patterns causing the variation of the reservoir inflow. Nowadays, Indonesian hydrologist performs reservoir inflow prediction according to the technical guideline of Pd-T-25-2004-A. This technical guideline does not consider the climate variables directly, resulting in significant deviation to the observation results. This research intends to predict the reservoir inflow using the statistical downscaling (SD of General Circulation Model (GCM outputs. The GCM outputs are obtained from the National Center for Environmental Prediction/National Center for Atmospheric Research Reanalysis (NCEP/NCAR Reanalysis. A new proposed hybrid SD model named Wavelet Support Vector Machine (WSVM was utilized. It is a combination of the Multiscale Principal Components Analysis (MSPCA and nonlinear Support Vector Machine regression. The model was validated at Sutami Reservoir, Indonesia. Training and testing were carried out using data of 1991–2008 and 2008–2012, respectively. The results showed that MSPCA produced better extracting data than PCA. The WSVM generated better reservoir inflow prediction than the one of technical guideline. Moreover, this research also applied WSVM for future reservoir inflow prediction based on GCM ECHAM5 and scenario SRES A1B.

  14. Parameter Selection Method for Support Vector Regression Based on Adaptive Fusion of the Mixed Kernel Function

    Directory of Open Access Journals (Sweden)

    Hailun Wang

    2017-01-01

    Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.

  15. Ultrasonic fluid quantity measurement in dynamic vehicular applications a support vector machine approach

    CERN Document Server

    Terzic, Jenny; Nagarajah, Romesh; Alamgir, Muhammad

    2013-01-01

    Accurate fluid level measurement in dynamic environments can be assessed using a Support Vector Machine (SVM) approach. SVM is a supervised learning model that analyzes and recognizes patterns. It is a signal classification technique which has far greater accuracy than conventional signal averaging methods. Ultrasonic Fluid Quantity Measurement in Dynamic Vehicular Applications: A Support Vector Machine Approach describes the research and development of a fluid level measurement system for dynamic environments. The measurement system is based on a single ultrasonic sensor. A Support Vector Machines (SVM) based signal characterization and processing system has been developed to compensate for the effects of slosh and temperature variation in fluid level measurement systems used in dynamic environments including automotive applications. It has been demonstrated that a simple ν-SVM model with Radial Basis Function (RBF) Kernel with the inclusion of a Moving Median filter could be used to achieve the high levels...

  16. Support Vector Machines: Relevance Feedback and Information Retrieval.

    Science.gov (United States)

    Drucker, Harris; Shahrary, Behzad; Gibbon, David C.

    2002-01-01

    Compares support vector machines (SVMs) to Rocchio, Ide regular and Ide dec-hi algorithms in information retrieval (IR) of text documents using relevancy feedback. If the preliminary search is so poor that one has to search through many documents to find at least one relevant document, then SVM is preferred. Includes nine tables. (Contains 24…

  17. Predicting post-translational lysine acetylation using support vector machines

    DEFF Research Database (Denmark)

    Gnad, Florian; Ren, Shubin; Choudhary, Chunaram

    2010-01-01

    spectrometry to identify 3600 lysine acetylation sites on 1750 human proteins covering most of the previously annotated sites and providing the most comprehensive acetylome so far. This dataset should provide an excellent source to train support vector machines (SVMs) allowing the high accuracy in silico...

  18. Identifying saltcedar with hyperspectral data and support vector machines

    Science.gov (United States)

    Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...

  19. Daily sea level prediction at Chiayi coast, Taiwan using extreme learning machine and relevance vector machine

    Science.gov (United States)

    Imani, Moslem; Kao, Huan-Chin; Lan, Wen-Hau; Kuo, Chung-Yen

    2018-02-01

    The analysis and the prediction of sea level fluctuations are core requirements of marine meteorology and operational oceanography. Estimates of sea level with hours-to-days warning times are especially important for low-lying regions and coastal zone management. The primary purpose of this study is to examine the applicability and capability of extreme learning machine (ELM) and relevance vector machine (RVM) models for predicting sea level variations and compare their performances with powerful machine learning methods, namely, support vector machine (SVM) and radial basis function (RBF) models. The input dataset from the period of January 2004 to May 2011 used in the study was obtained from the Dongshi tide gauge station in Chiayi, Taiwan. Results showed that the ELM and RVM models outperformed the other methods. The performance of the RVM approach was superior in predicting the daily sea level time series given the minimum root mean square error of 34.73 mm and the maximum determination coefficient of 0.93 (R2) during the testing periods. Furthermore, the obtained results were in close agreement with the original tide-gauge data, which indicates that RVM approach is a promising alternative method for time series prediction and could be successfully used for daily sea level forecasts.

  20. Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment

    Science.gov (United States)

    Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty

    2017-12-01

    Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.

  1. Fuzzy-based multi-kernel spherical support vector machine for ...

    Indian Academy of Sciences (India)

    In the proposed classifier, we design a new multi-kernel function based on the fuzzy triangular membership function. Finally, a newly developed multi-kernel function is incorporated into the spherical support vector machine to enhance the performance significantly. The experimental results are evaluated and performance is ...

  2. An implementation of support vector machine on sentiment classification of movie reviews

    Science.gov (United States)

    Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.

    2018-03-01

    With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%

  3. Upport vector machines for nonlinear kernel ARMA system identification.

    Science.gov (United States)

    Martínez-Ramón, Manel; Rojo-Alvarez, José Luis; Camps-Valls, Gustavo; Muñioz-Marí, Jordi; Navia-Vázquez, Angel; Soria-Olivas, Emilio; Figueiras-Vidal, Aníbal R

    2006-11-01

    Nonlinear system identification based on support vector machines (SVM) has been usually addressed by means of the standard SVM regression (SVR), which can be seen as an implicit nonlinear autoregressive and moving average (ARMA) model in some reproducing kernel Hilbert space (RKHS). The proposal of this letter is twofold. First, the explicit consideration of an ARMA model in an RKHS (SVM-ARMA2K) is proposed. We show that stating the ARMA equations in an RKHS leads to solving the regularized normal equations in that RKHS, in terms of the autocorrelation and cross correlation of the (nonlinearly) transformed input and output discrete time processes. Second, a general class of SVM-based system identification nonlinear models is presented, based on the use of composite Mercer's kernels. This general class can improve model flexibility by emphasizing the input-output cross information (SVM-ARMA4K), which leads to straightforward and natural combinations of implicit and explicit ARMA models (SVR-ARMA2K and SVR-ARMA4K). Capabilities of these different SVM-based system identification schemes are illustrated with two benchmark problems.

  4. Measurement of food colour in L*a*b* units from RGB digital image using least squares support vector machine regression

    Directory of Open Access Journals (Sweden)

    Roberto Romaniello

    2015-12-01

    Full Text Available The aim of this work is to evaluate the potential of least squares support vector machine (LS-SVM regression to develop an efficient method to measure the colour of food materials in L*a*b* units by means of a computer vision systems (CVS. A laboratory CVS, based on colour digital camera (CDC, was implemented and three LS-SVM models were trained and validated, one for each output variables (L*, a*, and b* required by this problem, using the RGB signals generated by the CDC as input variables to these models. The colour target-based approach was used to camera characterization and a standard reference target of 242 colour samples was acquired using the CVS and a colorimeter. This data set was split in two sets of equal sizes, for training and validating the LS-SVM models. An effective two-stage grid search process on the parameters space was performed in MATLAB to tune the regularization parameters γ and the kernel parameters σ2 of the three LS-SVM models. A 3-8-3 multilayer feed-forward neural network (MFNN, according to the research conducted by León et al. (2006, was also trained in order to compare its performance with those of LS-SVM models. The LS-SVM models developed in this research have been shown better generalization capability then the MFNN, allowed to obtain high correlations between L*a*b* data acquired using the colorimeter and the corresponding data obtained by transformation of the RGB data acquired by the CVS. In particular, for the validation set, R2 values equal to 0.9989, 0.9987, and 0.9994 for L*, a* and b* parameters were obtained. The root mean square error values were 0.6443, 0.3226, and 0.2702 for L*, a*, and b* respectively, and the average of colour differences ΔEab was 0.8232±0.5033 units. Thus, LS-SVM regression seems to be a useful tool to measurement of food colour using a low cost CVS.

  5. Support vector machine: a tool for mapping mineral prospectivity

    NARCIS (Netherlands)

    Zuo, R.; Carranza, E.J.M

    2011-01-01

    In this contribution, we describe an application of support vector machine (SVM), a supervised learning algorithm, to mineral prospectivity mapping. The free R package e1071 is used to construct a SVM with sigmoid kernel function to map prospectivity for Au deposits in western Meguma Terrain of Nova

  6. A Wavelet Support Vector Machine Combination Model for Singapore Tourist Arrival to Malaysia

    Science.gov (United States)

    Rafidah, A.; Shabri, Ani; Nurulhuda, A.; Suhaila, Y.

    2017-08-01

    In this study, wavelet support vector machine model (WSVM) is proposed and applied for monthly data Singapore tourist time series prediction. The WSVM model is combination between wavelet analysis and support vector machine (SVM). In this study, we have two parts, first part we compare between the kernel function and second part we compare between the developed models with single model, SVM. The result showed that kernel function linear better than RBF while WSVM outperform with single model SVM to forecast monthly Singapore tourist arrival to Malaysia.

  7. A comparison of regression algorithms for wind speed forecasting at Alexander Bay

    CSIR Research Space (South Africa)

    Botha, Nicolene

    2016-12-01

    Full Text Available to forecast 1 to 24 hours ahead, in hourly intervals. Predictions are performed on a wind speed time series with three machine learning regression algorithms, namely support vector regression, ordinary least squares and Bayesian ridge regression. The resulting...

  8. DC Algorithm for Extended Robust Support Vector Machine.

    Science.gov (United States)

    Fujiwara, Shuhei; Takeda, Akiko; Kanamori, Takafumi

    2017-05-01

    Nonconvex variants of support vector machines (SVMs) have been developed for various purposes. For example, robust SVMs attain robustness to outliers by using a nonconvex loss function, while extended [Formula: see text]-SVM (E[Formula: see text]-SVM) extends the range of the hyperparameter by introducing a nonconvex constraint. Here, we consider an extended robust support vector machine (ER-SVM), a robust variant of E[Formula: see text]-SVM. ER-SVM combines two types of nonconvexity from robust SVMs and E[Formula: see text]-SVM. Because of the two nonconvexities, the existing algorithm we proposed needs to be divided into two parts depending on whether the hyperparameter value is in the extended range or not. The algorithm also heuristically solves the nonconvex problem in the extended range. In this letter, we propose a new, efficient algorithm for ER-SVM. The algorithm deals with two types of nonconvexity while never entailing more computations than either E[Formula: see text]-SVM or robust SVM, and it finds a critical point of ER-SVM. Furthermore, we show that ER-SVM includes the existing robust SVMs as special cases. Numerical experiments confirm the effectiveness of integrating the two nonconvexities.

  9. General Dimensional Multiple-Output Support Vector Regressions and Their Multiple Kernel Learning.

    Science.gov (United States)

    Chung, Wooyong; Kim, Jisu; Lee, Heejin; Kim, Euntai

    2015-11-01

    Support vector regression has been considered as one of the most important regression or function approximation methodologies in a variety of fields. In this paper, two new general dimensional multiple output support vector regressions (MSVRs) named SOCPL1 and SOCPL2 are proposed. The proposed methods are formulated in the dual space and their relationship with the previous works is clearly investigated. Further, the proposed MSVRs are extended into the multiple kernel learning and their training is implemented by the off-the-shelf convex optimization tools. The proposed MSVRs are applied to benchmark problems and their performances are compared with those of the previous methods in the experimental section.

  10. The vector and parallel processing of MORSE code on Monte Carlo Machine

    International Nuclear Information System (INIS)

    Hasegawa, Yukihiro; Higuchi, Kenji.

    1995-11-01

    Multi-group Monte Carlo Code for particle transport, MORSE is modified for high performance computing on Monte Carlo Machine Monte-4. The method and the results are described. Monte-4 was specially developed to realize high performance computing of Monte Carlo codes for particle transport, which have been difficult to obtain high performance in vector processing on conventional vector processors. Monte-4 has four vector processor units with the special hardware called Monte Carlo pipelines. The vectorization and parallelization of MORSE code and the performance evaluation on Monte-4 are described. (author)

  11. Shallow water bathymetry mapping using Support Vector Machine (SVM) technique and multispectral imagery

    NARCIS (Netherlands)

    Misra, Ankita; Vojinovic, Zoran; Ramakrishnan, Balaji; Luijendijk, Arjen; Ranasinghe, Roshanka

    2018-01-01

    Satellite imagery along with image processing techniques prove to be efficient tools for bathymetry retrieval as they provide time and cost-effective alternatives to traditional methods of water depth estimation. In this article, a nonlinear machine learning technique of Support Vector Machine (SVM)

  12. Optimized support vector regression for drilling rate of penetration estimation

    Science.gov (United States)

    Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa

    2015-12-01

    In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.

  13. Theory of net analyte signal vectors in inverse regression

    DEFF Research Database (Denmark)

    Bro, R.; Andersen, Charlotte Møller

    2003-01-01

    The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...

  14. Relevance Vector Machine for Prediction of Soil Properties | Samui ...

    African Journals Online (AJOL)

    One of the first, most important steps in geotechnical engineering is site characterization. The ultimate goal of site characterization is to predict the in-situ soil properties at any half-space point for a site based on limited number of tests and data. In the present study, relevance vector machine (RVM) has been used to develop ...

  15. An assessment of support vector machines for land cover classification

    Science.gov (United States)

    Huang, C.; Davis, L.S.; Townshend, J.R.G.

    2002-01-01

    The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.

  16. A vector machine formulation with application to the computer-aided diagnosis of breast cancer from DCE-MRI screening examinations.

    Science.gov (United States)

    Levman, Jacob E D; Warner, Ellen; Causer, Petrina; Martel, Anne L

    2014-02-01

    This study investigates the use of a proposed vector machine formulation with application to dynamic contrast-enhanced magnetic resonance imaging examinations in the context of the computer-aided diagnosis of breast cancer. This paper describes a method for generating feature measurements that characterize a lesion's vascular heterogeneity as well as a supervised learning formulation that represents an improvement over the conventional support vector machine in this application. Spatially varying signal-intensity measures were extracted from the examinations using principal components analysis and the machine learning technique known as the support vector machine (SVM) was used to classify the results. An alternative vector machine formulation was found to improve on the results produced by the established SVM in randomized bootstrap validation trials, yielding a receiver-operating characteristic curve area of 0.82 which represents a statistically significant improvement over the SVM technique in this application.

  17. Large-scale ligand-based predictive modelling using support vector machines.

    Science.gov (United States)

    Alvarsson, Jonathan; Lampa, Samuel; Schaal, Wesley; Andersson, Claes; Wikberg, Jarl E S; Spjuth, Ola

    2016-01-01

    The increasing size of datasets in drug discovery makes it challenging to build robust and accurate predictive models within a reasonable amount of time. In order to investigate the effect of dataset sizes on predictive performance and modelling time, ligand-based regression models were trained on open datasets of varying sizes of up to 1.2 million chemical structures. For modelling, two implementations of support vector machines (SVM) were used. Chemical structures were described by the signatures molecular descriptor. Results showed that for the larger datasets, the LIBLINEAR SVM implementation performed on par with the well-established libsvm with a radial basis function kernel, but with dramatically less time for model building even on modest computer resources. Using a non-linear kernel proved to be infeasible for large data sizes, even with substantial computational resources on a computer cluster. To deploy the resulting models, we extended the Bioclipse decision support framework to support models from LIBLINEAR and made our models of logD and solubility available from within Bioclipse.

  18. On-line transient stability assessment of large-scale power systems by using ball vector machines

    International Nuclear Information System (INIS)

    Mohammadi, M.; Gharehpetian, G.B.

    2010-01-01

    In this paper ball vector machine (BVM) has been used for on-line transient stability assessment of large-scale power systems. To classify the system transient security status, a BVM has been trained for all contingencies. The proposed BVM based security assessment algorithm has very small training time and space in comparison with artificial neural networks (ANN), support vector machines (SVM) and other machine learning based algorithms. In addition, the proposed algorithm has less support vectors (SV) and therefore is faster than existing algorithms for on-line applications. One of the main points, to apply a machine learning method is feature selection. In this paper, a new Decision Tree (DT) based feature selection technique has been presented. The proposed BVM based algorithm has been applied to New England 39-bus power system. The simulation results show the effectiveness and the stability of the proposed method for on-line transient stability assessment procedure of large-scale power system. The proposed feature selection algorithm has been compared with different feature selection algorithms. The simulation results demonstrate the effectiveness of the proposed feature algorithm.

  19. Deep Restricted Kernel Machines Using Conjugate Feature Duality.

    Science.gov (United States)

    Suykens, Johan A K

    2017-08-01

    The aim of this letter is to propose a theory of deep restricted kernel machines offering new foundations for deep learning with kernel machines. From the viewpoint of deep learning, it is partially related to restricted Boltzmann machines, which are characterized by visible and hidden units in a bipartite graph without hidden-to-hidden connections and deep learning extensions as deep belief networks and deep Boltzmann machines. From the viewpoint of kernel machines, it includes least squares support vector machines for classification and regression, kernel principal component analysis (PCA), matrix singular value decomposition, and Parzen-type models. A key element is to first characterize these kernel machines in terms of so-called conjugate feature duality, yielding a representation with visible and hidden units. It is shown how this is related to the energy form in restricted Boltzmann machines, with continuous variables in a nonprobabilistic setting. In this new framework of so-called restricted kernel machine (RKM) representations, the dual variables correspond to hidden features. Deep RKM are obtained by coupling the RKMs. The method is illustrated for deep RKM, consisting of three levels with a least squares support vector machine regression level and two kernel PCA levels. In its primal form also deep feedforward neural networks can be trained within this framework.

  20. Explaining Support Vector Machines: A Color Based Nomogram.

    Directory of Open Access Journals (Sweden)

    Vanya Van Belle

    Full Text Available Support vector machines (SVMs are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models.In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables.Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant. When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable.This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method.

  1. Implicit Social Trust Dan Support Vector Regression Untuk Sistem Rekomendasi Berita

    Directory of Open Access Journals (Sweden)

    Melita Widya Ningrum

    2018-01-01

    Full Text Available Situs berita merupakan salah satu situs yang sering diakses masyarakat karena kemampuannya dalam menyajikan informasi terkini dari berbagai topik seperti olahraga, bisnis, politik, teknologi, kesehatan dan hiburan. Masyarakat dapat mencari dan melihat berita yang sedang populer dari seluruh dunia. Di sisi lain, melimpahnya artikel berita yang tersedia dapat menyulitkan pengguna dalam menemukan artikel berita yang sesuai dengan ketertarikannya. Pemilihan artikel berita yang ditampilkan ke halaman utama pengguna menjadi penting karena dapat meningkatkan minat pengguna untuk membaca artikel berita dari situs tersebut. Selain itu, pemilihan artikel berita yang sesuai dapat meminimalisir terjadinya banjir informasi yang tidak relevan. Dalam pemilihan artikel berita dibutuhkan sistem rekomendasi yang memiliki pengetahuan mengenai ketertarikan atau relevansi pengguna akan topik berita tertentu. Pada penelitian ini, peneliti membuat sistem rekomendasi artikel berita pada New York Times berbasis implicit social trust. Social trust dihasilkan dari interaksi antara pengguna dengan teman-temannya  dan bobot kepercayaan teman pengguna pada media sosial Twitter. Data yang diambil merupakan data pengguna Twitter, teman dan jumlah interaksi antar pengguna berupa retweet. Sistem ini memanfaatkan algoritma Support Vector Regression untuk memberikan estimasi penilaian pengguna terhadap suatu topik tertentu. Hasil pengolahan data dengan Support Vector Regression menunjukkan tingkat akurasi dengan MAPE sebesar 0,8243075902233644%.   Keywords : Twitter, Rekomendasi Berita, Social Trust, Support Vector Regression

  2. Design, development and evaluation of an online grading system for peeled pistachios equipped with machine vision technology and support vector machine

    Directory of Open Access Journals (Sweden)

    Hosein Nouri-Ahmadabadi

    2017-12-01

    Full Text Available In this study, an intelligent system based on combined machine vision (MV and Support Vector Machine (SVM was developed for sorting of peeled pistachio kernels and shells. The system was composed of conveyor belt, lighting box, camera, processing unit and sorting unit. A color CCD camera was used to capture images. The images were digitalized by a capture card and transferred to a personal computer for further analysis. Initially, images were converted from RGB color space to HSV color ones. For segmentation of the acquired images, H-component in the HSV color space and Otsu thresholding method were applied. A feature vector containing 30 color features was extracted from the captured images. A feature selection method based on sensitivity analysis was carried out to select superior features. The selected features were presented to SVM classifier. Various SVM models having a different kernel function were developed and tested. The SVM model having cubic polynomial kernel function and 38 support vectors achieved the best accuracy (99.17% and then was selected to use in online decision-making unit of the system. By launching the online system, it was found that limiting factors of the system capacity were related to the hardware parts of the system (conveyor belt and pneumatic valves used in the sorting unit. The limiting factors led to a distance of 8 mm between the samples. The overall accuracy and capacity of the sorter were obtained 94.33% and 22.74 kg/h, respectively. Keywords: Pistachio kernel, Sorting, Machine vision, Sensitivity analysis, Support vector machine

  3. SAM: Support Vector Machine Based Active Queue Management

    International Nuclear Information System (INIS)

    Shah, M.S.

    2014-01-01

    Recent years have seen an increasing interest in the design of AQM (Active Queue Management) controllers. The purpose of these controllers is to manage the network congestion under varying loads, link delays and bandwidth. In this paper, a new AQM controller is proposed which is trained by using the SVM (Support Vector Machine) with the RBF (Radial Basis Function) kernal. The proposed controller is called the support vector based AQM (SAM) controller. The performance of the proposed controller has been compared with three conventional AQM controllers, namely the Random Early Detection, Blue and Proportional Plus Integral Controller. The preliminary simulation studies show that the performance of the proposed controller is comparable to the conventional controllers. However, the proposed controller is more efficient in controlling the queue size than the conventional controllers. (author)

  4. Support vector machine used to diagnose the fault of rotor broken bars of induction motors

    DEFF Research Database (Denmark)

    Zhitong, Cao; Jiazhong, Fang; Hongpingn, Chen

    2003-01-01

    for the SVM. After a SVM is trained with learning sample vectors, so each kind of the rotor broken bar faults of induction motors can be classified. Finally the retest is demonstrated, which proves that the SVM really has preferable ability of classification. In this paper we tried applying the SVM......The data-based machine learning is an important aspect of modern intelligent technology, while statistical learning theory (SLT) is a new tool that studies the machine learning methods in the case of a small number of samples. As a common learning method, support vector machine (SVM) is derived...... from the SLT. Here we were done some analogical experiments of the rotor broken bar faults of induction motors used, analyzed the signals of the sample currents with Fourier transform, and constructed the spectrum characteristics from low frequency to high frequency used as learning sample vectors...

  5. Dual linear structured support vector machine tracking method via scale correlation filter

    Science.gov (United States)

    Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen

    2018-01-01

    Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.

  6. On Weighted Support Vector Regression

    DEFF Research Database (Denmark)

    Han, Xixuan; Clemmensen, Line Katrine Harder

    2014-01-01

    We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...

  7. Application of higher order spectral features and support vector machines for bearing faults classification.

    Science.gov (United States)

    Saidi, Lotfi; Ben Ali, Jaouher; Fnaiech, Farhat

    2015-01-01

    Condition monitoring and fault diagnosis of rolling element bearings timely and accurately are very important to ensure the reliability of rotating machinery. This paper presents a novel pattern classification approach for bearings diagnostics, which combines the higher order spectra analysis features and support vector machine classifier. The use of non-linear features motivated by the higher order spectra has been reported to be a promising approach to analyze the non-linear and non-Gaussian characteristics of the mechanical vibration signals. The vibration bi-spectrum (third order spectrum) patterns are extracted as the feature vectors presenting different bearing faults. The extracted bi-spectrum features are subjected to principal component analysis for dimensionality reduction. These principal components were fed to support vector machine to distinguish four kinds of bearing faults covering different levels of severity for each fault type, which were measured in the experimental test bench running under different working conditions. In order to find the optimal parameters for the multi-class support vector machine model, a grid-search method in combination with 10-fold cross-validation has been used. Based on the correct classification of bearing patterns in the test set, in each fold the performance measures are computed. The average of these performance measures is computed to report the overall performance of the support vector machine classifier. In addition, in fault detection problems, the performance of a detection algorithm usually depends on the trade-off between robustness and sensitivity. The sensitivity and robustness of the proposed method are explored by running a series of experiments. A receiver operating characteristic (ROC) curve made the results more convincing. The results indicated that the proposed method can reliably identify different fault patterns of rolling element bearings based on vibration signals. Copyright © 2014 ISA

  8. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak; Ghosh, Malay; Mallick, Bani K.

    2012-01-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik's ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  9. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak

    2012-07-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  10. Oblique decision trees using embedded support vector machines in classifier ensembles

    NARCIS (Netherlands)

    Menkovski, V.; Christou, I.; Efremidis, S.

    2008-01-01

    Classifier ensembles have emerged in recent years as a promising research area for boosting pattern recognition systems' performance. We present a new base classifier that utilizes oblique decision tree technology based on support vector machines for the construction of oblique (non-axis parallel)

  11. A comparative analysis of support vector machines and extreme learning machines.

    Science.gov (United States)

    Liu, Xueyi; Gao, Chuanhou; Li, Ping

    2012-09-01

    The theory of extreme learning machines (ELMs) has recently become increasingly popular. As a new learning algorithm for single-hidden-layer feed-forward neural networks, an ELM offers the advantages of low computational cost, good generalization ability, and ease of implementation. Hence the comparison and model selection between ELMs and other kinds of state-of-the-art machine learning approaches has become significant and has attracted many research efforts. This paper performs a comparative analysis of the basic ELMs and support vector machines (SVMs) from two viewpoints that are different from previous works: one is the Vapnik-Chervonenkis (VC) dimension, and the other is their performance under different training sample sizes. It is shown that the VC dimension of an ELM is equal to the number of hidden nodes of the ELM with probability one. Additionally, their generalization ability and computational complexity are exhibited with changing training sample size. ELMs have weaker generalization ability than SVMs for small sample but can generalize as well as SVMs for large sample. Remarkably, great superiority in computational speed especially for large-scale sample problems is found in ELMs. The results obtained can provide insight into the essential relationship between them, and can also serve as complementary knowledge for their past experimental and theoretical comparisons. Copyright © 2012 Elsevier Ltd. All rights reserved.

  12. Learning Algorithms for Audio and Video Processing: Independent Component Analysis and Support Vector Machine Based Approaches

    National Research Council Canada - National Science Library

    Qi, Yuan

    2000-01-01

    In this thesis, we propose two new machine learning schemes, a subband-based Independent Component Analysis scheme and a hybrid Independent Component Analysis/Support Vector Machine scheme, and apply...

  13. SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM

    International Nuclear Information System (INIS)

    Bobra, M. G.; Couvidat, S.

    2015-01-01

    We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a database of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities

  14. Linearity and Misspecification Tests for Vector Smooth Transition Regression Models

    DEFF Research Database (Denmark)

    Teräsvirta, Timo; Yang, Yukai

    The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...

  15. Slope Deformation Prediction Based on Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Lei JIA

    2013-07-01

    Full Text Available This paper principally studies the prediction of slope deformation based on Support Vector Machine (SVM. In the prediction process,explore how to reconstruct the phase space. The geological body’s displacement data obtained from chaotic time series are used as SVM’s training samples. Slope displacement caused by multivariable coupling is predicted by means of single variable. Results show that this model is of high fitting accuracy and generalization, and provides reference for deformation prediction in slope engineering.

  16. Support vector machine based fault classification and location of a long transmission line

    Directory of Open Access Journals (Sweden)

    Papia Ray

    2016-09-01

    Full Text Available This paper investigates support vector machine based fault type and distance estimation scheme in a long transmission line. The planned technique uses post fault single cycle current waveform and pre-processing of the samples is done by wavelet packet transform. Energy and entropy are obtained from the decomposed coefficients and feature matrix is prepared. Then the redundant features from the matrix are taken out by the forward feature selection method and normalized. Test and train data are developed by taking into consideration variables of a simulation situation like fault type, resistance path, inception angle, and distance. In this paper 10 different types of short circuit fault are analyzed. The test data are examined by support vector machine whose parameters are optimized by particle swarm optimization method. The anticipated method is checked on a 400 kV, 300 km long transmission line with voltage source at both the ends. Two cases were examined with the proposed method. The first one is fault very near to both the source end (front and rear and the second one is support vector machine with and without optimized parameter. Simulation result indicates that the anticipated method for fault classification gives high accuracy (99.21% and least fault distance estimation error (0.29%.

  17. Modeling and prediction of flotation performance using support vector regression

    Directory of Open Access Journals (Sweden)

    Despotović Vladimir

    2017-01-01

    Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.

  18. Relevance Vector Machine and Support Vector Machine Classifier Analysis of Scanning Laser Polarimetry Retinal Nerve Fiber Layer Measurements

    Science.gov (United States)

    Bowd, Christopher; Medeiros, Felipe A.; Zhang, Zuohua; Zangwill, Linda M.; Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.; Weinreb, Robert N.; Goldbaum, Michael H.

    2010-01-01

    Purpose To classify healthy and glaucomatous eyes using relevance vector machine (RVM) and support vector machine (SVM) learning classifiers trained on retinal nerve fiber layer (RNFL) thickness measurements obtained by scanning laser polarimetry (SLP). Methods Seventy-two eyes of 72 healthy control subjects (average age = 64.3 ± 8.8 years, visual field mean deviation =−0.71 ± 1.2 dB) and 92 eyes of 92 patients with glaucoma (average age = 66.9 ± 8.9 years, visual field mean deviation =−5.32 ± 4.0 dB) were imaged with SLP with variable corneal compensation (GDx VCC; Laser Diagnostic Technologies, San Diego, CA). RVM and SVM learning classifiers were trained and tested on SLP-determined RNFL thickness measurements from 14 standard parameters and 64 sectors (approximately 5.6° each) obtained in the circumpapillary area under the instrument-defined measurement ellipse (total 78 parameters). Tenfold cross-validation was used to train and test RVM and SVM classifiers on unique subsets of the full 164-eye data set and areas under the receiver operating characteristic (AUROC) curve for the classification of eyes in the test set were generated. AUROC curve results from RVM and SVM were compared to those for 14 SLP software-generated global and regional RNFL thickness parameters. Also reported was the AUROC curve for the GDx VCC software-generated nerve fiber indicator (NFI). Results The AUROC curves for RVM and SVM were 0.90 and 0.91, respectively, and increased to 0.93 and 0.94 when the training sets were optimized with sequential forward and backward selection (resulting in reduced dimensional data sets). AUROC curves for optimized RVM and SVM were significantly larger than those for all individual SLP parameters. The AUROC curve for the NFI was 0.87. Conclusions Results from RVM and SVM trained on SLP RNFL thickness measurements are similar and provide accurate classification of glaucomatous and healthy eyes. RVM may be preferable to SVM, because it provides a

  19. Support vector machines in analysis of top quark production

    International Nuclear Information System (INIS)

    Vaiciulis, A.

    2003-01-01

    The Support Vector Machine (SVM) learning algorithm is a new alternative to multivariate methods such as neural networks. Potential applications of SVMs in high energy physics include the common classification problem of signal/background discrimination as well as particle identification. A comparison of a conventional method and an SVM algorithm is presented here for the case of identifying top quark events in Run II physics at the CDF experiment

  20. Support vector machine based battery model for electric vehicles

    International Nuclear Information System (INIS)

    Wang Junping; Chen Quanshi; Cao Binggang

    2006-01-01

    The support vector machine (SVM) is a novel type of learning machine based on statistical learning theory that can map a nonlinear function successfully. As a battery is a nonlinear system, it is difficult to establish the relationship between the load voltage and the current under different temperatures and state of charge (SOC). The SVM is used to model the battery nonlinear dynamics in this paper. Tests are performed on an 80Ah Ni/MH battery pack with the Federal Urban Driving Schedule (FUDS) cycle to set up the SVM model. Compared with the Nernst and Shepherd combined model, the SVM model can simulate the battery dynamics better with small amounts of experimental data. The maximum relative error is 3.61%

  1. A comparison study of support vector machines and hidden Markov models in machinery condition monitoring

    International Nuclear Information System (INIS)

    Miao, Qiang; Huang, Hong Zhong; Fan, Xianfeng

    2007-01-01

    Condition classification is an important step in machinery fault detection, which is a problem of pattern recognition. Currently, there are a lot of techniques in this area and the purpose of this paper is to investigate two popular recognition techniques, namely hidden Markov model and support vector machine. At the beginning, we briefly introduced the procedure of feature extraction and the theoretical background of this paper. The comparison experiment was conducted for gearbox fault detection and the analysis results from this work showed that support vector machine has better classification performance in this area

  2. Extraction of inland Nypa fruticans (Nipa Palm) using Support Vector Machine

    Science.gov (United States)

    Alberto, R. T.; Serrano, S. C.; Damian, G. B.; Camaso, E. E.; Biagtan, A. R.; Panuyas, N. Z.; Quibuyen, J. S.

    2017-09-01

    Mangroves are considered as one of the major habitats in coastal ecosystem, providing a lot of economic and ecological services in human society. Nypa fruticans (Nipa palm) is one of the important species of mangroves because of its versatility and uniqueness as halophytic palm. However, nipas are not only adaptable in saline areas, they can also managed to thrive away from the coastline depending on the favorable soil types available in the area. Because of this, mapping of this species are not limited alone in the near shore areas, but in areas where this species are present as well. The extraction process of Nypa fruticans were carried out using the available LiDAR data. Support Vector Machine (SVM) classification process was used to extract nipas in inland areas. The SVM classification process in mapping Nypa fruticans produced high accuracy of 95+%. The Support Vector Machine classification process to extract inland nipas was proven to be effective by utilizing different terrain derivatives from LiDAR data.

  3. Automatic Detection of P and S Phases by Support Vector Machine

    Science.gov (United States)

    Jiang, Y.; Ning, J.; Bao, T.

    2017-12-01

    Many methods in seismology rely on accurately picked phases. A well performed program on automatically phase picking will assure the application of these methods. Related researches before mostly focus on finding different characteristics between noise and phases, which are all not enough successful. We have developed a new method which mainly based on support vector machine to detect P and S phases. In it, we first input some waveform pieces into the support vector machine, then employ it to work out a hyper plane which can divide the space into two parts: respectively noise and phase. We further use the same method to find a hyper plane which can separate the phase space into P and S parts based on the three components' cross-correlation matrix. In order to further improve the ability of phase detection, we also employ array data. At last, we show that the overall effect of our method is robust by employing both synthetic and real data.

  4. Comparison of ν-support vector regression and logistic equation for ...

    African Journals Online (AJOL)

    Due to the complexity and high non-linearity of bioprocess, most simple mathematical models fail to describe the exact behavior of biochemistry systems. As a novel type of learning method, support vector regression (SVR) owns the powerful capability to characterize problems via small sample, nonlinearity, high dimension ...

  5. Efficient implementations of block sparse matrix operations on shared memory vector machines

    International Nuclear Information System (INIS)

    Washio, T.; Maruyama, K.; Osoda, T.; Doi, S.; Shimizu, F.

    2000-01-01

    In this paper, we propose vectorization and shared memory-parallelization techniques for block-type random sparse matrix operations in finite element (FEM) applications. Here, a block corresponds to unknowns on one node in the FEM mesh and we assume that the block size is constant over the mesh. First, we discuss some basic vectorization ideas (the jagged diagonal (JAD) format and the segmented scan algorithm) for the sparse matrix-vector product. Then, we extend these ideas to the shared memory parallelization. After that, we show that the techniques can be applied not only to the sparse matrix-vector product but also to the sparse matrix-matrix product, the incomplete or complete sparse LU factorization and preconditioning. Finally, we report the performance evaluation results obtained on an NEC SX-4 shared memory vector machine for linear systems in some FEM applications. (author)

  6. Advanced Machine Learning for Classification, Regression, and Generation in Jet Physics

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    There is a deep connection between machine learning and jet physics - after all, jets are defined by unsupervised learning algorithms. Jet physics has been a driving force for studying modern machine learning in high energy physics. Domain specific challenges require new techniques to make full use of the algorithms. A key focus is on understanding how and what the algorithms learn. Modern machine learning techniques for jet physics are demonstrated for classification, regression, and generation. In addition to providing powerful baseline performance, we show how to train complex models directly on data and to generate sparse stacked images with non-uniform granularity.

  7. Classification of Motor Imagery EEG Signals with Support Vector Machines and Particle Swarm Optimization

    Science.gov (United States)

    Ma, Yuliang; Ding, Xiaohui; She, Qingshan; Luo, Zhizeng; Potter, Thomas; Zhang, Yingchun

    2016-01-01

    Support vector machines are powerful tools used to solve the small sample and nonlinear classification problems, but their ultimate classification performance depends heavily upon the selection of appropriate kernel and penalty parameters. In this study, we propose using a particle swarm optimization algorithm to optimize the selection of both the kernel and penalty parameters in order to improve the classification performance of support vector machines. The performance of the optimized classifier was evaluated with motor imagery EEG signals in terms of both classification and prediction. Results show that the optimized classifier can significantly improve the classification accuracy of motor imagery EEG signals. PMID:27313656

  8. Soft-sensing model of temperature for aluminum reduction cell on improved twin support vector regression

    Science.gov (United States)

    Li, Tao

    2018-06-01

    The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.

  9. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    Science.gov (United States)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  10. Prediction of toxicity of nitrobenzenes using ab initio and least squares support vector machines

    International Nuclear Information System (INIS)

    Niazi, Ali; Jameh-Bozorghi, Saeed; Nori-Shargh, Davood

    2008-01-01

    A quantitative structure-property relationship (QSPR) study is suggested for the prediction of toxicity (IGC 50 ) of nitrobenzenes. Ab initio theory was used to calculate some quantum chemical descriptors including electrostatic potentials and local charges at each atom, HOMO and LUMO energies, etc. Modeling of the IGC 50 of nitrobenzenes as a function of molecular structures was established by means of the least squares support vector machines (LS-SVM). This model was applied for the prediction of the toxicity (IGC 50 ) of nitrobenzenes, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.0049 for LS-SVM. Results have shown that the introduction of LS-SVM for quantum chemical descriptors drastically enhances the ability of prediction in QSAR studies superior to multiple linear regression and partial least squares

  11. A Modified Method Combined with a Support Vector Machine and Bayesian Algorithms in Biological Information

    Directory of Open Access Journals (Sweden)

    Wen-Gang Zhou

    2015-06-01

    Full Text Available With the deep research of genomics and proteomics, the number of new protein sequences has expanded rapidly. With the obvious shortcomings of high cost and low efficiency of the traditional experimental method, the calculation method for protein localization prediction has attracted a lot of attention due to its convenience and low cost. In the machine learning techniques, neural network and support vector machine (SVM are often used as learning tools. Due to its complete theoretical framework, SVM has been widely applied. In this paper, we make an improvement on the existing machine learning algorithm of the support vector machine algorithm, and a new improved algorithm has been developed, combined with Bayesian algorithms. The proposed algorithm can improve calculation efficiency, and defects of the original algorithm are eliminated. According to the verification, the method has proved to be valid. At the same time, it can reduce calculation time and improve prediction efficiency.

  12. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  13. A support vector machine approach to detect financial statement fraud in South Africa: A first look

    CSIR Research Space (South Africa)

    Moepya, SO

    2014-04-01

    Full Text Available Auditors face the difficult task of detecting companies that issue manipulated financial statements. In recent years, machine learning methods have provided a feasible solution to this task. This study develops support vector machine (SVM) models...

  14. Active damage detection method based on support vector machine and impulse response

    International Nuclear Information System (INIS)

    Taniguchi, Ryuta; Mita, Akira

    2004-01-01

    An active damage detection method was proposed to characterize damage in bolted joints. The purpose of this study is to propose a damage detection method that can obtain the detailed information of the damage by creating feature vectors for pattern recognition. In the proposed method, the wavelet transform is applied to the sensor signals, and the feature vectors are defined by second power average of the amplitude. The feature vectors generated by experiments were successfully used as the training data for Support Vector Machine (SVM). By applying the wavelet transform to time-frequency analysis, the accuracy of pattern recognition was raised in both correlation coefficient and SVM applications. Moreover, the SVM could identify the damage with very strong discernment capability than others. Applicability of the proposed method was successfully demonstrated. (author)

  15. A Support Vector Machine Approach to Dutch Part-of-Speech Tagging

    NARCIS (Netherlands)

    Poel, Mannes; Stegeman, L.; op den Akker, Hendrikus J.A.; Berthold, M.R.; Shawe-Taylor, J.; Lavrac, N.

    Part-of-Speech tagging, the assignment of Parts-of-Speech to the words in a given context of use, is a basic technique in many systems that handle natural languages. This paper describes a method for supervised training of a Part-of-Speech tagger using a committee of Support Vector Machines on a

  16. Credit Scoring by Fuzzy Support Vector Machines with a Novel Membership Function

    Directory of Open Access Journals (Sweden)

    Jian Shi

    2016-11-01

    Full Text Available Due to the recent financial crisis and European debt crisis, credit risk evaluation has become an increasingly important issue for financial institutions. Reliable credit scoring models are crucial for commercial banks to evaluate the financial performance of clients and have been widely studied in the fields of statistics and machine learning. In this paper a novel fuzzy support vector machine (SVM credit scoring model is proposed for credit risk analysis, in which fuzzy membership is adopted to indicate different contribution of each input point to the learning of SVM classification hyperplane. Considering the methodological consistency, support vector data description (SVDD is introduced to construct the fuzzy membership function and to reduce the effect of outliers and noises. The SVDD-based fuzzy SVM model is tested against the traditional fuzzy SVM on two real-world datasets and the research results confirm the effectiveness of the presented method.

  17. A Bayesian least-squares support vector machine method for predicting the remaining useful life of a microwave component

    Directory of Open Access Journals (Sweden)

    Fuqiang Sun

    2017-01-01

    Full Text Available Rapid and accurate lifetime prediction of critical components in a system is important to maintaining the system’s reliable operation. To this end, many lifetime prediction methods have been developed to handle various failure-related data collected in different situations. Among these methods, machine learning and Bayesian updating are the most popular ones. In this article, a Bayesian least-squares support vector machine method that combines least-squares support vector machine with Bayesian inference is developed for predicting the remaining useful life of a microwave component. A degradation model describing the change in the component’s power gain over time is developed, and the point and interval remaining useful life estimates are obtained considering a predefined failure threshold. In our case study, the radial basis function neural network approach is also implemented for comparison purposes. The results indicate that the Bayesian least-squares support vector machine method is more precise and stable in predicting the remaining useful life of this type of components.

  18. Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma

    Science.gov (United States)

    Singal, Amit G.; Mukherjee, Ashin; Elmunzer, B. Joseph; Higgins, Peter DR; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A; Waljee, Akbar K

    2015-01-01

    Background Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine learning algorithms. Methods We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared to the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. Results After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95%CI 0.56-0.67), whereas the machine learning algorithm had a c-statistic of 0.64 (95%CI 0.60–0.69) in the validation cohort. The machine learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (pmachine learning algorithm (p=0.047). Conclusion Machine learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC. PMID:24169273

  19. Multifrequency spiral vector model for the brushless doubly-fed induction machine

    DEFF Research Database (Denmark)

    Han, Peng; Cheng, Ming; Zhu, Xinkai

    2017-01-01

    This paper presents a multifrequency spiral vector model for both steady-state and dynamic performance analysis of the brushless doubly-fed induction machine (BDFIM) with a nested-loop rotor. Winding function theory is first employed to give a full picture of the inductance characteristics...... analytically, revealing the underlying relationship between harmonic components of stator-rotor mutual inductances and the airgap magnetic field distribution. Different from existing vector models, which only model the fundamental components of mutual inductances, the proposed vector model takes...... into consideration the low-order space harmonic coupling by incorporating nonsinusoidal inductances into modeling process. A new model order reduction approach is then proposed to transform the nested-loop rotor into an equivalent single-loop one. The effectiveness of the proposed modelling method is verified by 2D...

  20. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    Science.gov (United States)

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  1. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss Classification Using Image-Based Features

    Directory of Open Access Journals (Sweden)

    Mohammadmehdi Saberioon

    2018-03-01

    Full Text Available The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss were fed either a fish-meal based diet (80 fish or a 100% plant-based diet (80 fish and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF, Support vector machine (SVM, Logistic regression (LR and k-Nearest neighbours (k-NN. The SVM with radial based kernel provided the best classifier with correct classification rate (CCR of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40% classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin.

  2. Short-term load forecasting with increment regression tree

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Jingfei; Stenzel, Juergen [Darmstadt University of Techonology, Darmstadt 64283 (Germany)

    2006-06-15

    This paper presents a new regression tree method for short-term load forecasting. Both increment and non-increment tree are built according to the historical data to provide the data space partition and input variable selection. Support vector machine is employed to the samples of regression tree nodes for further fine regression. Results of different tree nodes are integrated through weighted average method to obtain the comprehensive forecasting result. The effectiveness of the proposed method is demonstrated through its application to an actual system. (author)

  3. A comparison of machine learning techniques for predicting downstream acid mine drainage

    CSIR Research Space (South Africa)

    van Zyl, TL

    2014-07-01

    Full Text Available windowing approach over historical values to generate a prediction for the current value. We evaluate a number of Machine Learning techniques as regressors including Support Vector Regression, Random Forests, Stochastic Gradient Decent Regression, Linear...

  4. Sentiment Analysis in the Sales Review of Indonesian Marketplace by Utilizing Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Anang Anggono Lutfi

    2018-04-01

    Full Text Available The online store is changing people’s shopping behavior. Despite the fact, the potential customer’s distrust in the quality of products and service is one of the online store’s weaknesses. A review is provided by the online stores to overcome this weakness. Customers often write a review using languages that are not well structured. Sentiment analysis is used to extract the polarity of the unstructured texts. This research attempted to do a sentiment analysis in the sales review. Sentiment analysis in sales reviews can be used as a tool to evaluate the sales. This research intends to conduct a sentiment analysis in the sales review of Indonesian marketplace by utilizing Support Vector Machine and Naive Bayes. The reviews of the data are gathered from one of Indonesian marketplace, Bukalapak. The data are classified into positive or negative class. TF-IDF is used to feature extraction. The experiment shows that Support Vector Machine with linear kernel provides higher accuracy than Naive Bayes. Support Vector Machine shows the highest accuracy average. The generated accuracy is 93.65%. This approach of sentiment analysis in sales review can be used as the base of intelligent sales evaluation for online stores in the future.

  5. Predicting respiratory tumor motion with multi-dimensional adaptive filters and support vector regression

    International Nuclear Information System (INIS)

    Riaz, Nadeem; Wiersma, Rodney; Mao Weihua; Xing Lei; Shanker, Piyush; Gudmundsson, Olafur; Widrow, Bernard

    2009-01-01

    Intra-fraction tumor tracking methods can improve radiation delivery during radiotherapy sessions. Image acquisition for tumor tracking and subsequent adjustment of the treatment beam with gating or beam tracking introduces time latency and necessitates predicting the future position of the tumor. This study evaluates the use of multi-dimensional linear adaptive filters and support vector regression to predict the motion of lung tumors tracked at 30 Hz. We expand on the prior work of other groups who have looked at adaptive filters by using a general framework of a multiple-input single-output (MISO) adaptive system that uses multiple correlated signals to predict the motion of a tumor. We compare the performance of these two novel methods to conventional methods like linear regression and single-input, single-output adaptive filters. At 400 ms latency the average root-mean-square-errors (RMSEs) for the 14 treatment sessions studied using no prediction, linear regression, single-output adaptive filter, MISO and support vector regression are 2.58, 1.60, 1.58, 1.71 and 1.26 mm, respectively. At 1 s, the RMSEs are 4.40, 2.61, 3.34, 2.66 and 1.93 mm, respectively. We find that support vector regression most accurately predicts the future tumor position of the methods studied and can provide a RMSE of less than 2 mm at 1 s latency. Also, a multi-dimensional adaptive filter framework provides improved performance over single-dimension adaptive filters. Work is underway to combine these two frameworks to improve performance.

  6. Robust anti-synchronization of uncertain chaotic systems based on multiple-kernel least squares support vector machine modeling

    International Nuclear Information System (INIS)

    Chen Qiang; Ren Xuemei; Na Jing

    2011-01-01

    Highlights: Model uncertainty of the system is approximated by multiple-kernel LSSVM. Approximation errors and disturbances are compensated in the controller design. Asymptotical anti-synchronization is achieved with model uncertainty and disturbances. Abstract: In this paper, we propose a robust anti-synchronization scheme based on multiple-kernel least squares support vector machine (MK-LSSVM) modeling for two uncertain chaotic systems. The multiple-kernel regression, which is a linear combination of basic kernels, is designed to approximate system uncertainties by constructing a multiple-kernel Lagrangian function and computing the corresponding regression parameters. Then, a robust feedback control based on MK-LSSVM modeling is presented and an improved update law is employed to estimate the unknown bound of the approximation error. The proposed control scheme can guarantee the asymptotic convergence of the anti-synchronization errors in the presence of system uncertainties and external disturbances. Numerical examples are provided to show the effectiveness of the proposed method.

  7. Penerapan Support Vector Machine (SVM untuk Pengkategorian Penelitian

    Directory of Open Access Journals (Sweden)

    Fithri Selva Jumeilah

    2017-07-01

    Full Text Available Research every college will continue to grow. Research will be stored in softcopy and hardcopy. The preparation of the research should be categorized in order to facilitate the search for people who need reference. To categorize the research, we need a method for text mining, one of them is with the implementation of Support Vector Machines (SVM. The data used to recognize the characteristics of each category then it takes secondary data which is a collection of abstracts of research. The data will be pre-processed with several stages: case folding converts all the letters into lowercase, stop words removal removal of very common words, tokenizing discard punctuation, and stemming searching for root words by removing the prefix and suffix. Further data that has undergone preprocessing will be converted into a numerical form with for the term weighting stage that is the weighting contribution of each word. From the results of term weighting then obtained data that can be used for data training and test data. The training process is done by providing input in the form of text data that is known to the class or category. Then by using the Support Vector Machines algorithm, the input data is transformed into a rule, function, or knowledge model that can be used in the prediction process. From the results of this study obtained that the categorization of research produced by SVM has been very good. This is proven by the results of the test which resulted in an accuracy of 90%.

  8. Support Vector Machines as tools for mortality graduation

    Directory of Open Access Journals (Sweden)

    Alberto Olivares

    2011-01-01

    Full Text Available A topic of interest in demographic and biostatistical analysis as well as in actuarial practice,is the graduation of the age-specific mortality pattern. A classical graduation technique is to fit parametric models. Recently, particular emphasis has been given to graduation using nonparametric techniques. Support Vector Machines (SVM is an innovative methodology that could be utilized for mortality graduation purposes. This paper evaluates SVM techniques as tools for graduating mortality rates. We apply SVM to empirical death rates from a variety of populations and time periods. For comparison, we also apply standard graduation techniques to the same data.

  9. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    Directory of Open Access Journals (Sweden)

    Daqing Zhang

    2015-01-01

    Full Text Available Blood-brain barrier (BBB is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration.

  10. Machine learning in virtual screening.

    Science.gov (United States)

    Melville, James L; Burke, Edmund K; Hirst, Jonathan D

    2009-05-01

    In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression techniques. Effective application of these methodologies requires an appreciation of data preparation, validation, optimization, and search methodologies, and we also survey developments in these areas.

  11. Support Vector Regression Model Based on Empirical Mode Decomposition and Auto Regression for Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Hong-Juan Li

    2013-04-01

    Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.

  12. Modeling and prediction of Turkey's electricity consumption using Support Vector Regression

    International Nuclear Information System (INIS)

    Kavaklioglu, Kadir

    2011-01-01

    Support Vector Regression (SVR) methodology is used to model and predict Turkey's electricity consumption. Among various SVR formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. (author)

  13. Vector control of three-phase AC machines system development in the practice

    CERN Document Server

    Quang, Nguyen Phung; Dittrich, J

    2015-01-01

    This book addresses the vector control of three-phase AC machines, in particular induction motors with squirrel-cage rotors (IM), permanent magnet synchronous motors (PMSM) and doubly-fed induction machines (DFIM), from a practical design and development perspective. The main focus is on the application of IM and PMSM in electrical drive systems, where field-orientated control has been successfully established in practice. It also discusses the use of grid-voltage oriented control of DFIMs in wind power plants. This second, enlarged edition includes new insights into flatness-based  nonlinear

  14. Performance and optimization of support vector machines in high-energy physics classification problems

    International Nuclear Information System (INIS)

    Sahin, M.Ö.; Krücker, D.; Melzer-Pellmann, I.-A.

    2016-01-01

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications.

  15. Performance and optimization of support vector machines in high-energy physics classification problems

    Energy Technology Data Exchange (ETDEWEB)

    Sahin, M.Ö., E-mail: ozgur.sahin@desy.de; Krücker, D., E-mail: dirk.kruecker@desy.de; Melzer-Pellmann, I.-A., E-mail: isabell.melzer@desy.de

    2016-12-01

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications.

  16. Reservoir rock permeability prediction using support vector regression in an Iranian oil field

    International Nuclear Information System (INIS)

    Saffarzadeh, Sadegh; Shadizadeh, Seyed Reza

    2012-01-01

    Reservoir permeability is a critical parameter for the evaluation of hydrocarbon reservoirs. It is often measured in the laboratory from reservoir core samples or evaluated from well test data. The prediction of reservoir rock permeability utilizing well log data is important because the core analysis and well test data are usually only available from a few wells in a field and have high coring and laboratory analysis costs. Since most wells are logged, the common practice is to estimate permeability from logs using correlation equations developed from limited core data; however, these correlation formulae are not universally applicable. Recently, support vector machines (SVMs) have been proposed as a new intelligence technique for both regression and classification tasks. The theory has a strong mathematical foundation for dependence estimation and predictive learning from finite data sets. The ultimate test for any technique that bears the claim of permeability prediction from well log data is the accurate and verifiable prediction of permeability for wells where only the well log data are available. The main goal of this paper is to develop the SVM method to obtain reservoir rock permeability based on well log data. (paper)

  17. Predicting hourly cooling load in the building: A comparison of support vector machine and different artificial neural networks

    International Nuclear Information System (INIS)

    Li Qiong; Meng Qinglin; Cai Jiejin; Yoshino, Hiroshi; Mochida, Akashi

    2009-01-01

    This study presents four modeling techniques for the prediction of hourly cooling load in the building. In addition to the traditional back propagation neural network (BPNN), the radial basis function neural network (RBFNN), general regression neural network (GRNN) and support vector machine (SVM) are considered. All the prediction models have been applied to an office building in Guangzhou, China. Evaluation of the prediction accuracy of the four models is based on the root mean square error (RMSE) and mean relative error (MRE). The simulation results demonstrate that the four discussed models can be effective for building cooling load prediction. The SVM and GRNN methods can achieve better accuracy and generalization than the BPNN and RBFNN methods

  18. Combining extreme learning machines using support vector machines for breast tissue classification.

    Science.gov (United States)

    Daliri, Mohammad Reza

    2015-01-01

    In this paper, we present a new approach for breast tissue classification using the features derived from electrical impedance spectroscopy. This method is composed of a feature extraction method, feature selection phase and a classification step. The feature extraction phase derives the features from the electrical impedance spectra. The extracted features consist of the impedivity at zero frequency (I0), the phase angle at 500 KHz, the high-frequency slope of phase angle, the impedance distance between spectral ends, the area under spectrum, the normalised area, the maximum of the spectrum, the distance between impedivity at I0 and the real part of the maximum frequency point and the length of the spectral curve. The system uses the information theoretic criterion as a strategy for feature selection and the combining extreme learning machines (ELMs) for the classification phase. The results of several ELMs are combined using the support vector machines classifier, and the result of classification is reported as a measure of the performance of the system. The results indicate that the proposed system achieves high accuracy in classification of breast tissues using the electrical impedance spectroscopy.

  19. Landslide susceptibility mapping & prediction using Support Vector Machine for Mandakini River Basin, Garhwal Himalaya, India

    Science.gov (United States)

    Kumar, Deepak; Thakur, Manoj; Dubey, Chandra S.; Shukla, Dericks P.

    2017-10-01

    In recent years, various machine learning techniques have been applied for landslide susceptibility mapping. In this study, three different variants of support vector machine viz., SVM, Proximal Support Vector Machine (PSVM) and L2-Support Vector Machine - Modified Finite Newton (L2-SVM-MFN) have been applied on the Mandakini River Basin in Uttarakhand, India to carry out the landslide susceptibility mapping. Eight thematic layers such as elevation, slope, aspect, drainages, geology/lithology, buffer of thrusts/faults, buffer of streams and soil along with the past landslide data were mapped in GIS environment and used for landslide susceptibility mapping in MATLAB. The study area covering 1625 km2 has merely 0.11% of area under landslides. There are 2009 pixels for past landslides out of which 50% (1000) landslides were considered as training set while remaining 50% as testing set. The performance of these techniques has been evaluated and the computational results show that L2-SVM-MFN obtains higher prediction values (0.829) of receiver operating characteristic curve (AUC-area under the curve) as compared to 0.807 for PSVM model and 0.79 for SVM. The results obtained from L2-SVM-MFN model are found to be superior than other SVM prediction models and suggest the usefulness of this technique to problem of landslide susceptibility mapping where training data is very less. However, these techniques can be used for satisfactory determination of susceptible zones with these inputs.

  20. Short-term traffic flow prediction model using particle swarm optimization–based combined kernel function-least squares support vector machine combined with chaos theory

    Directory of Open Access Journals (Sweden)

    Qiang Shang

    2016-08-01

    Full Text Available Short-term traffic flow prediction is an important part of intelligent transportation systems research and applications. For further improving the accuracy of short-time traffic flow prediction, a novel hybrid prediction model (multivariate phase space reconstruction–combined kernel function-least squares support vector machine based on multivariate phase space reconstruction and combined kernel function-least squares support vector machine is proposed. The C-C method is used to determine the optimal time delay and the optimal embedding dimension of traffic variables’ (flow, speed, and occupancy time series for phase space reconstruction. The G-P method is selected to calculate the correlation dimension of attractor which is an important index for judging chaotic characteristics of the traffic variables’ series. The optimal input form of combined kernel function-least squares support vector machine model is determined by multivariate phase space reconstruction, and the model’s parameters are optimized by particle swarm optimization algorithm. Finally, case validation is carried out using the measured data of an expressway in Xiamen, China. The experimental results suggest that the new proposed model yields better predictions compared with similar models (combined kernel function-least squares support vector machine, multivariate phase space reconstruction–generalized kernel function-least squares support vector machine, and phase space reconstruction–combined kernel function-least squares support vector machine, which indicates that the new proposed model exhibits stronger prediction ability and robustness.

  1. An Integrated Approach to Battery Health Monitoring using Bayesian Regression, Classification and State Estimation

    Data.gov (United States)

    National Aeronautics and Space Administration — The application of the Bayesian theory of managing uncertainty and complexity to regression and classification in the form of Relevance Vector Machine (RVM), and to...

  2. Support Vector Machine Based Tool for Plant Species Taxonomic Classification

    OpenAIRE

    Manimekalai .K; Vijaya.MS

    2014-01-01

    Plant species are living things and are generally categorized in terms of Domain, Kingdom, Phylum, Class, Order, Family, Genus and name of Species in a hierarchical fashion. This paper formulates the taxonomic leaf categorization problem as the hierarchical classification task and provides a suitable solution using a supervised learning technique namely support vector machine. Features are extracted from scanned images of plant leaves and trained using SVM. Only class, order, family of plants...

  3. MANCOVA for one way classification with homogeneity of regression coefficient vectors

    Science.gov (United States)

    Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

    2017-11-01

    The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.

  4. FUSION DECISION FOR A BIMODAL BIOMETRIC VERIFICATION SYSTEM USING SUPPORT VECTOR MACHINE AND ITS VARIATIONS

    Directory of Open Access Journals (Sweden)

    A. Teoh

    2017-12-01

    Full Text Available This paw presents fusion detection technique comparisons based on support vector machine and its variations for a bimodal biometric verification system that makes use of face images and speech utterances. The system is essentially constructed by a face expert, a speech expert and a fusion decision module. Each individual expert has been optimized to operate in automatic mode and designed for security access application. Fusion decision schemes considered are linear, weighted Support Vector Machine (SVM and linear SVM with quadratic transformation. The conditions tested include the balanced and unbalanced conditions between the two experts in order to obtain the optimum fusion module from  these techniques best suited to the target application.

  5. Magnitude And Distance Determination From The First Few Seconds Of One Three Components Seismological Station Signal Using Support Vector Machine Regression Methods

    Science.gov (United States)

    Ochoa Gutierrez, L. H.; Vargas Jimenez, C. A.; Niño Vasquez, L. F.

    2011-12-01

    The "Sabana de Bogota" (Bogota Savannah) is the most important social and economical center of Colombia. Almost the third of population is concentrated in this region and generates about the 40% of Colombia's Internal Brute Product (IBP). According to this, the zone presents an elevated vulnerability in case that a high destructive seismic event occurs. Historical evidences show that high magnitude events took place in the past with a huge damage caused to the city and indicate that is probable that such events can occur in the next years. This is the reason why we are working in an early warning generation system, using the first few seconds of a seismic signal registered by three components and wide band seismometers. Such system can be implemented using Computational Intelligence tools, designed and calibrated to the particular Geological, Structural and environmental conditions present in the region. The methods developed are expected to work on real time, thus suitable software and electronic tools need to be developed. We used Support Vector Machines Regression (SVMR) methods trained and tested with historic seismic events registered by "EL ROSAL" Station, located near Bogotá, calculating descriptors or attributes as the input of the model, from the first 6 seconds of signal. With this algorithm, we obtained less than 10% of mean absolute error and correlation coefficients greater than 85% in hypocentral distance and Magnitude estimation. With this results we consider that we can improve the method trying to have better accuracy with less signal time and that this can be a very useful model to be implemented directly in the seismological stations to generate a fast characterization of the event, broadcasting not only raw signal but pre-processed information that can be very useful for accurate Early Warning Generation.

  6. Application of Hybrid Quantum Tabu Search with Support Vector Regression (SVR for Load Forecasting

    Directory of Open Access Journals (Sweden)

    Cheng-Wen Lee

    2016-10-01

    Full Text Available Hybridizing chaotic evolutionary algorithms with support vector regression (SVR to improve forecasting accuracy is a hot topic in electricity load forecasting. Trapping at local optima and premature convergence are critical shortcomings of the tabu search (TS algorithm. This paper investigates potential improvements of the TS algorithm by applying quantum computing mechanics to enhance the search information sharing mechanism (tabu memory to improve the forecasting accuracy. This article presents an SVR-based load forecasting model that integrates quantum behaviors and the TS algorithm with the support vector regression model (namely SVRQTS to obtain a more satisfactory forecasting accuracy. Numerical examples demonstrate that the proposed model outperforms the alternatives.

  7. Reliable Fault Classification of Induction Motors Using Texture Feature Extraction and a Multiclass Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Jia Uddin

    2014-01-01

    Full Text Available This paper proposes a method for the reliable fault detection and classification of induction motors using two-dimensional (2D texture features and a multiclass support vector machine (MCSVM. The proposed model first converts time-domain vibration signals to 2D gray images, resulting in texture patterns (or repetitive patterns, and extracts these texture features by generating the dominant neighborhood structure (DNS map. The principal component analysis (PCA is then used for the purpose of dimensionality reduction of the high-dimensional feature vector including the extracted texture features due to the fact that the high-dimensional feature vector can degrade classification performance, and this paper configures an effective feature vector including discriminative fault features for diagnosis. Finally, the proposed approach utilizes the one-against-all (OAA multiclass support vector machines (MCSVMs to identify induction motor failures. In this study, the Gaussian radial basis function kernel cooperates with OAA MCSVMs to deal with nonlinear fault features. Experimental results demonstrate that the proposed approach outperforms three state-of-the-art fault diagnosis algorithms in terms of fault classification accuracy, yielding an average classification accuracy of 100% even in noisy environments.

  8. Precise on-machine extraction of the surface normal vector using an eddy current sensor array

    International Nuclear Information System (INIS)

    Wang, Yongqing; Lian, Meng; Liu, Haibo; Ying, Yangwei; Sheng, Xianjun

    2016-01-01

    To satisfy the requirements of on-machine measurement of the surface normal during complex surface manufacturing, a highly robust normal vector extraction method using an Eddy current (EC) displacement sensor array is developed, the output of which is almost unaffected by surface brightness, machining coolant and environmental noise. A precise normal vector extraction model based on a triangular-distributed EC sensor array is first established. Calibration of the effects of object surface inclination and coupling interference on measurement results, and the relative position of EC sensors, is involved. A novel apparatus employing three EC sensors and a force transducer was designed, which can be easily integrated into the computer numerical control (CNC) machine tool spindle and/or robot terminal execution. Finally, to test the validity and practicability of the proposed method, typical experiments were conducted with specified testing pieces using the developed approach and system, such as an inclined plane and cylindrical and spherical surfaces. (paper)

  9. Precise on-machine extraction of the surface normal vector using an eddy current sensor array

    Science.gov (United States)

    Wang, Yongqing; Lian, Meng; Liu, Haibo; Ying, Yangwei; Sheng, Xianjun

    2016-11-01

    To satisfy the requirements of on-machine measurement of the surface normal during complex surface manufacturing, a highly robust normal vector extraction method using an Eddy current (EC) displacement sensor array is developed, the output of which is almost unaffected by surface brightness, machining coolant and environmental noise. A precise normal vector extraction model based on a triangular-distributed EC sensor array is first established. Calibration of the effects of object surface inclination and coupling interference on measurement results, and the relative position of EC sensors, is involved. A novel apparatus employing three EC sensors and a force transducer was designed, which can be easily integrated into the computer numerical control (CNC) machine tool spindle and/or robot terminal execution. Finally, to test the validity and practicability of the proposed method, typical experiments were conducted with specified testing pieces using the developed approach and system, such as an inclined plane and cylindrical and spherical surfaces.

  10. Support vector machine incremental learning triggered by wrongly predicted samples

    Science.gov (United States)

    Tang, Ting-long; Guan, Qiu; Wu, Yi-rong

    2018-05-01

    According to the classic Karush-Kuhn-Tucker (KKT) theorem, at every step of incremental support vector machine (SVM) learning, the newly adding sample which violates the KKT conditions will be a new support vector (SV) and migrate the old samples between SV set and non-support vector (NSV) set, and at the same time the learning model should be updated based on the SVs. However, it is not exactly clear at this moment that which of the old samples would change between SVs and NSVs. Additionally, the learning model will be unnecessarily updated, which will not greatly increase its accuracy but decrease the training speed. Therefore, how to choose the new SVs from old sets during the incremental stages and when to process incremental steps will greatly influence the accuracy and efficiency of incremental SVM learning. In this work, a new algorithm is proposed to select candidate SVs and use the wrongly predicted sample to trigger the incremental processing simultaneously. Experimental results show that the proposed algorithm can achieve good performance with high efficiency, high speed and good accuracy.

  11. Pair- ${v}$ -SVR: A Novel and Efficient Pairing nu-Support Vector Regression Algorithm.

    Science.gov (United States)

    Hao, Pei-Yi

    This paper proposes a novel and efficient pairing nu-support vector regression (pair--SVR) algorithm that combines successfully the superior advantages of twin support vector regression (TSVR) and classical -SVR algorithms. In spirit of TSVR, the proposed pair--SVR solves two quadratic programming problems (QPPs) of smaller size rather than a single larger QPP, and thus has faster learning speed than classical -SVR. The significant advantage of our pair--SVR over TSVR is the improvement in the prediction speed and generalization ability by introducing the concepts of the insensitive zone and the regularization term that embodies the essence of statistical learning theory. Moreover, pair--SVR has additional advantage of using parameter for controlling the bounds on fractions of SVs and errors. Furthermore, the upper bound and lower bound functions of the regression model estimated by pair--SVR capture well the characteristics of data distributions, thus facilitating automatic estimation of the conditional mean and predictive variance simultaneously. This may be useful in many cases, especially when the noise is heteroscedastic and depends strongly on the input values. The experimental results validate the superiority of our pair--SVR in both training/prediction speed and generalization ability.This paper proposes a novel and efficient pairing nu-support vector regression (pair--SVR) algorithm that combines successfully the superior advantages of twin support vector regression (TSVR) and classical -SVR algorithms. In spirit of TSVR, the proposed pair--SVR solves two quadratic programming problems (QPPs) of smaller size rather than a single larger QPP, and thus has faster learning speed than classical -SVR. The significant advantage of our pair--SVR over TSVR is the improvement in the prediction speed and generalization ability by introducing the concepts of the insensitive zone and the regularization term that embodies the essence of statistical learning theory

  12. Support vector regression model based predictive control of water level of U-tube steam generators

    Energy Technology Data Exchange (ETDEWEB)

    Kavaklioglu, Kadir, E-mail: kadir.kavaklioglu@pau.edu.tr

    2014-10-15

    Highlights: • Water level of U-tube steam generators was controlled in a model predictive fashion. • Models for steam generator water level were built using support vector regression. • Cost function minimization for future optimal controls was performed by using the steepest descent method. • The results indicated the feasibility of the proposed method. - Abstract: A predictive control algorithm using support vector regression based models was proposed for controlling the water level of U-tube steam generators of pressurized water reactors. Steam generator data were obtained using a transfer function model of U-tube steam generators. Support vector regression based models were built using a time series type model structure for five different operating powers. Feedwater flow controls were calculated by minimizing a cost function that includes the level error, the feedwater change and the mismatch between feedwater and steam flow rates. Proposed algorithm was applied for a scenario consisting of a level setpoint change and a steam flow disturbance. The results showed that steam generator level can be controlled at all powers effectively by the proposed method.

  13. Pileup Subtraction and Jet Energy Prediction Using Machine Learning

    OpenAIRE

    Kong, Vein S; Li, Jiakun; Zhang, Yujia

    2015-01-01

    In the Large Hardron Collider (LHC), multiple proton-proton collisions cause pileup in reconstructing energy information for a single primary collision (jet). This project aims to select the most important features and create a model to accurately estimate jet energy. Different machine learning methods were explored, including linear regression, support vector regression and decision tree. The best result is obtained by linear regression with predictive features and the performance is improve...

  14. A Combination of Geographically Weighted Regression, Particle Swarm Optimization and Support Vector Machine for Landslide Susceptibility Mapping: A Case Study at Wanzhou in the Three Gorges Area, China.

    Science.gov (United States)

    Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian

    2016-05-11

    In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.

  15. A novel representation for apoptosis protein subcellular localization prediction using support vector machine.

    Science.gov (United States)

    Zhang, Li; Liao, Bo; Li, Dachao; Zhu, Wen

    2009-07-21

    Apoptosis, or programmed cell death, plays an important role in development of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on the concept that the position distribution information of amino acids is closely related with the structure and function of proteins, we introduce the concept of distance frequency [Matsuda, S., Vert, J.P., Ueda, N., Toh, H., Akutsu, T., 2005. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14, 2804-2813] and propose a novel way to calculate distance frequencies. In order to calculate the local features, each protein sequence is separated into p parts with the same length in our paper. Then we use the novel representation of protein sequences and adopt support vector machine to predict subcellular location. The overall prediction accuracy is significantly improved by jackknife test.

  16. Exact analytical modeling of magnetic vector potential in surface inset permanent magnet DC machines considering magnet segmentation

    Science.gov (United States)

    Jabbari, Ali

    2018-01-01

    Surface inset permanent magnet DC machine can be used as an alternative in automation systems due to their high efficiency and robustness. Magnet segmentation is a common technique in order to mitigate pulsating torque components in permanent magnet machines. An accurate computation of air-gap magnetic field distribution is necessary in order to calculate machine performance. An exact analytical method for magnetic vector potential calculation in surface inset permanent magnet machines considering magnet segmentation has been proposed in this paper. The analytical method is based on the resolution of Laplace and Poisson equations as well as Maxwell equation in polar coordinate by using sub-domain method. One of the main contributions of the paper is to derive an expression for the magnetic vector potential in the segmented PM region by using hyperbolic functions. The developed method is applied on the performance computation of two prototype surface inset magnet segmented motors with open circuit and on load conditions. The results of these models are validated through FEM method.

  17. Application of support vector machine model for enhancing the diagnostic value of tumor markers in gastric cancer

    International Nuclear Information System (INIS)

    Wang Hui; Huang Gang

    2010-01-01

    Objective: To evaluate the early diagnostic value of tumor markers for gastric cancer using support vector machine (SVM) model. Methods: Subjects involved in the study consisted of 262 cases with gastric cancer, 156 cases with benign gastric diseases and 149 healthy controls. From those subjects, five tumor markers, carcinoembryonic antigen (CEA), carbohydrate (CA) 125, CA19-9, alphafetoprotein (AFP) and CA50, were assayed and collected to make the datasets. To modify SVM model to fit the diagnostic classifiers, radial basis function was adopted and kernel function was optimized and validated by grid search and cross validation. For comparative study, methods of combination tests of five markers, Logistic regression, and decision tree were also used. Results: For gastric cancer, the diagnostic accuracy of the combination tests, Logistic regression, decision tree and SVM model were 46.2%, 64.5%, 63.9% and 95.1% respectively. SVM model significantly elevated the diagnostic value comparing with other three methods. Conclusion: The application of SVM model is of high value in enhancing the tumor marker for the diagnosis of gastric cancer. (authors)

  18. Performance and optimization of support vector machines in high-energy physics classification problems

    Energy Technology Data Exchange (ETDEWEB)

    Sahin, M.Oe.; Kruecker, D.; Melzer-Pellmann, I.A.

    2016-01-15

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications. A new C++ LIBSVM interface called SVM-HINT is developed and available on Github.

  19. Performance and optimization of support vector machines in high-energy physics classification problems

    International Nuclear Information System (INIS)

    Sahin, M.Oe.; Kruecker, D.; Melzer-Pellmann, I.A.

    2016-01-01

    In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications. A new C++ LIBSVM interface called SVM-HINT is developed and available on Github.

  20. A Numerical Comparison of Rule Ensemble Methods and Support Vector Machines

    Energy Technology Data Exchange (ETDEWEB)

    Meza, Juan C.; Woods, Mark

    2009-12-18

    Machine or statistical learning is a growing field that encompasses many scientific problems including estimating parameters from data, identifying risk factors in health studies, image recognition, and finding clusters within datasets, to name just a few examples. Statistical learning can be described as 'learning from data' , with the goal of making a prediction of some outcome of interest. This prediction is usually made on the basis of a computer model that is built using data where the outcomes and a set of features have been previously matched. The computer model is called a learner, hence the name machine learning. In this paper, we present two such algorithms, a support vector machine method and a rule ensemble method. We compared their predictive power on three supernova type 1a data sets provided by the Nearby Supernova Factory and found that while both methods give accuracies of approximately 95%, the rule ensemble method gives much lower false negative rates.

  1. Automated valve fault detection based on acoustic emission parameters and support vector machine

    Directory of Open Access Journals (Sweden)

    Salah M. Ali

    2018-03-01

    Full Text Available Reciprocating compressors are one of the most used types of compressors with wide applications in industry. The most common failure in reciprocating compressors is always related to the valves. Therefore, a reliable condition monitoring method is required to avoid the unplanned shutdown in this category of machines. Acoustic emission (AE technique is one of the effective recent methods in the field of valve condition monitoring. However, a major challenge is related to the analysis of AE signal which perhaps only depends on the experience and knowledge of technicians. This paper proposes automated fault detection method using support vector machine (SVM and AE parameters in an attempt to reduce human intervention in the process. Experiments were conducted on a single stage reciprocating air compressor by combining healthy and faulty valve conditions to acquire the AE signals. Valve functioning was identified through AE waveform analysis. SVM faults detection model was subsequently devised and validated based on training and testing samples respectively. The results demonstrated automatic valve fault detection model with accuracy exceeding 98%. It is believed that valve faults can be detected efficiently without human intervention by employing the proposed model for a single stage reciprocating compressor. Keywords: Condition monitoring, Faults detection, Signal analysis, Acoustic emission, Support vector machine

  2. Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model

    DEFF Research Database (Denmark)

    Møller, Niels Framroze

    This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its stru....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its......, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well...

  3. Fast Monte Carlo reliability evaluation using support vector machine

    International Nuclear Information System (INIS)

    Rocco, Claudio M.; Moreno, Jose Ali

    2002-01-01

    This paper deals with the feasibility of using support vector machine (SVM) to build empirical models for use in reliability evaluation. The approach takes advantage of the speed of SVM in the numerous model calculations typically required to perform a Monte Carlo reliability evaluation. The main idea is to develop an estimation algorithm, by training a model on a restricted data set, and replace system performance evaluation by a simpler calculation, which provides reasonably accurate model outputs. The proposed approach is illustrated by several examples. Excellent system reliability results are obtained by training a SVM with a small amount of information

  4. Feature Import Vector Machine: A General Classifier with Flexible Feature Selection.

    Science.gov (United States)

    Ghosh, Samiran; Wang, Yazhen

    2015-02-01

    The support vector machine (SVM) and other reproducing kernel Hilbert space (RKHS) based classifier systems are drawing much attention recently due to its robustness and generalization capability. General theme here is to construct classifiers based on the training data in a high dimensional space by using all available dimensions. The SVM achieves huge data compression by selecting only few observations which lie close to the boundary of the classifier function. However when the number of observations are not very large (small n ) but the number of dimensions/features are large (large p ), then it is not necessary that all available features are of equal importance in the classification context. Possible selection of an useful fraction of the available features may result in huge data compression. In this paper we propose an algorithmic approach by means of which such an optimal set of features could be selected. In short, we reverse the traditional sequential observation selection strategy of SVM to that of sequential feature selection. To achieve this we have modified the solution proposed by Zhu and Hastie (2005) in the context of import vector machine (IVM), to select an optimal sub-dimensional model to build the final classifier with sufficient accuracy.

  5. Successive overrelaxation for laplacian support vector machine.

    Science.gov (United States)

    Qi, Zhiquan; Tian, Yingjie; Shi, Yong

    2015-04-01

    Semisupervised learning (SSL) problem, which makes use of both a large amount of cheap unlabeled data and a few unlabeled data for training, in the last few years, has attracted amounts of attention in machine learning and data mining. Exploiting the manifold regularization (MR), Belkin et al. proposed a new semisupervised classification algorithm: Laplacian support vector machines (LapSVMs), and have shown the state-of-the-art performance in SSL field. To further improve the LapSVMs, we proposed a fast Laplacian SVM (FLapSVM) solver for classification. Compared with the standard LapSVM, our method has several improved advantages as follows: 1) FLapSVM does not need to deal with the extra matrix and burden the computations related to the variable switching, which make it more suitable for large scale problems; 2) FLapSVM’s dual problem has the same elegant formulation as that of standard SVMs. This means that the kernel trick can be applied directly into the optimization model; and 3) FLapSVM can be effectively solved by successive overrelaxation technology, which converges linearly to a solution and can process very large data sets that need not reside in memory. In practice, combining the strategies of random scheduling of subproblem and two stopping conditions, the computing speed of FLapSVM is rigidly quicker to that of LapSVM and it is a valid alternative to PLapSVM.

  6. A relevance vector machine technique for the automatic detection of clustered microcalcifications (Honorable Mention Poster Award)

    Science.gov (United States)

    Wei, Liyang; Yang, Yongyi; Nishikawa, Robert M.

    2005-04-01

    Microcalcification (MC) clusters in mammograms can be important early signs of breast cancer in women. Accurate detection of MC clusters is an important but challenging problem. In this paper, we propose the use of a recently developed machine learning technique -- relevance vector machine (RVM) -- for automatic detection of MCs in digitized mammograms. RVM is based on Bayesian estimation theory, and as a feature it can yield a decision function that depends on only a very small number of so-called relevance vectors. We formulate MC detection as a supervised-learning problem, and use RVM to classify if an MC object is present or not at each location in a mammogram image. MC clusters are then identified by grouping the detected MC objects. The proposed method is tested using a database of 141 clinical mammograms, and compared with a support vector machine (SVM) classifier which we developed previously. The detection performance is evaluated using the free-response receiver operating characteristic (FROC) curves. It is demonstrated that the RVM classifier matches closely with the SVM classifier in detection performance, and does so with a much sparser kernel representation than the SVM classifier. Consequently, the RVM classifier greatly reduces the computational complexity, making it more suitable for real-time processing of MC clusters in mammograms.

  7. Efficient Multiplicative Updates for Support Vector Machines

    DEFF Research Database (Denmark)

    Potluru, Vamsi K.; Plis, Sergie N; Mørup, Morten

    2009-01-01

    (NMF) problem. This allows us to derive a novel multiplicative algorithm for solving hard and soft margin SVM. The algorithm follows as a natural extension of the updates for NMF and semi-NMF. No additional parameter setting, such as choosing learning rate, is required. Exploiting the connection......The dual formulation of the support vector machine (SVM) objective function is an instance of a nonnegative quadratic programming problem. We reformulate the SVM objective function as a matrix factorization problem which establishes a connection with the regularized nonnegative matrix factorization...... between SVM and NMF formulation, we show how NMF algorithms can be applied to the SVM problem. Multiplicative updates that we derive for SVM problem also represent novel updates for semi-NMF. Further this unified view yields algorithmic insights in both directions: we demonstrate that the Kernel Adatron...

  8. Quantum optimization for training support vector machines.

    Science.gov (United States)

    Anguita, Davide; Ridella, Sandro; Rivieccio, Fabio; Zunino, Rodolfo

    2003-01-01

    Refined concepts, such as Rademacher estimates of model complexity and nonlinear criteria for weighting empirical classification errors, represent recent and promising approaches to characterize the generalization ability of Support Vector Machines (SVMs). The advantages of those techniques lie in both improving the SVM representation ability and yielding tighter generalization bounds. On the other hand, they often make Quadratic-Programming algorithms no longer applicable, and SVM training cannot benefit from efficient, specialized optimization techniques. The paper considers the application of Quantum Computing to solve the problem of effective SVM training, especially in the case of digital implementations. The presented research compares the behavioral aspects of conventional and enhanced SVMs; experiments in both a synthetic and real-world problems support the theoretical analysis. At the same time, the related differences between Quadratic-Programming and Quantum-based optimization techniques are considered.

  9. QSAR studies of the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by multiple linear regression (MLR) and support vector machine (SVM).

    Science.gov (United States)

    Qin, Zijian; Wang, Maolin; Yan, Aixia

    2017-07-01

    In this study, quantitative structure-activity relationship (QSAR) models using various descriptor sets and training/test set selection methods were explored to predict the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by using a multiple linear regression (MLR) and a support vector machine (SVM) method. 512 HCV NS3/4A protease inhibitors and their IC 50 values which were determined by the same FRET assay were collected from the reported literature to build a dataset. All the inhibitors were represented with selected nine global and 12 2D property-weighted autocorrelation descriptors calculated from the program CORINA Symphony. The dataset was divided into a training set and a test set by a random and a Kohonen's self-organizing map (SOM) method. The correlation coefficients (r 2 ) of training sets and test sets were 0.75 and 0.72 for the best MLR model, 0.87 and 0.85 for the best SVM model, respectively. In addition, a series of sub-dataset models were also developed. The performances of all the best sub-dataset models were better than those of the whole dataset models. We believe that the combination of the best sub- and whole dataset SVM models can be used as reliable lead designing tools for new NS3/4A protease inhibitors scaffolds in a drug discovery pipeline. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Linear and support vector regressions based on geometrical correlation of data

    Directory of Open Access Journals (Sweden)

    Kaijun Wang

    2007-10-01

    Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.

  11. ROBUSTNESS OF A FACE-RECOGNITION TECHNIQUE BASED ON SUPPORT VECTOR MACHINES

    OpenAIRE

    Prashanth Harshangi; Koshy George

    2010-01-01

    The ever-increasing requirements of security concerns have placed a greater demand for face recognition surveillance systems. However, most current face recognition techniques are not quite robust with respect to factors such as variable illumination, facial expression and detail, and noise in images. In this paper, we demonstrate that face recognition using support vector machines are sufficiently robust to different kinds of noise, does not require image pre-processing, and can be used with...

  12. Future Projection with an Extreme-Learning Machine and Support Vector Regression of Reference Evapotranspiration in a Mountainous Inland Watershed in North-West China

    Directory of Open Access Journals (Sweden)

    Zhenliang Yin

    2017-11-01

    Full Text Available This study aims to project future variability of reference evapotranspiration (ET0 using artificial intelligence methods, constructed with an extreme-learning machine (ELM and support vector regression (SVR in a mountainous inland watershed in north-west China. Eight global climate model (GCM outputs retrieved from the Coupled Model Inter-comparison Project Phase 5 (CMIP5 were employed to downscale monthly ET0 for the historical period 1960–2005 as a validation approach and for the future period 2010–2099 as a projection of ET0 under the Representative Concentration Pathway (RCP 4.5 and 8.5 scenarios. The following conclusions can be drawn: the ELM and SVR methods demonstrate a very good performance in estimating Food and Agriculture Organization (FAO-56 Penman–Monteith ET0. Variation in future ET0 mainly occurs in the spring and autumn seasons, while the summer and winter ET0 changes are moderately small. Annually, the ET0 values were shown to increase at a rate of approximately 7.5 mm, 7.5 mm, 0.0 mm (8.2 mm, 15.0 mm, 15.0 mm decade−1, respectively, for the near-term projection (2010–2039, mid-term projection (2040–2069, and long-term projection (2070–2099 under the RCP4.5 (RCP8.5 scenario. Compared to the historical period, the relative changes in ET0 were found to be approximately 2%, 5% and 6% (2%, 7% and 13%, during the near, mid- and long-term periods, respectively, under the RCP4.5 (RCP8.5 warming scenarios. In accordance with the analyses, we aver that the opportunity to downscale monthly ET0 with artificial intelligence is useful in practice for water-management policies.

  13. Neutron–gamma discrimination based on the support vector machine method

    International Nuclear Information System (INIS)

    Yu, Xunzhen; Zhu, Jingjun; Lin, ShinTed; Wang, Li; Xing, Haoyang; Zhang, Caixun; Xia, Yuxi; Liu, Shukui; Yue, Qian; Wei, Weiwei; Du, Qiang; Tang, Changjian

    2015-01-01

    In this study, the combination of the support vector machine (SVM) method with the moment analysis method (MAM) is proposed and utilized to perform neutron/gamma (n/γ) discrimination of the pulses from an organic liquid scintillator (OLS). Neutron and gamma events, which can be firmly separated on the scatter plot drawn by the charge comparison method (CCM), are detected to form the training data set and the test data set for the SVM, and the MAM is used to create the feature vectors for individual events in the data sets. Compared to the traditional methods, such as CCM, the proposed method can not only discriminate the neutron and gamma signals, even at lower energy levels, but also provide the corresponding classification accuracy for each event, which is useful in validating the discrimination. Meanwhile, the proposed method can also offer a predication of the classification for the under-energy-limit events

  14. Scorebox extraction from mobile sports videos using Support Vector Machines

    Science.gov (United States)

    Kim, Wonjun; Park, Jimin; Kim, Changick

    2008-08-01

    Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.

  15. Discussion About Nonlinear Time Series Prediction Using Least Squares Support Vector Machine

    International Nuclear Information System (INIS)

    Xu Ruirui; Bian Guoxing; Gao Chenfeng; Chen Tianlun

    2005-01-01

    The least squares support vector machine (LS-SVM) is used to study the nonlinear time series prediction. First, the parameter γ and multi-step prediction capabilities of the LS-SVM network are discussed. Then we employ clustering method in the model to prune the number of the support values. The learning rate and the capabilities of filtering noise for LS-SVM are all greatly improved.

  16. Optimization of Support Vector Machine (SVM) for Object Classification

    Science.gov (United States)

    Scholten, Matthew; Dhingra, Neil; Lu, Thomas T.; Chao, Tien-Hsin

    2012-01-01

    The Support Vector Machine (SVM) is a powerful algorithm, useful in classifying data into species. The SVMs implemented in this research were used as classifiers for the final stage in a Multistage Automatic Target Recognition (ATR) system. A single kernel SVM known as SVMlight, and a modified version known as a SVM with K-Means Clustering were used. These SVM algorithms were tested as classifiers under varying conditions. Image noise levels varied, and the orientation of the targets changed. The classifiers were then optimized to demonstrate their maximum potential as classifiers. Results demonstrate the reliability of SVM as a method for classification. From trial to trial, SVM produces consistent results.

  17. Tyrosine Kinase Ligand-Receptor Pair Prediction by Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masayuki Yarimizu

    2015-01-01

    Full Text Available Receptor tyrosine kinases are essential proteins involved in cellular differentiation and proliferation in vivo and are heavily involved in allergic diseases, diabetes, and onset/proliferation of cancerous cells. Identifying the interacting partner of this protein, a growth factor ligand, will provide a deeper understanding of cellular proliferation/differentiation and other cell processes. In this study, we developed a method for predicting tyrosine kinase ligand-receptor pairs from their amino acid sequences. We collected tyrosine kinase ligand-receptor pairs from the Database of Interacting Proteins (DIP and UniProtKB, filtered them by removing sequence redundancy, and used them as a dataset for machine learning and assessment of predictive performance. Our prediction method is based on support vector machines (SVMs, and we evaluated several input features suitable for tyrosine kinase for machine learning and compared and analyzed the results. Using sequence pattern information and domain information extracted from sequences as input features, we obtained 0.996 of the area under the receiver operating characteristic curve. This accuracy is higher than that obtained from general protein-protein interaction pair predictions.

  18. SVM-Maj: a majorization approach to linear support vector machines with different hinge errors

    NARCIS (Netherlands)

    P.J.F. Groenen (Patrick); G.I. Nalbantov (Georgi); J.C. Bioch (Cor)

    2007-01-01

    textabstractSupport vector machines (SVM) are becoming increasingly popular for the prediction of a binary dependent variable. SVMs perform very well with respect to competing techniques. Often, the solution of an SVM is obtained by switching to the dual. In this paper, we stick to the primal

  19. A Combination of Geographically Weighted Regression, Particle Swarm Optimization and Support Vector Machine for Landslide Susceptibility Mapping: A Case Study at Wanzhou in the Three Gorges Area, China

    Directory of Open Access Journals (Sweden)

    Xianyu Yu

    2016-05-01

    Full Text Available In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%–19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.

  20. Analyzing Big Data with the Hybrid Interval Regression Methods

    Directory of Open Access Journals (Sweden)

    Chia-Hui Huang

    2014-01-01

    Full Text Available Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM to analyze big data. Recently, the smooth support vector machine (SSVM was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.

  1. Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C.

    Science.gov (United States)

    Stoean, Ruxandra; Stoean, Catalin; Lupsor, Monica; Stefanescu, Horia; Badea, Radu

    2011-01-01

    Hepatic fibrosis, the principal pointer to the development of a liver disease within chronic hepatitis C, can be measured through several stages. The correct evaluation of its degree, based on recent different non-invasive procedures, is of current major concern. The latest methodology for assessing it is the Fibroscan and the effect of its employment is impressive. However, the complex interaction between its stiffness indicator and the other biochemical and clinical examinations towards a respective degree of liver fibrosis is hard to be manually discovered. In this respect, the novel, well-performing evolutionary-powered support vector machines are proposed towards an automated learning of the relationship between medical attributes and fibrosis levels. The traditional support vector machines have been an often choice for addressing hepatic fibrosis, while the evolutionary option has been validated on many real-world tasks and proven flexibility and good performance. The evolutionary approach is simple and direct, resulting from the hybridization of the learning component within support vector machines and the optimization engine of evolutionary algorithms. It discovers the optimal coefficients of surfaces that separate instances of distinct classes. Apart from a detached manner of establishing the fibrosis degree for new cases, a resulting formula also offers insight upon the correspondence between the medical factors and the respective outcome. What is more, a feature selection genetic algorithm can be further embedded into the method structure, in order to dynamically concentrate search only on the most relevant attributes. The data set refers 722 patients with chronic hepatitis C infection and 24 indicators. The five possible degrees of fibrosis range from F0 (no fibrosis) to F4 (cirrhosis). Since the standard support vector machines are among the most frequently used methods in recent artificial intelligence studies for hepatic fibrosis staging, the

  2. Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Masataka Fuchida

    2017-01-01

    Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.

  3. An empirical comparison of different approaches for combining multimodal neuroimaging data with support vector machine

    NARCIS (Netherlands)

    Pettersson-Yeo, W.; Benetti, S.; Marquand, A.F.; Joules, R.; Catani, M.; Williams, S.C.; Allen, P.; McGuire, P.; Mechelli, A.

    2014-01-01

    In the pursuit of clinical utility, neuroimaging researchers of psychiatric and neurological illness are increasingly using analyses, such as support vector machine, that allow inference at the single-subject level. Recent studies employing single-modality data, however, suggest that classification

  4. Extreme Learning Machine and Moving Least Square Regression Based Solar Panel Vision Inspection

    Directory of Open Access Journals (Sweden)

    Heng Liu

    2017-01-01

    Full Text Available In recent years, learning based machine intelligence has aroused a lot of attention across science and engineering. Particularly in the field of automatic industry inspection, the machine learning based vision inspection plays a more and more important role in defect identification and feature extraction. Through learning from image samples, many features of industry objects, such as shapes, positions, and orientations angles, can be obtained and then can be well utilized to determine whether there is defect or not. However, the robustness and the quickness are not easily achieved in such inspection way. In this work, for solar panel vision inspection, we present an extreme learning machine (ELM and moving least square regression based approach to identify solder joint defect and detect the panel position. Firstly, histogram peaks distribution (HPD and fractional calculus are applied for image preprocessing. Then an ELM-based defective solder joints identification is discussed in detail. Finally, moving least square regression (MLSR algorithm is introduced for solar panel position determination. Experimental results and comparisons show that the proposed ELM and MLSR based inspection method is efficient not only in detection accuracy but also in processing speed.

  5. A comparative study of machine learning models for ethnicity classification

    Science.gov (United States)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  6. Support Vector Regression-Based Adaptive Divided Difference Filter for Nonlinear State Estimation Problems

    Directory of Open Access Journals (Sweden)

    Hongjian Wang

    2014-01-01

    Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.

  7. Annual Electric Load Forecasting by a Least Squares Support Vector Machine with a Fruit Fly Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Bao Wang

    2012-11-01

    Full Text Available The accuracy of annual electric load forecasting plays an important role in the economic and social benefits of electric power systems. The least squares support vector machine (LSSVM has been proven to offer strong potential in forecasting issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. As a novel meta-heuristic and evolutionary algorithm, the fruit fly optimization algorithm (FOA has the advantages of being easy to understand and fast convergence to the global optimal solution. Therefore, to improve the forecasting performance, this paper proposes a LSSVM-based annual electric load forecasting model that uses FOA to automatically determine the appropriate values of the two parameters for the LSSVM model. By taking the annual electricity consumption of China as an instance, the computational result shows that the LSSVM combined with FOA (LSSVM-FOA outperforms other alternative methods, namely single LSSVM, LSSVM combined with coupled simulated annealing algorithm (LSSVM-CSA, generalized regression neural network (GRNN and regression model.

  8. Noise model based ν-support vector regression with its application to short-term wind speed forecasting.

    Science.gov (United States)

    Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie

    2014-09-01

    Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. KOMPARASI MODEL SUPPORT VECTOR MACHINES (SVM DAN NEURAL NETWORK UNTUK MENGETAHUI TINGKAT AKURASI PREDIKSI TERTINGGI HARGA SAHAM

    Directory of Open Access Journals (Sweden)

    R. Hadapiningradja Kusumodestoni

    2017-09-01

    Full Text Available There are many types of investments to make money, one of which is in the form of shares. Shares is a trading company dealing with securities in the global capital markets. Stock Exchange or also called stock market is actually the activities of private companies in the form of buying and selling investments. To avoid losses in investing, we need a model of predictive analysis with high accuracy and supported by data - lots of data and accurately. The correct techniques in the analysis will be able to reduce the risk for investors in investing. There are many models used in the analysis of stock price movement prediction, in this study the researchers used models of neural networks (NN and a model of support vector machine (SVM. Based on the background of the problems that have been mentioned in the previous description it can be formulated the problem as follows: need an algorithm that can predict stock prices, and need a high accuracy rate by adding a data set on the prediction, two algorithms will be investigated expected results last researchers can deduce where the algorithm accuracy rate predictions are the highest or accurate, then the purpose of this study was to mengkomparasi or compare between the two algorithms are algorithms Neural Network algorithm and Support Vector Machine which later on the end result has an accuracy rate forecast stock prices highest to see the error value RMSEnya. After doing research using the model of neural network and model of support vector machine (SVM to predict the stock using the data value of the shares on the stock index hongkong dated July 20, 2016 at 16:26 pm until the date of 15 September 2016 at 17:40 pm as many as 729 data sets within an interval of 5 minute through a process of training, learning, and then continue the process of testing so the result is that by using a neural network model of the prediction accuracy of 0.503 +/- 0.009 (micro 503 while using the model of support vector machine

  10. Water demand prediction using artificial neural networks and support vector regression

    CSIR Research Space (South Africa)

    Msiza, IS

    2008-11-01

    Full Text Available Neural Networks and Support Vector Regression Ishmael S. Msiza1, Fulufhelo V. Nelwamondo1,2, Tshilidzi Marwala3 . 1Modelling and Digital Science, CSIR, Johannesburg,SOUTH AFRICA 2Graduate School of Arts and Sciences, Harvard University, Cambridge..., Massachusetts, USA 3School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, SOUTH AFRICA Email: imsiza@csir.co.za, nelwamon@fas.harvard.edu, tshilidzi.marwala@wits.ac.za Abstract— Computational Intelligence techniques...

  11. Object Recognition System-on-Chip Using the Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Houzet Dominique

    2005-01-01

    Full Text Available The first aim of this work is to propose the design of a system-on-chip (SoC platform dedicated to digital image and signal processing, which is tuned to implement efficiently multiply-and-accumulate (MAC vector/matrix operations. The second aim of this work is to implement a recent promising neural network method, namely, the support vector machine (SVM used for real-time object recognition, in order to build a vision machine. With such a reconfigurable and programmable SoC platform, it is possible to implement any SVM function dedicated to any object recognition problem. The final aim is to obtain an automatic reconfiguration of the SoC platform, based on the results of the learning phase on an objects' database, which makes it possible to recognize practically any object without manual programming. Recognition can be of any kind that is from image to signal data. Such a system is a general-purpose automatic classifier. Many applications can be considered as a classification problem, but are usually treated specifically in order to optimize the cost of the implemented solution. The cost of our approach is more important than a dedicated one, but in a near future, hundreds of millions of gates will be common and affordable compared to the design cost. What we are proposing here is a general-purpose classification neural network implemented on a reconfigurable SoC platform. The first version presented here is limited in size and thus in object recognition performances, but can be easily upgraded according to technology improvements.

  12. Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences

    Directory of Open Access Journals (Sweden)

    Ji-Yong An

    2016-01-01

    Full Text Available We propose a novel computational method known as RVM-LPQ that combines the Relevance Vector Machine (RVM model and Local Phase Quantization (LPQ to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the LPQ feature representation on a Position Specific Scoring Matrix (PSSM, reducing the influence of noise using a Principal Component Analysis (PCA, and using a Relevance Vector Machine (RVM based classifier. We perform 5-fold cross-validation experiments on Yeast and Human datasets, and we achieve very high accuracies of 92.65% and 97.62%, respectively, which is significantly better than previous works. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM classifier on the Yeast dataset. The experimental results demonstrate that our RVM-LPQ method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool for future proteomics research.

  13. River flow prediction using hybrid models of support vector regression with the wavelet transform, singular spectrum analysis and chaotic approach

    Science.gov (United States)

    Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal

    2018-06-01

    Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.

  14. Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Chenzhong Cao

    2009-07-01

    Full Text Available Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT and murine local lymph node assay (LLNA are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers.

  15. Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine

    Science.gov (United States)

    Yuan, Hua; Huang, Jianping; Cao, Chenzhong

    2009-01-01

    Skin sensitization is the most commonly reported occupational illness, causing much suffering to a wide range of people. Identification and labeling of environmental allergens is urgently required to protect people from skin sensitization. The guinea pig maximization test (GPMT) and murine local lymph node assay (LLNA) are the two most important in vivo models for identification of skin sensitizers. In order to reduce the number of animal tests, quantitative structure-activity relationships (QSARs) are strongly encouraged in the assessment of skin sensitization of chemicals. This paper has investigated the skin sensitization potential of 162 compounds with LLNA results and 92 compounds with GPMT results using a support vector machine. A particle swarm optimization algorithm was implemented for feature selection from a large number of molecular descriptors calculated by Dragon. For the LLNA data set, the classification accuracies are 95.37% and 88.89% for the training and the test sets, respectively. For the GPMT data set, the classification accuracies are 91.80% and 90.32% for the training and the test sets, respectively. The classification performances were greatly improved compared to those reported in the literature, indicating that the support vector machine optimized by particle swarm in this paper is competent for the identification of skin sensitizers. PMID:19742136

  16. Modeling DNA affinity landscape through two-round support vector regression with weighted degree kernels

    KAUST Repository

    Wang, Xiaolei; Kuwahara, Hiroyuki; Gao, Xin

    2014-01-01

    high-quality estimates of such complex affinity landscapes is, thus, essential to the control of gene expression and the advance of synthetic biology. Results: Here, we propose a two-round prediction method that is based on support vector regression

  17. Stochastic subset selection for learning with kernel machines.

    Science.gov (United States)

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  18. Support vector machines and generalisation in HEP

    Science.gov (United States)

    Bevan, Adrian; Gamboa Goñi, Rodrigo; Hays, Jon; Stevenson, Tom

    2017-10-01

    We review the concept of Support Vector Machines (SVMs) and discuss examples of their use in a number of scenarios. Several SVM implementations have been used in HEP and we exemplify this algorithm using the Toolkit for Multivariate Analysis (TMVA) implementation. We discuss examples relevant to HEP including background suppression for H → τ + τ - at the LHC with several different kernel functions. Performance benchmarking leads to the issue of generalisation of hyper-parameter selection. The avoidance of fine tuning (over training or over fitting) in MVA hyper-parameter optimisation, i.e. the ability to ensure generalised performance of an MVA that is independent of the training, validation and test samples, is of utmost importance. We discuss this issue and compare and contrast performance of hold-out and k-fold cross-validation. We have extended the SVM functionality and introduced tools to facilitate cross validation in TMVA and present results based on these improvements.

  19. Non-linear HVAC computations using least square support vector machines

    International Nuclear Information System (INIS)

    Kumar, Mahendra; Kar, I.N.

    2009-01-01

    This paper aims to demonstrate application of least square support vector machines (LS-SVM) to model two complex heating, ventilating and air-conditioning (HVAC) relationships. The two applications considered are the estimation of the predicted mean vote (PMV) for thermal comfort and the generation of psychrometric chart. LS-SVM has the potential for quick, exact representations and also possesses a structure that facilitates hardware implementation. The results show very good agreement between function values computed from conventional model and LS-SVM model in real time. The robustness of LS-SVM models against input noises has also been analyzed.

  20. Single Directional SMO Algorithm for Least Squares Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Xigao Shao

    2013-01-01

    Full Text Available Working set selection is a major step in decomposition methods for training least squares support vector machines (LS-SVMs. In this paper, a new technique for the selection of working set in sequential minimal optimization- (SMO- type decomposition methods is proposed. By the new method, we can select a single direction to achieve the convergence of the optimality condition. A simple asymptotic convergence proof for the new algorithm is given. Experimental comparisons demonstrate that the classification accuracy of the new method is not largely different from the existing methods, but the training speed is faster than existing ones.

  1. New model for prediction binary mixture of antihistamine decongestant using artificial neural networks and least squares support vector machine by spectrophotometry method

    Science.gov (United States)

    Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza

    2017-07-01

    In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300 nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them.

  2. Sentiment Analysis of Comments on Rohingya Movement with Support Vector Machine

    OpenAIRE

    Chowdhury, Hemayet Ahmed; Nibir, Tanvir Alam; Islam, Md. Saiful

    2018-01-01

    The Rohingya Movement and Crisis caused a huge uproar in the political and economic state of Bangladesh. Refugee movement is a recurring event and a large amount of data in the form of opinions remains on social media such as Facebook, with very little analysis done on them.To analyse the comments based on all Rohingya related posts, we had to create and modify a classifier based on the Support Vector Machine algorithm. The code is implemented in python and uses scikit-learn library. A datase...

  3. A framework for multiple kernel support vector regression and its applications to siRNA efficacy prediction.

    Science.gov (United States)

    Qiu, Shibin; Lane, Terran

    2009-01-01

    The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions.

  4. Vision based nutrient deficiency classification in maize plants using multi class support vector machines

    Science.gov (United States)

    Leena, N.; Saju, K. K.

    2018-04-01

    Nutritional deficiencies in plants are a major concern for farmers as it affects productivity and thus profit. The work aims to classify nutritional deficiencies in maize plant in a non-destructive mannerusing image processing and machine learning techniques. The colored images of the leaves are analyzed and classified with multi-class support vector machine (SVM) method. Several images of maize leaves with known deficiencies like nitrogen, phosphorous and potassium (NPK) are used to train the SVM classifier prior to the classification of test images. The results show that the method was able to classify and identify nutritional deficiencies.

  5. A Hybrid Least Square Support Vector Machine Model with Parameters Optimization for Stock Forecasting

    Directory of Open Access Journals (Sweden)

    Jian Chai

    2015-01-01

    Full Text Available This paper proposes an EMD-LSSVM (empirical mode decomposition least squares support vector machine model to analyze the CSI 300 index. A WD-LSSVM (wavelet denoising least squares support machine is also proposed as a benchmark to compare with the performance of EMD-LSSVM. Since parameters selection is vital to the performance of the model, different optimization methods are used, including simplex, GS (grid search, PSO (particle swarm optimization, and GA (genetic algorithm. Experimental results show that the EMD-LSSVM model with GS algorithm outperforms other methods in predicting stock market movement direction.

  6. Particle swarm optimization based support vector machine for damage level prediction of non-reshaped berm breakwater

    Digital Repository Service at National Institute of Oceanography (India)

    Harish, N.; Mandal, S.; Rao, S.; Patil, S.G.

    breakwater. Soft computing tools like Artificial Neural Network, Fuzzy Logic, Support Vector Machine (SVM), etc, are successfully used to solve complex problems. In the present study, SVM and hybrid of Particle Swarm Optimization (PSO) with SVM (PSO...

  7. Short-term electricity prices forecasting based on support vector regression and Auto-regressive integrated moving average modeling

    International Nuclear Information System (INIS)

    Che Jinxing; Wang Jianzhou

    2010-01-01

    In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.

  8. Normal mammogram detection based on local probability difference transforms and support vector machines

    International Nuclear Information System (INIS)

    Chiracharit, W.; Kumhom, P.; Chamnongthai, K.; Sun, Y.; Delp, E.J.; Babbs, C.F

    2007-01-01

    Automatic detection of normal mammograms, as a ''first look'' for breast cancer, is a new approach to computer-aided diagnosis. This approach may be limited, however, by two main causes. The first problem is the presence of poorly separable ''crossed-distributions'' in which the correct classification depends upon the value of each feature. The second problem is overlap of the feature distributions that are extracted from digitized mammograms of normal and abnormal patients. Here we introduce a new Support Vector Machine (SVM) based method utilizing with the proposed uncrossing mapping and Local Probability Difference (LPD). Crossed-distribution feature pairs are identified and mapped into a new features that can be separated by a zero-hyperplane of the new axis. The probability density functions of the features of normal and abnormal mammograms are then sampled and the local probability difference functions are estimated to enhance the features. From 1,000 ground-truth-known mammograms, 250 normal and 250 abnormal cases, including spiculated lesions, circumscribed masses or microcalcifications, are used for training a support vector machine. The classification results tested with another 250 normal and 250 abnormal sets show improved testing performances with 90% sensitivity and 89% specificity. (author)

  9. Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): A case study of Beijing-Tianjin-Shijiazhuang.

    Science.gov (United States)

    Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin

    2017-01-01

    Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction.

  10. Vectorization in quantum chemistry

    International Nuclear Information System (INIS)

    Saunders, V.R.

    1987-01-01

    It is argued that the optimal vectorization algorithm for many steps (and sub-steps) in a typical ab initio calculation of molecular electronic structure is quite strongly dependent on the target vector machine. Details such as the availability (or lack) of a given vector construct in the hardware, vector startup times and asymptotic rates must all be considered when selecting the optimal algorithm. Illustrations are drawn from: gaussian integral evaluation, fock matrix construction, 4-index transformation of molecular integrals, direct-CI methods, the matrix multiply operation. A cross comparison of practical implementations on the CDC Cyber 205, the Cray-IS and Cray-XMP machines is presented. To achieve portability while remaining optimal on a wide range of machines it is necessary to code all available algorithms in a machine independent manner, and to select the appropriate algorithm using a procedure which is based on machine dependent parameters. Most such parameters concern the timing of certain vector loop kernals, which can usually be derived from a 'bench-marking' routine executed prior to the calculation proper

  11. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges.

    Science.gov (United States)

    Goldstein, Benjamin A; Navar, Ann Marie; Carter, Rickey E

    2017-06-14

    Risk prediction plays an important role in clinical cardiology research. Traditionally, most risk models have been based on regression models. While useful and robust, these statistical methods are limited to using a small number of predictors which operate in the same way on everyone, and uniformly throughout their range. The purpose of this review is to illustrate the use of machine-learning methods for development of risk prediction models. Typically presented as black box approaches, most machine-learning methods are aimed at solving particular challenges that arise in data analysis that are not well addressed by typical regression approaches. To illustrate these challenges, as well as how different methods can address them, we consider trying to predicting mortality after diagnosis of acute myocardial infarction. We use data derived from our institution's electronic health record and abstract data on 13 regularly measured laboratory markers. We walk through different challenges that arise in modelling these data and then introduce different machine-learning approaches. Finally, we discuss general issues in the application of machine-learning methods including tuning parameters, loss functions, variable importance, and missing data. Overall, this review serves as an introduction for those working on risk modelling to approach the diffuse field of machine learning. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Cardiology.

  12. Support vector machines for nuclear reactor state estimation

    Energy Technology Data Exchange (ETDEWEB)

    Zavaljevski, N.; Gross, K. C.

    2000-02-14

    Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.

  13. Support vector machines for nuclear reactor state estimation

    International Nuclear Information System (INIS)

    Zavaljevski, N.; Gross, K. C.

    2000-01-01

    Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm

  14. Time-frequency feature analysis and recognition of fission neutrons signal based on support vector machine

    International Nuclear Information System (INIS)

    Jin Jing; Wei Biao; Feng Peng; Tang Yuelin; Zhou Mi

    2010-01-01

    Based on the interdependent relationship between fission neutrons ( 252 Cf) and fission chain ( 235 U system), the paper presents the time-frequency feature analysis and recognition in fission neutron signal based on support vector machine (SVM) through the analysis on signal characteristics and the measuring principle of the 252 Cf fission neutron signal. The time-frequency characteristics and energy features of the fission neutron signal are extracted by using wavelet decomposition and de-noising wavelet packet decomposition, and then applied to training and classification by means of support vector machine based on statistical learning theory. The results show that, it is effective to obtain features of nuclear signal via wavelet decomposition and de-noising wavelet packet decomposition, and the latter can reflect the internal characteristics of the fission neutron system better. With the training accomplished, the SVM classifier achieves an accuracy rate above 70%, overcoming the lack of training samples, and verifying the effectiveness of the algorithm. (authors)

  15. Geodesic Flow Kernel Support Vector Machine for Hyperspectral Image Classification by Unsupervised Subspace Feature Transfer

    Directory of Open Access Journals (Sweden)

    Alim Samat

    2016-03-01

    Full Text Available In order to deal with scenarios where the training data, used to deduce a model, and the validation data have different statistical distributions, we study the problem of transformed subspace feature transfer for domain adaptation (DA in the context of hyperspectral image classification via a geodesic Gaussian flow kernel based support vector machine (GFKSVM. To show the superior performance of the proposed approach, conventional support vector machines (SVMs and state-of-the-art DA algorithms, including information-theoretical learning of discriminative cluster for domain adaptation (ITLDC, joint distribution adaptation (JDA, and joint transfer matching (JTM, are also considered. Additionally, unsupervised linear and nonlinear subspace feature transfer techniques including principal component analysis (PCA, randomized nonlinear principal component analysis (rPCA, factor analysis (FA and non-negative matrix factorization (NNMF are investigated and compared. Experiments on two real hyperspectral images show the cross-image classification performances of the GFKSVM, confirming its effectiveness and suitability when applied to hyperspectral images.

  16. Support Vector Machines for Hyperspectral Remote Sensing Classification

    Science.gov (United States)

    Gualtieri, J. Anthony; Cromp, R. F.

    1998-01-01

    The Support Vector Machine provides a new way to design classification algorithms which learn from examples (supervised learning) and generalize when applied to new data. We demonstrate its success on a difficult classification problem from hyperspectral remote sensing, where we obtain performances of 96%, and 87% correct for a 4 class problem, and a 16 class problem respectively. These results are somewhat better than other recent results on the same data. A key feature of this classifier is its ability to use high-dimensional data without the usual recourse to a feature selection step to reduce the dimensionality of the data. For this application, this is important, as hyperspectral data consists of several hundred contiguous spectral channels for each exemplar. We provide an introduction to this new approach, and demonstrate its application to classification of an agriculture scene.

  17. Signal Detection for QPSK Based Cognitive Radio Systems using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    M. T. Mushtaq

    2015-04-01

    Full Text Available Cognitive radio based network enables opportunistic dynamic spectrum access by sensing, adopting and utilizing the unused portion of licensed spectrum bands. Cognitive radio is intelligent enough to adapt the communication parameters of the unused licensed spectrum. Spectrum sensing is one of the most important tasks of the cognitive radio cycle. In this paper, the auto-correlation function kernel based Support Vector Machine (SVM classifier along with Welch's Periodogram detector is successfully implemented for the detection of four QPSK (Quadrature Phase Shift Keying based signals propagating through an AWGN (Additive White Gaussian Noise channel. It is shown that the combination of statistical signal processing and machine learning concepts improve the spectrum sensing process and spectrum sensing is possible even at low Signal to Noise Ratio (SNR values up to -50 dB.

  18. Assessing the human cardiovascular response to moderate exercise: feature extraction by support vector regression

    International Nuclear Information System (INIS)

    Wang, Lu; Su, Steven W; Celler, Branko G; Chan, Gregory S H; Cheng, Teddy M; Savkin, Andrey V

    2009-01-01

    This study aims to quantitatively describe the steady-state relationships among percentage changes in key central cardiovascular variables (i.e. stroke volume, heart rate (HR), total peripheral resistance and cardiac output), measured using non-invasive means, in response to moderate exercise, and the oxygen uptake rate, using a new nonlinear regression approach—support vector regression. Ten untrained normal males exercised in an upright position on an electronically braked cycle ergometer with constant workloads ranging from 25 W to 125 W. Throughout the experiment, .VO 2 was determined breath by breath and the HR was monitored beat by beat. During the last minute of each exercise session, the cardiac output was measured beat by beat using a novel non-invasive ultrasound-based device and blood pressure was measured using a tonometric measurement device. Based on the analysis of experimental data, nonlinear steady-state relationships between key central cardiovascular variables and .VO 2 were qualitatively observed except for the HR which increased linearly as a function of increasing .VO 2 . Quantitative descriptions of these complex nonlinear behaviour were provided by nonparametric models which were obtained by using support vector regression

  19. Predicting the dissolution kinetics of silicate glasses using machine learning

    Science.gov (United States)

    Anoop Krishnan, N. M.; Mangalathu, Sujith; Smedskjaer, Morten M.; Tandia, Adama; Burton, Henry; Bauchy, Mathieu

    2018-05-01

    Predicting the dissolution rates of silicate glasses in aqueous conditions is a complex task as the underlying mechanism(s) remain poorly understood and the dissolution kinetics can depend on a large number of intrinsic and extrinsic factors. Here, we assess the potential of data-driven models based on machine learning to predict the dissolution rates of various aluminosilicate glasses exposed to a wide range of solution pH values, from acidic to caustic conditions. Four classes of machine learning methods are investigated, namely, linear regression, support vector machine regression, random forest, and artificial neural network. We observe that, although linear methods all fail to describe the dissolution kinetics, the artificial neural network approach offers excellent predictions, thanks to its inherent ability to handle non-linear data. Overall, we suggest that a more extensive use of machine learning approaches could significantly accelerate the design of novel glasses with tailored properties.

  20. Source localization in an ocean waveguide using supervised machine learning.

    Science.gov (United States)

    Niu, Haiqiang; Reeves, Emma; Gerstoft, Peter

    2017-09-01

    Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix and used as the input for three machine learning methods: feed-forward neural networks (FNN), support vector machines (SVM), and random forests (RF). The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF, and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization.

  1. Support vector machine learning model for the prediction of sentinel node status in patients with cutaneous melanoma.

    Science.gov (United States)

    Mocellin, Simone; Ambrosi, Alessandro; Montesco, Maria Cristina; Foletto, Mirto; Zavagno, Giorgio; Nitti, Donato; Lise, Mario; Rossi, Carlo Riccardo

    2006-08-01

    Currently, approximately 80% of melanoma patients undergoing sentinel node biopsy (SNB) have negative sentinel lymph nodes (SLNs), and no prediction system is reliable enough to be implemented in the clinical setting to reduce the number of SNB procedures. In this study, the predictive power of support vector machine (SVM)-based statistical analysis was tested. The clinical records of 246 patients who underwent SNB at our institution were used for this analysis. The following clinicopathologic variables were considered: the patient's age and sex and the tumor's histological subtype, Breslow thickness, Clark level, ulceration, mitotic index, lymphocyte infiltration, regression, angiolymphatic invasion, microsatellitosis, and growth phase. The results of SVM-based prediction of SLN status were compared with those achieved with logistic regression. The SLN positivity rate was 22% (52 of 234). When the accuracy was > or = 80%, the negative predictive value, positive predictive value, specificity, and sensitivity were 98%, 54%, 94%, and 77% and 82%, 41%, 69%, and 93% by using SVM and logistic regression, respectively. Moreover, SVM and logistic regression were associated with a diagnostic error and an SNB percentage reduction of (1) 1% and 60% and (2) 15% and 73%, respectively. The results from this pilot study suggest that SVM-based prediction of SLN status might be evaluated as a prognostic method to avoid the SNB procedure in 60% of patients currently eligible, with a very low error rate. If validated in larger series, this strategy would lead to obvious advantages in terms of both patient quality of life and costs for the health care system.

  2. Screw Remaining Life Prediction Based on Quantum Genetic Algorithm and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Xiaochen Zhang

    2017-01-01

    Full Text Available To predict the remaining life of ball screw, a screw remaining life prediction method based on quantum genetic algorithm (QGA and support vector machine (SVM is proposed. A screw accelerated test bench is introduced. Accelerometers are installed to monitor the performance degradation of ball screw. Combined with wavelet packet decomposition and isometric mapping (Isomap, the sensitive feature vectors are obtained and stored in database. Meanwhile, the sensitive feature vectors are randomly chosen from the database and constitute training samples and testing samples. Then the optimal kernel function parameter and penalty factor of SVM are searched with the method of QGA. Finally, the training samples are used to train optimized SVM while testing samples are adopted to test the prediction accuracy of the trained SVM so the screw remaining life prediction model can be got. The experiment results show that the screw remaining life prediction model could effectively predict screw remaining life.

  3. Genome-wide prediction of discrete traits using bayesian regressions and machine learning

    Directory of Open Access Journals (Sweden)

    Forni Selma

    2011-02-01

    Full Text Available Abstract Background Genomic selection has gained much attention and the main goal is to increase the predictive accuracy and the genetic gain in livestock using dense marker information. Most methods dealing with the large p (number of covariates small n (number of observations problem have dealt only with continuous traits, but there are many important traits in livestock that are recorded in a discrete fashion (e.g. pregnancy outcome, disease resistance. It is necessary to evaluate alternatives to analyze discrete traits in a genome-wide prediction context. Methods This study shows two threshold versions of Bayesian regressions (Bayes A and Bayesian LASSO and two machine learning algorithms (boosting and random forest to analyze discrete traits in a genome-wide prediction context. These methods were evaluated using simulated and field data to predict yet-to-be observed records. Performances were compared based on the models' predictive ability. Results The simulation showed that machine learning had some advantages over Bayesian regressions when a small number of QTL regulated the trait under pure additivity. However, differences were small and disappeared with a large number of QTL. Bayesian threshold LASSO and boosting achieved the highest accuracies, whereas Random Forest presented the highest classification performance. Random Forest was the most consistent method in detecting resistant and susceptible animals, phi correlation was up to 81% greater than Bayesian regressions. Random Forest outperformed other methods in correctly classifying resistant and susceptible animals in the two pure swine lines evaluated. Boosting and Bayes A were more accurate with crossbred data. Conclusions The results of this study suggest that the best method for genome-wide prediction may depend on the genetic basis of the population analyzed. All methods were less accurate at correctly classifying intermediate animals than extreme animals. Among the different

  4. Steganalysis using logistic regression

    Science.gov (United States)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  5. A robust combination approach for short-term wind speed forecasting and analysis – Combination of the ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM) forecasts using a GPR (Gaussian Process Regression) model

    International Nuclear Information System (INIS)

    Wang, Jianzhou; Hu, Jianming

    2015-01-01

    With the increasing importance of wind power as a component of power systems, the problems induced by the stochastic and intermittent nature of wind speed have compelled system operators and researchers to search for more reliable techniques to forecast wind speed. This paper proposes a combination model for probabilistic short-term wind speed forecasting. In this proposed hybrid approach, EWT (Empirical Wavelet Transform) is employed to extract meaningful information from a wind speed series by designing an appropriate wavelet filter bank. The GPR (Gaussian Process Regression) model is utilized to combine independent forecasts generated by various forecasting engines (ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM)) in a nonlinear way rather than the commonly used linear way. The proposed approach provides more probabilistic information for wind speed predictions besides improving the forecasting accuracy for single-value predictions. The effectiveness of the proposed approach is demonstrated with wind speed data from two wind farms in China. The results indicate that the individual forecasting engines do not consistently forecast short-term wind speed for the two sites, and the proposed combination method can generate a more reliable and accurate forecast. - Highlights: • The proposed approach can make probabilistic modeling for wind speed series. • The proposed approach adapts to the time-varying characteristic of the wind speed. • The hybrid approach can extract the meaningful components from the wind speed series. • The proposed method can generate adaptive, reliable and more accurate forecasting results. • The proposed model combines four independent forecasting engines in a nonlinear way.

  6. Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions.

    Science.gov (United States)

    Rativa, Diego; Fernandes, Bruno J T; Roque, Alexandre

    2018-01-01

    Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care.

  7. Identifying predictive features in drug response using machine learning: opportunities and challenges.

    Science.gov (United States)

    Vidyasagar, Mathukumalli

    2015-01-01

    This article reviews several techniques from machine learning that can be used to study the problem of identifying a small number of features, from among tens of thousands of measured features, that can accurately predict a drug response. Prediction problems are divided into two categories: sparse classification and sparse regression. In classification, the clinical parameter to be predicted is binary, whereas in regression, the parameter is a real number. Well-known methods for both classes of problems are briefly discussed. These include the SVM (support vector machine) for classification and various algorithms such as ridge regression, LASSO (least absolute shrinkage and selection operator), and EN (elastic net) for regression. In addition, several well-established methods that do not directly fall into machine learning theory are also reviewed, including neural networks, PAM (pattern analysis for microarrays), SAM (significance analysis for microarrays), GSEA (gene set enrichment analysis), and k-means clustering. Several references indicative of the application of these methods to cancer biology are discussed.

  8. A support vector machine (SVM) based voltage stability classifier

    Energy Technology Data Exchange (ETDEWEB)

    Dosano, R.D.; Song, H. [Kunsan National Univ., Kunsan, Jeonbuk (Korea, Republic of); Lee, B. [Korea Univ., Seoul (Korea, Republic of)

    2007-07-01

    Power system stability has become even more complex and critical with the advent of deregulated energy markets and the growing desire to completely employ existing transmission and infrastructure. The economic pressure on electricity markets forces the operation of power systems and components to their limit of capacity and performance. System conditions can be more exposed to instability due to greater uncertainty in day to day system operations and increase in the number of potential components for system disturbances potentially resulting in voltage stability. This paper proposed a support vector machine (SVM) based power system voltage stability classifier using local measurements of voltage and active power of load. It described the procedure for fast classification of long-term voltage stability using the SVM algorithm. The application of the SVM based voltage stability classifier was presented with reference to the choice of input parameters; input data preconditioning; moving window for feature vector; determination of learning samples; and other considerations in SVM applications. The paper presented a case study with numerical examples of an 11-bus test system. The test results for the feasibility study demonstrated that the classifier could offer an excellent performance in classification with time-series measurements in terms of long-term voltage stability. 9 refs., 14 figs.

  9. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    Science.gov (United States)

    Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.

    2017-02-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  10. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    International Nuclear Information System (INIS)

    Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.; Sugiura, K.

    2017-01-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  11. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    Energy Technology Data Exchange (ETDEWEB)

    Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M. [Applied Electromagnetic Research Institute, National Institute of Information and Communications Technology, 4-2-1, Nukui-Kitamachi, Koganei, Tokyo 184-8795 (Japan); Sugiura, K., E-mail: nishizuka.naoto@nict.go.jp [Advanced Speech Translation Research and Development Promotion Center, National Institute of Information and Communications Technology (Japan)

    2017-02-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  12. A support vector machine approach for detection of microcalcifications.

    Science.gov (United States)

    El-Naqa, Issam; Yang, Yongyi; Wernick, Miles N; Galatsanos, Nikolas P; Nishikawa, Robert M

    2002-12-01

    In this paper, we investigate an approach based on support vector machines (SVMs) for detection of microcalcification (MC) clusters in digital mammograms, and propose a successive enhancement learning scheme for improved performance. SVM is a machine-learning method, based on the principle of structural risk minimization, which performs well when applied to data outside the training set. We formulate MC detection as a supervised-learning problem and apply SVM to develop the detection algorithm. We use the SVM to detect at each location in the image whether an MC is present or not. We tested the proposed method using a database of 76 clinical mammograms containing 1120 MCs. We use free-response receiver operating characteristic curves to evaluate detection performance, and compare the proposed algorithm with several existing methods. In our experiments, the proposed SVM framework outperformed all the other methods tested. In particular, a sensitivity as high as 94% was achieved by the SVM method at an error rate of one false-positive cluster per image. The ability of SVM to out perform several well-known methods developed for the widely studied problem of MC detection suggests that SVM is a promising technique for object detection in a medical imaging application.

  13. Channelized relevance vector machine as a numerical observer for cardiac perfusion defect detection task

    Science.gov (United States)

    Kalayeh, Mahdi M.; Marin, Thibault; Pretorius, P. Hendrik; Wernick, Miles N.; Yang, Yongyi; Brankov, Jovan G.

    2011-03-01

    In this paper, we present a numerical observer for image quality assessment, aiming to predict human observer accuracy in a cardiac perfusion defect detection task for single-photon emission computed tomography (SPECT). In medical imaging, image quality should be assessed by evaluating the human observer accuracy for a specific diagnostic task. This approach is known as task-based assessment. Such evaluations are important for optimizing and testing imaging devices and algorithms. Unfortunately, human observer studies with expert readers are costly and time-demanding. To address this problem, numerical observers have been developed as a surrogate for human readers to predict human diagnostic performance. The channelized Hotelling observer (CHO) with internal noise model has been found to predict human performance well in some situations, but does not always generalize well to unseen data. We have argued in the past that finding a model to predict human observers could be viewed as a machine learning problem. Following this approach, in this paper we propose a channelized relevance vector machine (CRVM) to predict human diagnostic scores in a detection task. We have previously used channelized support vector machines (CSVM) to predict human scores and have shown that this approach offers better and more robust predictions than the classical CHO method. The comparison of the proposed CRVM with our previously introduced CSVM method suggests that CRVM can achieve similar generalization accuracy, while dramatically reducing model complexity and computation time.

  14. Application of Artificial Neural Network and Support Vector Machines in Predicting Metabolizable Energy in Compound Feeds for Pigs.

    Science.gov (United States)

    Ahmadi, Hamed; Rodehutscord, Markus

    2017-01-01

    In the nutrition literature, there are several reports on the use of artificial neural network (ANN) and multiple linear regression (MLR) approaches for predicting feed composition and nutritive value, while the use of support vector machines (SVM) method as a new alternative approach to MLR and ANN models is still not fully investigated. The MLR, ANN, and SVM models were developed to predict metabolizable energy (ME) content of compound feeds for pigs based on the German energy evaluation system from analyzed contents of crude protein (CP), ether extract (EE), crude fiber (CF), and starch. A total of 290 datasets from standardized digestibility studies with compound feeds was provided from several institutions and published papers, and ME was calculated thereon. Accuracy and precision of developed models were evaluated, given their produced prediction values. The results revealed that the developed ANN [ R 2  = 0.95; root mean square error (RMSE) = 0.19 MJ/kg of dry matter] and SVM ( R 2  = 0.95; RMSE = 0.21 MJ/kg of dry matter) models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR ( R 2  = 0.89; RMSE = 0.27 MJ/kg of dry matter). The developed ANN and SVM models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR; however, there were not obvious differences between performance of ANN and SVM models. Thus, SVM model may also be considered as a promising tool for modeling the relationship between chemical composition and ME of compound feeds for pigs. To provide the readers and nutritionist with the easy and rapid tool, an Excel ® calculator, namely, SVM_ME_pig, was created to predict the metabolizable energy values in compound feeds for pigs using developed support vector machine model.

  15. Fault trend prediction of device based on support vector regression

    International Nuclear Information System (INIS)

    Song Meicun; Cai Qi

    2011-01-01

    The research condition of fault trend prediction and the basic theory of support vector regression (SVR) were introduced. SVR was applied to the fault trend prediction of roller bearing, and compared with other methods (BP neural network, gray model, and gray-AR model). The results show that BP network tends to overlearn and gets into local minimum so that the predictive result is unstable. It also shows that the predictive result of SVR is stabilization, and SVR is superior to BP neural network, gray model and gray-AR model in predictive precision. SVR is a kind of effective method of fault trend prediction. (authors)

  16. A Forecasting Approach Combining Self-Organizing Map with Support Vector Regression for Reservoir Inflow during Typhoon Periods

    Directory of Open Access Journals (Sweden)

    Gwo-Fong Lin

    2016-01-01

    Full Text Available This study describes the development of a reservoir inflow forecasting model for typhoon events to improve short lead-time flood forecasting performance. To strengthen the forecasting ability of the original support vector machines (SVMs model, the self-organizing map (SOM is adopted to group inputs into different clusters in advance of the proposed SOM-SVM model. Two different input methods are proposed for the SVM-based forecasting method, namely, SOM-SVM1 and SOM-SVM2. The methods are applied to an actual reservoir watershed to determine the 1 to 3 h ahead inflow forecasts. For 1, 2, and 3 h ahead forecasts, improvements in mean coefficient of efficiency (MCE due to the clusters obtained from SOM-SVM1 are 21.5%, 18.5%, and 23.0%, respectively. Furthermore, improvement in MCE for SOM-SVM2 is 20.9%, 21.2%, and 35.4%, respectively. Another SOM-SVM2 model increases the SOM-SVM1 model for 1, 2, and 3 h ahead forecasts obtained improvement increases of 0.33%, 2.25%, and 10.08%, respectively. These results show that the performance of the proposed model can provide improved forecasts of hourly inflow, especially in the proposed SOM-SVM2 model. In conclusion, the proposed model, which considers limit and higher related inputs instead of all inputs, can generate better forecasts in different clusters than are generated from the SOM process. The SOM-SVM2 model is recommended as an alternative to the original SVR (Support Vector Regression model because of its accuracy and robustness.

  17. A Personalized Electronic Movie Recommendation System Based on Support Vector Machine and Improved Particle Swarm Optimization.

    Science.gov (United States)

    Wang, Xibin; Luo, Fengji; Qian, Ying; Ranzi, Gianluca

    2016-01-01

    With the rapid development of ICT and Web technologies, a large an amount of information is becoming available and this is producing, in some instances, a condition of information overload. Under these conditions, it is difficult for a person to locate and access useful information for making decisions. To address this problem, there are information filtering systems, such as the personalized recommendation system (PRS) considered in this paper, that assist a person in identifying possible products or services of interest based on his/her preferences. Among available approaches, collaborative Filtering (CF) is one of the most widely used recommendation techniques. However, CF has some limitations, e.g., the relatively simple similarity calculation, cold start problem, etc. In this context, this paper presents a new regression model based on the support vector machine (SVM) classification and an improved PSO (IPSO) for the development of an electronic movie PRS. In its implementation, a SVM classification model is first established to obtain a preliminary movie recommendation list based on which a SVM regression model is applied to predict movies' ratings. The proposed PRS not only considers the movie's content information but also integrates the users' demographic and behavioral information to better capture the users' interests and preferences. The efficiency of the proposed method is verified by a series of experiments based on the MovieLens benchmark data set.

  18. SVM Classifier - a comprehensive java interface for support vector machine classification of microarray data.

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-12-12

    Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.

  19. Gradient Evolution-based Support Vector Machine Algorithm for Classification

    Science.gov (United States)

    Zulvia, Ferani E.; Kuo, R. J.

    2018-03-01

    This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.

  20. Learning with Uncertainty - Gaussian Processes and Relevance Vector Machines

    DEFF Research Database (Denmark)

    Candela, Joaquin Quinonero

    2004-01-01

    This thesis is concerned with Gaussian Processes (GPs) and Relevance Vector Machines (RVMs), both of which are particular instances of probabilistic linear models. We look at both models from a Bayesian perspective, and are forced to adopt an approximate Bayesian treatment to learning for two...... reasons. The first reason is the analytical intractability of the full Bayesian treatment and the fact that we in principle do not want to resort to sampling methods. The second reason, which incidentally justifies our not wanting to sample, is that we are interested in computationally efficient models...... approaches that ignore the accumulated uncertainty are way overconfident. Finally we explore a much harder problem: that of training with uncertain inputs. We explore approximating the full Bayesian treatment, which implies an analytically intractable integral. We propose two preliminary approaches...

  1. Exploration of machine learning techniques in predicting multiple sclerosis disease course

    OpenAIRE

    Zhao, Yijun; Healy, Brian C.; Rotstein, Dalia; Guttmann, Charles R. G.; Bakshi, Rohit; Weiner, Howard L.; Brodley, Carla E.; Chitnis, Tanuja

    2017-01-01

    Objective To explore the value of machine learning methods for predicting multiple sclerosis disease course. Methods 1693 CLIMB study patients were classified as increased EDSS?1.5 (worsening) or not (non-worsening) at up to five years after baseline visit. Support vector machines (SVM) were used to build the classifier, and compared to logistic regression (LR) using demographic, clinical and MRI data obtained at years one and two to predict EDSS at five years follow-up. Results Baseline data...

  2. Graduating the age-specific fertility pattern using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Anastasia Kostaki

    2009-06-01

    Full Text Available A topic of interest in demographic literature is the graduation of the age-specific fertility pattern. A standard graduation technique extensively used by demographers is to fit parametric models that accurately reproduce it. Non-parametric statistical methodology might be alternatively used for this graduation purpose. Support Vector Machines (SVM is a non-parametric methodology that could be utilized for fertility graduation purposes. This paper evaluates the SVM techniques as tools for graduating fertility rates In that we apply these techniques to empirical age specific fertility rates from a variety of populations, time period, and cohorts. Additionally, for comparison reasons we also fit known parametric models to the same empirical data sets.

  3. Long-Term Precipitation Analysis and Estimation of Precipitation Concentration Index Using Three Support Vector Machine Methods

    Directory of Open Access Journals (Sweden)

    Milan Gocic

    2016-01-01

    Full Text Available The monthly precipitation data from 29 stations in Serbia during the period of 1946–2012 were considered. Precipitation trends were calculated using linear regression method. Three CLINO periods (1961–1990, 1971–2000, and 1981–2010 in three subregions were analysed. The CLINO 1981–2010 period had a significant increasing trend. Spatial pattern of the precipitation concentration index (PCI was presented. For the purpose of PCI prediction, three Support Vector Machine (SVM models, namely, SVM coupled with the discrete wavelet transform (SVM-Wavelet, the firefly algorithm (SVM-FFA, and using the radial basis function (SVM-RBF, were developed and used. The estimation and prediction results of these models were compared with each other using three statistical indicators, that is, root mean square error, coefficient of determination, and coefficient of efficiency. The experimental results showed that an improvement in predictive accuracy and capability of generalization can be achieved by the SVM-Wavelet approach. Moreover, the results indicated the proposed SVM-Wavelet model can adequately predict the PCI.

  4. Predicting Antitumor Activity of Peptides by Consensus of Regression Models Trained on a Small Data Sample

    Directory of Open Access Journals (Sweden)

    Ivanka Jerić

    2011-11-01

    Full Text Available Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample.

  5. A Comparison of Supervised Machine Learning Algorithms and Feature Vectors for MS Lesion Segmentation Using Multimodal Structural MRI

    Science.gov (United States)

    Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.

    2014-01-01

    Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953

  6. DNS Tunneling Detection Method Based on Multilabel Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Ahmed Almusawi

    2018-01-01

    Full Text Available DNS tunneling is a method used by malicious users who intend to bypass the firewall to send or receive commands and data. This has a significant impact on revealing or releasing classified information. Several researchers have examined the use of machine learning in terms of detecting DNS tunneling. However, these studies have treated the problem of DNS tunneling as a binary classification where the class label is either legitimate or tunnel. In fact, there are different types of DNS tunneling such as FTP-DNS tunneling, HTTP-DNS tunneling, HTTPS-DNS tunneling, and POP3-DNS tunneling. Therefore, there is a vital demand to not only detect the DNS tunneling but rather classify such tunnel. This study aims to propose a multilabel support vector machine in order to detect and classify the DNS tunneling. The proposed method has been evaluated using a benchmark dataset that contains numerous DNS queries and is compared with a multilabel Bayesian classifier based on the number of corrected classified DNS tunneling instances. Experimental results demonstrate the efficacy of the proposed SVM classification method by obtaining an f-measure of 0.80.

  7. Semisupervised Support Vector Machines With Tangent Space Intrinsic Manifold Regularization.

    Science.gov (United States)

    Sun, Shiliang; Xie, Xijiong

    2016-09-01

    Semisupervised learning has been an active research topic in machine learning and data mining. One main reason is that labeling examples is expensive and time-consuming, while there are large numbers of unlabeled examples available in many practical problems. So far, Laplacian regularization has been widely used in semisupervised learning. In this paper, we propose a new regularization method called tangent space intrinsic manifold regularization. It is intrinsic to data manifold and favors linear functions on the manifold. Fundamental elements involved in the formulation of the regularization are local tangent space representations, which are estimated by local principal component analysis, and the connections that relate adjacent tangent spaces. Simultaneously, we explore its application to semisupervised classification and propose two new learning algorithms called tangent space intrinsic manifold regularized support vector machines (TiSVMs) and tangent space intrinsic manifold regularized twin SVMs (TiTSVMs). They effectively integrate the tangent space intrinsic manifold regularization consideration. The optimization of TiSVMs can be solved by a standard quadratic programming, while the optimization of TiTSVMs can be solved by a pair of standard quadratic programmings. The experimental results of semisupervised classification problems show the effectiveness of the proposed semisupervised learning algorithms.

  8. Gear fault diagnosis under variable conditions with intrinsic time-scale decomposition-singular value decomposition and support vector machine

    Energy Technology Data Exchange (ETDEWEB)

    Xing, Zhanqiang; Qu, Jianfeng; Chai, Yi; Tang, Qiu; Zhou, Yuming [Chongqing University, Chongqing (China)

    2017-02-15

    The gear vibration signal is nonlinear and non-stationary, gear fault diagnosis under variable conditions has always been unsatisfactory. To solve this problem, an intelligent fault diagnosis method based on Intrinsic time-scale decomposition (ITD)-Singular value decomposition (SVD) and Support vector machine (SVM) is proposed in this paper. The ITD method is adopted to decompose the vibration signal of gearbox into several Proper rotation components (PRCs). Subsequently, the singular value decomposition is proposed to obtain the singular value vectors of the proper rotation components and improve the robustness of feature extraction under variable conditions. Finally, the Support vector machine is applied to classify the fault type of gear. According to the experimental results, the performance of ITD-SVD exceeds those of the time-frequency analysis methods with EMD and WPT combined with SVD for feature extraction, and the classifier of SVM outperforms those for K-nearest neighbors (K-NN) and Back propagation (BP). Moreover, the proposed approach can accurately diagnose and identify different fault types of gear under variable conditions.

  9. Support Vector Machine Diagnosis of Acute Abdominal Pain

    Science.gov (United States)

    Björnsdotter, Malin; Nalin, Kajsa; Hansson, Lars-Erik; Malmgren, Helge

    This study explores the feasibility of a decision-support system for patients seeking care for acute abdominal pain, and, specifically the diagnosis of acute diverticulitis. We used a linear support vector machine (SVM) to separate diverticulitis from all other reported cases of abdominal pain and from the important differential diagnosis non-specific abdominal pain (NSAP). On a database containing 3337 patients, the SVM obtained results comparable to those of the doctors in separating diverticulitis or NSAP from the remaining diseases. The distinction between diverticulitis and NSAP was, however, substantially improved by the SVM. For this patient group, the doctors achieved a sensitivity of 0.714 and a specificity of 0.963. When adjusted to the physicians' results, the SVM sensitivity/specificity was higher at 0.714/0.985 and 0.786/0.963 respectively. Age was found as the most important discriminative variable, closely followed by C-reactive protein level and lower left side pain.

  10. hERG classification model based on a combination of support vector machine method and GRIND descriptors

    DEFF Research Database (Denmark)

    Li, Qiyuan; Jorgensen, Flemming Steen; Oprea, Tudor

    2008-01-01

    and diverse library of 495 compounds. The models combine pharmacophore-based GRIND descriptors with a support vector machine (SVM) classifier in order to discriminate between hERG blockers and nonblockers. Our models were applied at different thresholds from 1 to 40 mu m and achieved an overall accuracy up...

  11. Research on bearing life prediction based on support vector machine and its application

    International Nuclear Information System (INIS)

    Sun Chuang; Zhang Zhousuo; He Zhengjia

    2011-01-01

    Life prediction of rolling element bearing is the urgent demand in engineering practice, and the effective life prediction technique is beneficial to predictive maintenance. Support vector machine (SVM) is a novel machine learning method based on statistical learning theory, and is of advantage in prediction. This paper develops SVM-based model for bearing life prediction. The inputs of the model are features of bearing vibration signal and the output is the bearing running time-bearing failure time ratio. The model is built base on a few failed bearing data, and it can fuse information of the predicted bearing. So it is of advantage to bearing life prediction in practice. The model is applied to life prediction of a bearing, and the result shows the proposed model is of high precision.

  12. ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.

    Science.gov (United States)

    Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won

    2016-07-01

    In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.

  13. A technique to identify some typical radio frequency interference using support vector machine

    Science.gov (United States)

    Wang, Yuanchao; Li, Mingtao; Li, Dawei; Zheng, Jianhua

    2017-07-01

    In this paper, we present a technique to automatically identify some typical radio frequency interference from pulsar surveys using support vector machine. The technique has been tested by candidates. In these experiments, to get features of SVM, we use principal component analysis for mosaic plots and its classification accuracy is 96.9%; while we use mathematical morphology operation for smog plots and horizontal stripes plots and its classification accuracy is 86%. The technique is simple, high accurate and useful.

  14. Performance and optimization of support vector machines in high-energy physics classification problems

    Energy Technology Data Exchange (ETDEWEB)

    Sahin, Mehmet Oezguer; Kruecker, Dirk; Melzer-Pellmann, Isabell [DESY, Hamburg (Germany)

    2016-07-01

    In this talk, the use of Support Vector Machines (SVM) is promoted for new-physics searches in high-energy physics. We developed an interface, called SVM HEP Interface (SVM-HINT), for a popular SVM library, LibSVM, and introduced a statistical-significance based hyper-parameter optimization algorithm for the new-physics searches. As example case study, a search for Supersymmetry at the Large Hadron Collider is given to demonstrate the capabilities of SVM using SVM-HINT.

  15. DNBR Prediction Using a Support Vector Regression

    International Nuclear Information System (INIS)

    Yang, Heon Young; Na, Man Gyun

    2008-01-01

    PWRs (Pressurized Water Reactors) generally operate in the nucleate boiling state. However, the conversion of nucleate boiling into film boiling with conspicuously reduced heat transfer induces a boiling crisis that may cause the fuel clad melting in the long run. This type of boiling crisis is called Departure from Nucleate Boiling (DNB) phenomena. Because the prediction of minimum DNBR in a reactor core is very important to prevent the boiling crisis such as clad melting, a lot of research has been conducted to predict DNBR values. The object of this research is to predict minimum DNBR applying support vector regression (SVR) by using the measured signals of a reactor coolant system (RCS). The SVR has extensively and successfully been applied to nonlinear function approximation like the proposed problem for estimating DNBR values that will be a function of various input variables such as reactor power, reactor pressure, core mass flowrate, control rod positions and so on. The minimum DNBR in a reactor core is predicted using these various operating condition data as the inputs to the SVR. The minimum DBNR values predicted by the SVR confirm its correctness compared with COLSS values

  16. Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.

    Science.gov (United States)

    Chen, S; Samingan, A K; Hanzo, L

    2001-01-01

    The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.

  17. Classification of Listeria monocytogenes persistence in retail delicatessen environments using expert elicitation and machine learning.

    Science.gov (United States)

    Vangay, P; Steingrimsson, J; Wiedmann, M; Stasiewicz, M J

    2014-10-01

    Increasing evidence suggests that persistence of Listeria monocytogenes in food processing plants has been the underlying cause of a number of human listeriosis outbreaks. This study extracts criteria used by food safety experts in determining bacterial persistence in the environment, using retail delicatessen operations as a model. Using the Delphi method, we conducted an expert elicitation with 10 food safety experts from academia, industry, and government to classify L. monocytogenes persistence based on environmental sampling results collected over six months for 30 retail delicatessen stores. The results were modeled using variations of random forest, support vector machine, logistic regression, and linear regression; variable importance values of random forest and support vector machine models were consolidated to rank important variables in the experts' classifications. The duration of subtype isolation ranked most important across all expert categories. Sampling site category also ranked high in importance and validation errors doubled when this covariate was removed. Support vector machine and random forest models successfully classified the data with average validation errors of 3.1% and 2.2% (n = 144), respectively. Our findings indicate that (i) the frequency of isolations over time and sampling site information are critical factors for experts determining subtype persistence, (ii) food safety experts from different sectors may not use the same criteria in determining persistence, and (iii) machine learning models have potential for future use in environmental surveillance and risk management programs. Future work is necessary to validate the accuracy of expert and machine classification against biological measurement of L. monocytogenes persistence. © 2014 Society for Risk Analysis.

  18. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    International Nuclear Information System (INIS)

    Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

    2015-01-01

    The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO 2 , Fe 2 O 3 , CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na 2 O, K 2 O, TiO 2 , and P 2 O 5 , the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144

  19. The Construction of Support Vector Machine Classifier Using the Firefly Algorithm

    Directory of Open Access Journals (Sweden)

    Chih-Feng Chao

    2015-01-01

    Full Text Available The setting of parameters in the support vector machines (SVMs is very important with regard to its accuracy and efficiency. In this paper, we employ the firefly algorithm to train all parameters of the SVM simultaneously, including the penalty parameter, smoothness parameter, and Lagrangian multiplier. The proposed method is called the firefly-based SVM (firefly-SVM. This tool is not considered the feature selection, because the SVM, together with feature selection, is not suitable for the application in a multiclass classification, especially for the one-against-all multiclass SVM. In experiments, binary and multiclass classifications are explored. In the experiments on binary classification, ten of the benchmark data sets of the University of California, Irvine (UCI, machine learning repository are used; additionally the firefly-SVM is applied to the multiclass diagnosis of ultrasonic supraspinatus images. The classification performance of firefly-SVM is also compared to the original LIBSVM method associated with the grid search method and the particle swarm optimization based SVM (PSO-SVM. The experimental results advocate the use of firefly-SVM to classify pattern classifications for maximum accuracy.

  20. Machine learning in geosciences and remote sensing

    Directory of Open Access Journals (Sweden)

    David J. Lary

    2016-01-01

    Full Text Available Learning incorporates a broad range of complex procedures. Machine learning (ML is a subdivision of artificial intelligence based on the biological learning process. The ML approach deals with the design of algorithms to learn from machine readable data. ML covers main domains such as data mining, difficult-to-program applications, and software applications. It is a collection of a variety of algorithms (e.g. neural networks, support vector machines, self-organizing map, decision trees, random forests, case-based reasoning, genetic programming, etc. that can provide multivariate, nonlinear, nonparametric regression or classification. The modeling capabilities of the ML-based methods have resulted in their extensive applications in science and engineering. Herein, the role of ML as an effective approach for solving problems in geosciences and remote sensing will be highlighted. The unique features of some of the ML techniques will be outlined with a specific attention to genetic programming paradigm. Furthermore, nonparametric regression and classification illustrative examples are presented to demonstrate the efficiency of ML for tackling the geosciences and remote sensing problems.

  1. Application of Machine-Learning Models to Predict Tacrolimus Stable Dose in Renal Transplant Recipients

    Science.gov (United States)

    Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei

    2017-02-01

    Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67-0.76)] and validation cohorts [0.73 (0.63-0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future.

  2. Noise reduction by support vector regression with a Ricker wavelet kernel

    International Nuclear Information System (INIS)

    Deng, Xiaoying; Yang, Dinghui; Xie, Jing

    2009-01-01

    We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR

  3. Noise reduction by support vector regression with a Ricker wavelet kernel

    Science.gov (United States)

    Deng, Xiaoying; Yang, Dinghui; Xie, Jing

    2009-06-01

    We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR.

  4. Ranking Support Vector Machine with Kernel Approximation.

    Science.gov (United States)

    Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi

    2017-01-01

    Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  5. Ranking Support Vector Machine with Kernel Approximation

    Directory of Open Access Journals (Sweden)

    Kai Chen

    2017-01-01

    Full Text Available Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels can give higher accuracy than linear RankSVM (RankSVM with a linear kernel for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  6. Individualized prediction of illness course at the first psychotic episode: a support vector machine MRI study.

    LENUS (Irish Health Repository)

    Mourao-Miranda, J

    2012-05-01

    To date, magnetic resonance imaging (MRI) has made little impact on the diagnosis and monitoring of psychoses in individual patients. In this study, we used a support vector machine (SVM) whole-brain classification approach to predict future illness course at the individual level from MRI data obtained at the first psychotic episode.

  7. Aging Detection of Electrical Point Machines Based on Support Vector Data Description

    Directory of Open Access Journals (Sweden)

    Jaewon Sa

    2017-11-01

    Full Text Available Electrical point machines (EPM must be replaced at an appropriate time to prevent the occurrence of operational safety or stability problems in trains resulting from aging or budget constraints. However, it is difficult to replace EPMs effectively because the aging conditions of EPMs depend on the operating environments, and thus, a guideline is typically not be suitable for replacing EPMs at the most timely moment. In this study, we propose a method of classification for the detection of an aging effect to facilitate the timely replacement of EPMs. We employ support vector data description to segregate data of “aged” and “not-yet-aged” equipment by analyzing the subtle differences in normalized electrical signals resulting from aging. Based on the before and after-replacement data that was obtained from experimental studies that were conducted on EPMs, we confirmed that the proposed method was capable of classifying machines based on exhibited aging effects with adequate accuracy.

  8. A Novel Empirical Mode Decomposition With Support Vector Regression for Wind Speed Forecasting.

    Science.gov (United States)

    Ren, Ye; Suganthan, Ponnuthurai Nagaratnam; Srikanth, Narasimalu

    2016-08-01

    Wind energy is a clean and an abundant renewable energy source. Accurate wind speed forecasting is essential for power dispatch planning, unit commitment decision, maintenance scheduling, and regulation. However, wind is intermittent and wind speed is difficult to predict. This brief proposes a novel wind speed forecasting method by integrating empirical mode decomposition (EMD) and support vector regression (SVR) methods. The EMD is used to decompose the wind speed time series into several intrinsic mode functions (IMFs) and a residue. Subsequently, a vector combining one historical data from each IMF and the residue is generated to train the SVR. The proposed EMD-SVR model is evaluated with a wind speed data set. The proposed EMD-SVR model outperforms several recently reported methods with respect to accuracy or computational complexity.

  9. Predicting beta-turns in proteins using support vector machines with fractional polynomials.

    Science.gov (United States)

    Elbashir, Murtada; Wang, Jianxin; Wu, Fang-Xiang; Wang, Lusheng

    2013-11-07

    β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.

  10. Support Vector Machines Trained with Evolutionary Algorithms Employing Kernel Adatron for Large Scale Classification of Protein Structures.

    Science.gov (United States)

    Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana

    2016-01-01

    With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.

  11. A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model.

    Science.gov (United States)

    Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F

    2014-06-01

    To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.

  12. Breast cancer risk assessment and diagnosis model using fuzzy support vector machine based expert system

    Science.gov (United States)

    Dheeba, J.; Jaya, T.; Singh, N. Albert

    2017-09-01

    Classification of cancerous masses is a challenging task in many computerised detection systems. Cancerous masses are difficult to detect because these masses are obscured and subtle in mammograms. This paper investigates an intelligent classifier - fuzzy support vector machine (FSVM) applied to classify the tissues containing masses on mammograms for breast cancer diagnosis. The algorithm utilises texture features extracted using Laws texture energy measures and a FSVM to classify the suspicious masses. The new FSVM treats every feature as both normal and abnormal samples, but with different membership. By this way, the new FSVM have more generalisation ability to classify the masses in mammograms. The classifier analysed 219 clinical mammograms collected from breast cancer screening laboratory. The tests made on the real clinical mammograms shows that the proposed detection system has better discriminating power than the conventional support vector machine. With the best combination of FSVM and Laws texture features, the area under the Receiver operating characteristic curve reached .95, which corresponds to a sensitivity of 93.27% with a specificity of 87.17%. The results suggest that detecting masses using FSVM contribute to computer-aided detection of breast cancer and as a decision support system for radiologists.

  13. SVM Classifier – a comprehensive java interface for support vector machine classification of microarray data

    Science.gov (United States)

    Pirooznia, Mehdi; Deng, Youping

    2006-01-01

    Motivation Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. Results The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1–BRCA2 samples with RBF kernel of SVM. Conclusion We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at . PMID:17217518

  14. Modeling a ground-coupled heat pump system by a support vector machine

    Energy Technology Data Exchange (ETDEWEB)

    Esen, Hikmet; Esen, Mehmet [Department of Mechanical Education, Faculty of Technical Education, Firat University, 23119 Elazig (Turkey); Inalli, Mustafa [Department of Mechanical Engineering, Faculty of Engineering, Firat University, 23279 Elazig (Turkey); Sengur, Abdulkadir [Department of Electronic and Computer Science, Faculty of Technical Education, Firat University, 23119 Elazig (Turkey)

    2008-08-15

    This paper reports on a modeling study of ground coupled heat pump (GCHP) system performance (COP) by using a support vector machine (SVM) method. A GCHP system is a multi-variable system that is hard to model by conventional methods. As regards the SVM, it has a superior capability for generalization, and this capability is independent of the dimensionality of the input data. In this study, a SVM based method was intended to adopt GCHP system for efficient modeling. The Lin-kernel SVM method was quite efficient in modeling purposes and did not require a pre-knowledge about the system. The performance of the proposed methodology was evaluated by using several statistical validation parameters. It is found that the root-mean squared (RMS) value is 0.002722, the coefficient of multiple determinations (R{sup 2}) value is 0.999999, coefficient of variation (cov) value is 0.077295, and mean error function (MEF) value is 0.507437 for the proposed Lin-kernel SVM method. The optimum parameters of the SVM method were determined by using a greedy search algorithm. This search algorithm was effective for obtaining the optimum parameters. The simulation results show that the SVM is a good method for prediction of the COP of the GCHP system. The computation of SVM model is faster compared with other machine learning techniques (artificial neural networks (ANN) and adaptive neuro-fuzzy inference system (ANFIS)); because there are fewer free parameters and only support vectors (only a fraction of all data) are used in the generalization process. (author)

  15. Applying Multi-Class Support Vector Machines for performance assessment of shipping operations: The case of tanker vessels

    DEFF Research Database (Denmark)

    Pagoropoulos, Aris; Møller, Anders H.; McAloone, Tim C.

    2017-01-01

    of feature selection algorithms. Afterwards, a model based on Multi- Class Support Vector Machines (SVM) was constructed and the efficacy of the approach is shown through the application of a test set. The results demonstrate the importance and benefits of machine learning algorithms in driving energy....... Identifying the potential of behavioural savings can be challenging, due to the inherent difficulty in analysing the data and operationalizing energy efficiency within the dynamic operating environment of the vessels. This article proposes a supervised learning model for identifying the presence of energy...

  16. Online Artifact Removal for Brain-Computer Interfaces Using Support Vector Machines and Blind Source Separation

    OpenAIRE

    Halder, Sebastian; Bensch, Michael; Mellinger, Jürgen; Bogdan, Martin; Kübler, Andrea; Birbaumer, Niels; Rosenstiel, Wolfgang

    2007-01-01

    We propose a combination of blind source separation (BSS) and independent component analysis (ICA) (signal decomposition into artifacts and nonartifacts) with support vector machines (SVMs) (automatic classification) that are designed for online usage. In order to select a suitable BSS/ICA method, three ICA algorithms (JADE, Infomax, and FastICA) and one BSS algorithm (AMUSE) are evaluated to determine their ability to isolate electromyographic (EMG) and electrooculographic...

  17. PARAMETER SELECTION IN LEAST SQUARES-SUPPORT VECTOR MACHINES REGRESSION ORIENTED, USING GENERALIZED CROSS-VALIDATION

    Directory of Open Access Journals (Sweden)

    ANDRÉS M. ÁLVAREZ MEZA

    2012-01-01

    Full Text Available RESUMEN: En este trabajo, se propone una metodología para la selección automática de los parámetros libres de la técnica de regresión basada en mínimos cuadrados máquinas de vectores de soporte (LS-SVM, a partir de un análisis de validación cruzada generalizada multidimensional sobre el conjunto de ecuaciones lineales de LS-SVM. La técnica desarrollada no requiere de un conocimiento a priori por parte del usuario acerca de la influencia de los parámetros libres en los resultados. Se realizan experimentos sobre dos bases de datos artificiales y dos bases de datos reales. De acuerdo a los resultados obtenidos, se concluye que el algoritmo desarrollado calcula regresiones apropiadas con errores relativos competentes.

  18. Environmental noise forecasting based on support vector machine

    Science.gov (United States)

    Fu, Yumei; Zan, Xinwu; Chen, Tianyi; Xiang, Shihan

    2018-01-01

    As an important pollution source, the noise pollution is always the researcher's focus. Especially in recent years, the noise pollution is seriously harmful to the human beings' environment, so the research about the noise pollution is a very hot spot. Some noise monitoring technologies and monitoring systems are applied in the environmental noise test, measurement and evaluation. But, the research about the environmental noise forecasting is weak. In this paper, a real-time environmental noise monitoring system is introduced briefly. This monitoring system is working in Mianyang City, Sichuan Province. It is monitoring and collecting the environmental noise about more than 20 enterprises in this district. Based on the large amount of noise data, the noise forecasting by the Support Vector Machine (SVM) is studied in detail. Compared with the time series forecasting model and the artificial neural network forecasting model, the SVM forecasting model has some advantages such as the smaller data size, the higher precision and stability. The noise forecasting results based on the SVM can provide the important and accuracy reference to the prevention and control of the environmental noise.

  19. A nonlinear support vector machine model with hard penalty function based on glowworm swarm optimization for forecasting daily global solar radiation

    International Nuclear Information System (INIS)

    Jiang, He; Dong, Yao

    2016-01-01

    Highlights: • Eclat data mining algorithm is used to determine the possible predictors. • Support vector machine is converted into a ridge regularization problem. • Hard penalty selects the number of radial basis functions to simply the structure. • Glowworm swarm optimization is utilized to determine the optimal parameters. - Abstract: For a portion of the power which is generated by grid connected photovoltaic installations, an effective solar irradiation forecasting approach must be crucial to ensure the quality and the security of power grid. This paper develops and investigates a novel model to forecast 30 daily global solar radiation at four given locations of the United States. Eclat data mining algorithm is first presented to discover association rules between solar radiation and several meteorological factors laying a theoretical foundation for these correlative factors as input vectors. An effective and innovative intelligent optimization model based on nonlinear support vector machine and hard penalty function is proposed to forecast solar radiation by converting support vector machine into a regularization problem with ridge penalty, adding a hard penalty function to select the number of radial basis functions, and using glowworm swarm optimization algorithm to determine the optimal parameters of the model. In order to illustrate our validity of the proposed method, the datasets at four sites of the United States are split to into training data and test data, separately. The experiment results reveal that the proposed model delivers the best forecasting performances comparing with other competitors.

  20. Daily Peak Load Forecasting Based on Complete Ensemble Empirical Mode Decomposition with Adaptive Noise and Support Vector Machine Optimized by Modified Grey Wolf Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Shuyu Dai

    2018-01-01

    Full Text Available Daily peak load forecasting is an important part of power load forecasting. The accuracy of its prediction has great influence on the formulation of power generation plan, power grid dispatching, power grid operation and power supply reliability of power system. Therefore, it is of great significance to construct a suitable model to realize the accurate prediction of the daily peak load. A novel daily peak load forecasting model, CEEMDAN-MGWO-SVM (Complete Ensemble Empirical Mode Decomposition with Adaptive Noise and Support Vector Machine Optimized by Modified Grey Wolf Optimization Algorithm, is proposed in this paper. Firstly, the model uses the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN algorithm to decompose the daily peak load sequence into multiple sub sequences. Then, the model of modified grey wolf optimization and support vector machine (MGWO-SVM is adopted to forecast the sub sequences. Finally, the forecasting sequence is reconstructed and the forecasting result is obtained. Using CEEMDAN can realize noise reduction for non-stationary daily peak load sequence, which makes the daily peak load sequence more regular. The model adopts the grey wolf optimization algorithm improved by introducing the population dynamic evolution operator and the nonlinear convergence factor to enhance the global search ability and avoid falling into the local optimum, which can better optimize the parameters of the SVM algorithm for improving the forecasting accuracy of daily peak load. In this paper, three cases are used to test the forecasting accuracy of the CEEMDAN-MGWO-SVM model. We choose the models EEMD-MGWO-SVM (Ensemble Empirical Mode Decomposition and Support Vector Machine Optimized by Modified Grey Wolf Optimization Algorithm, MGWO-SVM (Support Vector Machine Optimized by Modified Grey Wolf Optimization Algorithm, GWO-SVM (Support Vector Machine Optimized by Grey Wolf Optimization Algorithm, SVM (Support Vector

  1. Accelerated Monte Carlo system reliability analysis through machine-learning-based surrogate models of network connectivity

    International Nuclear Information System (INIS)

    Stern, R.E.; Song, J.; Work, D.B.

    2017-01-01

    The two-terminal reliability problem in system reliability analysis is known to be computationally intractable for large infrastructure graphs. Monte Carlo techniques can estimate the probability of a disconnection between two points in a network by selecting a representative sample of network component failure realizations and determining the source-terminal connectivity of each realization. To reduce the runtime required for the Monte Carlo approximation, this article proposes an approximate framework in which the connectivity check of each sample is estimated using a machine-learning-based classifier. The framework is implemented using both a support vector machine (SVM) and a logistic regression based surrogate model. Numerical experiments are performed on the California gas distribution network using the epicenter and magnitude of the 1989 Loma Prieta earthquake as well as randomly-generated earthquakes. It is shown that the SVM and logistic regression surrogate models are able to predict network connectivity with accuracies of 99% for both methods, and are 1–2 orders of magnitude faster than using a Monte Carlo method with an exact connectivity check. - Highlights: • Surrogate models of network connectivity are developed by machine-learning algorithms. • Developed surrogate models can reduce the runtime required for Monte Carlo simulations. • Support vector machine and logistic regressions are employed to develop surrogate models. • Numerical example of California gas distribution network demonstrate the proposed approach. • The developed models have accuracies 99%, and are 1–2 orders of magnitude faster than MCS.

  2. BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

    Directory of Open Access Journals (Sweden)

    V. Dheepa

    2012-07-01

    Full Text Available Along with the great increase of internet and e-commerce, the use of credit card is an unavoidable one. Due to the increase of credit card usage, the frauds associated with this have also increased. There are a lot of approaches used to detect the frauds. In this paper, behavior based classification approach using Support Vector Machines are employed and efficient feature extraction method also adopted. If any discrepancies occur in the behaviors transaction pattern then it is predicted as suspicious and taken for further consideration to find the frauds. Generally credit card fraud detection problem suffers from a large amount of data, which is rectified by the proposed method. Achieving finest accuracy, high fraud catching rate and low false alarms are the main tasks of this approach.

  3. A SUPPORT VECTOR MACHINE APPROACH FOR DEVELOPING TELEMEDICINE SOLUTIONS: MEDICAL DIAGNOSIS

    Directory of Open Access Journals (Sweden)

    Mihaela GHEORGHE

    2015-06-01

    Full Text Available Support vector machine represents an important tool for artificial neural networks techniques including classification and prediction. It offers a solution for a wide range of different issues in which cases the traditional optimization algorithms and methods cannot be applied directly due to different constraints, including memory restrictions, hidden relationships between variables, very high volume of computations that needs to be handled. One of these issues relates to medical diagnosis, a subset of the medical field. In this paper, the SVM learning algorithm is tested on a diabetes dataset and the results obtained for training with different kernel functions are presented and analyzed in order to determine a good approach from a telemedicine perspective.

  4. A robust regression based on weighted LSSVM and penalized trimmed squares

    International Nuclear Information System (INIS)

    Liu, Jianyong; Wang, Yong; Fu, Chengqun; Guo, Jie; Yu, Qin

    2016-01-01

    Least squares support vector machine (LS-SVM) for nonlinear regression is sensitive to outliers in the field of machine learning. Weighted LS-SVM (WLS-SVM) overcomes this drawback by adding weight to each training sample. However, as the number of outliers increases, the accuracy of WLS-SVM may decrease. In order to improve the robustness of WLS-SVM, a new robust regression method based on WLS-SVM and penalized trimmed squares (WLSSVM–PTS) has been proposed. The algorithm comprises three main stages. The initial parameters are obtained by least trimmed squares at first. Then, the significant outliers are identified and eliminated by the Fast-PTS algorithm. The remaining samples with little outliers are estimated by WLS-SVM at last. The statistical tests of experimental results carried out on numerical datasets and real-world datasets show that the proposed WLSSVM–PTS is significantly robust than LS-SVM, WLS-SVM and LSSVM–LTS.

  5. Using support vector machines to improve elemental ion identification in macromolecular crystal structures

    Energy Technology Data Exchange (ETDEWEB)

    Morshed, Nader [University of California, Berkeley, CA 94720 (United States); Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Echols, Nathaniel, E-mail: nechols@lbl.gov [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Adams, Paul D., E-mail: nechols@lbl.gov [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); University of California, Berkeley, CA 94720 (United States)

    2015-05-01

    A method to automatically identify possible elemental ions in X-ray crystal structures has been extended to use support vector machine (SVM) classifiers trained on selected structures in the PDB, with significantly improved sensitivity over manually encoded heuristics. In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.

  6. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Boucher, Thomas F., E-mail: boucher@cs.umass.edu [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Ozanne, Marie V. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Carmosino, Marco L. [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Dyar, M. Darby [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Mahadevan, Sridhar [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Breves, Elly A.; Lepore, Kate H. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Clegg, Samuel M. [Los Alamos National Laboratory, P.O. Box 1663, MS J565, Los Alamos, NM 87545 (United States)

    2015-05-01

    The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO{sub 2}, Fe{sub 2}O{sub 3}, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na{sub 2}O, K{sub 2}O, TiO{sub 2}, and P{sub 2}O{sub 5}, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high

  7. Support Vector Regression and Genetic Algorithm for HVAC Optimal Operation

    Directory of Open Access Journals (Sweden)

    Ching-Wei Chen

    2016-01-01

    Full Text Available This study covers records of various parameters affecting the power consumption of air-conditioning systems. Using the Support Vector Machine (SVM, the chiller power consumption model, secondary chilled water pump power consumption model, air handling unit fan power consumption model, and air handling unit load model were established. In addition, it was found that R2 of the models all reached 0.998, and the training time was far shorter than that of the neural network. Through genetic programming, a combination of operating parameters with the least power consumption of air conditioning operation was searched. Moreover, the air handling unit load in line with the air conditioning cooling load was predicted. The experimental results show that for the combination of operating parameters with the least power consumption in line with the cooling load obtained through genetic algorithm search, the power consumption of the air conditioning systems under said combination of operating parameters was reduced by 22% compared to the fixed operating parameters, thus indicating significant energy efficiency.

  8. Automatic Task Classification via Support Vector Machine and Crowdsourcing

    Directory of Open Access Journals (Sweden)

    Hyungsik Shin

    2018-01-01

    Full Text Available Automatic task classification is a core part of personal assistant systems that are widely used in mobile devices such as smartphones and tablets. Even though many industry leaders are providing their own personal assistant services, their proprietary internals and implementations are not well known to the public. In this work, we show through real implementation and evaluation that automatic task classification can be implemented for mobile devices by using the support vector machine algorithm and crowdsourcing. To train our task classifier, we collected our training data set via crowdsourcing using the Amazon Mechanical Turk platform. Our classifier can classify a short English sentence into one of the thirty-two predefined tasks that are frequently requested while using personal mobile devices. Evaluation results show high prediction accuracy of our classifier ranging from 82% to 99%. By using large amount of crowdsourced data, we also illustrate the relationship between training data size and the prediction accuracy of our task classifier.

  9. Automatic inspection of textured surfaces by support vector machines

    Science.gov (United States)

    Jahanbin, Sina; Bovik, Alan C.; Pérez, Eduardo; Nair, Dinesh

    2009-08-01

    Automatic inspection of manufactured products with natural looking textures is a challenging task. Products such as tiles, textile, leather, and lumber project image textures that cannot be modeled as periodic or otherwise regular; therefore, a stochastic modeling of local intensity distribution is required. An inspection system to replace human inspectors should be flexible in detecting flaws such as scratches, cracks, and stains occurring in various shapes and sizes that have never been seen before. A computer vision algorithm is proposed in this paper that extracts local statistical features from grey-level texture images decomposed with wavelet frames into subbands of various orientations and scales. The local features extracted are second order statistics derived from grey-level co-occurrence matrices. Subsequently, a support vector machine (SVM) classifier is trained to learn a general description of normal texture from defect-free samples. This algorithm is implemented in LabVIEW and is capable of processing natural texture images in real-time.

  10. Modeling the Financial Distress of Microenterprise StartUps Using Support Vector Machines: A Case Study

    Directory of Open Access Journals (Sweden)

    Antonio Blanco-Oliver

    2014-10-01

    Full Text Available Despite the leading role that micro-entrepreneurship plays in economic development, and the high failure rate of microenterprise start-ups in their early years, very few studies have designed financial distress models to detect the financial problems of micro-entrepreneurs. Moreover, due to a lack of research, nothing is known about whether non-financial information and nonparametric statistical techniques improve the predictive capacity of these models. Therefore, this paper provides an innovative financial distress model specifically designed for microenterprise startups via support vector machines (SVMs that employs financial, non-financial, and macroeconomic variables. Based on a sample of almost 5,500 micro- entrepreneurs from a Peruvian Microfinance Institution (MFI, our findings show that the introduction of non-financial information related to the zone in which the entrepreneurs live and situate their business, the duration of the MFI-entrepreneur relationship, the number of loans granted by the MFI in the last year, the loan destination, and the opinion of experts on the probability that microenterprise start-ups may experience financial problems, significantly increases the accuracy performance of our financial distress model. Furthermore, the results reveal that the models that use SVMs outperform those which employ traditional logistic regression (LR analysis.

  11. A Conjunction Method of Wavelet Transform-Particle Swarm Optimization-Support Vector Machine for Streamflow Forecasting

    Directory of Open Access Journals (Sweden)

    Fanping Zhang

    2014-01-01

    Full Text Available Streamflow forecasting has an important role in water resource management and reservoir operation. Support vector machine (SVM is an appropriate and suitable method for streamflow prediction due to its best versatility, robustness, and effectiveness. In this study, a wavelet transform particle swarm optimization support vector machine (WT-PSO-SVM model is proposed and applied for streamflow time series prediction. Firstly, the streamflow time series were decomposed into various details (Ds and an approximation (A3 at three resolution levels (21-22-23 using Daubechies (db3 discrete wavelet. Correlation coefficients between each D subtime series and original monthly streamflow time series are calculated. Ds components with high correlation coefficients (D3 are added to the approximation (A3 as the input values of the SVM model. Secondly, the PSO is employed to select the optimal parameters, C, ε, and σ, of the SVM model. Finally, the WT-PSO-SVM models are trained and tested by the monthly streamflow time series of Tangnaihai Station located in Yellow River upper stream from January 1956 to December 2008. The test results indicate that the WT-PSO-SVM approach provide a superior alternative to the single SVM model for forecasting monthly streamflow in situations without formulating models for internal structure of the watershed.

  12. Using Support Vector Machine to Forecast Energy Usage of a Manhattan Skyscraper

    Science.gov (United States)

    Winter, R.; Boulanger, A.; Anderson, R.; Wu, L.

    2011-12-01

    As our society gains a better understanding of how humans have negatively impacted the environment, research related to reducing carbon emissions and overall energy consumption has become increasingly important. One of the simplest ways to reduce energy usage is by making current buildings less wasteful. By improving energy efficiency, this method of lowering our carbon footprint is particularly worthwhile because it actually reduces energy costs of operating the building, unlike many environmental initiatives that require large monetary investments. In order to improve the efficiency of the heating and air conditioning (HVAC) system of a Manhattan skyscraper, 345 Park Avenue, a predictive computer model was designed to forecast the amount of energy the building will consume. This model uses support vector machine (SVM), a method that builds a regression based purely on historical data of the building, requiring no knowledge of its size, heating and cooling methods, or any other physical properties. This pure dependence on historical data makes the model very easily applicable to different types of buildings with few model adjustments. The SVM model was built to predict a week of future energy usage based on past energy, temperature, and dew point temperature data. The predictive model closely approximated the actual values of energy usage for the spring and less closely for the winter. The prediction may be improved with additional historical data to help the model account for seasonal variability. This model is useful for creating a close approximation of future energy usage and predicting ways to diminish waste.

  13. Application of support vector machines to breast cancer screening using mammogram and clinical history data

    Science.gov (United States)

    Land, Walker H., Jr.; McKee, Dan; Velazquez, Roberto; Wong, Lut; Lo, Joseph Y.; Anderson, Francis R.

    2003-05-01

    The objectives of this paper are to discuss: (1) the development and testing of a new Evolutionary Programming (EP) method to optimally configure Support Vector Machine (SVM) parameters for facilitating the diagnosis of breast cancer; (2) evaluation of EP derived learning machines when the number of BI-RADS and clinical history discriminators are reduced from 16 to 7; (3) establishing system performance for several SVM kernels in addition to the EP/Adaptive Boosting (EP/AB) hybrid using the Digital Database for Screening Mammography, University of South Florida (DDSM USF) and Duke data sets; and (4) obtaining a preliminary evaluation of the measurement of SVM learning machine inter-institutional generalization capability using BI-RADS data. Measuring performance of the SVM designs and EP/AB hybrid against these objectives will provide quantative evidence that the software packages described can generalize to larger patient data sets from different institutions. Most iterative methods currently in use to optimize learning machine parameters are time consuming processes, which sometimes yield sub-optimal values resulting in performance degradation. SVMs are new machine intelligence paradigms, which use the Structural Risk Minimization (SRM) concept to develop learning machines. These learning machines can always be trained to provide global minima, given that the machine parameters are optimally computed. In addition, several system performance studies are described which include EP derived SVM performance as a function of: (a) population and generation size as well as a method for generating initial populations and (b) iteratively derived versus EP derived learning machine parameters. Finally, the authors describe a set of experiments providing preliminary evidence that both the EP/AB hybrid and SVM Computer Aided Diagnostic C++ software packages will work across a large population of patients, based on a data set of approximately 2,500 samples from five different

  14. A Support Vector Machine-Based Gender Identification Using Speech Signal

    Science.gov (United States)

    Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk

    We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.

  15. Probability Distribution and Deviation Information Fusion Driven Support Vector Regression Model and Its Application

    Directory of Open Access Journals (Sweden)

    Changhao Fan

    2017-01-01

    Full Text Available In modeling, only information from the deviation between the output of the support vector regression (SVR model and the training sample is considered, whereas the other prior information of the training sample, such as probability distribution information, is ignored. Probabilistic distribution information describes the overall distribution of sample data in a training sample that contains different degrees of noise and potential outliers, as well as helping develop a high-accuracy model. To mine and use the probability distribution information of a training sample, a new support vector regression model that incorporates probability distribution information weight SVR (PDISVR is proposed. In the PDISVR model, the probability distribution of each sample is considered as the weight and is then introduced into the error coefficient and slack variables of SVR. Thus, the deviation and probability distribution information of the training sample are both used in the PDISVR model to eliminate the influence of noise and outliers in the training sample and to improve predictive performance. Furthermore, examples with different degrees of noise were employed to demonstrate the performance of PDISVR, which was then compared with those of three SVR-based methods. The results showed that PDISVR performs better than the three other methods.

  16. Financial Distress Prediction using Linear Discriminant Analysis and Support Vector Machine

    Science.gov (United States)

    Santoso, Noviyanti; Wibowo, Wahyu

    2018-03-01

    A financial difficulty is the early stages before the bankruptcy. Bankruptcies caused by the financial distress can be seen from the financial statements of the company. The ability to predict financial distress became an important research topic because it can provide early warning for the company. In addition, predicting financial distress is also beneficial for investors and creditors. This research will be made the prediction model of financial distress at industrial companies in Indonesia by comparing the performance of Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) combined with variable selection technique. The result of this research is prediction model based on hybrid Stepwise-SVM obtains better balance among fitting ability, generalization ability and model stability than the other models.

  17. Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine

    Science.gov (United States)

    Lawi, Armin; Sya'Rani Machrizzandi, M.

    2018-03-01

    Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.

  18. Data on Support Vector Machines (SVM model to forecast photovoltaic power

    Directory of Open Access Journals (Sweden)

    M. Malvoni

    2016-12-01

    Full Text Available The data concern the photovoltaic (PV power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled “Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data” (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015 [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA are applied to the Least Squares Support Vector Machines (LS-SVM to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.

  19. Classifying machinery condition using oil samples and binary logistic regression

    Science.gov (United States)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  20. Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy

    International Nuclear Information System (INIS)

    Jain, Rishee K.; Smith, Kevin M.; Culligan, Patricia J.; Taylor, John E.

    2014-01-01

    Highlights: • We develop a building energy forecasting model using support vector regression. • Model is applied to data from a multi-family residential building in New York City. • We extend sensor based energy forecasting to multi-family residential buildings. • We examine the impact temporal and spatial granularity has on model accuracy. • Optimal granularity occurs at the by floor in hourly temporal intervals. - Abstract: Buildings are the dominant source of energy consumption and environmental emissions in urban areas. Therefore, the ability to forecast and characterize building energy consumption is vital to implementing urban energy management and efficiency initiatives required to curb emissions. Advances in smart metering technology have enabled researchers to develop “sensor based” approaches to forecast building energy consumption that necessitate less input data than traditional methods. Sensor-based forecasting utilizes machine learning techniques to infer the complex relationships between consumption and influencing variables (e.g., weather, time of day, previous consumption). While sensor-based forecasting has been studied extensively for commercial buildings, there is a paucity of research applying this data-driven approach to the multi-family residential sector. In this paper, we build a sensor-based forecasting model using Support Vector Regression (SVR), a commonly used machine learning technique, and apply it to an empirical data-set from a multi-family residential building in New York City. We expand our study to examine the impact of temporal (i.e., daily, hourly, 10 min intervals) and spatial (i.e., whole building, by floor, by unit) granularity have on the predictive power of our single-step model. Results indicate that sensor based forecasting models can be extended to multi-family residential buildings and that the optimal monitoring granularity occurs at the by floor level in hourly intervals. In addition to implications for

  1. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    Science.gov (United States)

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  2. A program system for ab initio MO calculations on vector and parallel processing machines. Pt. 1

    International Nuclear Information System (INIS)

    Ernenwein, R.; Rohmer, M.M.; Benard, M.

    1990-01-01

    We present a program system for ab initio molecular orbital calculations on vector and parallel computers. The present article is devoted to the computation of one- and two-electron integrals over contracted Gaussian basis sets involving s-, p-, d- and f-type functions. The McMurchie and Davidson (MMD) algorithm has been implemented and parallelized by distributing over a limited number of logical tasks the calculation of the 55 relevant classes of integrals. All sections of the MMD algorithm have been efficiently vectorized, leading to a scalar/vector ratio of 5.8. Different algorithms are proposed and compared for an optimal vectorization of the contraction of the 'intermediate integrals' generated by the MMD formalism. Advantage is taken of the dynamic storage allocation for tuning the length of the vector loops (i.e. the size of the vectorization buffer) as a function of (i) the total memory available for the job, (ii) the number of logical tasks defined by the user (≤13), and (iii) the storage requested by each specific class of integrals. Test calculations carried out on a CRAY-2 computer show that the average number of finite integrals computed over a (s, p, d, f) CGTO basis set is about 1180000 per second and per processor. The combination of vectorization and parallelism on this 4-processor machine reduces the CPU time by a factor larger than 20 with respect to the scalar and sequential performance. (orig.)

  3. Structure-activity relationship study of oxindole-based inhibitors of cyclin-dependent kinases based on least-squares support vector machines

    International Nuclear Information System (INIS)

    Li Jiazhong; Liu Huanxiang; Yao Xiaojun; Liu Mancang; Hu Zhide; Fan Botao

    2007-01-01

    The least-squares support vector machines (LS-SVMs), as an effective modified algorithm of support vector machine, was used to build structure-activity relationship (SAR) models to classify the oxindole-based inhibitors of cyclin-dependent kinases (CDKs) based on their activity. Each compound was depicted by the structural descriptors that encode constitutional, topological, geometrical, electrostatic and quantum-chemical features. The forward-step-wise linear discriminate analysis method was used to search the descriptor space and select the structural descriptors responsible for activity. The linear discriminant analysis (LDA) and nonlinear LS-SVMs method were employed to build classification models, and the best results were obtained by the LS-SVMs method with prediction accuracy of 100% on the test set and 90.91% for CDK1 and CDK2, respectively, as well as that of LDA models 95.45% and 86.36%. This paper provides an effective method to screen CDKs inhibitors

  4. Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors.

    Science.gov (United States)

    Sato, Tomohiro; Yuki, Hitomi; Takaya, Daisuke; Sasaki, Shunta; Tanaka, Akiko; Honma, Teruki

    2012-04-23

    In this study, machine learning using support vector machine was combined with three-dimensional (3D) molecular shape overlay, to improve the screening efficiency. Since the 3D molecular shape overlay does not use fingerprints or descriptors to compare two compounds, unlike 2D similarity methods, the application of machine learning to a 3D shape-based method has not been extensively investigated. The 3D similarity profile of a compound is defined as the array of 3D shape similarities with multiple known active compounds of the target protein and is used as the explanatory variable of support vector machine. As the measures of 3D shape similarity for our new prediction models, the prediction performances of the 3D shape similarity metrics implemented in ROCS, such as ShapeTanimoto and ScaledColor, were validated, using the known inhibitors of 15 target proteins derived from the ChEMBL database. The learning models based on the 3D similarity profiles stably outperformed the original ROCS when more than 10 known inhibitors were available as the queries. The results demonstrated the advantages of combining machine learning with the 3D similarity profile to process the 3D shape information of plural active compounds.

  5. Predicting Tunnel Squeezing Using Multiclass Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Yang Sun

    2018-01-01

    Full Text Available Tunnel squeezing is one of the major geological disasters that often occur during the construction of tunnels in weak rock masses subjected to high in situ stresses. It could cause shield jamming, budget overruns, and construction delays and could even lead to tunnel instability and casualties. Therefore, accurate prediction or identification of tunnel squeezing is extremely important in the design and construction of tunnels. This study presents a modified application of a multiclass support vector machine (SVM to predict tunnel squeezing based on four parameters, that is, diameter (D, buried depth (H, support stiffness (K, and rock tunneling quality index (Q. We compiled a database from the literature, including 117 case histories obtained from different countries such as India, Nepal, and Bhutan, to train the multiclass SVM model. The proposed model was validated using 8-fold cross validation, and the average error percentage was approximately 11.87%. Compared with existing approaches, the proposed multiclass SVM model yields a better performance in predictive accuracy. More importantly, one could estimate the severity of potential squeezing problems based on the predicted squeezing categories/classes.

  6. Nonlinear structural damage detection using support vector machines

    Science.gov (United States)

    Xiao, Li; Qu, Wenzhong

    2012-04-01

    An actual structure including connections and interfaces may exist nonlinear. Because of many complicated problems about nonlinear structural health monitoring (SHM), relatively little progress have been made in this aspect. Statistical pattern recognition techniques have been demonstrated to be competitive with other methods when applied to real engineering datasets. When a structure existing 'breathing' cracks that open and close under operational loading may cause a linear structural system to respond to its operational and environmental loads in a nonlinear manner nonlinear. In this paper, a vibration-based structural health monitoring when the structure exists cracks is investigated with autoregressive support vector machine (AR-SVM). Vibration experiments are carried out with a model frame. Time-series data in different cases such as: initial linear structure; linear structure with mass changed; nonlinear structure; nonlinear structure with mass changed are acquired.AR model of acceleration time-series is established, and different kernel function types and corresponding parameters are chosen and compared, which can more accurate, more effectively locate the damage. Different cases damaged states and different damage positions have been recognized successfully. AR-SVM method for the insufficient training samples is proved to be practical and efficient on structure nonlinear damage detection.

  7. A load predictive energy management system for supercapacitor-battery hybrid energy storage system in solar application using the Support Vector Machine

    International Nuclear Information System (INIS)

    Chia, Yen Yee; Lee, Lam Hong; Shafiabady, Niusha; Isa, Dino

    2015-01-01

    Highlights: • A novel energy management system (EMS) for supercapacitor-battery hybrid energy storage system is implemented. • It is a load predictive EMS which is implemented using Support Vector Machine (SVM). • An optimum SVM load prediction model is obtained, which yields 100% accuracy in 0.004866 s of training time. • The implemented load predictive EMS is compared with the conventional sequential programming control. • This methodology reduces the number of power electronics used and prolong battery lifespan. - Abstract: This paper presents the use of a Support Vector Machine load predictive energy management system to control the energy flow between a solar energy source, a supercapacitor-battery hybrid energy storage combination and the load. The supercapacitor-battery hybrid energy storage system is deployed in a solar energy system to improve the reliability of delivered power. The combination of batteries and supercapacitors makes use of complementary characteristic that allow the overlapping of a battery’s high energy density with a supercapacitors’ high power density. This hybrid system produces a straightforward benefit over either individual system, by taking advantage of each characteristic. When the supercapacitor caters for the instantaneous peak power which prolongs the battery lifespan, it also minimizes the system cost and ensures a greener system by reducing the number of batteries. The resulting performance is highly dependent on the energy controls implemented in the system to exploit the strengths of the energy storage devices and minimize its weaknesses. It is crucial to use energy from the supercapacitor and therefore minimize jeopardizing the power system reliability especially when there is a sudden peak power demand. This study has been divided into two stages. The first stage is to obtain the optimum SVM load prediction model, and the second stage carries out the performance comparison of the proposed SVM-load predictive

  8. Comparison of Support Vector Machine, Neural Network, and CART Algorithms for the Land-Cover Classification Using Limited Training Data Points

    Science.gov (United States)

    Support vector machine (SVM) was applied for land-cover characterization using MODIS time-series data. Classification performance was examined with respect to training sample size, sample variability, and landscape homogeneity (purity). The results were compared to two convention...

  9. Application of support vector machines to breast cancer screening using mammogram and history data

    Science.gov (United States)

    Land, Walker H., Jr.; Akanda, Anab; Lo, Joseph Y.; Anderson, Francis; Bryden, Margaret

    2002-05-01

    Support Vector Machines (SVMs) are a new and radically different type of classifiers and learning machines that use a hypothesis space of linear functions in a high dimensional feature space. This relatively new paradigm, based on Statistical Learning Theory (SLT) and Structural Risk Minimization (SRM), has many advantages when compared to traditional neural networks, which are based on Empirical Risk Minimization (ERM). Unlike neural networks, SVM training always finds a global minimum. Furthermore, SVMs have inherent ability to solve pattern classification without incorporating any problem-domain knowledge. In this study, the SVM was employed as a pattern classifier, operating on mammography data used for breast cancer detection. The main focus was to formulate the best learning machine configurations for optimum specificity and positive predictive value at very high sensitivities. Using a mammogram database of 500 biopsy-proven samples, the best performing SVM, on average, was able to achieve (under statistical 5-fold cross-validation) a specificity of 45.0% and a positive predictive value (PPV) of 50.1% at 100% sensitivity. At 97% sensitivity, a specificity of 55.8% and a PPV of 55.2% were obtained.

  10. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    Science.gov (United States)

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  11. A support vector machine integrated system for the classification of operation anomalies in nuclear components and systems

    International Nuclear Information System (INIS)

    Rocco S, Claudio M.; Zio, Enrico

    2007-01-01

    A support vector machine (SVM) approach to the classification of transients in nuclear power plants is presented. SVM is a machine-learning algorithm that has been successfully used in pattern recognition for cluster analysis. In the present work, single- and multiclass SVM are combined into a hierarchical structure for distinguishing among transients in nuclear systems on the basis of measured data. An example of application of the approach is presented with respect to the classification of anomalies and malfunctions occurring in the feedwater system of a boiling water reactor. The data used in the example are provided by the HAMBO simulator of the Halden Reactor Project

  12. Fault Diagnosis of Plunger Pump in Truck Crane Based on Relevance Vector Machine with Particle Swarm Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Wenliao Du

    2013-01-01

    Full Text Available Promptly and accurately dealing with the equipment breakdown is very important in terms of enhancing reliability and decreasing downtime. A novel fault diagnosis method PSO-RVM based on relevance vector machines (RVM with particle swarm optimization (PSO algorithm for plunger pump in truck crane is proposed. The particle swarm optimization algorithm is utilized to determine the kernel width parameter of the kernel function in RVM, and the five two-class RVMs with binary tree architecture are trained to recognize the condition of mechanism. The proposed method is employed in the diagnosis of plunger pump in truck crane. The six states, including normal state, bearing inner race fault, bearing roller fault, plunger wear fault, thrust plate wear fault, and swash plate wear fault, are used to test the classification performance of the proposed PSO-RVM model, which compared with the classical models, such as back-propagation artificial neural network (BP-ANN, ant colony optimization artificial neural network (ANT-ANN, RVM, and support vectors, machines with particle swarm optimization (PSO-SVM, respectively. The experimental results show that the PSO-RVM is superior to the first three classical models, and has a comparative performance to the PSO-SVM, the corresponding diagnostic accuracy achieving as high as 99.17% and 99.58%, respectively. But the number of relevance vectors is far fewer than that of support vector, and the former is about 1/12–1/3 of the latter, which indicates that the proposed PSO-RVM model is more suitable for applications that require low complexity and real-time monitoring.

  13. Support vector machine based diagnostic system for breast cancer using swarm intelligence.

    Science.gov (United States)

    Chen, Hui-Ling; Yang, Bo; Wang, Gang; Wang, Su-Jing; Liu, Jie; Liu, Da-You

    2012-08-01

    Breast cancer is becoming a leading cause of death among women in the whole world, meanwhile, it is confirmed that the early detection and accurate diagnosis of this disease can ensure a long survival of the patients. In this paper, a swarm intelligence technique based support vector machine classifier (PSO_SVM) is proposed for breast cancer diagnosis. In the proposed PSO-SVM, the issue of model selection and feature selection in SVM is simultaneously solved under particle swarm (PSO optimization) framework. A weighted function is adopted to design the objective function of PSO, which takes into account the average accuracy rates of SVM (ACC), the number of support vectors (SVs) and the selected features simultaneously. Furthermore, time varying acceleration coefficients (TVAC) and inertia weight (TVIW) are employed to efficiently control the local and global search in PSO algorithm. The effectiveness of PSO-SVM has been rigorously evaluated against the Wisconsin Breast Cancer Dataset (WBCD), which is commonly used among researchers who use machine learning methods for breast cancer diagnosis. The proposed system is compared with the grid search method with feature selection by F-score. The experimental results demonstrate that the proposed approach not only obtains much more appropriate model parameters and discriminative feature subset, but also needs smaller set of SVs for training, giving high predictive accuracy. In addition, Compared to the existing methods in previous studies, the proposed system can also be regarded as a promising success with the excellent classification accuracy of 99.3% via 10-fold cross validation (CV) analysis. Moreover, a combination of five informative features is identified, which might provide important insights to the nature of the breast cancer disease and give an important clue for the physicians to take a closer attention. We believe the promising result can ensure that the physicians make very accurate diagnostic decision in

  14. Predicting sumoylation sites using support vector machines based on various sequence features, conformational flexibility and disorder.

    Science.gov (United States)

    Yavuz, Ahmet Sinan; Sezerman, Osman Ugur

    2014-01-01

    Sumoylation, which is a reversible and dynamic post-translational modification, is one of the vital processes in a cell. Before a protein matures to perform its function, sumoylation may alter its localization, interactions, and possibly structural conformation. Abberations in protein sumoylation has been linked with a variety of disorders and developmental anomalies. Experimental approaches to identification of sumoylation sites may not be effective due to the dynamic nature of sumoylation, laborsome experiments and their cost. Therefore, computational approaches may guide experimental identification of sumoylation sites and provide insights for further understanding sumoylation mechanism. In this paper, the effectiveness of using various sequence properties in predicting sumoylation sites was investigated with statistical analyses and machine learning approach employing support vector machines. These sequence properties were derived from windows of size 7 including position-specific amino acid composition, hydrophobicity, estimated sub-window volumes, predicted disorder, and conformational flexibility. 5-fold cross-validation results on experimentally identified sumoylation sites revealed that our method successfully predicts sumoylation sites with a Matthew's correlation coefficient, sensitivity, specificity, and accuracy equal to 0.66, 73%, 98%, and 97%, respectively. Additionally, we have showed that our method compares favorably to the existing prediction methods and basic regular expressions scanner. By using support vector machines, a new, robust method for sumoylation site prediction was introduced. Besides, the possible effects of predicted conformational flexibility and disorder on sumoylation site recognition were explored computationally for the first time to our knowledge as an additional parameter that could aid in sumoylation site prediction.

  15. PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine

    Directory of Open Access Journals (Sweden)

    Baskin Berivan

    2003-03-01

    Full Text Available Abstract Background The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND. Results Cross-validation estimated the support vector machine's test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days. Conclusions Machine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at http://bind.ca. Current capabilities allow searching for human, mouse and yeast protein-interaction information.

  16. New fuzzy support vector machine for the class imbalance problem in medical datasets classification.

    Science.gov (United States)

    Gu, Xiaoqing; Ni, Tongguang; Wang, Hongyuan

    2014-01-01

    In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.

  17. New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification

    Directory of Open Access Journals (Sweden)

    Xiaoqing Gu

    2014-01-01

    Full Text Available In medical datasets classification, support vector machine (SVM is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM for the class imbalance problem (called FSVM-CIP is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.

  18. Coupling hydrological modeling and support vector regression to model hydropeaking in alpine catchments.

    Science.gov (United States)

    Chiogna, Gabriele; Marcolini, Giorgia; Liu, Wanying; Pérez Ciria, Teresa; Tuo, Ye

    2018-08-15

    Water management in the alpine region has an important impact on streamflow. In particular, hydropower production is known to cause hydropeaking i.e., sudden fluctuations in river stage caused by the release or storage of water in artificial reservoirs. Modeling hydropeaking with hydrological models, such as the Soil Water Assessment Tool (SWAT), requires knowledge of reservoir management rules. These data are often not available since they are sensitive information belonging to hydropower production companies. In this short communication, we propose to couple the results of a calibrated hydrological model with a machine learning method to reproduce hydropeaking without requiring the knowledge of the actual reservoir management operation. We trained a support vector machine (SVM) with SWAT model outputs, the day of the week and the energy price. We tested the model for the Upper Adige river basin in North-East Italy. A wavelet analysis showed that energy price has a significant influence on river discharge, and a wavelet coherence analysis demonstrated the improved performance of the SVM model in comparison to the SWAT model alone. The SVM model was also able to capture the fluctuations in streamflow caused by hydropeaking when both energy price and river discharge displayed a complex temporal dynamic. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Automatic Detection of Retinal Exudates using a Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Nualsawat HIRANSAKOLWONG

    2013-02-01

    Full Text Available Retinal exudates are among the preliminary signs of diabetic retinopathy, a major cause of vision loss in diabetic patients. Correct and efficient screening of exudates is very expensive in professional time and may cause human error. Nowadays, the digital retinal image is frequently used to follow-up and diagnoses eye diseases. Therefore, the retinal image is crucial and essential for experts to detect exudates. Unfortunately, it is a normal situation that retinal images in Thailand are poor quality images. In this paper, we present a series of experiments on feature selection and exudates classification using the support vector machine classifiers. The retinal images are segmented following key preprocessing steps, i.e., color normalization, contrast enhancement, noise removal and color space selection. On data sets of poor quality images, sensitivity, specificity and accuracy is 94.46%, 89.52% and 92.14%, respectively.

  20. EEG machine learning with Higuchi fractal dimension and Sample Entropy as features for successful detection of depression

    OpenAIRE

    Cukic, Milena; Pokrajac, David; Stokic, Miodrag; Simic, slobodan; Radivojevic, Vlada; Ljubisavljevic, Milos

    2018-01-01

    Reliable diagnosis of depressive disorder is essential for both optimal treatment and prevention of fatal outcomes. In this study, we aimed to elucidate the effectiveness of two non-linear measures, Higuchi Fractal Dimension (HFD) and Sample Entropy (SampEn), in detecting depressive disorders when applied on EEG. HFD and SampEn of EEG signals were used as features for seven machine learning algorithms including Multilayer Perceptron, Logistic Regression, Support Vector Machines with the linea...

  1. A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine

    Science.gov (United States)

    Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing

    2017-09-01

    Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.

  2. Support vector machines and evolutionary algorithms for classification single or together?

    CERN Document Server

    Stoean, Catalin

    2014-01-01

    When discussing classification, support vector machines are known to be a capable and efficient technique to learn and predict with high accuracy within a quick time frame. Yet, their black box means to do so make the practical users quite circumspect about relying on it, without much understanding of the how and why of its predictions. The question raised in this book is how can this ‘masked hero’ be made more comprehensible and friendly to the public: provide a surrogate model for its hidden optimization engine, replace the method completely or appoint a more friendly approach to tag along and offer the much desired explanations? Evolutionary algorithms can do all these and this book presents such possibilities of achieving high accuracy, comprehensibility, reasonable runtime as well as unconstrained performance.

  3. Support Vector Machines for decision support in electricity markets׳ strategic bidding

    DEFF Research Database (Denmark)

    Pinto, Tiago; Sousa, Tiago M.; Praça, Isabel

    2015-01-01

    . The ALBidS system allows MASCEM market negotiating players to take the best possible advantages from the market context. This paper presents the application of a Support Vector Machines (SVM) based approach to provide decision support to electricity market players. This strategy is tested and validated...... by being included in ALBidS and then compared with the application of an Artificial Neural Network (ANN), originating promising results: an effective electricity market price forecast in a fast execution time. The proposed approach is tested and validated using real electricity markets data from MIBEL......׳ research group has developed a multi-agent system: Multi-Agent System for Competitive Electricity Markets (MASCEM), which simulates the electricity markets environment. MASCEM is integrated with Adaptive Learning Strategic Bidding System (ALBidS) that works as a decision support system for market players...

  4. Multi-view L2-SVM and its multi-view core vector machine.

    Science.gov (United States)

    Huang, Chengquan; Chung, Fu-lai; Wang, Shitong

    2016-03-01

    In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Mining protein function from text using term-based support vector machines

    Science.gov (United States)

    Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J

    2005-01-01

    Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835

  6. Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances

    Directory of Open Access Journals (Sweden)

    Abut F

    2015-08-01

    Full Text Available Fatih Abut, Mehmet Fatih AkayDepartment of Computer Engineering, Çukurova University, Adana, TurkeyAbstract: Maximal oxygen uptake (VO2max indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance

  7. Support vector regression scoring of receptor-ligand complexes for rank-ordering and virtual screening of chemical libraries.

    Science.gov (United States)

    Li, Liwei; Wang, Bo; Meroueh, Samy O

    2011-09-26

    The community structure-activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the two CSAR data sets. The features used to train SVR-KB are knowledge-based pairwise potentials, while SVR-EP is based on physicochemical properties. SVR-KB and SVR-EP were compared to seven other widely used scoring functions, including Glide, X-score, GoldScore, ChemScore, Vina, Dock, and PMF. Results showed that SVR-KB trained with features obtained from three-dimensional complexes of the PDBbind data set outperformed all other scoring functions, including best performing X-score, by nearly 0.1 using three correlation coefficients, namely Pearson, Spearman, and Kendall. It was interesting that higher performance in rank ordering did not translate into greater enrichment in virtual screening assessed using the 40 targets of the Directory of Useful Decoys (DUD). To remedy this situation, a variant of SVR-KB (SVR-KBD) was developed by following a target-specific tailoring strategy that we had previously employed to derive SVM-SP. SVR-KBD showed a much higher enrichment, outperforming all other scoring functions tested, and was comparable in performance to our previously derived scoring function SVM-SP.

  8. A predictive model of chemical flooding for enhanced oil recovery purposes: Application of least square support vector machine

    Directory of Open Access Journals (Sweden)

    Mohammad Ali Ahmadi

    2016-06-01

    Full Text Available Applying chemical flooding in petroleum reservoirs turns into interesting subject of the recent researches. Developing strategies of the aforementioned method are more robust and precise when they consider both economical point of views (net present value (NPV and technical point of views (recovery factor (RF. In the present study huge attempts are made to propose predictive model for specifying efficiency of chemical flooding in oil reservoirs. To gain this goal, the new type of support vector machine method which evolved by Suykens and Vandewalle was employed. Also, high precise chemical flooding data banks reported in previous works were employed to test and validate the proposed vector machine model. According to the mean square error (MSE, correlation coefficient and average absolute relative deviation, the suggested LSSVM model has acceptable reliability; integrity and robustness. Thus, the proposed intelligent based model can be considered as an alternative model to monitor the efficiency of chemical flooding in oil reservoir when the required experimental data are not available or accessible.

  9. Short-term wind speed prediction using an unscented Kalman filter based state-space support vector regression approach

    International Nuclear Information System (INIS)

    Chen, Kuilin; Yu, Jie

    2014-01-01

    Highlights: • A novel hybrid modeling method is proposed for short-term wind speed forecasting. • Support vector regression model is constructed to formulate nonlinear state-space framework. • Unscented Kalman filter is adopted to recursively update states under random uncertainty. • The new SVR–UKF approach is compared to several conventional methods for short-term wind speed prediction. • The proposed method demonstrates higher prediction accuracy and reliability. - Abstract: Accurate wind speed forecasting is becoming increasingly important to improve and optimize renewable wind power generation. Particularly, reliable short-term wind speed prediction can enable model predictive control of wind turbines and real-time optimization of wind farm operation. However, this task remains challenging due to the strong stochastic nature and dynamic uncertainty of wind speed. In this study, unscented Kalman filter (UKF) is integrated with support vector regression (SVR) based state-space model in order to precisely update the short-term estimation of wind speed sequence. In the proposed SVR–UKF approach, support vector regression is first employed to formulate a nonlinear state-space model and then unscented Kalman filter is adopted to perform dynamic state estimation recursively on wind sequence with stochastic uncertainty. The novel SVR–UKF method is compared with artificial neural networks (ANNs), SVR, autoregressive (AR) and autoregressive integrated with Kalman filter (AR-Kalman) approaches for predicting short-term wind speed sequences collected from three sites in Massachusetts, USA. The forecasting results indicate that the proposed method has much better performance in both one-step-ahead and multi-step-ahead wind speed predictions than the other approaches across all the locations

  10. Fuzzy support vector machine for microarray imbalanced data classification

    Science.gov (United States)

    Ladayya, Faroh; Purnami, Santi Wulan; Irhamah

    2017-11-01

    DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.

  11. A Framework for Diagnosing the Out-of-Control Signals in Multivariate Process Using Optimized Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Tai-fu Li

    2013-01-01

    Full Text Available Multivariate statistical process control is the continuation and development of unitary statistical process control. Most multivariate statistical quality control charts are usually used (in manufacturing and service industries to determine whether a process is performing as intended or if there are some unnatural causes of variation upon an overall statistics. Once the control chart detects out-of-control signals, one difficulty encountered with multivariate control charts is the interpretation of an out-of-control signal. That is, we have to determine whether one or more or a combination of variables is responsible for the abnormal signal. A novel approach for diagnosing the out-of-control signals in the multivariate process is described in this paper. The proposed methodology uses the optimized support vector machines (support vector machine classification based on genetic algorithm to recognize set of subclasses of multivariate abnormal patters, identify the responsible variable(s on the occurrence of abnormal pattern. Multiple sets of experiments are used to verify this model. The performance of the proposed approach demonstrates that this model can accurately classify the source(s of out-of-control signal and even outperforms the conventional multivariate control scheme.

  12. Control-group feature normalization for multivariate pattern analysis of structural MRI data using the support vector machine.

    Science.gov (United States)

    Linn, Kristin A; Gaonkar, Bilwaj; Satterthwaite, Theodore D; Doshi, Jimit; Davatzikos, Christos; Shinohara, Russell T

    2016-05-15

    Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization, it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We study our proposed approach in the context of group classification using structural MRI data. We show that control-based normalization leads to better reproducibility of estimated multivariate disease patterns and improves the classifier performance in many cases. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Integrated Features by Administering the Support Vector Machine (SVM of Translational Initiations Sites in Alternative Polymorphic Contex

    Directory of Open Access Journals (Sweden)

    Nurul Arneida Husin

    2012-04-01

    Full Text Available Many algorithms and methods have been proposed for classification problems in bioinformatics. In this study, the discriminative approach in particular support vector machines (SVM is employed to recognize the studied TIS patterns. The applied discriminative approach is used to learn about some discriminant functions of samples that have been labelled as positive or negative. After learning, the discriminant functions are employed to decide whether a new sample is true or false. In this study, support vector machines (SVM is employed to recognize the patterns for studied translational initiation sites in alternative weak context. The method has been optimized with the best parameters selected; c=100, E=10-6 and ex=2 for non linear kernel function. Results show that with top 5 features and non linear kernel, the best prediction accuracy achieved is 95.8%. J48 algorithm is applied to compare with SVM with top 15 features and the results show a good prediction accuracy of 95.8%. This indicates that the top 5 features selected by the IGR method and that are performed by SVM are sufficient to use in the prediction of TIS in weak contexts.

  14. Fault Diagnosis of a Reconfigurable Crawling–Rolling Robot Based on Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Karthikeyan Elangovan

    2017-10-01

    Full Text Available As robots begin to perform jobs autonomously, with minimal or no human intervention, a new challenge arises: robots also need to autonomously detect errors and recover from faults. In this paper, we present a Support Vector Machine (SVM-based fault diagnosis system for a bio-inspired reconfigurable robot named Scorpio. The diagnosis system needs to detect and classify faults while Scorpio uses its crawling and rolling locomotion modes. Specifically, we classify between faulty and non-faulty conditions by analyzing onboard Inertial Measurement Unit (IMU sensor data. The data capture nine different locomotion gaits, which include rolling and crawling modes, at three different speeds. Statistical methods are applied to extract features and to reduce the dimensionality of original IMU sensor data features. These statistical features were given as inputs for training and testing. Additionally, the c-Support Vector Classification (c-SVC and nu-SVC models of SVM, and their fault classification accuracies, were compared. The results show that the proposed SVM approach can be used to autonomously diagnose locomotion gait faults while the reconfigurable robot is in operation.

  15. Output-only modal parameter estimator of linear time-varying structural systems based on vector TAR model and least squares support vector machine

    Science.gov (United States)

    Zhou, Si-Da; Ma, Yuan-Chen; Liu, Li; Kang, Jie; Ma, Zhi-Sai; Yu, Lei

    2018-01-01

    Identification of time-varying modal parameters contributes to the structural health monitoring, fault detection, vibration control, etc. of the operational time-varying structural systems. However, it is a challenging task because there is not more information for the identification of the time-varying systems than that of the time-invariant systems. This paper presents a vector time-dependent autoregressive model and least squares support vector machine based modal parameter estimator for linear time-varying structural systems in case of output-only measurements. To reduce the computational cost, a Wendland's compactly supported radial basis function is used to achieve the sparsity of the Gram matrix. A Gamma-test-based non-parametric approach of selecting the regularization factor is adapted for the proposed estimator to replace the time-consuming n-fold cross validation. A series of numerical examples have illustrated the advantages of the proposed modal parameter estimator on the suppression of the overestimate and the short data. A laboratory experiment has further validated the proposed estimator.

  16. Chaotic particle swarm optimization algorithm in a support vector regression electric load forecasting model

    International Nuclear Information System (INIS)

    Hong, W.-C.

    2009-01-01

    Accurate forecasting of electric load has always been the most important issues in the electricity industry, particularly for developing countries. Due to the various influences, electric load forecasting reveals highly nonlinear characteristics. Recently, support vector regression (SVR), with nonlinear mapping capabilities of forecasting, has been successfully employed to solve nonlinear regression and time series problems. However, it is still lack of systematic approaches to determine appropriate parameter combination for a SVR model. This investigation elucidates the feasibility of applying chaotic particle swarm optimization (CPSO) algorithm to choose the suitable parameter combination for a SVR model. The empirical results reveal that the proposed model outperforms the other two models applying other algorithms, genetic algorithm (GA) and simulated annealing algorithm (SA). Finally, it also provides the theoretical exploration of the electric load forecasting support system (ELFSS)

  17. Machine learning to predict the occurrence of bisphosphonate-related osteonecrosis of the jaw associated with dental extraction: A preliminary report.

    Science.gov (United States)

    Kim, Dong Wook; Kim, Hwiyoung; Nam, Woong; Kim, Hyung Jun; Cha, In-Ho

    2018-04-23

    The aim of this study was to build and validate five types of machine learning models that can predict the occurrence of BRONJ associated with dental extraction in patients taking bisphosphonates for the management of osteoporosis. A retrospective review of the medical records was conducted to obtain cases and controls for the study. Total 125 patients consisting of 41 cases and 84 controls were selected for the study. Five machine learning prediction algorithms including multivariable logistic regression model, decision tree, support vector machine, artificial neural network, and random forest were implemented. The outputs of these models were compared with each other and also with conventional methods, such as serum CTX level. Area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the results. The performance of machine learning models was significantly superior to conventional statistical methods and single predictors. The random forest model yielded the best performance (AUC = 0.973), followed by artificial neural network (AUC = 0.915), support vector machine (AUC = 0.882), logistic regression (AUC = 0.844), decision tree (AUC = 0.821), drug holiday alone (AUC = 0.810), and CTX level alone (AUC = 0.630). Machine learning methods showed superior performance in predicting BRONJ associated with dental extraction compared to conventional statistical methods using drug holiday and serum CTX level. Machine learning can thus be applied in a wide range of clinical studies. Copyright © 2017. Published by Elsevier Inc.

  18. Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances.

    Science.gov (United States)

    Abut, Fatih; Akay, Mehmet Fatih

    2015-01-01

    Maximal oxygen uptake (VO2max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO2max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO2max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO2max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO2max. Consequently, a lot of studies have been conducted in the last years to predict VO2max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO2max conducted in recent years and to compare the performance of various VO2max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance.

  19. Support vector machines for TEC seismo-ionospheric anomalies detection

    Directory of Open Access Journals (Sweden)

    M. Akhoondzadeh

    2013-02-01

    Full Text Available Using time series prediction methods, it is possible to pursue the behaviors of earthquake precursors in the future and to announce early warnings when the differences between the predicted value and the observed value exceed the predefined threshold value. Support Vector Machines (SVMs are widely used due to their many advantages for classification and regression tasks. This study is concerned with investigating the Total Electron Content (TEC time series by using a SVM to detect seismo-ionospheric anomalous variations induced by the three powerful earthquakes of Tohoku (11 March 2011, Haiti (12 January 2010 and Samoa (29 September 2009. The duration of TEC time series dataset is 49, 46 and 71 days, for Tohoku, Haiti and Samoa earthquakes, respectively, with each at time resolution of 2 h. In the case of Tohoku earthquake, the results show that the difference between the predicted value obtained from the SVM method and the observed value reaches the maximum value (i.e., 129.31 TECU at earthquake time in a period of high geomagnetic activities. The SVM method detected a considerable number of anomalous occurrences 1 and 2 days prior to the Haiti earthquake and also 1 and 5 days before the Samoa earthquake in a period of low geomagnetic activities. In order to show that the method is acting sensibly with regard to the results extracted during nonevent and event TEC data, i.e., to perform some null-hypothesis tests in which the methods would also be calibrated, the same period of data from the previous year of the Samoa earthquake date has been taken into the account. Further to this, in this study, the detected TEC anomalies using the SVM method were compared to the previous results (Akhoondzadeh and Saradjian, 2011; Akhoondzadeh, 2012 obtained from the mean, median, wavelet and Kalman filter methods. The SVM detected anomalies are similar to those detected using the previous methods. It can be concluded that SVM can be a suitable learning method

  20. Integrating principal component analysis and vector quantization with support vector regression for sulfur content prediction in HDS process

    Directory of Open Access Journals (Sweden)

    Shokri Saeid

    2015-01-01

    Full Text Available An accurate prediction of sulfur content is very important for the proper operation and product quality control in hydrodesulfurization (HDS process. For this purpose, a reliable data- driven soft sensors utilizing Support Vector Regression (SVR was developed and the effects of integrating Vector Quantization (VQ with Principle Component Analysis (PCA were studied on the assessment of this soft sensor. First, in pre-processing step the PCA and VQ techniques were used to reduce dimensions of the original input datasets. Then, the compressed datasets were used as input variables for the SVR model. Experimental data from the HDS setup were employed to validate the proposed integrated model. The integration of VQ/PCA techniques with SVR model was able to increase the prediction accuracy of SVR. The obtained results show that integrated technique (VQ-SVR was better than (PCA-SVR in prediction accuracy. Also, VQ decreased the sum of the training and test time of SVR model in comparison with PCA. For further evaluation, the performance of VQ-SVR model was also compared to that of SVR. The obtained results indicated that VQ-SVR model delivered the best satisfactory predicting performance (AARE= 0.0668 and R2= 0.995 in comparison with investigated models.

  1. [Discrimination of types of polyacrylamide based on near infrared spectroscopy coupled with least square support vector machine].

    Science.gov (United States)

    Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang

    2014-04-01

    In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The

  2. Application of support vector machine for classification of multispectral data

    International Nuclear Information System (INIS)

    Bahari, Nurul Iman Saiful; Ahmad, Asmala; Aboobaider, Burhanuddin Mohd

    2014-01-01

    In this paper, support vector machine (SVM) is used to classify satellite remotely sensed multispectral data. The data are recorded from a Landsat-5 TM satellite with resolution of 30x30m. SVM finds the optimal separating hyperplane between classes by focusing on the training cases. The study area of Klang Valley has more than 10 land covers and classification using SVM has been done successfully without any pixel being unclassified. The training area is determined carefully by visual interpretation and with the aid of the reference map of the study area. The result obtained is then analysed for the accuracy and visual performance. Accuracy assessment is done by determination and discussion of Kappa coefficient value, overall and producer accuracy for each class (in pixels and percentage). While, visual analysis is done by comparing the classification data with the reference map. Overall the study shows that SVM is able to classify the land covers within the study area with a high accuracy

  3. Using support vector machines in the multivariate state estimation technique

    International Nuclear Information System (INIS)

    Zavaljevski, N.; Gross, K.C.

    1999-01-01

    One approach to validate nuclear power plant (NPP) signals makes use of pattern recognition techniques. This approach often assumes that there is a set of signal prototypes that are continuously compared with the actual sensor signals. These signal prototypes are often computed based on empirical models with little or no knowledge about physical processes. A common problem of all data-based models is their limited ability to make predictions on the basis of available training data. Another problem is related to suboptimal training algorithms. Both of these potential shortcomings with conventional approaches to signal validation and sensor operability validation are successfully resolved by adopting a recently proposed learning paradigm called the support vector machine (SVM). The work presented here is a novel application of SVM for data-based modeling of system state variables in an NPP, integrated with a nonlinear, nonparametric technique called the multivariate state estimation technique (MSET), an algorithm developed at Argonne National Laboratory for a wide range of nuclear plant applications

  4. The Improved Relevance Voxel Machine

    DEFF Research Database (Denmark)

    Ganz, Melanie; Sabuncu, Mert; Van Leemput, Koen

    The concept of sparse Bayesian learning has received much attention in the machine learning literature as a means of achieving parsimonious representations of features used in regression and classification. It is an important family of algorithms for sparse signal recovery and compressed sensing....... Hence in its current form it is reminiscent of a greedy forward feature selection algorithm. In this report, we aim to solve the problems of the original RVoxM algorithm in the spirit of [7] (FastRVM).We call the new algorithm Improved Relevance Voxel Machine (IRVoxM). Our contributions...... and enables basis selection from overcomplete dictionaries. One of the trailblazers of Bayesian learning is MacKay who already worked on the topic in his PhD thesis in 1992 [1]. Later on Tipping and Bishop developed the concept of sparse Bayesian learning [2, 3] and Tipping published the Relevance Vector...

  5. Prediction of diffuse solar irradiance using machine learning and multivariable regression

    International Nuclear Information System (INIS)

    Lou, Siwei; Li, Danny H.W.; Lam, Joseph C.; Chan, Wilco W.H.

    2016-01-01

    Highlights: • 54.9% of the annual global irradiance is composed by its diffuse part in HK. • Hourly diffuse irradiance was predicted by accessible variables. • The importance of variable in prediction was assessed by machine learning. • Simple prediction equations were developed with the knowledge of variable importance. - Abstract: The paper studies the horizontal global, direct-beam and sky-diffuse solar irradiance data measured in Hong Kong from 2008 to 2013. A machine learning algorithm was employed to predict the horizontal sky-diffuse irradiance and conduct sensitivity analysis for the meteorological variables. Apart from the clearness index (horizontal global/extra atmospheric solar irradiance), we found that predictors including solar altitude, air temperature, cloud cover and visibility are also important in predicting the diffuse component. The mean absolute error (MAE) of the logistic regression using the aforementioned predictors was less than 21.5 W/m"2 and 30 W/m"2 for Hong Kong and Denver, USA, respectively. With the systematic recording of the five variables for more than 35 years, the proposed model would be appropriate to estimate of long-term diffuse solar radiation, study climate change and develope typical meteorological year in Hong Kong and places with similar climates.

  6. Seasonal prediction of winter extreme precipitation over Canada by support vector regression

    Directory of Open Access Journals (Sweden)

    Z. Zeng

    2011-01-01

    Full Text Available For forecasting the maximum 5-day accumulated precipitation over the winter season at lead times of 3, 6, 9 and 12 months over Canada from 1950 to 2007, two nonlinear and two linear regression models were used, where the models were support vector regression (SVR (nonlinear and linear versions, nonlinear Bayesian neural network (BNN and multiple linear regression (MLR. The 118 stations were grouped into six geographic regions by K-means clustering. For each region, the leading principal components of the winter maximum 5-d accumulated precipitation anomalies were the predictands. Potential predictors included quasi-global sea surface temperature anomalies and 500 hPa geopotential height anomalies over the Northern Hemisphere, as well as six climate indices (the Niño-3.4 region sea surface temperature, the North Atlantic Oscillation, the Pacific-North American teleconnection, the Pacific Decadal Oscillation, the Scandinavia pattern, and the East Atlantic pattern. The results showed that in general the two robust SVR models tended to have better forecast skills than the two non-robust models (MLR and BNN, and the nonlinear SVR model tended to forecast slightly better than the linear SVR model. Among the six regions, the Prairies region displayed the highest forecast skills, and the Arctic region the second highest. The strongest nonlinearity was manifested over the Prairies and the weakest nonlinearity over the Arctic.

  7. Constructing Support Vector Machine Ensembles for Cancer Classification Based on Proteomic Profiling

    Institute of Scientific and Technical Information of China (English)

    Yong Mao; Xiao-Bo Zhou; Dao-Ying Pi; You-Xian Sun

    2005-01-01

    In this study, we present a constructive algorithm for training cooperative support vector machine ensembles (CSVMEs). CSVME combines ensemble architecture design with cooperative training for individual SVMs in ensembles. Unlike most previous studies on training ensembles, CSVME puts emphasis on both accuracy and collaboration among individual SVMs in an ensemble. A group of SVMs selected on the basis of recursive classifier elimination is used in CSVME, and the number of the individual SVMs selected to construct CSVME is determined by 10-fold cross-validation. This kind of SVME has been tested on two ovarian cancer datasets previously obtained by proteomic mass spectrometry. By combining several individual SVMs, the proposed method achieves better performance than the SVME of all base SVMs.

  8. Classification of Autism Spectrum Disorder Using Random Support Vector Machine Cluster

    Directory of Open Access Journals (Sweden)

    Xia-an Bi

    2018-02-01

    Full Text Available Autism spectrum disorder (ASD is mainly reflected in the communication and language barriers, difficulties in social communication, and it is a kind of neurological developmental disorder. Most researches have used the machine learning method to classify patients and normal controls, among which support vector machines (SVM are widely employed. But the classification accuracy of SVM is usually low, due to the usage of a single SVM as classifier. Thus, we used multiple SVMs to classify ASD patients and typical controls (TC. Resting-state functional magnetic resonance imaging (fMRI data of 46 TC and 61 ASD patients were obtained from the Autism Brain Imaging Data Exchange (ABIDE database. Only 84 of 107 subjects are utilized in experiments because the translation or rotation of 7 TC and 16 ASD patients has surpassed ±2 mm or ±2°. Then the random SVM cluster was proposed to distinguish TC and ASD. The results show that this method has an excellent classification performance based on all the features. Furthermore, the accuracy based on the optimal feature set could reach to 96.15%. Abnormal brain regions could also be found, such as inferior frontal gyrus (IFG (orbital and opercula part, hippocampus, and precuneus. It is indicated that the method of random SVM cluster may apply to the auxiliary diagnosis of ASD.

  9. Fast metabolite identification with Input Output Kernel Regression

    Science.gov (United States)

    Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho

    2016-01-01

    Motivation: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. Results: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. Availability and implementation: Contact: celine.brouard@aalto.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307628

  10. Multiscale asymmetric orthogonal wavelet kernel for linear programming support vector learning and nonlinear dynamic systems identification.

    Science.gov (United States)

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2014-05-01

    Support vector regression for approximating nonlinear dynamic systems is more delicate than the approximation of indicator functions in support vector classification, particularly for systems that involve multitudes of time scales in their sampled data. The kernel used for support vector learning determines the class of functions from which a support vector machine can draw its solution, and the choice of kernel significantly influences the performance of a support vector machine. In this paper, to bridge the gap between wavelet multiresolution analysis and kernel learning, the closed-form orthogonal wavelet is exploited to construct new multiscale asymmetric orthogonal wavelet kernels for linear programming support vector learning. The closed-form multiscale orthogonal wavelet kernel provides a systematic framework to implement multiscale kernel learning via dyadic dilations and also enables us to represent complex nonlinear dynamics effectively. To demonstrate the superiority of the proposed multiscale wavelet kernel in identifying complex nonlinear dynamic systems, two case studies are presented that aim at building parallel models on benchmark datasets. The development of parallel models that address the long-term/mid-term prediction issue is more intricate and challenging than the identification of series-parallel models where only one-step ahead prediction is required. Simulation results illustrate the effectiveness of the proposed multiscale kernel learning.

  11. Estimating Frequency by Interpolation Using Least Squares Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Changwei Ma

    2015-01-01

    Full Text Available Discrete Fourier transform- (DFT- based maximum likelihood (ML algorithm is an important part of single sinusoid frequency estimation. As signal to noise ratio (SNR increases and is above the threshold value, it will lie very close to Cramer-Rao lower bound (CRLB, which is dependent on the number of DFT points. However, its mean square error (MSE performance is directly proportional to its calculation cost. As a modified version of support vector regression (SVR, least squares SVR (LS-SVR can not only still keep excellent capabilities for generalizing and fitting but also exhibit lower computational complexity. In this paper, therefore, LS-SVR is employed to interpolate on Fourier coefficients of received signals and attain high frequency estimation accuracy. Our results show that the proposed algorithm can make a good compromise between calculation cost and MSE performance under the assumption that the sample size, number of DFT points, and resampling points are already known.

  12. Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms

    Directory of Open Access Journals (Sweden)

    Zhongyi Hu

    2013-01-01

    Full Text Available Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA based memetic algorithm (FA-MA to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.

  13. Detection of sensor degradation using K-means clustering and support vector regression in nuclear power plant

    International Nuclear Information System (INIS)

    Seo, Inyong; Ha, Bokam; Lee, Sungwoo; Shin, Changhoon; Lee, Jaeyong; Kim, Seongjun

    2011-01-01

    In a nuclear power plant (NPP), periodic sensor calibrations are required to assure sensors are operating correctly. However, only a few faulty sensors are found to be rectified. For the safe operation of an NPP and the reduction of unnecessary calibration, on-line calibration monitoring is needed. In this study, an on-line calibration monitoring called KPCSVR using k-means clustering and principal component based Auto-Associative support vector regression (PCSVR) is proposed for nuclear power plant. To reduce the training time of the model, k-means clustering method was used. Response surface methodology is employed to efficiently determine the optimal values of support vector regression hyperparameters. The proposed KPCSVR model was confirmed with actual plant data of Kori Nuclear Power Plant Unit 3 which were measured from the primary and secondary systems of the plant, and compared with the PCSVR model. By using data clustering, the average accuracy of PCSVR improved from 1.228×10 -4 to 0.472×10 -4 and the average sensitivity of PCSVR from 0.0930 to 0.0909, which results in good detection of sensor drift. Moreover, the training time is greatly reduced from 123.5 to 31.5 sec. (author)

  14. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    Science.gov (United States)

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  15. Biomarkers of Eating Disorders Using Support Vector Machine Analysis of Structural Neuroimaging Data: Preliminary Results

    Directory of Open Access Journals (Sweden)

    Antonio Cerasa

    2015-01-01

    Full Text Available Presently, there are no valid biomarkers to identify individuals with eating disorders (ED. The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa were compared against 17 body mass index-matched healthy controls (HC. Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice.

  16. Biomarkers of Eating Disorders Using Support Vector Machine Analysis of Structural Neuroimaging Data: Preliminary Results

    Science.gov (United States)

    Cerasa, Antonio; Castiglioni, Isabella; Salvatore, Christian; Funaro, Angela; Martino, Iolanda; Alfano, Stefania; Donzuso, Giulia; Perrotta, Paolo; Gioia, Maria Cecilia; Gilardi, Maria Carla; Quattrone, Aldo

    2015-01-01

    Presently, there are no valid biomarkers to identify individuals with eating disorders (ED). The aim of this work was to assess the feasibility of a machine learning method for extracting reliable neuroimaging features allowing individual categorization of patients with ED. Support Vector Machine (SVM) technique, combined with a pattern recognition method, was employed utilizing structural magnetic resonance images. Seventeen females with ED (six with diagnosis of anorexia nervosa and 11 with bulimia nervosa) were compared against 17 body mass index-matched healthy controls (HC). Machine learning allowed individual diagnosis of ED versus HC with an Accuracy ≥ 0.80. Voxel-based pattern recognition analysis demonstrated that voxels influencing the classification Accuracy involved the occipital cortex, the posterior cerebellar lobule, precuneus, sensorimotor/premotor cortices, and the medial prefrontal cortex, all critical regions known to be strongly involved in the pathophysiological mechanisms of ED. Although these findings should be considered preliminary given the small size investigated, SVM analysis highlights the role of well-known brain regions as possible biomarkers to distinguish ED from HC at an individual level, thus encouraging the translational implementation of this new multivariate approach in the clinical practice. PMID:26648660

  17. Quantitative structure–activity relationship model for amino acids as corrosion inhibitors based on the support vector machine and molecular design

    International Nuclear Information System (INIS)

    Zhao, Hongxia; Zhang, Xiuhui; Ji, Lin; Hu, Haixiang; Li, Qianshu

    2014-01-01

    Highlights: • Nonlinear quantitative structure–activity relationship (QSAR) model was built by the support vector machine. • Descriptors for QSAR model were selected by principal component analysis. • Binding energy was taken as one of the descriptors for QSAR model. • Acidic solution and protonation of the inhibitor were considered. - Abstract: The inhibition performance of nineteen amino acids was studied by theoretical methods. The affection of acidic solution and protonation of inhibitor were considered in molecular dynamics simulation and the results indicated that the protonated amino-group was not adsorbed on Fe (1 1 0) surface. Additionally, a nonlinear quantitative structure–activity relationship (QSAR) model was built by the support vector machine. The correlation coefficient was 0.97 and the root mean square error, the differences between predicted and experimental inhibition efficiencies (%), was 1.48. Furthermore, five new amino acids were theoretically designed and their inhibition efficiencies were predicted by the built QSAR model

  18. Support vector machine learning-based fMRI data group analysis.

    Science.gov (United States)

    Wang, Ze; Childress, Anna R; Wang, Jiongjiong; Detre, John A

    2007-07-15

    To explore the multivariate nature of fMRI data and to consider the inter-subject brain response discrepancies, a multivariate and brain response model-free method is fundamentally required. Two such methods are presented in this paper by integrating a machine learning algorithm, the support vector machine (SVM), and the random effect model. Without any brain response modeling, SVM was used to extract a whole brain spatial discriminance map (SDM), representing the brain response difference between the contrasted experimental conditions. Population inference was then obtained through the random effect analysis (RFX) or permutation testing (PMU) on the individual subjects' SDMs. Applied to arterial spin labeling (ASL) perfusion fMRI data, SDM RFX yielded lower false-positive rates in the null hypothesis test and higher detection sensitivity for synthetic activations with varying cluster size and activation strengths, compared to the univariate general linear model (GLM)-based RFX. For a sensory-motor ASL fMRI study, both SDM RFX and SDM PMU yielded similar activation patterns to GLM RFX and GLM PMU, respectively, but with higher t values and cluster extensions at the same significance level. Capitalizing on the absence of temporal noise correlation in ASL data, this study also incorporated PMU in the individual-level GLM and SVM analyses accompanied by group-level analysis through RFX or group-level PMU. Providing inferences on the probability of being activated or deactivated at each voxel, these individual-level PMU-based group analysis methods can be used to threshold the analysis results of GLM RFX, SDM RFX or SDM PMU.

  19. Noninvasive extraction of fetal electrocardiogram based on Support Vector Machine

    Science.gov (United States)

    Fu, Yumei; Xiang, Shihan; Chen, Tianyi; Zhou, Ping; Huang, Weiyan

    2015-10-01

    The fetal electrocardiogram (FECG) signal has important clinical value for diagnosing the fetal heart diseases and choosing suitable therapeutics schemes to doctors. So, the noninvasive extraction of FECG from electrocardiogram (ECG) signals becomes a hot research point. A new method, the Support Vector Machine (SVM) is utilized for the extraction of FECG with limited size of data. Firstly, the theory of the SVM and the principle of the extraction based on the SVM are studied. Secondly, the transformation of maternal electrocardiogram (MECG) component in abdominal composite signal is verified to be nonlinear and fitted with the SVM. Then, the SVM is trained, and the training results are compared with the real data to ensure the effect of the training. Meanwhile, the parameters of the SVM are optimized to achieve the best performance so that the learning machine can be utilized to fit the unknown samples. Finally, the FECG is extracted by removing the optimal estimation of MECG component from the abdominal composite signal. In order to evaluate the performance of FECG extraction based on the SVM, the Signal-to-Noise Ratio (SNR) and the visual test are used. The experimental results show that the FECG with good quality can be extracted, its SNR ratio is significantly increased as high as 9.2349 dB and the time cost is significantly decreased as short as 0.802 seconds. Compared with the traditional method, the noninvasive extraction method based on the SVM has a simple realization, the shorter treatment time and the better extraction quality under the same conditions.

  20. Forecasting systems reliability based on support vector regression with genetic algorithms

    International Nuclear Information System (INIS)

    Chen, K.-Y.

    2007-01-01

    This study applies a novel neural-network technique, support vector regression (SVR), to forecast reliability in engine systems. The aim of this study is to examine the feasibility of SVR in systems reliability prediction by comparing it with the existing neural-network approaches and the autoregressive integrated moving average (ARIMA) model. To build an effective SVR model, SVR's parameters must be set carefully. This study proposes a novel approach, known as GA-SVR, which searches for SVR's optimal parameters using real-value genetic algorithms, and then adopts the optimal parameters to construct the SVR models. A real reliability data for 40 suits of turbochargers were employed as the data set. The experimental results demonstrate that SVR outperforms the existing neural-network approaches and the traditional ARIMA models based on the normalized root mean square error and mean absolute percentage error

  1. Estimation of the Power Peaking Factor in a Nuclear Reactor Using Support Vector Machines and Uncertainty Analysis

    International Nuclear Information System (INIS)

    Bae, In Ho; Na, Man Gyun; Lee, Yoon Joon; Park, Goon Cherl

    2009-01-01

    Knowing more about the Local Power Density (LPD) at the hottest part of a nuclear reactor core can provide more important information than knowledge of the LPD at any other position. The LPD at the hottest part needs to be estimated accurately in order to prevent the fuel rod from melting in a nuclear reactor. Support Vector Machines (SVMs) have successfully been applied in classification and regression problems. Therefore, in this paper, the power peaking factor, which is defined as the highest LPD to the average power density in a reactor core, was estimated by SVMs which use numerous measured signals of the reactor coolant system. The SVM models were developed by using a training data set and validated by an independent test data set. The SVM models' uncertainty was analyzed by using 100 sampled training data sets and verification data sets. The prediction intervals were very small, which means that the predicted values were very accurate. The predicted values were then applied to the first fuel cycle of the Yonggwang Nuclear Power Plant Unit 3. The root mean squared error was approximately 0.15%, which is accurate enough for use in LPD monitoring and for core protection that uses LPD estimation

  2. Hybrid RGSA and Support Vector Machine Framework for Three-Dimensional Magnetic Resonance Brain Tumor Classification

    Directory of Open Access Journals (Sweden)

    R. Rajesh Sharma

    2015-01-01

    algorithm (RGSA. Support vector machines, over backpropagation network, and k-nearest neighbor are used to evaluate the goodness of classifier approach. The preliminary evaluation of the system is performed using 320 real-time brain MRI images. The system is trained and tested by using a leave-one-case-out method. The performance of the classifier is tested using the receiver operating characteristic curve of 0.986 (±002. The experimental results demonstrate the systematic and efficient feature extraction and feature selection algorithm to the performance of state-of-the-art feature classification methods.

  3. Status self-validation of a multifunctional sensor using a multivariate relevance vector machine and predictive filters

    International Nuclear Information System (INIS)

    Shen, Zhengguang; Wang, Qi

    2013-01-01

    A novel strategy by using a multivariable relevance vector machine coupled with predictive filters for status self-validation of a multifunctional sensor is proposed. The working principle and online updating algorithm of predictive filters are emphasized for multiple fault detection, isolation and recovery (FDIR), and the incorrect sensor measurements are validated online. The multivariable relevance vector machine is then employed for the signal reconstruction of the multifunctional sensor to generate the final validated measurement values (VMV) of multiple measured components, in which its advantages of sparse models and multivariable simultaneous outputs are fully used. With all likely uncertainty sources of the multifunctional self-validating sensor taken into account, the uncertainty propagation model is deduced in detail to evaluate the online validated uncertainty (VU) under a fault-free situation while a qualitative uncertainty component is appended to indicate the accuracy changes of VMV under different types of fault. A real experimental system of a multifunctional self-validating sensor is designed to verify the performance of the proposed strategy. From the real-time capacity and fault recovery accuracy of FDIR, and runtime of signal reconstruction under small samples, a performance comparison among different methods is made. Results demonstrate that the proposed scheme provides a better solution to the status self-validation of a multifunctional self-validating sensor under both normal and abnormal situations. (paper)

  4. The Short-Term Power Load Forecasting Based on Sperm Whale Algorithm and Wavelet Least Square Support Vector Machine with DWT-IR for Feature Selection

    Directory of Open Access Journals (Sweden)

    Jin-peng Liu

    2017-07-01

    Full Text Available Short-term power load forecasting is an important basis for the operation of integrated energy system, and the accuracy of load forecasting directly affects the economy of system operation. To improve the forecasting accuracy, this paper proposes a load forecasting system based on wavelet least square support vector machine and sperm whale algorithm. Firstly, the methods of discrete wavelet transform and inconsistency rate model (DWT-IR are used to select the optimal features, which aims to reduce the redundancy of input vectors. Secondly, the kernel function of least square support vector machine LSSVM is replaced by wavelet kernel function for improving the nonlinear mapping ability of LSSVM. Lastly, the parameters of W-LSSVM are optimized by sperm whale algorithm, and the short-term load forecasting method of W-LSSVM-SWA is established. Additionally, the example verification results show that the proposed model outperforms other alternative methods and has a strong effectiveness and feasibility in short-term power load forecasting.

  5. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Science.gov (United States)

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  6. Prediction on sunspot activity based on fuzzy information granulation and support vector machine

    Science.gov (United States)

    Peng, Lingling; Yan, Haisheng; Yang, Zhigang

    2018-04-01

    In order to analyze the range of sunspots, a combined prediction method of forecasting the fluctuation range of sunspots based on fuzzy information granulation (FIG) and support vector machine (SVM) was put forward. Firstly, employing the FIG to granulate sample data and extract va)alid information of each window, namely the minimum value, the general average value and the maximum value of each window. Secondly, forecasting model is built respectively with SVM and then cross method is used to optimize these parameters. Finally, the fluctuation range of sunspots is forecasted with the optimized SVM model. Case study demonstrates that the model have high accuracy and can effectively predict the fluctuation of sunspots.

  7. An alternative approach to the determination of scaling law expressions for the L–H transition in Tokamaks utilizing classification tools instead of regression

    International Nuclear Information System (INIS)

    Gaudio, P; Gelfusa, M; Lupelli, I; Murari, A; Vega, J

    2014-01-01

    A new approach to determine the power law expressions for the threshold between the H and L mode of confinement is presented. The method is based on two powerful machine learning tools for classification: neural networks and support vector machines. Using as inputs clear examples of the systems on either side of the transition, the machine learning tools learn the input–output mapping corresponding to the equations of the boundary separating the confinement regimes. Systematic tests with synthetic data show that the machine learning tools provide results competitive with traditional statistical regression and more robust against random noise and systematic errors. The developed tools have then been applied to the multi-machine International Tokamak Physics Activity International Global Threshold Database of validated ITER-like Tokamak discharges. The machine learning tools converge on the same scaling law parameters obtained with non-linear regression. On the other hand, the developed tools allow a reduction of 50% of the uncertainty in the extrapolations to ITER. Therefore the proposed approach can effectively complement traditional regression since its application poses much less stringent requirements on the experimental data, to be used to determine the scaling laws, because they do not require examples exactly at the moment of the transition. (paper)

  8. Multi-class parkinsonian disorders classification with quantitative MR markers and graph-based features using support vector machines.

    Science.gov (United States)

    Morisi, Rita; Manners, David Neil; Gnecco, Giorgio; Lanconelli, Nico; Testa, Claudia; Evangelisti, Stefania; Talozzi, Lia; Gramegna, Laura Ludovica; Bianchini, Claudio; Calandra-Buonaura, Giovanna; Sambati, Luisa; Giannini, Giulia; Cortelli, Pietro; Tonon, Caterina; Lodi, Raffaele

    2018-02-01

    In this study we attempt to automatically classify individual patients with different parkinsonian disorders, making use of pattern recognition techniques to distinguish among several forms of parkinsonisms (multi-class classification), based on a set of binary classifiers that discriminate each disorder from all others. We combine diffusion tensor imaging, proton spectroscopy and morphometric-volumetric data to obtain MR quantitative markers, which are provided to support vector machines with the aim of recognizing the different parkinsonian disorders. Feature selection is used to find the most important features for classification. We also exploit a graph-based technique on the set of quantitative markers to extract additional features from the dataset, and increase classification accuracy. When graph-based features are not used, the MR markers that are most frequently automatically extracted by the feature selection procedure reflect alterations in brain regions that are also usually considered to discriminate parkinsonisms in routine clinical practice. Graph-derived features typically increase the diagnostic accuracy, and reduce the number of features required. The results obtained in the work demonstrate that support vector machines applied to multimodal brain MR imaging and using graph-based features represent a novel and highly accurate approach to discriminate parkinsonisms, and a useful tool to assist the diagnosis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. A Semisupervised Support Vector Machines Algorithm for BCI Systems

    Science.gov (United States)

    Qin, Jianzhao; Li, Yuanqing; Sun, Wei

    2007-01-01

    As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141

  10. Failure prognostics by support vector regression of time series data under stationary/nonstationary environmental and operational conditions

    International Nuclear Information System (INIS)

    Liu, Jie

    2015-01-01

    This Ph. D. work is motivated by the possibility of monitoring the conditions of components of energy systems for their extended and safe use, under proper practice of operation and adequate policies of maintenance. The aim is to develop a Support Vector Regression (SVR)-based framework for predicting time series data under stationary/nonstationary environmental and operational conditions. Single SVR and SVR-based ensemble approaches are developed to tackle the prediction problem based on both small and large datasets. Strategies are proposed for adaptively updating the single SVR and SVR-based ensemble models in the existence of pattern drifts. Comparisons with other online learning approaches for kernel-based modelling are provided with reference to time series data from a critical component in Nuclear Power Plants (NPPs) provided by Electricite de France (EDF). The results show that the proposed approaches achieve comparable prediction results, considering the Mean Squared Error (MSE) and Mean Relative Error (MRE), in much less computation time. Furthermore, by analyzing the geometrical meaning of the Feature Vector Selection (FVS) method proposed in the literature, a novel geometrically interpretable kernel method, named Reduced Rank Kernel Ridge Regression-II (RRKRR-II), is proposed to describe the linear relations between a predicted value and the predicted values of the Feature Vectors (FVs) selected by FVS. Comparisons with several kernel methods on a number of public datasets prove the good prediction accuracy and the easy-of-tuning of the hyper-parameters of RRKRR-II. (author)

  11. LMethyR-SVM: Predict Human Enhancers Using Low Methylated Regions based on Weighted Support Vector Machines.

    Science.gov (United States)

    Xu, Jingting; Hu, Hong; Dai, Yang

    The identification of enhancers is a challenging task. Various types of epigenetic information including histone modification have been utilized in the construction of enhancer prediction models based on a diverse panel of machine learning schemes. However, DNA methylation profiles generated from the whole genome bisulfite sequencing (WGBS) have not been fully explored for their potential in enhancer prediction despite the fact that low methylated regions (LMRs) have been implied to be distal active regulatory regions. In this work, we propose a prediction framework, LMethyR-SVM, using LMRs identified from cell-type-specific WGBS DNA methylation profiles and a weighted support vector machine learning framework. In LMethyR-SVM, the set of cell-type-specific LMRs is further divided into three sets: reliable positive, like positive and likely negative, according to their resemblance to a small set of experimentally validated enhancers in the VISTA database based on an estimated non-parametric density distribution. Then, the prediction model is obtained by solving a weighted support vector machine. We demonstrate the performance of LMethyR-SVM by using the WGBS DNA methylation profiles derived from the human embryonic stem cell type (H1) and the fetal lung fibroblast cell type (IMR90). The predicted enhancers are highly conserved with a reasonable validation rate based on a set of commonly used positive markers including transcription factors, p300 binding and DNase-I hypersensitive sites. In addition, we show evidence that the large fraction of the LMethyR-SVM predicted enhancers are not predicted by ChromHMM in H1 cell type and they are more enriched for the FANTOM5 enhancers. Our work suggests that low methylated regions detected from the WGBS data are useful as complementary resources to histone modification marks in developing models for the prediction of cell-type-specific enhancers.

  12. Blood glucose level prediction based on support vector regression using mobile platforms.

    Science.gov (United States)

    Reymann, Maximilian P; Dorschky, Eva; Groh, Benjamin H; Martindale, Christine; Blank, Peter; Eskofier, Bjoern M

    2016-08-01

    The correct treatment of diabetes is vital to a patient's health: Staying within defined blood glucose levels prevents dangerous short- and long-term effects on the body. Mobile devices informing patients about their future blood glucose levels could enable them to take counter-measures to prevent hypo or hyper periods. Previous work addressed this challenge by predicting the blood glucose levels using regression models. However, these approaches required a physiological model, representing the human body's response to insulin and glucose intake, or are not directly applicable to mobile platforms (smart phones, tablets). In this paper, we propose an algorithm for mobile platforms to predict blood glucose levels without the need for a physiological model. Using an online software simulator program, we trained a Support Vector Regression (SVR) model and exported the parameter settings to our mobile platform. The prediction accuracy of our mobile platform was evaluated with pre-recorded data of a type 1 diabetes patient. The blood glucose level was predicted with an error of 19 % compared to the true value. Considering the permitted error of commercially used devices of 15 %, our algorithm is the basis for further development of mobile prediction algorithms.

  13. Product demand forecasts using wavelet kernel support vector machine and particle swarm optimization in manufacture system

    Science.gov (United States)

    Wu, Qi

    2010-03-01

    Demand forecasts play a crucial role in supply chain management. The future demand for a certain product is the basis for the respective replenishment systems. Aiming at demand series with small samples, seasonal character, nonlinearity, randomicity and fuzziness, the existing support vector kernel does not approach the random curve of the sales time series in the space (quadratic continuous integral space). In this paper, we present a hybrid intelligent system combining the wavelet kernel support vector machine and particle swarm optimization for demand forecasting. The results of application in car sale series forecasting show that the forecasting approach based on the hybrid PSOWv-SVM model is effective and feasible, the comparison between the method proposed in this paper and other ones is also given, which proves that this method is, for the discussed example, better than hybrid PSOv-SVM and other traditional methods.

  14. Prediction of Carbohydrate-Binding Proteins from Sequences Using Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Seizi Someya

    2010-01-01

    Full Text Available Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs. We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.

  15. SYN Flood Attack Detection in Cloud Computing using Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Zerina Mašetić

    2017-11-01

    Full Text Available Cloud computing is a trending technology, as it reduces the cost of running a business. However, many companies are skeptic moving about towards cloud due to the security concerns. Based on the Cloud Security Alliance report, Denial of Service (DoS attacks are among top 12 attacks in the cloud computing. Therefore, it is important to develop a mechanism for detection and prevention of these attacks. The aim of this paper is to evaluate Support Vector Machine (SVM algorithm in creating the model for classification of DoS attacks and normal network behaviors. The study was performed in several phases: a attack simulation, b data collection, cfeature selection, and d classification. The proposedmodel achieved 100% classification accuracy with true positive rate (TPR of 100%. SVM showed outstanding performance in DoS attack detection and proves that it serves as a valuable asset in the network security area.

  16. Classification of ECG signal with Support Vector Machine Method for Arrhythmia Detection

    Science.gov (United States)

    Turnip, Arjon; Ilham Rizqywan, M.; Kusumandari, Dwi E.; Turnip, Mardi; Sihombing, Poltak

    2018-03-01

    An electrocardiogram is a potential bioelectric record that occurs as a result of cardiac activity. QRS Detection with zero crossing calculation is one method that can precisely determine peak R of QRS wave as part of arrhythmia detection. In this paper, two experimental scheme (2 minutes duration with different activities: relaxed and, typing) were conducted. From the two experiments it were obtained: accuracy, sensitivity, and positive predictivity about 100% each for the first experiment and about 79%, 93%, 83% for the second experiment, respectively. Furthermore, the feature set of MIT-BIH arrhythmia using the support vector machine (SVM) method on the WEKA software is evaluated. By combining the available attributes on the WEKA algorithm, the result is constant since all classes of SVM goes to the normal class with average 88.49% accuracy.

  17. Landslide susceptibility mapping based on Support Vector Machine: A case study on natural slopes of Hong Kong, China

    Science.gov (United States)

    Yao, X.; Tham, L. G.; Dai, F. C.

    2008-11-01

    The Support Vector Machine (SVM) is an increasingly popular learning procedure based on statistical learning theory, and involves a training phase in which the model is trained by a training dataset of associated input and target output values. The trained model is then used to evaluate a separate set of testing data. There are two main ideas underlying the SVM for discriminant-type problems. The first is an optimum linear separating hyperplane that separates the data patterns. The second is the use of kernel functions to convert the original non-linear data patterns into the format that is linearly separable in a high-dimensional feature space. In this paper, an overview of the SVM, both one-class and two-class SVM methods, is first presented followed by its use in landslide susceptibility mapping. A study area was selected from the natural terrain of Hong Kong, and slope angle, slope aspect, elevation, profile curvature of slope, lithology, vegetation cover and topographic wetness index (TWI) were used as environmental parameters which influence the occurrence of landslides. One-class and two-class SVM models were trained and then used to map landslide susceptibility respectively. The resulting susceptibility maps obtained by the methods were compared to that obtained by the logistic regression (LR) method. It is concluded that two-class SVM possesses better prediction efficiency than logistic regression and one-class SVM. However, one-class SVM, which only requires failed cases, has an advantage over the other two methods as only "failed" case information is usually available in landslide susceptibility mapping.

  18. Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information

    OpenAIRE

    Wei-Jong Yang; Wei-Hau Du; Pau-Choo Chang; Jar-Ferr Yang; Pi-Hsia Hung

    2017-01-01

    The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an importan...

  19. Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory.

    Science.gov (United States)

    Kruppa, Jochen; Liu, Yufeng; Biau, Gérard; Kohler, Michael; König, Inke R; Malley, James D; Ziegler, Andreas

    2014-07-01

    Probability estimation for binary and multicategory outcome using logistic and multinomial logistic regression has a long-standing tradition in biostatistics. However, biases may occur if the model is misspecified. In contrast, outcome probabilities for individuals can be estimated consistently with machine learning approaches, including k-nearest neighbors (k-NN), bagged nearest neighbors (b-NN), random forests (RF), and support vector machines (SVM). Because machine learning methods are rarely used by applied biostatisticians, the primary goal of this paper is to explain the concept of probability estimation with these methods and to summarize recent theoretical findings. Probability estimation in k-NN, b-NN, and RF can be embedded into the class of nonparametric regression learning machines; therefore, we start with the construction of nonparametric regression estimates and review results on consistency and rates of convergence. In SVMs, outcome probabilities for individuals are estimated consistently by repeatedly solving classification problems. For SVMs we review classification problem and then dichotomous probability estimation. Next we extend the algorithms for estimating probabilities using k-NN, b-NN, and RF to multicategory outcomes and discuss approaches for the multicategory probability estimation problem using SVM. In simulation studies for dichotomous and multicategory dependent variables we demonstrate the general validity of the machine learning methods and compare it with logistic regression. However, each method fails in at least one simulation scenario. We conclude with a discussion of the failures and give recommendations for selecting and tuning the methods. Applications to real data and example code are provided in a companion article (doi:10.1002/bimj.201300077). © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Automatic Sleep Staging using Multi-dimensional Feature Extraction and Multi-kernel Fuzzy Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Yanjun Zhang

    2014-01-01

    Full Text Available This paper employed the clinical Polysomnographic (PSG data, mainly including all-night Electroencephalogram (EEG, Electrooculogram (EOG and Electromyogram (EMG signals of subjects, and adopted the American Academy of Sleep Medicine (AASM clinical staging manual as standards to realize automatic sleep staging. Authors extracted eighteen different features of EEG, EOG and EMG in time domains and frequency domains to construct the vectors according to the existing literatures as well as clinical experience. By adopting sleep samples self-learning, the linear combination of weights and parameters of multiple kernels of the fuzzy support vector machine (FSVM were learned and the multi-kernel FSVM (MK-FSVM was constructed. The overall agreement between the experts' scores and the results presented was 82.53%. Compared with previous results, the accuracy of N1 was improved to some extent while the accuracies of other stages were approximate, which well reflected the sleep structure. The staging algorithm proposed in this paper is transparent, and worth further investigation.

  1. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Science.gov (United States)

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  2. Localization of thermal anomalies in electrical equipment using Infrared Thermography and support vector machine

    Science.gov (United States)

    Laib dit Leksir, Y.; Mansour, M.; Moussaoui, A.

    2018-03-01

    Analysis and processing of databases obtained from infrared thermal inspections made on electrical installations require the development of new tools to obtain more information to visual inspections. Consequently, methods based on the capture of thermal images show a great potential and are increasingly employed in this field. However, there is a need for the development of effective techniques to analyse these databases in order to extract significant information relating to the state of the infrastructures. This paper presents a technique explaining how this approach can be implemented and proposes a system that can help to detect faults in thermal images of electrical installations. The proposed method classifies and identifies the region of interest (ROI). The identification is conducted using support vector machine (SVM) algorithm. The aim here is to capture the faults that exist in electrical equipments during an inspection of some machines using A40 FLIR camera. After that, binarization techniques are employed to select the region of interest. Later the comparative analysis of the obtained misclassification errors using the proposed method with Fuzzy c means and Ostu, has also be addressed.

  3. Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation

    Science.gov (United States)

    2018-01-01

    Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site. PMID:29370230

  4. Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation.

    Directory of Open Access Journals (Sweden)

    Hazlee Azil Illias

    Full Text Available Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM with modified evolutionary particle swarm optimisation (EPSO algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO-Time Varying Acceleration Coefficient (TVAC technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site.

  5. Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation.

    Science.gov (United States)

    Illias, Hazlee Azil; Zhao Liang, Wee

    2018-01-01

    Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site.

  6. Cost Forecasting of Substation Projects Based on Cuckoo Search Algorithm and Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Dongxiao Niu

    2018-01-01

    Full Text Available Accurate prediction of substation project cost is helpful to improve the investment management and sustainability. It is also directly related to the economy of substation project. Ensemble Empirical Mode Decomposition (EEMD can decompose variables with non-stationary sequence signals into significant regularity and periodicity, which is helpful in improving the accuracy of prediction model. Adding the Gauss perturbation to the traditional Cuckoo Search (CS algorithm can improve the searching vigor and precision of CS algorithm. Thus, the parameters and kernel functions of Support Vector Machines (SVM model are optimized. By comparing the prediction results with other models, this model has higher prediction accuracy.

  7. Impact of Health Care Employees’ Job Satisfaction on Organizational Performance Support Vector Machine Approach

    Directory of Open Access Journals (Sweden)

    CEMIL KUZEY

    2018-01-01

    Full Text Available This study is undertaken to search for key factors that contribute to job satisfaction among health care workers, and also to determine the impact of these underlying dimensions of employee satisfaction on organizational performance. Exploratory Factor Analysis (EFA is applied to initially uncover the key factors, and then, in the next stage of analysis, a popular data mining technique, Support Vector Machine (SVM is employed on a sample of 249 to determine the impact of job satisfaction factors on organizational performance. According to the proposed model, the main factors are revealed to be management’s attitude, pay/reward, job security and colleagues.

  8. Asynchronized synchronous machines

    CERN Document Server

    Botvinnik, M M

    1964-01-01

    Asynchronized Synchronous Machines focuses on the theoretical research on asynchronized synchronous (AS) machines, which are "hybrids” of synchronous and induction machines that can operate with slip. Topics covered in this book include the initial equations; vector diagram of an AS machine; regulation in cases of deviation from the law of full compensation; parameters of the excitation system; and schematic diagram of an excitation regulator. The possible applications of AS machines and its calculations in certain cases are also discussed. This publication is beneficial for students and indiv

  9. The Pattern Recognition in Cattle Brand using Bag of Visual Words and Support Vector Machines Multi-Class

    Directory of Open Access Journals (Sweden)

    Carlos Silva, Mr

    2018-03-01

    Full Text Available The recognition images of cattle brand in an automatic way is a necessity to governmental organs responsible for this activity. To help this process, this work presents a method that consists in using Bag of Visual Words for extracting of characteristics from images of cattle brand and Support Vector Machines Multi-Class for classification. This method consists of six stages: a select database of images; b extract points of interest (SURF; c create vocabulary (K-means; d create vector of image characteristics (visual words; e train and sort images (SVM; f evaluate the classification results. The accuracy of the method was tested on database of municipal city hall, where it achieved satisfactory results, reporting 86.02% of accuracy and 56.705 seconds of processing time, respectively.

  10. Figure of merit for macrouniformity based on image quality ruler evaluation and machine learning framework

    Science.gov (United States)

    Wang, Weibao; Overall, Gary; Riggs, Travis; Silveston-Keith, Rebecca; Whitney, Julie; Chiu, George; Allebach, Jan P.

    2013-01-01

    Assessment of macro-uniformity is a capability that is important for the development and manufacture of printer products. Our goal is to develop a metric that will predict macro-uniformity, as judged by human subjects, by scanning and analyzing printed pages. We consider two different machine learning frameworks for the metric: linear regression and the support vector machine. We have implemented the image quality ruler, based on the recommendations of the INCITS W1.1 macro-uniformity team. Using 12 subjects at Purdue University and 20 subjects at Lexmark, evenly balanced with respect to gender, we conducted subjective evaluations with a set of 35 uniform b/w prints from seven different printers with five levels of tint coverage. Our results suggest that the image quality ruler method provides a reliable means to assess macro-uniformity. We then defined and implemented separate features to measure graininess, mottle, large area variation, jitter, and large-scale non-uniformity. The algorithms that we used are largely based on ISO image quality standards. Finally, we used these features computed for a set of test pages and the subjects' image quality ruler assessments of these pages to train the two different predictors - one based on linear regression and the other based on the support vector machine (SVM). Using five-fold cross-validation, we confirmed the efficacy of our predictor.

  11. Measurement of the t (bar t) cross section at the Run II Tevatron using Support Vector Machines

    International Nuclear Information System (INIS)

    Whitehouse, Benjamin Eric

    2010-01-01

    This dissertation measures the t(bar t) production cross section at the Run II CDF detector using data from early 2001 through March 2007. The Tevatron at Fermilab is a p(bar p) collider with center of mass energy √s = 1.96 TeV. This data composes a sample with a time-integrated luminosity measured at 2.2 ± 0.1 fb -1 . A system of learning machines is developed to recognize t(bar t) events in the 'lepton plus jets' decay channel. Support Vector Machines are described, and their ability to cope with a multi-class discrimination problem is provided. The t(bar t) production cross section is then measured in this framework, and found to be σ t# bar t# = 7.14 ± 0.25 (stat) -0.86 +0.61 (sys) pb.

  12. PMSVM: An Optimized Support Vector Machine Classification Algorithm Based on PCA and Multilevel Grid Search Methods

    Directory of Open Access Journals (Sweden)

    Yukai Yao

    2015-01-01

    Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.

  13. Mixed kernel function support vector regression for global sensitivity analysis

    Science.gov (United States)

    Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng

    2017-11-01

    Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.

  14. Assessing the potential of support vector machine for estimating daily solar radiation using sunshine duration

    International Nuclear Information System (INIS)

    Chen, Ji-Long; Li, Guo-Sheng; Wu, Sheng-Jun

    2013-01-01

    Highlights: • Support vector machine is used to estimate daily solar radiation from sunshine duration. • Seven SVM models using different input attributes are evaluated using 35 years long term data. • SVM models significantly outperform the empirical models. • The optimal SVM model is proposed. - Abstract: Estimation of solar radiation from sunshine duration offers an important alternative in the absence of measured solar radiation. However, due to the dynamic nature of atmosphere, accurate estimation of daily solar radiation has been being a challenging task. This paper presents an application of Support vector machine (SVM) to estimation of daily solar radiation using sunshine duration. Seven SVM models using different input attributes and five empirical sunshine-based models are evaluated using meteorological data at three stations in Liaoning province in China. All the SVM models give good performances and significantly outperform the empirical models. The newly developed model, SVM1 using sunshine ratio as input attribute, is preferred due to its greater accuracy and simple input attribute. It performs better in winter, while highest root mean square error and relative root mean square error are obtained in summer. The season-dependent SVM model is superior to the fixed model in estimation of daily solar radiation for winter, while consideration of seasonal variation of the data sets cannot improve the results for spring, summer and autumn. Moreover, daily solar radiation could be well estimated by SVM1 using the data from nearby stations. The results indicate that the SVM method would be a promising alternative over the traditional approaches for estimation of daily solar radiation

  15. Urban Heat Island Growth Modeling Using Artificial Neural Networks and Support Vector Regression: A case study of Tehran, Iran

    Science.gov (United States)

    Sherafati, Sh. A.; Saradjian, M. R.; Niazmardi, S.

    2013-09-01

    Numerous investigations on Urban Heat Island (UHI) show that land cover change is the main factor of increasing Land Surface Temperature (LST) in urban areas. Therefore, to achieve a model which is able to simulate UHI growth, urban expansion should be concerned first. Considerable researches on urban expansion modeling have been done based on cellular automata. Accordingly the objective of this paper is to implement CA method for trend detection of Tehran UHI spatiotemporal growth based on urban sprawl parameters (such as Distance to nearest road, Digital Elevation Model (DEM), Slope and Aspect ratios). It should be mentioned that UHI growth modeling may have more complexities in comparison with urban expansion, since the amount of each pixel's temperature should be investigated instead of its state (urban and non-urban areas). The most challenging part of CA model is the definition of Transfer Rules. Here, two methods have used to find appropriate transfer Rules which are Artificial Neural Networks (ANN) and Support Vector Regression (SVR). The reason of choosing these approaches is that artificial neural networks and support vector regression have significant abilities to handle the complications of such a spatial analysis in comparison with other methods like Genetic or Swarm intelligence. In this paper, UHI change trend has discussed between 1984 and 2007. For this purpose, urban sprawl parameters in 1984 have calculated and added to the retrieved LST of this year. In order to achieve LST, Thematic Mapper (TM) and Enhanced Thematic Mapper (ETM+) night-time images have exploited. The reason of implementing night-time images is that UHI phenomenon is more obvious during night hours. After that multilayer feed-forward neural networks and support vector regression have used separately to find the relationship between this data and the retrieved LST in 2007. Since the transfer rules might not be the same in different regions, the satellite image of the city has

  16. Differentiation of Glioblastoma and Lymphoma Using Feature Extraction and Support Vector Machine.

    Science.gov (United States)

    Yang, Zhangjing; Feng, Piaopiao; Wen, Tian; Wan, Minghua; Hong, Xunning

    2017-01-01

    Differentiation of glioblastoma multiformes (GBMs) and lymphomas using multi-sequence magnetic resonance imaging (MRI) is an important task that is valuable for treatment planning. However, this task is a challenge because GBMs and lymphomas may have a similar appearance in MRI images. This similarity may lead to misclassification and could affect the treatment results. In this paper, we propose a semi-automatic method based on multi-sequence MRI to differentiate these two types of brain tumors. Our method consists of three steps: 1) the key slice is selected from 3D MRIs and region of interests (ROIs) are drawn around the tumor region; 2) different features are extracted based on prior clinical knowledge and validated using a t-test; and 3) features that are helpful for classification are used to build an original feature vector and a support vector machine is applied to perform classification. In total, 58 GBM cases and 37 lymphoma cases are used to validate our method. A leave-one-out crossvalidation strategy is adopted in our experiments. The global accuracy of our method was determined as 96.84%, which indicates that our method is effective for the differentiation of GBM and lymphoma and can be applied in clinical diagnosis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  17. Numerical Control Machine Tool Fault Diagnosis Using Hybrid Stationary Subspace Analysis and Least Squares Support Vector Machine with a Single Sensor

    Directory of Open Access Journals (Sweden)

    Chen Gao

    2017-03-01

    Full Text Available Tool fault diagnosis in numerical control (NC machines plays a significant role in ensuring manufacturing quality. However, current methods of tool fault diagnosis lack accuracy. Therefore, in the present paper, a fault diagnosis method was proposed based on stationary subspace analysis (SSA and least squares support vector machine (LS-SVM using only a single sensor. First, SSA was used to extract stationary and non-stationary sources from multi-dimensional signals without the need for independency and without prior information of the source signals, after the dimensionality of the vibration signal observed by a single sensor was expanded by phase space reconstruction technique. Subsequently, 10 dimensionless parameters in the time-frequency domain for non-stationary sources were calculated to generate samples to train the LS-SVM. Finally, the measured vibration signals from tools of an unknown state and their non-stationary sources were separated by SSA to serve as test samples for the trained SVM. The experimental validation demonstrated that the proposed method has better diagnosis accuracy than three previous methods based on LS-SVM alone, Principal component analysis and LS-SVM or on SSA and Linear discriminant analysis.

  18. Prediction of endoplasmic reticulum resident proteins using fragmented amino acid composition and support vector machine

    Directory of Open Access Journals (Sweden)

    Ravindra Kumar

    2017-09-01

    Full Text Available Background The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins to the secretory pathway. Hence at any given point of time, endoplasmic reticulum contains two different cohorts of proteins, (i proteins involved in endoplasmic reticulum-specific function, which reside in the lumen of the endoplasmic reticulum, called as endoplasmic reticulum resident proteins and (ii proteins which are in process of moving to the extracellular space. Thus, endoplasmic reticulum resident proteins must somehow be distinguished from newly synthesized secretory proteins, which pass through the endoplasmic reticulum on their way out of the cell. Approximately only 50% of the proteins used in this study as training data had endoplasmic reticulum retention signal, which shows that these signals are not essentially present in all endoplasmic reticulum resident proteins. This also strongly indicates the role of additional factors in retention of endoplasmic reticulum-specific proteins inside the endoplasmic reticulum. Methods This is a support vector machine based method, where we had used different forms of protein features as inputs for support vector machine to develop the prediction models. During training leave-one-out approach of cross-validation was used. Maximum performance was obtained with a combination of amino acid compositions of different part of proteins. Results In this study, we have reported a novel support vector machine based method for predicting endoplasmic reticulum resident proteins, named as ERPred. During training we achieved a maximum accuracy of 81.42% with leave-one-out approach of cross-validation. When evaluated on independent dataset, ERPred did prediction with sensitivity of 72.31% and specificity of 83

  19. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    Science.gov (United States)

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-06-17

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

  20. Machine learning of swimming data via wisdom of crowd and regression analysis.

    Science.gov (United States)

    Xie, Jiang; Xu, Junfu; Nie, Celine; Nie, Qing

    2017-04-01

    Every performance, in an officially sanctioned meet, by a registered USA swimmer is recorded into an online database with times dating back to 1980. For the first time, statistical analysis and machine learning methods are systematically applied to 4,022,631 swim records. In this study, we investigate performance features for all strokes as a function of age and gender. The variances in performance of males and females for different ages and strokes were studied, and the correlations of performances for different ages were estimated using the Pearson correlation. Regression analysis show the performance trends for both males and females at different ages and suggest critical ages for peak training. Moreover, we assess twelve popular machine learning methods to predict or classify swimmer performance. Each method exhibited different strengths or weaknesses in different cases, indicating no one method could predict well for all strokes. To address this problem, we propose a new method by combining multiple inference methods to derive Wisdom of Crowd Classifier (WoCC). Our simulation experiments demonstrate that the WoCC is a consistent method with better overall prediction accuracy. Our study reveals several new age-dependent trends in swimming and provides an accurate method for classifying and predicting swimming times.

  1. Voice based gender classification using machine learning

    Science.gov (United States)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  2. Pengaruh Jumlah Kelas dan Skema Klasifikasi terhadap Akurasi Informasi Penggunaan Lahan Hasil Klasifikasi Berbasis Objek dengan Teknik Support Vector Machine di Sebagian Kabupaten Kebumen Provinsi Jawa Tengah

    Directory of Open Access Journals (Sweden)

    Aria Jaka Dwiputra

    2016-10-01

    Full Text Available Produksi peta penggunaan lahan dari data geospasial satelit harus memperhatikan resolusi spasial yang digunakan, dengan menggunakan ilmu penginderaan jauh secara digital akurasi informasi geospasial tematik yang diperoleh dari data geospasial satelit dapat diukur secara kuantitatif. Jumlah kelas penggunaan lahan, skema klasifikasi, dan teknik ekstraksi informasi akan berpengaruh dalam overall accuracy.Penelitian ini menggunakan tiga variasi jumlah kelas dan skema klasifikasi yaitu 4, 7, dan 10 kelas penggunaan lahan. Tiga variasi tersebut diekstraksi dari citra ALOS AVNIR – 2 dengan resolusi 10 meter menggunakan metode klasifikasi berbasis objek dengan pendekatan support vector machine. Sesuatu yang lain dari klasifikasi berbasis objek adalah proses segmentasi yang mengelompokkan objek tutupan lahan dalam satu bagian, dan algoritma SVM mengklasifikasikan citra segmentasi dengan memanfaatkan empat tipe kernel yaitu linier, radial basis function, sigmoid, dan polynomial. Hasil ekstraksi informasi penggunaan lahan untuk jumlah kelas empat menggunakan klasifikasi berbasis objek dengan pendekatan support vector machine memiliki akurasi sebesar 87.2666% serta nilai koefesien kappa 0.8048 dan tipe kernel yang dipakai adalah kernel Linier. Hasil ekstraksi informasi penggunaan lahan untuk jumlah kelas tujuh menggunakan klasifikasi berbasis objek dengan pendekatan support vector machine memiliki akurasi sebesar 79.8021% serta nilai koefesien kappa 0.7293 dan tipe kernel yang dipakai adalah kernel Linier. Hasil ekstraksi informasi penggunaan lahan untuk jumlah kelas sepuluh menggunakan klasifikasi berbasis objek dengan pendekatan support vector machine memiliki akurasi sebesar 73.3377% serta nilai koefesien kappa 0.6466 dan tipe kernel yang dipakai adalah kernel Linier. Hasil ekstraksi informasi penggunaan lahan untuk jumlah kelas sepuluh menggunakan klasifikasi berbasis objek dengan pendekatan support vector machine dan ditambah dengan data bantu berupa

  3. Taxi-Out Time Prediction for Departures at Charlotte Airport Using Machine Learning Techniques

    Science.gov (United States)

    Lee, Hanbong; Malik, Waqar; Jung, Yoon C.

    2016-01-01

    Predicting the taxi-out times of departures accurately is important for improving airport efficiency and takeoff time predictability. In this paper, we attempt to apply machine learning techniques to actual traffic data at Charlotte Douglas International Airport for taxi-out time prediction. To find the key factors affecting aircraft taxi times, surface surveillance data is first analyzed. From this data analysis, several variables, including terminal concourse, spot, runway, departure fix and weight class, are selected for taxi time prediction. Then, various machine learning methods such as linear regression, support vector machines, k-nearest neighbors, random forest, and neural networks model are applied to actual flight data. Different traffic flow and weather conditions at Charlotte airport are also taken into account for more accurate prediction. The taxi-out time prediction results show that linear regression and random forest techniques can provide the most accurate prediction in terms of root-mean-square errors. We also discuss the operational complexity and uncertainties that make it difficult to predict the taxi times accurately.

  4. Patients on weaning trials classified with support vector machines

    International Nuclear Information System (INIS)

    Garde, Ainara; Caminal, Pere; Giraldo, Beatriz F; Schroeder, Rico; Voss, Andreas; Benito, Salvador

    2010-01-01

    The process of discontinuing mechanical ventilation is called weaning and is one of the most challenging problems in intensive care. An unnecessary delay in the discontinuation process and an early weaning trial are undesirable. This study aims to characterize the respiratory pattern through features that permit the identification of patients' conditions in weaning trials. Three groups of patients have been considered: 94 patients with successful weaning trials, who could maintain spontaneous breathing after 48 h (GSucc); 39 patients who failed the weaning trial (GFail) and 21 patients who had successful weaning trials, but required reintubation in less than 48 h (GRein). Patients are characterized by their cardiorespiratory interactions, which are described by joint symbolic dynamics (JSD) applied to the cardiac interbeat and breath durations. The most discriminating features in the classification of the different groups of patients (GSucc, GFail and GRein) are identified by support vector machines (SVMs). The SVM-based feature selection algorithm has an accuracy of 81% in classifying GSucc versus the rest of the patients, 83% in classifying GRein versus GSucc patients and 81% in classifying GRein versus the rest of the patients. Moreover, a good balance between sensitivity and specificity is achieved in all classifications

  5. Machine learning of the reactor core loading pattern critical parameters

    International Nuclear Information System (INIS)

    Trontl, K.; Pevec, D.; Smuc, T.

    2007-01-01

    The usual approach to loading pattern optimization involves high degree of engineering judgment, a set of heuristic rules, an optimization algorithm and a computer code used for evaluating proposed loading patterns. The speed of the optimization process is highly dependent on the computer code used for the evaluation. In this paper we investigate the applicability of a machine learning model which could be used for fast loading pattern evaluation. We employed a recently introduced machine learning technique, Support Vector Regression (SVR), which has a strong theoretical background in statistical learning theory. Superior empirical performance of the method has been reported on difficult regression problems in different fields of science and technology. SVR is a data driven, kernel based, nonlinear modelling paradigm, in which model parameters are automatically determined by solving a quadratic optimization problem. The main objective of the work reported in this paper was to evaluate the possibility of applying SVR method for reactor core loading pattern modelling. The starting set of experimental data for training and testing of the machine learning algorithm was obtained using a two-dimensional diffusion theory reactor physics computer code. We illustrate the performance of the solution and discuss its applicability, i.e., complexity, speed and accuracy, with a projection to a more realistic scenario involving machine learning from the results of more accurate and time consuming three-dimensional core modelling code. (author)

  6. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Directory of Open Access Journals (Sweden)

    Saerom Park

    Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  7. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes

    OpenAIRE

    Yu, Wei; Liu, Tiebin; Valdez, Rodolfo; Gwinn, Marta; Khoury, Muin J

    2010-01-01

    Abstract Background We present a potentially useful alternative approach based on support vector machine (SVM) techniques to classify persons with and without common diseases. We illustrate the method to detect persons with diabetes and pre-diabetes in a cross-sectional representative sample of the U.S. population. Methods We used data from the 1999-2004 National Health and Nutrition Examination Survey (NHANES) to develop and validate SVM models for two classification schemes: Classification ...

  8. A machine learning model with human cognitive biases capable of learning from small and biased datasets.

    Science.gov (United States)

    Taniguchi, Hidetaka; Sato, Hiroshi; Shirakawa, Tomohiro

    2018-05-09

    Human learners can generalize a new concept from a small number of samples. In contrast, conventional machine learning methods require large amounts of data to address the same types of problems. Humans have cognitive biases that promote fast learning. Here, we developed a method to reduce the gap between human beings and machines in this type of inference by utilizing cognitive biases. We implemented a human cognitive model into machine learning algorithms and compared their performance with the currently most popular methods, naïve Bayes, support vector machine, neural networks, logistic regression and random forests. We focused on the task of spam classification, which has been studied for a long time in the field of machine learning and often requires a large amount of data to obtain high accuracy. Our models achieved superior performance with small and biased samples in comparison with other representative machine learning methods.

  9. A Novel Hybrid Model Based on Extreme Learning Machine, k-Nearest Neighbor Regression and Wavelet Denoising Applied to Short-Term Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-05-01

    Full Text Available Electric load forecasting plays an important role in electricity markets and power systems. Because electric load time series are complicated and nonlinear, it is very difficult to achieve a satisfactory forecasting accuracy. In this paper, a hybrid model, Wavelet Denoising-Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EWKM, which combines k-Nearest Neighbor (KNN and Extreme Learning Machine (ELM based on a wavelet denoising technique is proposed for short-term load forecasting. The proposed hybrid model decomposes the time series into a low frequency-associated main signal and some detailed signals associated with high frequencies at first, then uses KNN to determine the independent and dependent variables from the low-frequency signal. Finally, the ELM is used to get the non-linear relationship between these variables to get the final prediction result for the electric load. Compared with three other models, Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EKM, Wavelet Denoising-Extreme Learning Machine (WKM and Wavelet Denoising-Back Propagation Neural Network optimized by k-Nearest Neighbor Regression (WNNM, the model proposed in this paper can improve the accuracy efficiently. New South Wales is the economic powerhouse of Australia, so we use the proposed model to predict electric demand for that region. The accurate prediction has a significant meaning.

  10. Prediction of Baseflow Index of Catchments using Machine Learning Algorithms

    Science.gov (United States)

    Yadav, B.; Hatfield, K.

    2017-12-01

    We present the results of eight machine learning techniques for predicting the baseflow index (BFI) of ungauged basins using a surrogate of catchment scale climate and physiographic data. The tested algorithms include ordinary least squares, ridge regression, least absolute shrinkage and selection operator (lasso), elasticnet, support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Our work seeks to identify the dominant controls of BFI that can be readily obtained from ancillary geospatial databases and remote sensing measurements, such that the developed techniques can be extended to ungauged catchments. More than 800 gauged catchments spanning the continental United States were selected to develop the general methodology. The BFI calculation was based on the baseflow separated from daily streamflow hydrograph using HYSEP filter. The surrogate catchment attributes were compiled from multiple sources including digital elevation model, soil, landuse, climate data, other publicly available ancillary and geospatial data. 80% catchments were used to train the ML algorithms, and the remaining 20% of the catchments were used as an independent test set to measure the generalization performance of fitted models. A k-fold cross-validation using exhaustive grid search was used to fit the hyperparameters of each model. Initial model development was based on 19 independent variables, but after variable selection and feature ranking, we generated revised sparse models of BFI prediction that are based on only six catchment attributes. These key predictive variables selected after the careful evaluation of bias-variance tradeoff include average catchment elevation, slope, fraction of sand, permeability, temperature, and precipitation. The most promising algorithms exceeding an accuracy score (r-square) of 0.7 on test data include support vector machine, gradient boosted regression trees, random forests, and extremely randomized

  11. Supplier Short Term Load Forecasting Using Support Vector Regression and Exogenous Input

    Science.gov (United States)

    Matijaš, Marin; Vukićcević, Milan; Krajcar, Slavko

    2011-09-01

    In power systems, task of load forecasting is important for keeping equilibrium between production and consumption. With liberalization of electricity markets, task of load forecasting changed because each market participant has to forecast their own load. Consumption of end-consumers is stochastic in nature. Due to competition, suppliers are not in a position to transfer their costs to end-consumers; therefore it is essential to keep forecasting error as low as possible. Numerous papers are investigating load forecasting from the perspective of the grid or production planning. We research forecasting models from the perspective of a supplier. In this paper, we investigate different combinations of exogenous input on the simulated supplier loads and show that using points of delivery as a feature for Support Vector Regression leads to lower forecasting error, while adding customer number in different datasets does the opposite.

  12. A Support Vector Machine Approach for Truncated Fingerprint Image Detection from Sweeping Fingerprint Sensors

    Science.gov (United States)

    Chen, Chi-Jim; Pai, Tun-Wen; Cheng, Mox

    2015-01-01

    A sweeping fingerprint sensor converts fingerprints on a row by row basis through image reconstruction techniques. However, a built fingerprint image might appear to be truncated and distorted when the finger was swept across a fingerprint sensor at a non-linear speed. If the truncated fingerprint images were enrolled as reference targets and collected by any automated fingerprint identification system (AFIS), successful prediction rates for fingerprint matching applications would be decreased significantly. In this paper, a novel and effective methodology with low time computational complexity was developed for detecting truncated fingerprints in a real time manner. Several filtering rules were implemented to validate existences of truncated fingerprints. In addition, a machine learning method of supported vector machine (SVM), based on the principle of structural risk minimization, was applied to reject pseudo truncated fingerprints containing similar characteristics of truncated ones. The experimental result has shown that an accuracy rate of 90.7% was achieved by successfully identifying truncated fingerprint images from testing images before AFIS enrollment procedures. The proposed effective and efficient methodology can be extensively applied to all existing fingerprint matching systems as a preliminary quality control prior to construction of fingerprint templates. PMID:25835186

  13. Fault diagnosis of automobile hydraulic brake system using statistical features and support vector machines

    Science.gov (United States)

    Jegadeeshwaran, R.; Sugumaran, V.

    2015-02-01

    Hydraulic brakes in automobiles are important components for the safety of passengers; therefore, the brakes are a good subject for condition monitoring. The condition of the brake components can be monitored by using the vibration characteristics. On-line condition monitoring by using machine learning approach is proposed in this paper as a possible solution to such problems. The vibration signals for both good as well as faulty conditions of brakes were acquired from a hydraulic brake test setup with the help of a piezoelectric transducer and a data acquisition system. Descriptive statistical features were extracted from the acquired vibration signals and the feature selection was carried out using the C4.5 decision tree algorithm. There is no specific method to find the right number of features required for classification for a given problem. Hence an extensive study is needed to find the optimum number of features. The effect of the number of features was also studied, by using the decision tree as well as Support Vector Machines (SVM). The selected features were classified using the C-SVM and Nu-SVM with different kernel functions. The results are discussed and the conclusion of the study is presented.

  14. Support vector machines for prediction and analysis of beta and gamma-turns in proteins.

    Science.gov (United States)

    Pham, Tho Hoan; Satou, Kenji; Ho, Tu Bao

    2005-04-01

    Tight turns have long been recognized as one of the three important features of proteins, together with alpha-helix and beta-sheet. Tight turns play an important role in globular proteins from both the structural and functional points of view. More than 90% tight turns are beta-turns and most of the rest are gamma-turns. Analysis and prediction of beta-turns and gamma-turns is very useful for design of new molecules such as drugs, pesticides, and antigens. In this paper we investigated two aspects of applying support vector machine (SVM), a promising machine learning method for bioinformatics, to prediction and analysis of beta-turns and gamma-turns. First, we developed two SVM-based methods, called BTSVM and GTSVM, which predict beta-turns and gamma-turns in a protein from its sequence. When compared with other methods, BTSVM has a superior performance and GTSVM is competitive. Second, we used SVMs with a linear kernel to estimate the support of amino acids for the formation of beta-turns and gamma-turns depending on their position in a protein. Our analysis results are more comprehensive and easier to use than the previous results in designing turns in proteins.

  15. A Support Vector Machine Approach for Truncated Fingerprint Image Detection from Sweeping Fingerprint Sensors

    Directory of Open Access Journals (Sweden)

    Chi-Jim Chen

    2015-03-01

    Full Text Available A sweeping fingerprint sensor converts fingerprints on a row by row basis through image reconstruction techniques. However, a built fingerprint image might appear to be truncated and distorted when the finger was swept across a fingerprint sensor at a non-linear speed. If the truncated fingerprint images were enrolled as reference targets and collected by any automated fingerprint identification system (AFIS, successful prediction rates for fingerprint matching applications would be decreased significantly. In this paper, a novel and effective methodology with low time computational complexity was developed for detecting truncated fingerprints in a real time manner. Several filtering rules were implemented to validate existences of truncated fingerprints. In addition, a machine learning method of supported vector machine (SVM, based on the principle of structural risk minimization, was applied to reject pseudo truncated fingerprints containing similar characteristics of truncated ones. The experimental result has shown that an accuracy rate of 90.7% was achieved by successfully identifying truncated fingerprint images from testing images before AFIS enrollment procedures. The proposed effective and efficient methodology can be extensively applied to all existing fingerprint matching systems as a preliminary quality control prior to construction of fingerprint templates.

  16. Stroke localization and classification using microwave tomography with k-means clustering and support vector machine.

    Science.gov (United States)

    Guo, Lei; Abbosh, Amin

    2018-05-01

    For any chance for stroke patients to survive, the stroke type should be classified to enable giving medication within a few hours of the onset of symptoms. In this paper, a microwave-based stroke localization and classification framework is proposed. It is based on microwave tomography, k-means clustering, and a support vector machine (SVM) method. The dielectric profile of the brain is first calculated using the Born iterative method, whereas the amplitude of the dielectric profile is then taken as the input to k-means clustering. The cluster is selected as the feature vector for constructing and testing the SVM. A database of MRI-derived realistic head phantoms at different signal-to-noise ratios is used in the classification procedure. The performance of the proposed framework is evaluated using the receiver operating characteristic (ROC) curve. The results based on a two-dimensional framework show that 88% classification accuracy, with a sensitivity of 91% and a specificity of 87%, can be achieved. Bioelectromagnetics. 39:312-324, 2018. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.

  17. Settlement Prediction of Road Soft Foundation Using a Support Vector Machine (SVM Based on Measured Data

    Directory of Open Access Journals (Sweden)

    Yu Huiling

    2016-01-01

    Full Text Available The suppor1t vector machine (SVM is a relatively new artificial intelligence technique which is increasingly being applied to geotechnical problems and is yielding encouraging results. SVM is a new machine learning method based on the statistical learning theory. A case study based on road foundation engineering project shows that the forecast results are in good agreement with the measured data. The SVM model is also compared with BP artificial neural network model and traditional hyperbola method. The prediction results indicate that the SVM model has a better prediction ability than BP neural network model and hyperbola method. Therefore, settlement prediction based on SVM model can reflect actual settlement process more correctly. The results indicate that it is effective and feasible to use this method and the nonlinear mapping relation between foundation settlement and its influence factor can be expressed well. It will provide a new method to predict foundation settlement.

  18. Estimating chlorophyll with thermal and broadband multispectral high resolution imagery from an unmanned aerial system using relevance vector machines for precision agriculture

    Science.gov (United States)

    Elarab, Manal; Ticlavilca, Andres M.; Torres-Rua, Alfonso F.; Maslova, Inga; McKee, Mac

    2015-12-01

    Precision agriculture requires high-resolution information to enable greater precision in the management of inputs to production. Actionable information about crop and field status must be acquired at high spatial resolution and at a temporal frequency appropriate for timely responses. In this study, high spatial resolution imagery was obtained through the use of a small, unmanned aerial system called AggieAirTM. Simultaneously with the AggieAir flights, intensive ground sampling for plant chlorophyll was conducted at precisely determined locations. This study reports the application of a relevance vector machine coupled with cross validation and backward elimination to a dataset composed of reflectance from high-resolution multi-spectral imagery (VIS-NIR), thermal infrared imagery, and vegetative indices, in conjunction with in situ SPAD measurements from which chlorophyll concentrations were derived, to estimate chlorophyll concentration from remotely sensed data at 15-cm resolution. The results indicate that a relevance vector machine with a thin plate spline kernel type and kernel width of 5.4, having LAI, NDVI, thermal and red bands as the selected set of inputs, can be used to spatially estimate chlorophyll concentration with a root-mean-squared-error of 5.31 μg cm-2, efficiency of 0.76, and 9 relevance vectors.

  19. Modeling and Forecast Biological Oxygen Demand (BOD using Combination Support Vector Machine with Wavelet Transform

    Directory of Open Access Journals (Sweden)

    Abazar Solgi

    2017-06-01

    Full Text Available Introduction: Chemical pollution of surface water is one of the serious issues that threaten the quality of water. This would be more important when the surface waters used for human drinking supply. One of the key parameters used to measure water pollution is BOD. Because many variables affect the water quality parameters and a complex nonlinear relationship between them is established conventional methods can not solve the problem of quality management of water resources. For years, the Artificial Intelligence methods were used for prediction of nonlinear time series and a good performance of them has been reported. Recently, the wavelet transform that is a signal processing method, has shown good performance in hydrological modeling and is widely used. Extensive research has been globally provided in use of Artificial Neural Network and Adaptive Neural Fuzzy Inference System models to forecast the BOD. But support vector machine has not yet been extensively studied. For this purpose, in this study the ability of support vector machine to predict the monthly BOD parameter based on the available data, temperature, river flow, DO and BOD was evaluated. Materials and Methods: SVM was introduced in 1992 by Vapnik that was a Russian mathematician. This method has been built based on the statistical learning theory. In recent years the use of SVM, is highly taken into consideration. SVM was used in applications such as handwriting recognition, face recognition and has good results. Linear SVM is simplest type of SVM, consists of a hyperplane that dataset of positive and negative is separated with maximum distance. The suitable separator has maximum distance from every one of two dataset. So about this machine that its output groups label (here -1 to +1, the aim is to obtain the maximum distance between categories. This is interpreted to have a maximum margin. Wavelet transform is one of methods in the mathematical science that its main idea was

  20. Support vector machine based estimation of remaining useful life: current research status and future trends

    International Nuclear Information System (INIS)

    Huang, Hong Zhong; Wang, Hai Kun; Li, Yan Feng; Zhang, Longlong; Liu, Zhiliang

    2015-01-01

    Estimation of remaining useful life (RUL) is helpful to manage life cycles of machines and to reduce maintenance cost. Support vector machine (SVM) is a promising algorithm for estimation of RUL because it can easily process small training sets and multi-dimensional data. Many SVM based methods have been proposed to predict RUL of some key components. We did a literature review related to SVM based RUL estimation within a decade. The references reviewed are classified into two categories: improved SVM algorithms and their applications to RUL estimation. The latter category can be further divided into two types: one, to predict the condition state in the future and then build a relationship between state and RUL; two, to establish a direct relationship between current state and RUL. However, SVM is seldom used to track the degradation process and build an accurate relationship between the current health condition state and RUL. Based on the above review and summary, this paper points out that the ability to continually improve SVM, and obtain a novel idea for RUL prediction using SVM will be future works.