Deep Support Vector Machines for Regression Problems
Wiering, Marco; Schutten, Marten; Millea, Adrian; Meijster, Arnold; Schomaker, Lambertus
2013-01-01
In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain
Phase Space Prediction of Chaotic Time Series with Nu-Support Vector Machine Regression
International Nuclear Information System (INIS)
Ye Meiying; Wang Xiaodong
2005-01-01
A new class of support vector machine, nu-support vector machine, is discussed which can handle both classification and regression. We focus on nu-support vector machine regression and use it for phase space prediction of chaotic time series. The effectiveness of the method is demonstrated by applying it to the Henon map. This study also compares nu-support vector machine with back propagation (BP) networks in order to better evaluate the performance of the proposed methods. The experimental results show that the nu-support vector machine regression obtains lower root mean squared error than the BP networks and provides an accurate chaotic time series prediction. These results can be attributable to the fact that nu-support vector machine implements the structural risk minimization principle and this leads to better generalization than the BP networks.
Zhou, Lim Yi; Shan, Fam Pei; Shimizu, Kunio; Imoto, Tomoaki; Lateh, Habibah; Peng, Koay Swee
2017-08-01
A comparative study of logistic regression, support vector machine (SVM) and least square support vector machine (LSSVM) models has been done to predict the slope failure (landslide) along East-West Highway (Gerik-Jeli). The effects of two monsoon seasons (southwest and northeast) that occur in Malaysia are considered in this study. Two related factors of occurrence of slope failure are included in this study: rainfall and underground water. For each method, two predictive models are constructed, namely SOUTHWEST and NORTHEAST models. Based on the results obtained from logistic regression models, two factors (rainfall and underground water level) contribute to the occurrence of slope failure. The accuracies of the three statistical models for two monsoon seasons are verified by using Relative Operating Characteristics curves. The validation results showed that all models produced prediction of high accuracy. For the results of SVM and LSSVM, the models using RBF kernel showed better prediction compared to the models using linear kernel. The comparative results showed that, for SOUTHWEST models, three statistical models have relatively similar performance. For NORTHEAST models, logistic regression has the best predictive efficiency whereas the SVM model has the second best predictive efficiency.
Digital Repository Service at National Institute of Oceanography (India)
Patil, S.G.; Mandal, S.; Hegde, A; Muruganandam, A
, it is noticed that one particular model in isolation cannot capture all data patterns easily. In the present paper, a hybrid genetic algorithm tuned support vector machine regression (HGASVMR) model was developed to predict wave transmission of horizontally...
Optimization of Filter by using Support Vector Regression Machine with Cuckoo Search Algorithm
İlarslan, M.; Demirel, S.; Torpi, H.; Keskin, A. K.; Çağlar, M. F.
2014-01-01
Herein, a new methodology using a 3D Electromagnetic (EM) simulator-based Support Vector Regression Machine (SVRM) models of base elements is presented for band-pass filter (BPF) design. SVRM models of elements, which are as fast as analytical equations and as accurate as a 3D EM simulator, are employed in a simple and efficient Cuckoo Search Algorithm (CSA) to optimize an ultra-wideband (UWB) microstrip BPF. CSA performance is verified by comparing it with other Meta-Heuristics such as Genet...
Directory of Open Access Journals (Sweden)
Suduan Chen
2014-01-01
Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
International Nuclear Information System (INIS)
Baser, Furkan; Demirhan, Haydar
2017-01-01
Accurate estimation of the amount of horizontal global solar radiation for a particular field is an important input for decision processes in solar radiation investments. In this article, we focus on the estimation of yearly mean daily horizontal global solar radiation by using an approach that utilizes fuzzy regression functions with support vector machine (FRF-SVM). This approach is not seriously affected by outlier observations and does not suffer from the over-fitting problem. To demonstrate the utility of the FRF-SVM approach in the estimation of horizontal global solar radiation, we conduct an empirical study over a dataset collected in Turkey and applied the FRF-SVM approach with several kernel functions. Then, we compare the estimation accuracy of the FRF-SVM approach to an adaptive neuro-fuzzy system and a coplot supported-genetic programming approach. We observe that the FRF-SVM approach with a Gaussian kernel function is not affected by both outliers and over-fitting problem and gives the most accurate estimates of horizontal global solar radiation among the applied approaches. Consequently, the use of hybrid fuzzy functions and support vector machine approaches is found beneficial in long-term forecasting of horizontal global solar radiation over a region with complex climatic and terrestrial characteristics. - Highlights: • A fuzzy regression functions with support vector machines approach is proposed. • The approach is robust against outlier observations and over-fitting problem. • Estimation accuracy of the model is superior to several existent alternatives. • A new solar radiation estimation model is proposed for the region of Turkey. • The model is useful under complex terrestrial and climatic conditions.
Prediction of sweetness by multilinear regression analysis and support vector machine.
Zhong, Min; Chong, Yang; Nie, Xianglei; Yan, Aixia; Yuan, Qipeng
2013-09-01
The sweetness of a compound is of large interest for the food additive industry. In this work, 2 quantitative models were built to predict the logSw (the logarithm of sweetness) of 320 unique compounds with a molecular weight from 132 to 1287 and a sweetness from 22 to 22500000. The whole dataset was randomly split into a training set including 214 compounds and a test set including 106 compounds, represented by 12 selected molecular descriptors. Then, logSw was predicted using a multilinear regression (MLR) analysis and a support vector machine (SVM). For the test set, the correlation coefficients of 0.87 and 0.88 were obtained by MLR and SVM, respectively. The descriptors found in our quantitative structure-activity relationship models are prone to a structural interpretation and support the AH/B System model proposed by Shallenberger and Acree. © 2013 Institute of Food Technologists®
Optimization of Filter by using Support Vector Regression Machine with Cuckoo Search Algorithm
Directory of Open Access Journals (Sweden)
M. İlarslan
2014-09-01
Full Text Available Herein, a new methodology using a 3D Electromagnetic (EM simulator-based Support Vector Regression Machine (SVRM models of base elements is presented for band-pass filter (BPF design. SVRM models of elements, which are as fast as analytical equations and as accurate as a 3D EM simulator, are employed in a simple and efficient Cuckoo Search Algorithm (CSA to optimize an ultra-wideband (UWB microstrip BPF. CSA performance is verified by comparing it with other Meta-Heuristics such as Genetic Algorithm (GA and Particle Swarm Optimization (PSO. As an example of the proposed design methodology, an UWB BPF that operates between the frequencies of 3.1 GHz and 10.6 GHz is designed, fabricated and measured. The simulation and measurement results indicate in conclusion the superior performance of this optimization methodology in terms of improved filter response characteristics like return loss, insertion loss, harmonic suppression and group delay.
Failure and reliability prediction by support vector machines regression of time series data
International Nuclear Information System (INIS)
Chagas Moura, Marcio das; Zio, Enrico; Lins, Isis Didier; Droguett, Enrique
2011-01-01
Support Vector Machines (SVMs) are kernel-based learning methods, which have been successfully adopted for regression problems. However, their use in reliability applications has not been widely explored. In this paper, a comparative analysis is presented in order to evaluate the SVM effectiveness in forecasting time-to-failure and reliability of engineered components based on time series data. The performance on literature case studies of SVM regression is measured against other advanced learning methods such as the Radial Basis Function, the traditional MultiLayer Perceptron model, Box-Jenkins autoregressive-integrated-moving average and the Infinite Impulse Response Locally Recurrent Neural Networks. The comparison shows that in the analyzed cases, SVM outperforms or is comparable to other techniques. - Highlights: → Realistic modeling of reliability demands complex mathematical formulations. → SVM is proper when the relation input/output is unknown or very costly to be obtained. → Results indicate the potential of SVM for reliability time series prediction. → Reliability estimates support the establishment of adequate maintenance strategies.
Directory of Open Access Journals (Sweden)
Co Jan Miles
2016-01-01
Full Text Available Predicting links between the nodes of a graph has become an important Data Mining task because of its direct applications to biology, social networking, communication surveillance, and other domains. Recent literature in time-series link prediction has shown that the Vector Auto Regression (VAR technique is one of the most accurate for this problem. In this study, we apply Support Vector Machine (SVM to improve the VAR technique that uses an unweighted adjacency matrix along with 5 matrices: Common Neighbor (CN, Adamic-Adar (AA, Jaccard’s Coefficient (JC, Preferential Attachment (PA, and Research Allocation Index (RA. A DBLP dataset covering the years from 2003 until 2013 was collected and transformed into time-sliced graph representations. The appropriate matrices were computed from these graphs, mapped to the feature space, and then used to build baseline VAR models with lag of 2 and some corresponding SVM classifiers. Using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC as the main fitness metric, the average result of 82.04% for the VAR was improved to 84.78% with SVM. Additional experiments to handle the highly imbalanced dataset by oversampling with SMOTE and undersampling with K-means clusters, however, did not improve the average AUC-ROC of the baseline SVM.
Near Real-Time Dust Aerosol Detection with Support Vector Machines for Regression
Rivas-Perea, P.; Rivas-Perea, P. E.; Cota-Ruiz, J.; Aragon Franco, R. A.
2015-12-01
Remote sensing instruments operating in the near-infrared spectrum usually provide the necessary information for further dust aerosol spectral analysis using statistical or machine learning algorithms. Such algorithms have proven to be effective in analyzing very specific case studies or dust events. However, very few make the analysis open to the public on a regular basis, fewer are designed specifically to operate in near real-time to higher resolutions, and almost none give a global daily coverage. In this research we investigated a large-scale approach to a machine learning algorithm called "support vector regression". The algorithm uses four near-infrared spectral bands from NASA MODIS instrument: B20 (3.66-3.84μm), B29 (8.40-8.70μm), B31 (10.78-11.28μm), and B32 (11.77-12.27μm). The algorithm is presented with ground truth from more than 30 distinct reported dust events, from different geographical regions, at different seasons, both over land and sea cover, in the presence of clouds and clear sky, and in the presence of fires. The purpose of our algorithm is to learn to distinguish the dust aerosols spectral signature from other spectral signatures, providing as output an estimate of the probability of a data point being consistent with dust aerosol signatures. During modeling with ground truth, our algorithm achieved more than 90% of accuracy, and the current live performance of the algorithm is remarkable. Moreover, our algorithm is currently operating in near real-time using NASA's Land, Atmosphere Near real-time Capability for EOS (LANCE) servers, providing a high resolution global overview including 64, 32, 16, 8, 4, 2, and 1km. The near real-time analysis of our algorithm is now available to the general public at http://dust.reev.us and archives of the results starting from 2012 are available upon request.
Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan
2017-07-01
Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.
Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression
Directory of Open Access Journals (Sweden)
Ali Asghar Besalatpour
2016-02-01
Full Text Available Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010. Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA, mean weight diameter (MWD, geometric mean diameter (GMD of aggregates, and water-dispersible clay (WDC content (Calero et al., 2008. Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010. Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014. The main objective of this study is to investigate the potential use of support vector machines (SVMs method for estimating soil aggregate stability (as quantified by GMD as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E, which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM content was determined by the Walkley-Black method (Nelson & Sommers, 1986. Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand were measured using the procedure described by Gee & Bauder (1986 and calcium carbonate equivalent (CCE content was determined by the back-titration method (Nelson, 1982. The modified Kemper & Rosenau (1986 method was used to determine wet-aggregate stability (GMD. The topographic attributes of elevation, slope, and aspect were characterized using a 20-m
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Directory of Open Access Journals (Sweden)
Rachid Darnag
2017-02-01
Full Text Available Support vector machines (SVM represent one of the most promising Machine Learning (ML tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR models using molecular descriptors. Multiple linear regression (MLR and artificial neural networks (ANNs were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure–activity relationships was evaluated.
International Nuclear Information System (INIS)
Sun Zhong-Hua; Jiang Fan
2010-01-01
In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method. (rapid communication)
The Neural Support Vector Machine
Wiering, Marco; van der Ree, Michiel; Embrechts, Mark; Stollenga, Marijn; Meijster, Arnold; Nolte, A; Schomaker, Lambertus
2013-01-01
This paper describes a new machine learning algorithm for regression and dimensionality reduction tasks. The Neural Support Vector Machine (NSVM) is a hybrid learning algorithm consisting of neural networks and support vector machines (SVMs). The output of the NSVM is given by SVMs that take a
Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler
2013-02-01
The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.
Directory of Open Access Journals (Sweden)
Danilo A. López-Sarmiento
2013-11-01
Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.
Oguntunde, Philip G; Lischeid, Gunnar; Dietrich, Ottfried
2018-03-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease (P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried
2018-03-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Support vector machines applications
Guo, Guodong
2014-01-01
Support vector machines (SVM) have both a solid mathematical background and good performance in practical applications. This book focuses on the recent advances and applications of the SVM in different areas, such as image processing, medical practice, computer vision, pattern recognition, machine learning, applied statistics, business intelligence, and artificial intelligence. The aim of this book is to create a comprehensive source on support vector machine applications, especially some recent advances.
Directory of Open Access Journals (Sweden)
Peek Andrew S
2007-06-01
Full Text Available Abstract Background RNA interference (RNAi is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid
Heddam, Salim; Kisi, Ozgur
2018-04-01
In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.
Directory of Open Access Journals (Sweden)
Fereshteh Shiri
2010-08-01
Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.
Directory of Open Access Journals (Sweden)
Mehmet Das
2018-01-01
Full Text Available In this study, an air heated solar collector (AHSC dryer was designed to determine the drying characteristics of the pear. Flat pear slices of 10 mm thickness were used in the experiments. The pears were dried both in the AHSC dryer and under the sun. Panel glass temperature, panel floor temperature, panel inlet temperature, panel outlet temperature, drying cabinet inlet temperature, drying cabinet outlet temperature, drying cabinet temperature, drying cabinet moisture, solar radiation, pear internal temperature, air velocity and mass loss of pear were measured at 30 min intervals. Experiments were carried out during the periods of June 2017 in Elazig, Turkey. The experiments started at 8:00 a.m. and continued till 18:00. The experiments were continued until the weight changes in the pear slices stopped. Wet basis moisture content (MCw, dry basis moisture content (MCd, adjustable moisture ratio (MR, drying rate (DR, and convective heat transfer coefficient (hc were calculated with both in the AHSC dryer and the open sun drying experiment data. It was found that the values of hc in both drying systems with a range 12.4 and 20.8 W/m2 °C. Three different kernel models were used in the support vector machine (SVM regression to construct the predictive model of the calculated hc values for both systems. The mean absolute error (MAE, root mean squared error (RMSE, relative absolute error (RAE and root relative absolute error (RRAE analysis were performed to indicate the predictive model’s accuracy. As a result, the rate of drying of the pear was examined for both systems and it was observed that the pear had dried earlier in the AHSC drying system. A predictive model was obtained using the SVM regression for the calculated hc values for the pear in the AHSC drying system. The normalized polynomial kernel was determined as the best kernel model in SVM for estimating the hc values.
Abu Awad, Yara; Koutrakis, Petros; Coull, Brent A; Schwartz, Joel
2017-11-01
Fine ambient particulate matter has been widely associated with multiple health effects. Mitigation hinges on understanding which sources are contributing to its toxicity. Black Carbon (BC), an indicator of particles generated from traffic sources, has been associated with a number of health effects however due to its high spatial variability, its concentration is difficult to estimate. We previously fit a model estimating BC concentrations in the greater Boston area; however this model was built using limited monitoring data and could not capture the complex spatio-temporal patterns of ambient BC. In order to improve our predictive ability, we obtained more data for a total of 24,301 measurements from 368 monitors over a 12 year period in Massachusetts, Rhode Island and New Hampshire. We also used Nu-Support Vector Regression (nu-SVR) - a machine learning technique which incorporates nonlinear terms and higher order interactions, with appropriate regularization of parameter estimates. We then used a generalized additive model to refit the residuals from the nu-SVR and added the residual predictions to our earlier estimates. Both spatial and temporal predictors were included in the model which allowed us to capture the change in spatial patterns of BC over time. The 10 fold cross validated (CV) R 2 of the model was good in both cold (10-fold CV R 2 = 0.87) and warm seasons (CV R 2 = 0.79). We have successfully built a model that can be used to estimate short and long-term exposures to BC and will be useful for studies looking at various health outcomes in MA, RI and Southern NH. Copyright © 2017 Elsevier Inc. All rights reserved.
Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.
2017-02-01
Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the
Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos
2016-08-01
This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.
Support Vector Regression with Interval-Input Interval-Output
Directory of Open Access Journals (Sweden)
Wensen An
2008-12-01
Full Text Available Support vector machines (classification and regression are powerful machine learning techniques for crisp data. In this paper, the problem is considered for interval data. Two methods to deal with the problem using support vector regression are proposed and two new methods for evaluating performance for estimating prediction interval are presented as well.
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Leone, Robert Matthew
A search for vector-like quarks (VLQs) decaying to a Z boson using multi-stage machine learning was compared to a search using a standard square cuts search strategy. VLQs are predicted by several new theories beyond the Standard Model. The searches used 20.3 inverse femtobarns of proton-proton collisions at a center-of-mass energy of 8 TeV collected with the ATLAS detector in 2012 at the CERN Large Hadron Collider. CLs upper limits on production cross sections of vector-like top and bottom quarks were computed for VLQs produced singly or in pairs, Tsingle, Bsingle, Tpair, and Bpair. The two stage machine learning classification search strategy did not provide any improvement over the standard square cuts strategy, but for Tpair, Bpair, and Tsingle, a third stage of machine learning regression was able to lower the upper limits of high signal masses by as much as 50%. Additionally, new test statistics were developed for use in the Neyman construction of confidence regions in order to address deficiencies in c...
Yan, Jun; Huang, Jian-Hua; He, Min; Lu, Hong-Bing; Yang, Rui; Kong, Bo; Xu, Qing-Song; Liang, Yi-Zeng
2013-08-01
Retention indices for frequently reported compounds of plant essential oils on three different stationary phases were investigated. Multivariate linear regression, partial least squares, and support vector machine combined with a new variable selection approach called random-frog recently proposed by our group, were employed to model quantitative structure-retention relationships. Internal and external validations were performed to ensure the stability and predictive ability. All the three methods could obtain an acceptable model, and the optimal results by support vector machine based on a small number of informative descriptors with the square of correlation coefficient for cross validation, values of 0.9726, 0.9759, and 0.9331 on the dimethylsilicone stationary phase, the dimethylsilicone phase with 5% phenyl groups, and the PEG stationary phase, respectively. The performances of two variable selection approaches, random-frog and genetic algorithm, are compared. The importance of the variables was found to be consistent when estimated from correlation coefficients in multivariate linear regression equations and selection probability in model spaces. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Active set support vector regression.
Musicant, David R; Feinberg, Alexander
2004-03-01
This paper presents active set support vector regression (ASVR), a new active set strategy to solve a straightforward reformulation of the standard support vector regression problem. This new algorithm is based on the successful ASVM algorithm for classification problems, and consists of solving a finite number of linear equations with a typically large dimensionality equal to the number of points to be approximated. However, by making use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available. ASVR is extremely fast, produces comparable generalization error to other popular algorithms, and is available on the web for download.
Vector control of induction machines
Robyns, Benoit
2012-01-01
After a brief introduction to the main law of physics and fundamental concepts inherent in electromechanical conversion, ""Vector Control of Induction Machines"" introduces the standard mathematical models for induction machines - whichever rotor technology is used - as well as several squirrel-cage induction machine vector-control strategies. The use of causal ordering graphs allows systematization of the design stage, as well as standardization of the structure of control devices. ""Vector Control of Induction Machines"" suggests a unique approach aimed at reducing parameter sensitivity for
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Learning with Support Vector Machines
Campbell, Colin
2010-01-01
Support Vectors Machines have become a well established tool within machine learning. They work well in practice and have now been used across a wide range of applications from recognizing hand-written digits, to face identification, text categorisation, bioinformatics, and database marketing. In this book we give an introductory overview of this subject. We start with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise. We show that this framework can be extended to many other scenarios such a
Qin, Zijian; Wang, Maolin; Yan, Aixia
2017-07-01
In this study, quantitative structure-activity relationship (QSAR) models using various descriptor sets and training/test set selection methods were explored to predict the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by using a multiple linear regression (MLR) and a support vector machine (SVM) method. 512 HCV NS3/4A protease inhibitors and their IC 50 values which were determined by the same FRET assay were collected from the reported literature to build a dataset. All the inhibitors were represented with selected nine global and 12 2D property-weighted autocorrelation descriptors calculated from the program CORINA Symphony. The dataset was divided into a training set and a test set by a random and a Kohonen's self-organizing map (SOM) method. The correlation coefficients (r 2 ) of training sets and test sets were 0.75 and 0.72 for the best MLR model, 0.87 and 0.85 for the best SVM model, respectively. In addition, a series of sub-dataset models were also developed. The performances of all the best sub-dataset models were better than those of the whole dataset models. We believe that the combination of the best sub- and whole dataset SVM models can be used as reliable lead designing tools for new NS3/4A protease inhibitors scaffolds in a drug discovery pipeline. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kisi, Ozgur; Parmar, Kulwinder Singh
2016-03-01
This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.
Valente, Giancarlo; De Martino, Federico; Esposito, Fabrizio; Goebel, Rainer; Formisano, Elia
2011-05-15
In this work we illustrate the approach of the Maastricht Brain Imaging Center to the PBAIC 2007 competition, where participants had to predict, based on fMRI measurements of brain activity, subject driven actions and sensory experience in a virtual world. After standard pre-processing (slice scan time correction, motion correction), we generated rating predictions based on linear Relevance Vector Machine (RVM) learning from all brain voxels. Spatial and temporal filtering of the time series was optimized rating by rating. For some of the ratings (e.g. Instructions, Hits, Faces, Velocity), linear RVM regression was accurate and very consistent within and between subjects. For other ratings (e.g. Arousal, Valence) results were less satisfactory. Our approach ranked overall second. To investigate the role of different brain regions in ratings prediction we generated predictive maps, i.e. maps of the weighted contribution of each voxel to the predicted rating. These maps generally included (but were not limited to) "specialized" regions which are consistent with results from conventional neuroimaging studies and known functional neuroanatomy. In conclusion, Sparse Bayesian Learning models, such as RVM, appear to be a valuable approach to the multivariate regression of fMRI time series. The implementation of the Automatic Relevance Determination criterion is particularly suitable and provides a good generalization, despite the limited number of samples which is typically available in fMRI. Predictive maps allow disclosing multi-voxel patterns of brain activity that predict perceptual and behavioral subjective experience. Copyright © 2010 Elsevier Inc. All rights reserved.
Balabin, Roman M; Lomakina, Ekaterina I
2011-06-28
A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is
Directory of Open Access Journals (Sweden)
Zhenliang Yin
2017-11-01
Full Text Available This study aims to project future variability of reference evapotranspiration (ET0 using artificial intelligence methods, constructed with an extreme-learning machine (ELM and support vector regression (SVR in a mountainous inland watershed in north-west China. Eight global climate model (GCM outputs retrieved from the Coupled Model Inter-comparison Project Phase 5 (CMIP5 were employed to downscale monthly ET0 for the historical period 1960–2005 as a validation approach and for the future period 2010–2099 as a projection of ET0 under the Representative Concentration Pathway (RCP 4.5 and 8.5 scenarios. The following conclusions can be drawn: the ELM and SVR methods demonstrate a very good performance in estimating Food and Agriculture Organization (FAO-56 Penman–Monteith ET0. Variation in future ET0 mainly occurs in the spring and autumn seasons, while the summer and winter ET0 changes are moderately small. Annually, the ET0 values were shown to increase at a rate of approximately 7.5 mm, 7.5 mm, 0.0 mm (8.2 mm, 15.0 mm, 15.0 mm decade−1, respectively, for the near-term projection (2010–2039, mid-term projection (2040–2069, and long-term projection (2070–2099 under the RCP4.5 (RCP8.5 scenario. Compared to the historical period, the relative changes in ET0 were found to be approximately 2%, 5% and 6% (2%, 7% and 13%, during the near, mid- and long-term periods, respectively, under the RCP4.5 (RCP8.5 warming scenarios. In accordance with the analyses, we aver that the opportunity to downscale monthly ET0 with artificial intelligence is useful in practice for water-management policies.
Directory of Open Access Journals (Sweden)
Roberto Romaniello
2015-12-01
Full Text Available The aim of this work is to evaluate the potential of least squares support vector machine (LS-SVM regression to develop an efficient method to measure the colour of food materials in L*a*b* units by means of a computer vision systems (CVS. A laboratory CVS, based on colour digital camera (CDC, was implemented and three LS-SVM models were trained and validated, one for each output variables (L*, a*, and b* required by this problem, using the RGB signals generated by the CDC as input variables to these models. The colour target-based approach was used to camera characterization and a standard reference target of 242 colour samples was acquired using the CVS and a colorimeter. This data set was split in two sets of equal sizes, for training and validating the LS-SVM models. An effective two-stage grid search process on the parameters space was performed in MATLAB to tune the regularization parameters γ and the kernel parameters σ2 of the three LS-SVM models. A 3-8-3 multilayer feed-forward neural network (MFNN, according to the research conducted by León et al. (2006, was also trained in order to compare its performance with those of LS-SVM models. The LS-SVM models developed in this research have been shown better generalization capability then the MFNN, allowed to obtain high correlations between L*a*b* data acquired using the colorimeter and the corresponding data obtained by transformation of the RGB data acquired by the CVS. In particular, for the validation set, R2 values equal to 0.9989, 0.9987, and 0.9994 for L*, a* and b* parameters were obtained. The root mean square error values were 0.6443, 0.3226, and 0.2702 for L*, a*, and b* respectively, and the average of colour differences ΔEab was 0.8232±0.5033 units. Thus, LS-SVM regression seems to be a useful tool to measurement of food colour using a low cost CVS.
Support vector machines optimization based theory, algorithms, and extensions
Deng, Naiyang; Zhang, Chunhua
2013-01-01
Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)-classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built.The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twi
Directory of Open Access Journals (Sweden)
Santana Isabel
2011-08-01
Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.
Clustering Categories in Support Vector Machines
DEFF Research Database (Denmark)
Carrizosa, Emilio; Nogales-Gómez, Amaya; Morales, Dolores Romero
2017-01-01
The support vector machine (SVM) is a state-of-the-art method in supervised classification. In this paper the Cluster Support Vector Machine (CLSVM) methodology is proposed with the aim to increase the sparsity of the SVM classifier in the presence of categorical features, leading to a gain...
Prediction of Machine Tool Condition Using Support Vector Machine
Wang, Peigong; Meng, Qingfeng; Zhao, Jian; Li, Junjie; Wang, Xiufeng
2011-07-01
Condition monitoring and predicting of CNC machine tools are investigated in this paper. Considering the CNC machine tools are often small numbers of samples, a condition predicting method for CNC machine tools based on support vector machines (SVMs) is proposed, then one-step and multi-step condition prediction models are constructed. The support vector machines prediction models are used to predict the trends of working condition of a certain type of CNC worm wheel and gear grinding machine by applying sequence data of vibration signal, which is collected during machine processing. And the relationship between different eigenvalue in CNC vibration signal and machining quality is discussed. The test result shows that the trend of vibration signal Peak-to-peak value in surface normal direction is most relevant to the trend of surface roughness value. In trends prediction of working condition, support vector machine has higher prediction accuracy both in the short term ('One-step') and long term (multi-step) prediction compared to autoregressive (AR) model and the RBF neural network. Experimental results show that it is feasible to apply support vector machine to CNC machine tool condition prediction.
Image superresolution using support vector regression.
Ni, Karl S; Nguyen, Truong Q
2007-06-01
A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.
Cascade Support Vector Machines with Dimensionality Reduction
Directory of Open Access Journals (Sweden)
Oliver Kramer
2015-01-01
Full Text Available Cascade support vector machines have been introduced as extension of classic support vector machines that allow a fast training on large data sets. In this work, we combine cascade support vector machines with dimensionality reduction based preprocessing. The cascade principle allows fast learning based on the division of the training set into subsets and the union of cascade learning results based on support vectors in each cascade level. The combination with dimensionality reduction as preprocessing results in a significant speedup, often without loss of classifier accuracies, while considering the high-dimensional pendants of the low-dimensional support vectors in each new cascade level. We analyze and compare various instantiations of dimensionality reduction preprocessing and cascade SVMs with principal component analysis, locally linear embedding, and isometric mapping. The experimental analysis on various artificial and real-world benchmark problems includes various cascade specific parameters like intermediate training set sizes and dimensionalities.
Support vector machine for diagnosis cancer disease: A comparative study
Directory of Open Access Journals (Sweden)
Nasser H. Sweilam
2010-12-01
Full Text Available Support vector machine has become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. Training a support vector machine requires the solution of a very large quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, several approaches exist for circumventing the above shortcomings and work well. Another learning algorithm, particle swarm optimization, Quantum-behave Particle Swarm for training SVM is introduced. Another approach named least square support vector machine (LSSVM and active set strategy are introduced. The obtained results by these methods are tested on a breast cancer dataset and compared with the exact solution model problem.
Twin support vector machines models, extensions and applications
Jayadeva; Chandra, Suresh
2017-01-01
This book provides a systematic and focused study of the various aspects of twin support vector machines (TWSVM) and related developments for classification and regression. In addition to presenting most of the basic models of TWSVM and twin support vector regression (TWSVR) available in the literature, it also discusses the important and challenging applications of this new machine learning methodology. A chapter on “Additional Topics” has been included to discuss kernel optimization and support tensor machine topics, which are comparatively new but have great potential in applications. It is primarily written for graduate students and researchers in the area of machine learning and related topics in computer science, mathematics, electrical engineering, management science and finance.
Data classification using Support vector Machine (SVM), a simplified approach
S Amarappa; Dr. S V Sathyanarayana
2014-01-01
In all our day to day activities we will be classifying things based on situations and on our needs. Human beings do classification of any kind by their natural perception. Classifying data is a common task in machine learning which requires artificial intelligence. Support vector Machine (SVM) is a new technique suitable for binary classification tasks. SVMs are a set of supervised learning methods used for classification, regression and outliers detection. The SVM classifiers work for both...
Support vector machines classifiers of physical activities in preschoolers
The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...
The efficacy of support vector machines (SVM) in robust ...
Indian Academy of Sciences (India)
The efficacy of support vector machines (SVM) in robust determination of earthquake early warning magnitudes in central Japan. Ramakrushna Reddy and Rajesh R Nair ..... SVMs were developed to solve the classification problem. However, recently, SVMs have been suc- cessfully extended to regression and density esti-.
Visualisation and interpretation of Support Vector Regression models.
Ustün, B; Melssen, W J; Buydens, L M C
2007-07-09
This paper introduces a technique to visualise the information content of the kernel matrix and a way to interpret the ingredients of the Support Vector Regression (SVR) model. Recently, the use of Support Vector Machines (SVM) for solving classification (SVC) and regression (SVR) problems has increased substantially in the field of chemistry and chemometrics. This is mainly due to its high generalisation performance and its ability to model non-linear relationships in a unique and global manner. Modeling of non-linear relationships will be enabled by applying a kernel function. The kernel function transforms the input data, usually non-linearly related to the associated output property, into a high dimensional feature space where the non-linear relationship can be represented in a linear form. Usually, SVMs are applied as a black box technique. Hence, the model cannot be interpreted like, e.g., Partial Least Squares (PLS). For example, the PLS scores and loadings make it possible to visualise and understand the driving force behind the optimal PLS machinery. In this study, we have investigated the possibilities to visualise and interpret the SVM model. Here, we exclusively have focused on Support Vector Regression to demonstrate these visualisation and interpretation techniques. Our observations show that we are now able to turn a SVR black box model into a transparent and interpretable regression modeling technique.
Maroco, João; Silva, Dina; Rodrigues, Ana; Guerreiro, Manuela; Santana, Isabel; de Mendonça, Alexandre
2011-08-17
Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Press' Q test showed that all classifiers performed better than chance alone (p classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most
Density Based Support Vector Machines for Classification
Zahra Nazari; Dongshik Kang
2015-01-01
Support Vector Machines (SVM) is the most successful algorithm for classification problems. SVM learns the decision boundary from two classes (for Binary Classification) of training points. However, sometimes there are some less meaningful samples amongst training points, which are corrupted by noises or misplaced in wrong side, called outliers. These outliers are affecting on margin and classification performance, and machine should better to discard them. SVM as a popular and widely used cl...
Efficient Multiplicative Updates for Support Vector Machines
DEFF Research Database (Denmark)
Potluru, Vamsi K.; Plis, Sergie N; Mørup, Morten
2009-01-01
The dual formulation of the support vector machine (SVM) objective function is an instance of a nonnegative quadratic programming problem. We reformulate the SVM objective function as a matrix factorization problem which establishes a connection with the regularized nonnegative matrix factorization...
Cormanich, Rodrigo A; Goodarzi, Mohammad; Freitas, Matheus P
2009-02-01
Inhibition of tyrosine kinase enzyme WEE1 is an important step for the treatment of cancer. The bioactivities of a series of WEE1 inhibitors have been previously modeled through comparative molecular field analyses (CoMFA and CoMSIA), but a two-dimensional image-based quantitative structure-activity relationship approach has shown to be highly predictive for other compound classes. This method, called multivariate image analysis applied to quantitative structure-activity relationship, was applied here to derive quantitative structure-activity relationship models. Whilst the well-known bilinear and multilinear partial least squares regressions (PLS and N-PLS, respectively) correlated multivariate image analysis descriptors with the corresponding dependent variables only reasonably well, the use of wavelet and principal component ranking as variable selection methods, together with least-squares support vector machine, improved significantly the prediction statistics. These recently implemented mathematical tools, particularly novel in quantitative structure-activity relationship studies, represent an important advance for the development of more predictive quantitative structure-activity relationship models and, consequently, new drugs.
DNBR Prediction Using a Support Vector Regression
International Nuclear Information System (INIS)
Yang, Heon Young; Na, Man Gyun
2008-01-01
PWRs (Pressurized Water Reactors) generally operate in the nucleate boiling state. However, the conversion of nucleate boiling into film boiling with conspicuously reduced heat transfer induces a boiling crisis that may cause the fuel clad melting in the long run. This type of boiling crisis is called Departure from Nucleate Boiling (DNB) phenomena. Because the prediction of minimum DNBR in a reactor core is very important to prevent the boiling crisis such as clad melting, a lot of research has been conducted to predict DNBR values. The object of this research is to predict minimum DNBR applying support vector regression (SVR) by using the measured signals of a reactor coolant system (RCS). The SVR has extensively and successfully been applied to nonlinear function approximation like the proposed problem for estimating DNBR values that will be a function of various input variables such as reactor power, reactor pressure, core mass flowrate, control rod positions and so on. The minimum DNBR in a reactor core is predicted using these various operating condition data as the inputs to the SVR. The minimum DBNR values predicted by the SVR confirm its correctness compared with COLSS values
Image Jacobian Matrix Estimation Based on Online Support Vector Regression
Directory of Open Access Journals (Sweden)
Shangqin Mao
2012-10-01
Full Text Available Research into robotics visual servoing is an important area in the field of robotics. It has proven difficult to achieve successful results for machine vision and robotics in unstructured environments without using any a priori camera or kinematic models. In uncalibrated visual servoing, image Jacobian matrix estimation methods can be divided into two groups: the online method and the offline method. The offline method is not appropriate for most natural environments. The online method is robust but rough. Moreover, if the images feature configuration changes, it needs to restart the approximating procedure. A novel approach based on an online support vector regression (OL-SVR algorithm is proposed which overcomes the drawbacks and combines the virtues just mentioned.
Coal demand prediction based on a support vector machine model
Energy Technology Data Exchange (ETDEWEB)
Jia, Cun-liang; Wu, Hai-shan; Gong, Dun-wei [China University of Mining & Technology, Xuzhou (China). School of Information and Electronic Engineering
2007-01-15
A forecasting model for coal demand of China using a support vector regression was constructed. With the selected embedding dimension, the output vectors and input vectors were constructed based on the coal demand of China from 1980 to 2002. After compared with lineal kernel and Sigmoid kernel, a radial basis function(RBF) was adopted as the kernel function. By analyzing the relationship between the error margin of prediction and the model parameters, the proper parameters were chosen. The support vector machines (SVM) model with multi-input and single output was proposed. Compared the predictor based on RBF neural networks with test datasets, the results show that the SVM predictor has higher precision and greater generalization ability. In the end, the coal demand from 2003 to 2006 is accurately forecasted. l0 refs., 2 figs., 4 tabs.
Support vector machine for automatic pain recognition
Monwar, Md Maruf; Rezaei, Siamak
2009-02-01
Facial expressions are a key index of emotion and the interpretation of such expressions of emotion is critical to everyday social functioning. In this paper, we present an efficient video analysis technique for recognition of a specific expression, pain, from human faces. We employ an automatic face detector which detects face from the stored video frame using skin color modeling technique. For pain recognition, location and shape features of the detected faces are computed. These features are then used as inputs to a support vector machine (SVM) for classification. We compare the results with neural network based and eigenimage based automatic pain recognition systems. The experiment results indicate that using support vector machine as classifier can certainly improve the performance of automatic pain recognition system.
Energy Technology Data Exchange (ETDEWEB)
Chen, Shou Tung; Hsiao, Yi Hsuan; Kuo, Shou Jen; Tseng, Hsin Shun; Wu, Hwa Koon; Chen, Dar Ren [Changhua Christian Hospital, Changhua (China); Huang, Yu Len [Tunghai University, Taichung (China)
2009-10-15
Logistic regression analysis (LRA), Support Vector Machine (SVM) and a neural network (NN) are commonly used statistical models in computeraided diagnostic (CAD) systems for breast ultrasonography (US). The aim of this study was to clarify the diagnostic ability of the use of these statistical models for future applications of CAD systems, such as three-dimensional (3D) power Doppler imaging, vascularity evaluation and the differentiation of a solid mass. A database that contained 3D power Doppler imaging pairs of non-harmonic and tissue harmonic images for 97 benign and 86 malignant solid tumors was utilized. The virtual organ computer-aided analysis-imaging program was used to analyze the stored volumes of the 183 solid breast tumors. LRA, an SVM and NN were employed in comparative analyses for the characterization of benign and malignant solid breast masses from the database. The values of area under receiver operating characteristic (ROC) curve, referred to as Az values for the use of non-harmonic 3D power Doppler US with LRA, SVM and NN were 0.9341, 0.9185 and 0.9086, respectively. The Az values for the use of harmonic 3D power Doppler US with LRA, SVM and NN were 0.9286, 0.8979 and 0.9009, respectively. The Az values of six ROC curves for the use of LRA, SVM and NN for non-harmonic or harmonic 3D power Doppler imaging were similar. The diagnostic performances of these three models (LRA, SVM and NN) are not different as demonstrated by ROC curve analysis. Depending on user emphasis for the use of ROC curve findings, the use of LRA appears to provide better sensitivity as compared to the other statistical models.
Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L
2012-08-07
Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.
Twin Support Vector Machine: A review from 2007 to 2014
Directory of Open Access Journals (Sweden)
Divya Tomar
2015-03-01
Full Text Available Twin Support Vector Machine (TWSVM is an emerging machine learning method suitable for both classification and regression problems. It utilizes the concept of Generalized Eigen-values Proximal Support Vector Machine (GEPSVM and finds two non-parallel planes for each class by solving a pair of Quadratic Programming Problems. It enhances the computational speed as compared to the traditional Support Vector Machine (SVM. TWSVM was initially constructed to solve binary classification problems; later researchers successfully extended it for multi-class problem domain. TWSVM always gives promising empirical results, due to which it has many attractive features which enhance its applicability. This paper presents the research development of TWSVM in recent years. This study is divided into two main broad categories - variant based and multi-class based TWSVM methods. The paper primarily discusses the basic concept of TWSVM and highlights its applications in recent years. A comparative analysis of various research contributions based on TWSVM is also presented. This is helpful for researchers to effectively utilize the TWSVM as an emergent research methodology and encourage them to work further in the performance enhancement of TWSVM.
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding
Scope of Support Vector Machine in Steganography
Tanwar, Rohit; Malhotrab, Sona
2017-08-01
Steganography is a technique used for secure transmission of data. Using audio as a cover file opens path for many extra features. In order to overcome the limitations of conventional LSB technique, various variants were proposed by different authors. In order to achieve robustness, use of various optimization techniques has been tradition. In this paper the focus is put on use of Genetic Algorithm and Particle Swarm Intelligence in steganography. To list detailed scope, merits and de-merits of the two optimization techniques is the main constituent of this paper. In spite of analyzing the two techniques, the motivation and applicability of machine learning algorithm in the problem statement is also discussed. This paper will guide the path in using Support Vector Machine for optimizing the data hiding.
SUPPORT VECTOR MACHINE METHOD FOR PREDICTING INVESTMENT MEASURES
Directory of Open Access Journals (Sweden)
Olga V. Kitova
2016-01-01
Full Text Available Possibilities of applying intelligent machine learning technique based on support vectors for predicting investment measures are considered in the article. The base features of support vector method over traditional econometric techniques for improving the forecast quality are described. Computer modeling results in terms of tuning support vector machine models developed with programming language Python for predicting some investment measures are shown.
Jeffrey T. Walton
2008-01-01
Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
Deep neural mapping support vector machines.
Li, Yujian; Zhang, Ting
2017-09-01
The choice of kernel has an important effect on the performance of a support vector machine (SVM). The effect could be reduced by NEUROSVM, an architecture using multilayer perceptron for feature extraction and SVM for classification. In binary classification, a general linear kernel NEUROSVM can be theoretically simplified as an input layer, many hidden layers, and an SVM output layer. As a feature extractor, the sub-network composed of the input and hidden layers is first trained together with a virtual ordinary output layer by backpropagation, then with the output of its last hidden layer taken as input of the SVM classifier for further training separately. By taking the sub-network as a kernel mapping from the original input space into a feature space, we present a novel model, called deep neural mapping support vector machine (DNMSVM), from the viewpoint of deep learning. This model is also a new and general kernel learning method, where the kernel mapping is indeed an explicit function expressed as a sub-network, different from an implicit function induced by a kernel function traditionally. Moreover, we exploit a two-stage procedure of contrastive divergence learning and gradient descent for DNMSVM to jointly training an adaptive kernel mapping instead of a kernel function, without requirement of kernel tricks. As a whole of the sub-network and the SVM classifier, the joint training of DNMSVM is done by using gradient descent to optimize the objective function with the sub-network layer-wise pre-trained via contrastive divergence learning of restricted Boltzmann machines. Compared to the separate training of NEUROSVM, the joint training is a new algorithm for DNMSVM to have advantages over NEUROSVM. Experimental results show that DNMSVM can outperform NEUROSVM and RBFSVM (i.e., SVM with the kernel of radial basis function), demonstrating its effectiveness. Copyright © 2017 Elsevier Ltd. All rights reserved.
Support vector machines and generalisation in HEP
Bevan, Adrian; Gamboa Goñi, Rodrigo; Hays, Jon; Stevenson, Tom
2017-10-01
We review the concept of Support Vector Machines (SVMs) and discuss examples of their use in a number of scenarios. Several SVM implementations have been used in HEP and we exemplify this algorithm using the Toolkit for Multivariate Analysis (TMVA) implementation. We discuss examples relevant to HEP including background suppression for H → τ + τ ‑ at the LHC with several different kernel functions. Performance benchmarking leads to the issue of generalisation of hyper-parameter selection. The avoidance of fine tuning (over training or over fitting) in MVA hyper-parameter optimisation, i.e. the ability to ensure generalised performance of an MVA that is independent of the training, validation and test samples, is of utmost importance. We discuss this issue and compare and contrast performance of hold-out and k-fold cross-validation. We have extended the SVM functionality and introduced tools to facilitate cross validation in TMVA and present results based on these improvements.
Support Vector Machines for Petrophysical Modelling and Lithoclassification
Al-Anazi, Ammal Fannoush Khalifah
2011-12-01
Given increasing challenges of oil and gas production from partially depleted conventional or unconventional reservoirs, reservoir characterization is a key element of the reservoir development workflow. Reservoir characterization impacts well placement, injection and production strategies, and field management. Reservoir characterization projects point and line data to a large three-dimensional volume. The relationship between variables, e.g. porosity and permeability, is often established by regression yet the complexities between measured variables often lead to poor correlation coefficients between the regressed variables. Recent advances in machine learning methods have provided attractive alternatives for constructing interpretation models of rock properties in heterogeneous reservoirs. Here, Support Vector Machines (SVMs), a class of a learning machine that is formulated to output regression models and classifiers of competitive generalization capability, has been explored to determine its capabilities for determining the relationship, both in regression and in classification, between reservoir rock properties. This thesis documents research on the capability of SVMs to model petrophysical and elastic properties in heterogeneous sandstone and carbonate reservoirs. Specifically, the capabilities of SVM regression and classification has been examined and compared to neural network-based methods, namely multilayered neural networks, radial basis function neural networks, general regression neural networks, probabilistic neural networks, and linear discriminant analysis. The petrophysical properties that have been evaluated include porosity, permeability, Poisson's ratio and Young's modulus. Statistical error analysis reveals that the SVM method yields comparable or superior predictions of petrophysical and elastic rock properties and classification of the lithology compared to neural networks. The SVM method also shows uniform prediction capability under the
Successive overrelaxation for laplacian support vector machine.
Qi, Zhiquan; Tian, Yingjie; Shi, Yong
2015-04-01
Semisupervised learning (SSL) problem, which makes use of both a large amount of cheap unlabeled data and a few unlabeled data for training, in the last few years, has attracted amounts of attention in machine learning and data mining. Exploiting the manifold regularization (MR), Belkin et al. proposed a new semisupervised classification algorithm: Laplacian support vector machines (LapSVMs), and have shown the state-of-the-art performance in SSL field. To further improve the LapSVMs, we proposed a fast Laplacian SVM (FLapSVM) solver for classification. Compared with the standard LapSVM, our method has several improved advantages as follows: 1) FLapSVM does not need to deal with the extra matrix and burden the computations related to the variable switching, which make it more suitable for large scale problems; 2) FLapSVM’s dual problem has the same elegant formulation as that of standard SVMs. This means that the kernel trick can be applied directly into the optimization model; and 3) FLapSVM can be effectively solved by successive overrelaxation technology, which converges linearly to a solution and can process very large data sets that need not reside in memory. In practice, combining the strategies of random scheduling of subproblem and two stopping conditions, the computing speed of FLapSVM is rigidly quicker to that of LapSVM and it is a valid alternative to PLapSVM.
International Nuclear Information System (INIS)
Wang, Jianzhou; Hu, Jianming
2015-01-01
With the increasing importance of wind power as a component of power systems, the problems induced by the stochastic and intermittent nature of wind speed have compelled system operators and researchers to search for more reliable techniques to forecast wind speed. This paper proposes a combination model for probabilistic short-term wind speed forecasting. In this proposed hybrid approach, EWT (Empirical Wavelet Transform) is employed to extract meaningful information from a wind speed series by designing an appropriate wavelet filter bank. The GPR (Gaussian Process Regression) model is utilized to combine independent forecasts generated by various forecasting engines (ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM)) in a nonlinear way rather than the commonly used linear way. The proposed approach provides more probabilistic information for wind speed predictions besides improving the forecasting accuracy for single-value predictions. The effectiveness of the proposed approach is demonstrated with wind speed data from two wind farms in China. The results indicate that the individual forecasting engines do not consistently forecast short-term wind speed for the two sites, and the proposed combination method can generate a more reliable and accurate forecast. - Highlights: • The proposed approach can make probabilistic modeling for wind speed series. • The proposed approach adapts to the time-varying characteristic of the wind speed. • The hybrid approach can extract the meaningful components from the wind speed series. • The proposed method can generate adaptive, reliable and more accurate forecasting results. • The proposed model combines four independent forecasting engines in a nonlinear way.
Recursive support vector machines for dimensionality reduction.
Tao, Qing; Chu, Dejun; Wang, Jue
2008-01-01
The usual dimensionality reduction technique in supervised learning is mainly based on linear discriminant analysis (LDA), but it suffers from singularity or undersampled problems. On the other hand, a regular support vector machine (SVM) separates the data only in terms of one single direction of maximum margin, and the classification accuracy may be not good enough. In this letter, a recursive SVM (RSVM) is presented, in which several orthogonal directions that best separate the data with the maximum margin are obtained. Theoretical analysis shows that a completely orthogonal basis can be derived in feature subspace spanned by the training samples and the margin is decreasing along the recursive components in linearly separable cases. As a result, a new dimensionality reduction technique based on multilevel maximum margin components and then a classifier with high accuracy are achieved. Experiments in synthetic and several real data sets show that RSVM using multilevel maximum margin features can do efficient dimensionality reduction and outperform regular SVM in binary classification problems.
Ranking Support Vector Machine with Kernel Approximation
Directory of Open Access Journals (Sweden)
Kai Chen
2017-01-01
Full Text Available Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels can give higher accuracy than linear RankSVM (RankSVM with a linear kernel for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.
Computerized Interactive Gaming via Supporting Vector Machines
Directory of Open Access Journals (Sweden)
Y. Jiang
2008-01-01
Full Text Available Computerized interactive gaming requires automatic processing of large volume of random data produced by players on spot, such as shooting, football kicking, and boxing. This paper describes a supporting vector machine-based artificial intelligence algorithm as one of the possible solutions to the problem of random data processing and the provision of interactive indication for further actions. In comparison with existing techniques, such as rule-based and neural networks, and so forth, our SVM-based interactive gaming algorithm has the features of (i high-speed processing, providing instant response to the players, (ii winner selection and control by one parameter, which can be predesigned and adjusted according to the needs of interaction and game design or specific level of difficulties, and (iii detection of interaction points is adaptive to the input changes, and no labelled training data is required. Experiments on numerical simulation support that the proposed algorithm is robust to random noise, accurate in picking up winning data, and convenient for all interactive gaming designs.
Subspace identification of Hammer stein models using support vector machines
International Nuclear Information System (INIS)
Al-Dhaifallah, Mujahed
2011-01-01
System identification is the art of finding mathematical tools and algorithms that build an appropriate mathematical model of a system from measured input and output data. Hammerstein model, consisting of a memoryless nonlinearity followed by a dynamic linear element, is often a good trade-off as it can represent some dynamic nonlinear systems very accurately, but is nonetheless quite simple. Moreover, the extensive knowledge about LTI system representations can be applied to the dynamic linear block. On the other hand, finding an effective representation for the nonlinearity is an active area of research. Recently, support vector machines (SVMs) and least squares support vector machines (LS-SVMs) have demonstrated powerful abilities in approximating linear and nonlinear functions. In contrast with other approximation methods, SVMs do not require a-priori structural information. Furthermore, there are well established methods with guaranteed convergence (ordinary least squares, quadratic programming) for fitting LS-SVMs and SVMs. The general objective of this research is to develop new subspace algorithms for Hammerstein systems based on SVM regression.
Linearity and Misspecification Tests for Vector Smooth Transition Regression Models
DEFF Research Database (Denmark)
Teräsvirta, Timo; Yang, Yukai
The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...
Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W
2015-08-01
Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.
Fault Isolation for Nonlinear Systems Using Flexible Support Vector Regression
Directory of Open Access Journals (Sweden)
Yufang Liu
2014-01-01
Full Text Available While support vector regression is widely used as both a function approximating tool and a residual generator for nonlinear system fault isolation, a drawback for this method is the freedom in selecting model parameters. Moreover, for samples with discordant distributing complexities, the selection of reasonable parameters is even impossible. To alleviate this problem we introduce the method of flexible support vector regression (F-SVR, which is especially suited for modelling complicated sample distributions, as it is free from parameters selection. Reasonable parameters for F-SVR are automatically generated given a sample distribution. Lastly, we apply this method in the analysis of the fault isolation of high frequency power supplies, where satisfactory results have been obtained.
A novel stepwise support vector machine (SVM) method based on ...
African Journals Online (AJOL)
ajl yemi
2011-11-23
Nov 23, 2011 ... began to use computational approaches, particularly machine learning methods to identify pre-miRNAs (Xue et al., 2005; Huang et al., 2007; Jiang et al., 2007). Xue et al. (2005) presented a support vector machine (SVM)- based classifier called triplet-SVM, which classifies human pre-miRNAs from pseudo ...
Infinite ensemble of support vector machines for prediction of ...
African Journals Online (AJOL)
Many researchers have demonstrated the use of artificial neural networks (ANNs) to predict musculoskeletal disorders risk associated with occupational exposures. In order to improve the accuracy of LBDs risk classification, this paper proposes to use the support vector machines (SVMs), a machine learning algorithm used ...
Infinite ensemble of support vector machines for prediction of ...
African Journals Online (AJOL)
user
Keeping these issues in view, the main objective of this study is to develop a support vector machines based classification system that could classify industrial ..... His current research interest areas include sequencing and scheduling, ergonomics, supply chain management, inventory management, machine learning, etc.
Optimized support vector regression for drilling rate of penetration estimation
Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa
2015-12-01
In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.
Day-Ahead Wind Speed Forecasting Using Relevance Vector Machine
Directory of Open Access Journals (Sweden)
Guoqiang Sun
2014-01-01
Full Text Available With the development of wind power technology, the security of the power system, power quality, and stable operation will meet new challenges. So, in this paper, we propose a recently developed machine learning technique, relevance vector machine (RVM, for day-ahead wind speed forecasting. We combine Gaussian kernel function and polynomial kernel function to get mixed kernel for RVM. Then, RVM is compared with back propagation neural network (BP and support vector machine (SVM for wind speed forecasting in four seasons in precision and velocity; the forecast results demonstrate that the proposed method is reasonable and effective.
Theory of net analyte signal vectors in inverse regression
DEFF Research Database (Denmark)
Bro, R.; Andersen, Charlotte Møller
2003-01-01
The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares i...... recently suggested by Faber (Anal. Chem. 1998; 70: 5108-5110). A required correction of the net analyte signal in situations with negative predicted responses is also discussed. Copyright (C) 2004 John Wiley Sons, Ltd.......The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...... in spectral calibration where the responses of all pure analytes and interferents are assumed to be known. However, in chemometrics, inverse calibration models such as partial least squares regression are more abundant and several tools for calculating the net analyte signal in inverse regression models have...
Vector machine techniques for modeling of seismic liquefaction data
Directory of Open Access Journals (Sweden)
Pijush Samui
2014-06-01
Full Text Available This article employs three soft computing techniques, Support Vector Machine (SVM; Least Square Support Vector Machine (LSSVM and Relevance Vector Machine (RVM, for prediction of liquefaction susceptibility of soil. SVM and LSSVM are based on the structural risk minimization (SRM principle which seeks to minimize an upper bound of the generalization error consisting of the sum of the training error and a confidence interval. RVM is a sparse Bayesian kernel machine. SVM, LSSVM and RVM have been used as classification tools. The developed SVM, LSSVM and RVM give equations for prediction of liquefaction susceptibility of soil. A comparative study has been carried out between the developed SVM, LSSVM and RVM models. The results from this article indicate that the developed SVM gives the best performance for prediction of liquefaction susceptibility of soil.
Support Vector Machine Classification of Drunk Driving Behaviour.
Chen, Huiqin; Chen, Lei
2017-01-23
Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM) classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R-R intervals (SDNN), the root mean square value of the difference of the adjacent R-R interval series (RMSSD), low frequency (LF), high frequency (HF), the ratio of the low and high frequencies (LF/HF), and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.
Support Vector Machine Classification of Drunk Driving Behaviour
Directory of Open Access Journals (Sweden)
Huiqin Chen
2017-01-01
Full Text Available Alcohol is the root cause of numerous traffic accidents due to its pharmacological action on the human central nervous system. This study conducted a detection process to distinguish drunk driving from normal driving under simulated driving conditions. The classification was performed by a support vector machine (SVM classifier trained to distinguish between these two classes by integrating both driving performance and physiological measurements. In addition, principal component analysis was conducted to rank the weights of the features. The standard deviation of R–R intervals (SDNN, the root mean square value of the difference of the adjacent R–R interval series (RMSSD, low frequency (LF, high frequency (HF, the ratio of the low and high frequencies (LF/HF, and average blink duration were the highest weighted features in the study. The results show that SVM classification can successfully distinguish drunk driving from normal driving with an accuracy of 70%. The driving performance data and the physiological measurements reported by this paper combined with air-alcohol concentration could be integrated using the support vector regression classification method to establish a better early warning model, thereby improving vehicle safety.
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Fault trend prediction of device based on support vector regression
International Nuclear Information System (INIS)
Song Meicun; Cai Qi
2011-01-01
The research condition of fault trend prediction and the basic theory of support vector regression (SVR) were introduced. SVR was applied to the fault trend prediction of roller bearing, and compared with other methods (BP neural network, gray model, and gray-AR model). The results show that BP network tends to overlearn and gets into local minimum so that the predictive result is unstable. It also shows that the predictive result of SVR is stabilization, and SVR is superior to BP neural network, gray model and gray-AR model in predictive precision. SVR is a kind of effective method of fault trend prediction. (authors)
Prediction of hourly PM2.5 using a space-time support vector regression model
Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang
2018-05-01
Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.
Klasifikasi Penyakit Ayam Menggunakan Metode Support Vector Machine
Directory of Open Access Journals (Sweden)
Eka Dwi Nurcahya
2017-04-01
Full Text Available Penyakit pada ayam memiliki banyak jenis tetapi mempunyai beberapa gejala yang mirip dengan perbedaan yang sedikit. Gejala penyakit yang sulit dibedakan membuat peternak rentan melakukan kesalahan penanganan. Klasifikasi menggunakan teknik komputasi dapat membedakan jenis penyakit dengan lebih baik. Support Vector Machine adalah salah satu teknik komputasi yang dapat digunakan. Penelitian ini menggunakan jenis penyakit ayam antara lain Avian Influenza, Cronic Respiratory Disease, Corryza, Newcastle Disease, Gumboro, dan Koksidiosis. Gejala-gejala yang ditimbulkan dari enam penyakit yang diteliti berjumlah 26 jenis gejala dengan objek penelitian berjumlah 105 ekor ayam. Metode pengambilan data menggunakan data lapangan hasil dari pengamatan peternak sesuai rujukan ahli peternakan. Support Vector Machine sebagai pengklasifikasi memberikan bobot pembeda pada setiap penyakit dengan dasar gejala yang dimiliki setiap penyakit. Hasil penerapan Support Vector Machine pada klasifikasi penyakit ayam medapatkan nilai akurasi sebesar 84,7% atau 89 data sesuai dengan rujukan ahli.
A Support Vector Machine-Based Voice Activity Detection Employing Effective Feature Vectors
Jo, Q.-Haing; Park, Yun-Sik; Lee, Kye-Hwan; Chang, Joon-Hyuk
In this letter, we propose effective feature vectors to improve the performance of voice activity detection (VAD) employing a support vector machine (SVM), which is known to incorporate an optimized nonlinear decision over two different classes. To extract the effective feature vectors, we present a novel scheme that combines the a posteriori SNR, a priori SNR, and predicted SNR, widely adopted in conventional statistical model-based VAD.
[Model transfer method based on support vector machine].
Xiong, Yu-hong; Wen, Zhi-yu; Liang, Yu-qian; Chen, Qin; Zhang, Bo; Liu, Yu; Xiang, Xian-yi
2007-01-01
The model transfer is a basic method to build up universal and comparable performance of spectrometer data by seeking a mathematical transformation relation among different spectrometers. Because of nonlinear effect and small calibration sample set in fact, it is important to solve the problem of model transfer under the condition of nonlinear effect in evidence and small sample set. This paper summarizes support vector machines theory, puts forward the method of model transfer based on support vector machine and piecewise direct standardization, and makes use of computer simulation method, giving a example to explain the method and compare it with artificial neural network in the end.
Prediction of Banking Systemic Risk Based on Support Vector Machine
Directory of Open Access Journals (Sweden)
Shouwei Li
2013-01-01
Full Text Available Banking systemic risk is a complex nonlinear phenomenon and has shed light on the importance of safeguarding financial stability by recent financial crisis. According to the complex nonlinear characteristics of banking systemic risk, in this paper we apply support vector machine (SVM to the prediction of banking systemic risk in an attempt to suggest a new model with better explanatory power and stability. We conduct a case study of an SVM-based prediction model for Chinese banking systemic risk and find the experiment results showing that support vector machine is an efficient method in such case.
Support vector machine for day ahead electricity price forecasting
Razak, Intan Azmira binti Wan Abdul; Abidin, Izham bin Zainal; Siah, Yap Keem; Rahman, Titik Khawa binti Abdul; Lada, M. Y.; Ramani, Anis Niza binti; Nasir, M. N. M.; Ahmad, Arfah binti
2015-05-01
Electricity price forecasting has become an important part of power system operation and planning. In a pool- based electric energy market, producers submit selling bids consisting in energy blocks and their corresponding minimum selling prices to the market operator. Meanwhile, consumers submit buying bids consisting in energy blocks and their corresponding maximum buying prices to the market operator. Hence, both producers and consumers use day ahead price forecasts to derive their respective bidding strategies to the electricity market yet reduce the cost of electricity. However, forecasting electricity prices is a complex task because price series is a non-stationary and highly volatile series. Many factors cause for price spikes such as volatility in load and fuel price as well as power import to and export from outside the market through long term contract. This paper introduces an approach of machine learning algorithm for day ahead electricity price forecasting with Least Square Support Vector Machine (LS-SVM). Previous day data of Hourly Ontario Electricity Price (HOEP), generation's price and demand from Ontario power market are used as the inputs for training data. The simulation is held using LSSVMlab in Matlab with the training and testing data of 2004. SVM that widely used for classification and regression has great generalization ability with structured risk minimization principle rather than empirical risk minimization. Moreover, same parameter settings in trained SVM give same results that absolutely reduce simulation process compared to other techniques such as neural network and time series. The mean absolute percentage error (MAPE) for the proposed model shows that SVM performs well compared to neural network.
Explaining Support Vector Machines: A Color Based Nomogram.
Directory of Open Access Journals (Sweden)
Vanya Van Belle
Full Text Available Support vector machines (SVMs are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models.In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables.Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant. When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable.This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method.
Online Support Vector Regression with Varying Parameters for Time-Dependent Data
International Nuclear Information System (INIS)
Omitaomu, Olufemi A.; Jeong, Myong K.; Badiru, Adedeji B.
2011-01-01
Support vector regression (SVR) is a machine learning technique that continues to receive interest in several domains including manufacturing, engineering, and medicine. In order to extend its application to problems in which datasets arrive constantly and in which batch processing of the datasets is infeasible or expensive, an accurate online support vector regression (AOSVR) technique was proposed. The AOSVR technique efficiently updates a trained SVR function whenever a sample is added to or removed from the training set without retraining the entire training data. However, the AOSVR technique assumes that the new samples and the training samples are of the same characteristics; hence, the same value of SVR parameters is used for training and prediction. This assumption is not applicable to data samples that are inherently noisy and non-stationary such as sensor data. As a result, we propose Accurate On-line Support Vector Regression with Varying Parameters (AOSVR-VP) that uses varying SVR parameters rather than fixed SVR parameters, and hence accounts for the variability that may exist in the samples. To accomplish this objective, we also propose a generalized weight function to automatically update the weights of SVR parameters in on-line monitoring applications. The proposed function allows for lower and upper bounds for SVR parameters. We tested our proposed approach and compared results with the conventional AOSVR approach using two benchmark time series data and sensor data from nuclear power plant. The results show that using varying SVR parameters is more applicable to time dependent data.
Single Object Tracking With Fuzzy Least Squares Support Vector Machine.
Zhang, Shunli; Zhao, Sicong; Sui, Yao; Zhang, Li
2015-12-01
Single object tracking, in which a target is often initialized manually in the first frame and then is tracked and located automatically in the subsequent frames, is a hot topic in computer vision. The traditional tracking-by-detection framework, which often formulates tracking as a binary classification problem, has been widely applied and achieved great success in single object tracking. However, there are some potential issues in this formulation. For instance, the boundary between the positive and negative training samples is fuzzy, and the objectives of tracking and classification are inconsistent. In this paper, we attempt to address the above issues from the fuzzy system perspective and propose a novel tracking method by formulating tracking as a fuzzy classification problem. First, we introduce the fuzzy strategy into tracking and propose a novel fuzzy tracking framework, which can measure the importance of the training samples by assigning different memberships to them and offer more strict spatial constraints. Second, we develop a fuzzy least squares support vector machine (FLS-SVM) approach and employ it to implement a concrete tracker. In particular, the primal form, dual form, and kernel form of FLS-SVM are analyzed and the corresponding closed-form solutions are derived for efficient realizations. Besides, a least squares regression model is built to control the update adaptively, retaining the robustness of the appearance model. The experimental results demonstrate that our method can achieve comparable or superior performance to many state-of-the-art methods.
Upport vector machines for nonlinear kernel ARMA system identification.
Martínez-Ramón, Manel; Rojo-Alvarez, José Luis; Camps-Valls, Gustavo; Muñioz-Marí, Jordi; Navia-Vázquez, Angel; Soria-Olivas, Emilio; Figueiras-Vidal, Aníbal R
2006-11-01
Nonlinear system identification based on support vector machines (SVM) has been usually addressed by means of the standard SVM regression (SVR), which can be seen as an implicit nonlinear autoregressive and moving average (ARMA) model in some reproducing kernel Hilbert space (RKHS). The proposal of this letter is twofold. First, the explicit consideration of an ARMA model in an RKHS (SVM-ARMA2K) is proposed. We show that stating the ARMA equations in an RKHS leads to solving the regularized normal equations in that RKHS, in terms of the autocorrelation and cross correlation of the (nonlinearly) transformed input and output discrete time processes. Second, a general class of SVM-based system identification nonlinear models is presented, based on the use of composite Mercer's kernels. This general class can improve model flexibility by emphasizing the input-output cross information (SVM-ARMA4K), which leads to straightforward and natural combinations of implicit and explicit ARMA models (SVR-ARMA2K and SVR-ARMA4K). Capabilities of these different SVM-based system identification schemes are illustrated with two benchmark problems.
A support vector machine approach to the development of an ...
African Journals Online (AJOL)
... the results with a standard benchmark, 99.78% and 98.89% were obtained for the detection of normal and attacks when tested with 5500 packets. The results suggest an improved detection efficiency and false alarm rate. Key words: Support Vector Machine, Classification algorithm, Network Security, Network Packets ...
Identifying saltcedar with hyperspectral data and support vector machines
Saltcedar (Tamarix spp.) are a group of dense phreatophytic shrubs and trees that are invasive to riparian areas throughout the United States. This study determined the feasibility of using hyperspectral data and a support vector machine (SVM) classifier to discriminate saltcedar from other cover t...
Variance inflation in high dimensional Support Vector Machines
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2013-01-01
Many important machine learning models, supervised and unsupervised, are based on simple Euclidean distance or orthogonal projection in a high dimensional feature space. When estimating such models from small training sets we face the problem that the span of the training data set input vectors i...
Landslide susceptibility mapping using support vector machine and ...
Indian Academy of Sciences (India)
based support vector machine (SVM) at Kalaleh Township area of the Golestan province, Iran. In this paper, six different types of kernel classifiers such as linear, polynomial degree of 2, polynomial degree of 3, polynomial degree of 4, radial ...
Landslide susceptibility mapping using support vector machine and ...
Indian Academy of Sciences (India)
port vector machine by employing six types of kernel function classifiers. Subsequently, the results were plotted in ArcGIS and six landslide susceptibility maps were produced. Then, using the success rate and the prediction rate methods, the validation process was performed by comparing the existing landslide data with ...
The efficacy of support vector machines (SVM) in robust ...
Indian Academy of Sciences (India)
The efficacy of support vector machines (SVM) in robust determination of earthquake early ... This work deals with a methodology applied to seismic early warning systems which are designed to provide real-time estimation of the ... The effectiveness of warning systems can be pre- dicted by using P-wave rather than S-wave ...
Relevance Vector Machine for Prediction of Soil Properties | Samui ...
African Journals Online (AJOL)
One of the first, most important steps in geotechnical engineering is site characterization. The ultimate goal of site characterization is to predict the in-situ soil properties at any half-space point for a site based on limited number of tests and data. In the present study, relevance vector machine (RVM) has been used to develop ...
Perturbation to enhance support vector machines for classification
To, K. N.; Lim, C. C.
2004-02-01
This paper presents a perturbation method to obtain a sensitivity measure of trained solution from a set of input patterns using support vector machine learning. Applying to classify images of binary classes, the sensitivity measure adds ability to extract content of interest of images for two classes in image classification.
Experimental comparison of support vector machines with random ...
Indian Academy of Sciences (India)
a positive contribution to the problem of land cover classification by exploring generalized reduced gra- dient method, support vector machines, and random forests to improve producer accuracy and overall classification accuracy. The performance ..... first performed using the MNF transform (Karaska et al. 2004; Plaza et al.
Estimating Frequency by Interpolation Using Least Squares Support Vector Regression
Directory of Open Access Journals (Sweden)
Changwei Ma
2015-01-01
Full Text Available Discrete Fourier transform- (DFT- based maximum likelihood (ML algorithm is an important part of single sinusoid frequency estimation. As signal to noise ratio (SNR increases and is above the threshold value, it will lie very close to Cramer-Rao lower bound (CRLB, which is dependent on the number of DFT points. However, its mean square error (MSE performance is directly proportional to its calculation cost. As a modified version of support vector regression (SVR, least squares SVR (LS-SVR can not only still keep excellent capabilities for generalizing and fitting but also exhibit lower computational complexity. In this paper, therefore, LS-SVR is employed to interpolate on Fourier coefficients of received signals and attain high frequency estimation accuracy. Our results show that the proposed algorithm can make a good compromise between calculation cost and MSE performance under the assumption that the sample size, number of DFT points, and resampling points are already known.
Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms
Directory of Open Access Journals (Sweden)
Zhongyi Hu
2013-01-01
Full Text Available Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA based memetic algorithm (FA-MA to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.
Lithium-ion battery remaining useful life prediction based on grey support vector machines
Directory of Open Access Journals (Sweden)
Xiaogang Li
2015-12-01
Full Text Available In this article, an improved grey prediction model is proposed to address low-accuracy prediction issue of grey forecasting model. The first step is using a trigonometric function to transform the original data sequence to smooth the data, which is called smoothness of grey prediction model, and then a grey support vector machine model by integrating the improved grey model with support vector machine is introduced. At the initial stage of the model, trigonometric functions and accumulation generation operation can be used to preprocess the data, which enhances the smoothness of the data and reduces the associated randomness. In addition, support vector machine is implemented to establish a prediction model for the pre-processed data and select the optimal model parameters via genetic algorithms. Finally, the data are restored through the ‘regressive generate’ operation to obtain the forecasting data. To prove that the grey support vector machine model is superior to the other models, the battery life data from the Center for Advanced Life Cycle Engineering are selected, and the presented model is used to predict the remaining useful life of the battery. The predicted result is compared to that of grey model and support vector machines. For a more intuitive comparison of the three models, this article quantifies the root mean square errors for these three different models in the case of different ratio of training samples and prediction samples. The results show that the effect of grey support vector machine model is optimal, and the corresponding root mean square error is only 3.18%.
A Novel Support Vector Machine with Globality-Locality Preserving
Directory of Open Access Journals (Sweden)
Cheng-Long Ma
2014-01-01
Full Text Available Support vector machine (SVM is regarded as a powerful method for pattern classification. However, the solution of the primal optimal model of SVM is susceptible for class distribution and may result in a nonrobust solution. In order to overcome this shortcoming, an improved model, support vector machine with globality-locality preserving (GLPSVM, is proposed. It introduces globality-locality preserving into the standard SVM, which can preserve the manifold structure of the data space. We complete rich experiments on the UCI machine learning data sets. The results validate the effectiveness of the proposed model, especially on the Wine and Iris databases; the recognition rate is above 97% and outperforms all the algorithms that were developed from SVM.
A novel support vector machine with globality-locality preserving.
Ma, Cheng-Long; Yuan, Yu-Bo
2014-01-01
Support vector machine (SVM) is regarded as a powerful method for pattern classification. However, the solution of the primal optimal model of SVM is susceptible for class distribution and may result in a nonrobust solution. In order to overcome this shortcoming, an improved model, support vector machine with globality-locality preserving (GLPSVM), is proposed. It introduces globality-locality preserving into the standard SVM, which can preserve the manifold structure of the data space. We complete rich experiments on the UCI machine learning data sets. The results validate the effectiveness of the proposed model, especially on the Wine and Iris databases; the recognition rate is above 97% and outperforms all the algorithms that were developed from SVM.
Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study
Al-Anazi, A. F.; Gates, I. D.
2010-12-01
In wells with limited log and core data, porosity, a fundamental and essential property to characterize reservoirs, is challenging to estimate by conventional statistical methods from offset well log and core data in heterogeneous formations. Beyond simple regression, neural networks have been used to develop more accurate porosity correlations. Unfortunately, neural network-based correlations have limited generalization ability and global correlations for a field are usually less accurate compared to local correlations for a sub-region of the reservoir. In this paper, support vector machines are explored as an intelligent technique to correlate porosity to well log data. Recently, support vector regression (SVR), based on the statistical learning theory, have been proposed as a new intelligence technique for both prediction and classification tasks. The underlying formulation of support vector machines embodies the structural risk minimization (SRM) principle which has been shown to be superior to the traditional empirical risk minimization (ERM) principle employed by conventional neural networks and classical statistical methods. This new formulation uses margin-based loss functions to control model complexity independently of the dimensionality of the input space, and kernel functions to project the estimation problem to a higher dimensional space, which enables the solution of more complex nonlinear problem optimization methods to exist for a globally optimal solution. SRM minimizes an upper bound on the expected risk using a margin-based loss function ( ɛ-insensitivity loss function for regression) in contrast to ERM which minimizes the error on the training data. Unlike classical learning methods, SRM, indexed by margin-based loss function, can also control model complexity independent of dimensionality. The SRM inductive principle is designed for statistical estimation with finite data where the ERM inductive principle provides the optimal solution (the
Directory of Open Access Journals (Sweden)
Laura Cornejo-Bueno
2017-11-01
Full Text Available Wind Power Ramp Events (WPREs are large fluctuations of wind power in a short time interval, which lead to strong, undesirable variations in the electric power produced by a wind farm. Its accurate prediction is important in the effort of efficiently integrating wind energy in the electric system, without affecting considerably its stability, robustness and resilience. In this paper, we tackle the problem of predicting WPREs by applying Machine Learning (ML regression techniques. Our approach consists of using variables from atmospheric reanalysis data as predictive inputs for the learning machine, which opens the possibility of hybridizing numerical-physical weather models with ML techniques for WPREs prediction in real systems. Specifically, we have explored the feasibility of a number of state-of-the-art ML regression techniques, such as support vector regression, artificial neural networks (multi-layer perceptrons and extreme learning machines and Gaussian processes to solve the problem. Furthermore, the ERA-Interim reanalysis from the European Center for Medium-Range Weather Forecasts is the one used in this paper because of its accuracy and high resolution (in both spatial and temporal domains. Aiming at validating the feasibility of our predicting approach, we have carried out an extensive experimental work using real data from three wind farms in Spain, discussing the performance of the different ML regression tested in this wind power ramp event prediction problem.
Robust support vector machine-trained fuzzy system.
Forghani, Yahya; Yazdi, Hadi Sadoghi
2014-02-01
Because the SVM (support vector machine) classifies data with the widest symmetric margin to decrease the probability of the test error, modern fuzzy systems use SVM to tune the parameters of fuzzy if-then rules. But, solving the SVM model is time-consuming. To overcome this disadvantage, we propose a rapid method to solve the robust SVM model and use it to tune the parameters of fuzzy if-then rules. The robust SVM is an extension of SVM for interval-valued data classification. We compare our proposed method with SVM, robust SVM, ISVM-FC (incremental support vector machine-trained fuzzy classifier), BSVM-FC (batch support vector machine-trained fuzzy classifier), SOTFN-SV (a self-organizing TS-type fuzzy network with support vector learning) and SCLSE (a TS-type fuzzy system with subtractive clustering for antecedent parameter tuning and LSE for consequent parameter tuning) by using some real datasets. According to experimental results, the use of proposed approach leads to very low training and testing time with good misclassification rate. Copyright © 2013 Elsevier Ltd. All rights reserved.
Online Sequential Projection Vector Machine with Adaptive Data Mean Update
Directory of Open Access Journals (Sweden)
Lin Chen
2016-01-01
Full Text Available We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1 the projection vectors for dimension reduction, (2 the input weights, biases, and output weights, and (3 the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD approach, adaptive multihyperplane machine (AMM, primal estimated subgradient solver (Pegasos, online sequential extreme learning machine (OSELM, and SVD + OSELM (feature selection based on SVD is performed before OSELM. The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.
Collapse moment estimation by support vector machines for wall-thinned pipe bends and elbows
International Nuclear Information System (INIS)
Na, Man Gyun; Kim, Jin Weon; Hwang, In Joon
2007-01-01
The collapse moment due to wall-thinned defects is estimated through support vector machines with parameters optimized by a genetic algorithm. The support vector regression models are developed and applied to numerical data obtained from the finite element analysis for wall-thinned defects in piping systems. The support vector regression models are optimized by using both the data sets (training data and optimization data) prepared for training and optimization, and its performance verification is performed by using another data set (test data) different from the training data and the optimization data. In this work, three support vector regression models are developed, respectively, for three data sets divided into the three classes of extrados, intrados, and crown defects, which is because they have different characteristics. The relative root mean square (RMS) errors of the estimated collapse moment are 0.2333% for the training data, 0.5229% for the optimization data and 0.5011% for the test data. It is known from this result that the support vector regression models are sufficiently accurate to be used in the integrity evaluation of wall-thinned pipe bends and elbows
A Support Vector Regression Approach for Investigating Multianticipative Driving Behavior
Directory of Open Access Journals (Sweden)
Bin Lu
2015-01-01
Full Text Available This paper presents a Support Vector Regression (SVR approach that can be applied to predict the multianticipative driving behavior using vehicle trajectory data. Building upon the SVR approach, a multianticipative car-following model is developed and enhanced in learning speed and predication accuracy. The model training and validation are conducted by using the field trajectory data extracted from the Next Generation Simulation (NGSIM project. During the model training and validation tests, the estimation results show that the SVR model performs as well as IDM model with respect to the model prediction accuracy. In addition, this paper performs a relative importance analysis to quantify the multianticipation in terms of the different stimuli to which drivers react in platoon car following. The analysis results confirm that drivers respond to the behavior of not only the immediate leading vehicle in front but also the second, third, and even fourth leading vehicles. Specifically, in congested traffic conditions, drivers are observed to be more sensitive to the relative speed than to the gap. These findings provide insight into multianticipative driving behavior and illustrate the necessity of taking into account multianticipative car-following model in microscopic traffic simulation.
Mixed kernel function support vector regression for global sensitivity analysis
Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng
2017-11-01
Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.
Support vector machine classifier for diagnosis in electrical machines: Application to broken bar
Matic, Dragan; Kulic, Filip; Pineda-Sanchez, Manuel; Kamenko, Ilija
2012-01-01
This paper presents a support vector machine classifier for broken bar detection in electrical induction machine. It is a reliable online method, which has high robustness to load variations and changing operating conditions. The phase current is only physical value to be measured. The steady state current is analyzed for broken bar fault via motor current signature analysis technique based on Hilbert transform. A two dimensional feature space is proposed. The features are: magnitude and freq...
Product Quality Modelling Based on Incremental Support Vector Machine
International Nuclear Information System (INIS)
Wang, J; Zhang, W; Qin, B; Shi, W
2012-01-01
Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.
Product Quality Modelling Based on Incremental Support Vector Machine
Wang, J.; Zhang, W.; Qin, B.; Shi, W.
2012-05-01
Incremental Support vector machine (ISVM) is a new learning method developed in recent years based on the foundations of statistical learning theory. It is suitable for the problem of sequentially arriving field data and has been widely used for product quality prediction and production process optimization. However, the traditional ISVM learning does not consider the quality of the incremental data which may contain noise and redundant data; it will affect the learning speed and accuracy to a great extent. In order to improve SVM training speed and accuracy, a modified incremental support vector machine (MISVM) is proposed in this paper. Firstly, the margin vectors are extracted according to the Karush-Kuhn-Tucker (KKT) condition; then the distance from the margin vectors to the final decision hyperplane is calculated to evaluate the importance of margin vectors, where the margin vectors are removed while their distance exceed the specified value; finally, the original SVs and remaining margin vectors are used to update the SVM. The proposed MISVM can not only eliminate the unimportant samples such as noise samples, but also can preserve the important samples. The MISVM has been experimented on two public data and one field data of zinc coating weight in strip hot-dip galvanizing, and the results shows that the proposed method can improve the prediction accuracy and the training speed effectively. Furthermore, it can provide the necessary decision supports and analysis tools for auto control of product quality, and also can extend to other process industries, such as chemical process and manufacturing process.
Fuzzy support vector machines for adaptive Morse code recognition.
Yang, Cheng-Hong; Jin, Li-Cheng; Chuang, Li-Yeh
2006-11-01
Morse code is now being harnessed for use in rehabilitation applications of augmentative-alternative communication and assistive technology, facilitating mobility, environmental control and adapted worksite access. In this paper, Morse code is selected as a communication adaptive device for persons who suffer from muscle atrophy, cerebral palsy or other severe handicaps. A stable typing rate is strictly required for Morse code to be effective as a communication tool. Therefore, an adaptive automatic recognition method with a high recognition rate is needed. The proposed system uses both fuzzy support vector machines and the variable-degree variable-step-size least-mean-square algorithm to achieve these objectives. We apply fuzzy memberships to each point, and provide different contributions to the decision learning function for support vector machines. Statistical analyses demonstrated that the proposed method elicited a higher recognition rate than other algorithms in the literature.
Sistem Deteksi Retinopati Diabetik Menggunakan Support Vector Machine
Directory of Open Access Journals (Sweden)
Wahyudi Setiawan
2014-02-01
Full Text Available Diabetic Retinopathy is a complication of Diabetes Melitus. It can be a blindness if untreated settled as early as possible. System created in this thesis is the detection of diabetic retinopathy level of the image obtained from fundus photographs. There are three main steps to resolve the problems, preprocessing, feature extraction and classification. Preprocessing methods that used in this system are Grayscale Green Channel, Gaussian Filter, Contrast Limited Adaptive Histogram Equalization and Masking. Two Dimensional Linear Discriminant Analysis (2DLDA is used for feature extraction. Support Vector Machine (SVM is used for classification. The test result performed by taking a dataset of MESSIDOR with number of images that vary for the training phase, otherwise is used for the testing phase. Test result show the optimal accuracy are 84% . Keywords : Diabetic Retinopathy, Support Vector Machine, Two Dimensional Linear Discriminant Analysis, MESSIDOR
Prototype Vector Machine for Large Scale Semi-Supervised Learning
Energy Technology Data Exchange (ETDEWEB)
Zhang, Kai; Kwok, James T.; Parvin, Bahram
2009-04-29
Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.
Held, Elizabeth; Cape, Joshua; Tintle, Nathan
2016-01-01
Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.
Representative Vector Machines: A Unified Framework for Classical Classifiers.
Gui, Jie; Liu, Tongliang; Tao, Dacheng; Sun, Zhenan; Tan, Tieniu
2016-08-01
Classifier design is a fundamental problem in pattern recognition. A variety of pattern classification methods such as the nearest neighbor (NN) classifier, support vector machine (SVM), and sparse representation-based classification (SRC) have been proposed in the literature. These typical and widely used classifiers were originally developed from different theory or application motivations and they are conventionally treated as independent and specific solutions for pattern classification. This paper proposes a novel pattern classification framework, namely, representative vector machines (or RVMs for short). The basic idea of RVMs is to assign the class label of a test example according to its nearest representative vector. The contributions of RVMs are twofold. On one hand, the proposed RVMs establish a unified framework of classical classifiers because NN, SVM, and SRC can be interpreted as the special cases of RVMs with different definitions of representative vectors. Thus, the underlying relationship among a number of classical classifiers is revealed for better understanding of pattern classification. On the other hand, novel and advanced classifiers are inspired in the framework of RVMs. For example, a robust pattern classification method called discriminant vector machine (DVM) is motivated from RVMs. Given a test example, DVM first finds its k -NNs and then performs classification based on the robust M-estimator and manifold regularization. Extensive experimental evaluations on a variety of visual recognition tasks such as face recognition (Yale and face recognition grand challenge databases), object categorization (Caltech-101 dataset), and action recognition (Action Similarity LAbeliNg) demonstrate the advantages of DVM over other classifiers.
Weighted K-means support vector machine for cancer prediction.
Kim, SungHwan
2016-01-01
To date, the support vector machine (SVM) has been widely applied to diverse bio-medical fields to address disease subtype identification and pathogenicity of genetic variants. In this paper, I propose the weighted K-means support vector machine (wKM-SVM) and weighted support vector machine (wSVM), for which I allow the SVM to impose weights to the loss term. Besides, I demonstrate the numerical relations between the objective function of the SVM and weights. Motivated by general ensemble techniques, which are known to improve accuracy, I directly adopt the boosting algorithm to the newly proposed weighted KM-SVM (and wSVM). For predictive performance, a range of simulation studies demonstrate that the weighted KM-SVM (and wSVM) with boosting outperforms the standard KM-SVM (and SVM) including but not limited to many popular classification rules. I applied the proposed methods to simulated data and two large-scale real applications in the TCGA pan-cancer methylation data of breast and kidney cancer. In conclusion, the weighted KM-SVM (and wSVM) increases accuracy of the classification model, and will facilitate disease diagnosis and clinical treatment decisions to benefit patients. A software package (wSVM) is publicly available at the R-project webpage (https://www.r-project.org).
Using support vector regression to predict PM10 and PM2.5
International Nuclear Information System (INIS)
Weizhen, Hou; Zhengqiang, Li; Yuhuan, Zhang; Hua, Xu; Ying, Zhang; Kaitao, Li; Donghui, Li; Peng, Wei; Yan, Ma
2014-01-01
Support vector machine (SVM), as a novel and powerful machine learning tool, can be used for the prediction of PM 10 and PM 2.5 (particulate matter less or equal than 10 and 2.5 micrometer) in the atmosphere. This paper describes the development of a successive over relaxation support vector regress (SOR-SVR) model for the PM 10 and PM 2.5 prediction, based on the daily average aerosol optical depth (AOD) and meteorological parameters (atmospheric pressure, relative humidity, air temperature, wind speed), which were all measured in Beijing during the year of 2010–2012. The Gaussian kernel function, as well as the k-fold crosses validation and grid search method, are used in SVR model to obtain the optimal parameters to get a better generalization capability. The result shows that predicted values by the SOR-SVR model agree well with the actual data and have a good generalization ability to predict PM 10 and PM 2.5 . In addition, AOD plays an important role in predicting particulate matter with SVR model, which should be included in the prediction model. If only considering the meteorological parameters and eliminating AOD from the SVR model, the prediction results of predict particulate matter will be not satisfying
An assessment of support vector machines for land cover classification
Huang, C.; Davis, L.S.; Townshend, J.R.G.
2002-01-01
The support vector machine (SVM) is a group of theoretically superior machine learning algorithms. It was found competitive with the best available machine learning algorithms in classifying high-dimensional data sets. This paper gives an introduction to the theoretical development of the SVM and an experimental evaluation of its accuracy, stability and training speed in deriving land cover classifications from satellite images. The SVM was compared to three other popular classifiers, including the maximum likelihood classifier (MLC), neural network classifiers (NNC) and decision tree classifiers (DTC). The impacts of kernel configuration on the performance of the SVM and of the selection of training data and input variables on the four classifiers were also evaluated in this experiment.
A Parallel Vector Machine for the PM Programming Language
Bellerby, Tim
2016-04-01
PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using
Directory of Open Access Journals (Sweden)
Chan-Yun Yang
2013-01-01
Full Text Available Austempered ductile iron has emerged as a notable material in several engineering fields, including marine applications. The initial austenite carbon content after austenization transform but before austempering process for generating bainite matrix proved critical in controlling the resulted microstructure and thus mechanical properties. In this paper, support vector regression is employed in order to establish a relationship between the initial carbon concentration in the austenite with austenization temperature and alloy contents, thereby exercising improved control in the mechanical properties of the austempered ductile irons. Particularly, the paper emphasizes a methodology tailored to deal with a limited amount of available data with intrinsically contracted and skewed distribution. The collected information from a variety of data sources presents another challenge of highly uncertain variance. The authors present a hybrid model consisting of a procedure of a histogram equalizer and a procedure of a support-vector-machine (SVM- based regression to gain a more robust relationship to respond to the challenges. The results show greatly improved accuracy of the proposed model in comparison to two former established methodologies. The sum squared error of the present model is less than one fifth of that of the two previous models.
Directory of Open Access Journals (Sweden)
Hailun Wang
2017-01-01
Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.
Nonlinear regularization path for quadratic loss support vector machines.
Karasuyama, Masayuki; Takeuchi, Ichiro
2011-10-01
Regularization path algorithms have been proposed to deal with model selection problem in several machine learning approaches. These algorithms allow computation of the entire path of solutions for every value of regularization parameter using the fact that their solution paths have piecewise linear form. In this paper, we extend the applicability of regularization path algorithm to a class of learning machines that have quadratic loss and quadratic penalty term. This class contains several important learning machines such as squared hinge loss support vector machine (SVM) and modified Huber loss SVM. We first show that the solution paths of this class of learning machines have piecewise nonlinear form, and piecewise segments between two breakpoints are characterized by a class of rational functions. Then we develop an algorithm that can efficiently follow the piecewise nonlinear path by solving these rational equations. To solve these rational equations, we use rational approximation technique with quadratic convergence rate, and thus, our algorithm can follow the nonlinear path much more precisely than existing approaches such as predictor-corrector type nonlinear-path approximation. We show the algorithm performance on some artificial and real data sets. © 2011 IEEE
Wavelength detection in FBG sensor networks using least squares support vector regression
Chen, Jing; Jiang, Hao; Liu, Tundong; Fu, Xiaoli
2014-04-01
A wavelength detection method for a wavelength division multiplexing (WDM) fiber Bragg grating (FBG) sensor network is proposed based on least squares support vector regression (LS-SVR). As a kind of promising machine learning technique, LS-SVR is employed to approximate the inverse function of the reflection spectrum. The LS-SVR detection model is established from the training samples, and then the Bragg wavelength of each FBG can be directly identified by inputting the measured spectrum into the well-trained model. We also discuss the impact of the sample size and the preprocess of the input spectrum on the performance of the training effectiveness. The results demonstrate that our approach is effective in improving the accuracy for sensor networks with a large number of FBGs.
Chord Recognition Based on Temporal Correlation Support Vector Machine
Directory of Open Access Journals (Sweden)
Zhongyang Rao
2016-05-01
Full Text Available In this paper, we propose a method called temporal correlation support vector machine (TCSVM for automatic major-minor chord recognition in audio music. We first use robust principal component analysis to separate the singing voice from the music to reduce the influence of the singing voice and consider the temporal correlations of the chord features. Using robust principal component analysis, we expect the low-rank component of the spectrogram matrix to contain the musical accompaniment and the sparse component to contain the vocal signals. Then, we extract a new logarithmic pitch class profile (LPCP feature called enhanced LPCP from the low-rank part. To exploit the temporal correlation among the LPCP features of chords, we propose an improved support vector machine algorithm called TCSVM. We perform this study using the MIREX’09 (Music Information Retrieval Evaluation eXchange Audio Chord Estimation dataset. Furthermore, we conduct comprehensive experiments using different pitch class profile feature vectors to examine the performance of TCSVM. The results of our method are comparable to the state-of-the-art methods that entered the MIREX in 2013 and 2014 for the MIREX’09 Audio Chord Estimation task dataset.
Directory of Open Access Journals (Sweden)
Qingsong Xu
2014-01-01
Full Text Available Extreme learning machine (ELM is a learning algorithm for single-hidden layer feedforward neural network dedicated to an extremely fast learning. However, the performance of ELM in structural impact localization is unknown yet. In this paper, a comparison study of ELM with least squares support vector machine (LSSVM is presented for the application on impact localization of a plate structure with surface-mounted piezoelectric sensors. Both basic and kernel-based ELM regression models have been developed for the location prediction. Comparative studies of the basic ELM, kernel-based ELM, and LSSVM models are carried out. Results show that the kernel-based ELM requires the shortest learning time and it is capable of producing suboptimal localization accuracy among the three models. Hence, ELM paves a promising way in structural impact detection.
Wang, Yuanjia; Chen, Tianle; Zeng, Donglin
2016-01-01
Learning risk scores to predict dichotomous or continuous outcomes using machine learning approaches has been studied extensively. However, how to learn risk scores for time-to-event outcomes subject to right censoring has received little attention until recently. Existing approaches rely on inverse probability weighting or rank-based regression, which may be inefficient. In this paper, we develop a new support vector hazards machine (SVHM) approach to predict censored outcomes. Our method is based on predicting the counting process associated with the time-to-event outcomes among subjects at risk via a series of support vector machines. Introducing counting processes to represent time-to-event data leads to a connection between support vector machines in supervised learning and hazards regression in standard survival analysis. To account for different at risk populations at observed event times, a time-varying offset is used in estimating risk scores. The resulting optimization is a convex quadratic programming problem that can easily incorporate non-linearity using kernel trick. We demonstrate an interesting link from the profiled empirical risk function of SVHM to the Cox partial likelihood. We then formally show that SVHM is optimal in discriminating covariate-specific hazard function from population average hazard function, and establish the consistency and learning rate of the predicted risk using the estimated risk scores. Simulation studies show improved prediction accuracy of the event times using SVHM compared to existing machine learning methods and standard conventional approaches. Finally, we analyze two real world biomedical study data where we use clinical markers and neuroimaging biomarkers to predict age-at-onset of a disease, and demonstrate superiority of SVHM in distinguishing high risk versus low risk subjects.
Support vector machines for TEC seismo-ionospheric anomalies detection
Directory of Open Access Journals (Sweden)
M. Akhoondzadeh
2013-02-01
Full Text Available Using time series prediction methods, it is possible to pursue the behaviors of earthquake precursors in the future and to announce early warnings when the differences between the predicted value and the observed value exceed the predefined threshold value. Support Vector Machines (SVMs are widely used due to their many advantages for classification and regression tasks. This study is concerned with investigating the Total Electron Content (TEC time series by using a SVM to detect seismo-ionospheric anomalous variations induced by the three powerful earthquakes of Tohoku (11 March 2011, Haiti (12 January 2010 and Samoa (29 September 2009. The duration of TEC time series dataset is 49, 46 and 71 days, for Tohoku, Haiti and Samoa earthquakes, respectively, with each at time resolution of 2 h. In the case of Tohoku earthquake, the results show that the difference between the predicted value obtained from the SVM method and the observed value reaches the maximum value (i.e., 129.31 TECU at earthquake time in a period of high geomagnetic activities. The SVM method detected a considerable number of anomalous occurrences 1 and 2 days prior to the Haiti earthquake and also 1 and 5 days before the Samoa earthquake in a period of low geomagnetic activities. In order to show that the method is acting sensibly with regard to the results extracted during nonevent and event TEC data, i.e., to perform some null-hypothesis tests in which the methods would also be calibrated, the same period of data from the previous year of the Samoa earthquake date has been taken into the account. Further to this, in this study, the detected TEC anomalies using the SVM method were compared to the previous results (Akhoondzadeh and Saradjian, 2011; Akhoondzadeh, 2012 obtained from the mean, median, wavelet and Kalman filter methods. The SVM detected anomalies are similar to those detected using the previous methods. It can be concluded that SVM can be a suitable learning method
Support vector machines for TEC seismo-ionospheric anomalies detection
Akhoondzadeh, M.
2013-02-01
Using time series prediction methods, it is possible to pursue the behaviors of earthquake precursors in the future and to announce early warnings when the differences between the predicted value and the observed value exceed the predefined threshold value. Support Vector Machines (SVMs) are widely used due to their many advantages for classification and regression tasks. This study is concerned with investigating the Total Electron Content (TEC) time series by using a SVM to detect seismo-ionospheric anomalous variations induced by the three powerful earthquakes of Tohoku (11 March 2011), Haiti (12 January 2010) and Samoa (29 September 2009). The duration of TEC time series dataset is 49, 46 and 71 days, for Tohoku, Haiti and Samoa earthquakes, respectively, with each at time resolution of 2 h. In the case of Tohoku earthquake, the results show that the difference between the predicted value obtained from the SVM method and the observed value reaches the maximum value (i.e., 129.31 TECU) at earthquake time in a period of high geomagnetic activities. The SVM method detected a considerable number of anomalous occurrences 1 and 2 days prior to the Haiti earthquake and also 1 and 5 days before the Samoa earthquake in a period of low geomagnetic activities. In order to show that the method is acting sensibly with regard to the results extracted during nonevent and event TEC data, i.e., to perform some null-hypothesis tests in which the methods would also be calibrated, the same period of data from the previous year of the Samoa earthquake date has been taken into the account. Further to this, in this study, the detected TEC anomalies using the SVM method were compared to the previous results (Akhoondzadeh and Saradjian, 2011; Akhoondzadeh, 2012) obtained from the mean, median, wavelet and Kalman filter methods. The SVM detected anomalies are similar to those detected using the previous methods. It can be concluded that SVM can be a suitable learning method to detect
Scorebox extraction from mobile sports videos using Support Vector Machines
Kim, Wonjun; Park, Jimin; Kim, Changick
2008-08-01
Scorebox plays an important role in understanding contents of sports videos. However, the tiny scorebox may give the small-display-viewers uncomfortable experience in grasping the game situation. In this paper, we propose a novel framework to extract the scorebox from sports video frames. We first extract candidates by using accumulated intensity and edge information after short learning period. Since there are various types of scoreboxes inserted in sports videos, multiple attributes need to be used for efficient extraction. Based on those attributes, the optimal information gain is computed and top three ranked attributes in terms of information gain are selected as a three-dimensional feature vector for Support Vector Machines (SVM) to distinguish the scorebox from other candidates, such as logos and advertisement boards. The proposed method is tested on various videos of sports games and experimental results show the efficiency and robustness of our proposed method.
Support vector machine based battery model for electric vehicles
International Nuclear Information System (INIS)
Wang Junping; Chen Quanshi; Cao Binggang
2006-01-01
The support vector machine (SVM) is a novel type of learning machine based on statistical learning theory that can map a nonlinear function successfully. As a battery is a nonlinear system, it is difficult to establish the relationship between the load voltage and the current under different temperatures and state of charge (SOC). The SVM is used to model the battery nonlinear dynamics in this paper. Tests are performed on an 80Ah Ni/MH battery pack with the Federal Urban Driving Schedule (FUDS) cycle to set up the SVM model. Compared with the Nernst and Shepherd combined model, the SVM model can simulate the battery dynamics better with small amounts of experimental data. The maximum relative error is 3.61%
Predicting post-translational lysine acetylation using support vector machines
DEFF Research Database (Denmark)
Gnad, Florian; Ren, Shubin; Choudhary, Chunaram
2010-01-01
Lysine acetylation is a post-translational protein modification and a primary regulatory mechanism that controls many cell signaling processes. Lysine acetylation sites are recognized by acetyltransferases and deacetylases through sequence patterns (motifs). Recently, we used high-resolution mass...... spectrometry to identify 3600 lysine acetylation sites on 1750 human proteins covering most of the previously annotated sites and providing the most comprehensive acetylome so far. This dataset should provide an excellent source to train support vector machines (SVMs) allowing the high accuracy in silico...
Engineering support vector machine kernels that recognize translation initiation sites.
Zien, A; Rätsch, G; Mika, S; Schölkopf, B; Lengauer, T; Müller, K R
2000-09-01
In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g. ESTScan) could profit from advanced TIS recognition.
Single directional SMO algorithm for least squares support vector machines.
Shao, Xigao; Wu, Kun; Liao, Bifeng
2013-01-01
Working set selection is a major step in decomposition methods for training least squares support vector machines (LS-SVMs). In this paper, a new technique for the selection of working set in sequential minimal optimization- (SMO-) type decomposition methods is proposed. By the new method, we can select a single direction to achieve the convergence of the optimality condition. A simple asymptotic convergence proof for the new algorithm is given. Experimental comparisons demonstrate that the classification accuracy of the new method is not largely different from the existing methods, but the training speed is faster than existing ones.
An Incremental Support Vector Machine based Speech Activity Detection Algorithm.
Xianbo, Xiao; Guangshu, Hu
2005-01-01
Traditional voice activity detection algorithms are mostly threshold-based or statistical model-based. All those methods are absent of the ability to react quickly to variations of environments. This paper describes an incremental SVM (Support Vector Machine) method for speech activity detection. The proposed incremental procedure makes it adaptive to variation of environments and the special construction of incremental training data set decreases computing consumption effectively. Experiments results demonstrated its higher end point detection accuracy. Further work will be focused on decreasing computing consumption and importing multi-class SVM classifiers.
Slope Deformation Prediction Based on Support Vector Machine
Directory of Open Access Journals (Sweden)
Lei JIA
2013-07-01
Full Text Available This paper principally studies the prediction of slope deformation based on Support Vector Machine (SVM. In the prediction process，explore how to reconstruct the phase space. The geological body’s displacement data obtained from chaotic time series are used as SVM’s training samples. Slope displacement caused by multivariable coupling is predicted by means of single variable. Results show that this model is of high fitting accuracy and generalization, and provides reference for deformation prediction in slope engineering.
Gas detonation cell width prediction model based on support vector regression
Directory of Open Access Journals (Sweden)
Jiyang Yu
2017-10-01
Full Text Available Detonation cell width is an important parameter in hydrogen explosion assessments. The experimental data on gas detonation are statistically analyzed to establish a universal method to numerically predict detonation cell widths. It is commonly understood that detonation cell width, λ, is highly correlated with the characteristic reaction zone width, δ. Classical parametric regression methods were widely applied in earlier research to build an explicit semiempirical correlation for the ratio of λ/δ. The obtained correlations formulate the dependency of the ratio λ/δ on a dimensionless effective chemical activation energy and a dimensionless temperature of the gas mixture. In this paper, support vector regression (SVR, which is based on nonparametric machine learning, is applied to achieve functions with better fitness to experimental data and more accurate predictions. Furthermore, a third parameter, dimensionless pressure, is considered as an additional independent variable. It is found that three-parameter SVR can significantly improve the performance of the fitting function. Meanwhile, SVR also provides better adaptability and the model functions can be easily renewed when experimental database is updated or new regression parameters are considered.
Study on fault diagnosis technology based on support vector machine
International Nuclear Information System (INIS)
Xia Hong; Zhang Nan; Du Xingfu
2010-01-01
In this paper, two fault diagnostic casts were constructed using SVM theory for model fault such as cracks of steam generator heat transfer tubes and small break loss of coolant accident in nuclear power plant. One fault diagnostic cast was constructed based on least squares support vector machines using C++ programming language. The other was constructed based on traditionary support vector machines using Matlab7.0 program. The results in the Simulation Test have shown that the performance of the two models based on two kinds of SVMs both depends on the choice of nuclear function model and the parameters. In this study, after the suitable choice, the same diagnostic performance was obtained using two kinds of SVMs. The fault can be diagnosed exactly during the period between the third second until shutdown. In the first three seconds, the fault data were not yet shown or the data were fluctuant. The simulation results demonstrated that the methods could diagnose the fault phenomenon accurately under the circumstances of small example sizes, and the precision was very high. (authors)
DC Algorithm for Extended Robust Support Vector Machine.
Fujiwara, Shuhei; Takeda, Akiko; Kanamori, Takafumi
2017-05-01
Nonconvex variants of support vector machines (SVMs) have been developed for various purposes. For example, robust SVMs attain robustness to outliers by using a nonconvex loss function, while extended [Formula: see text]-SVM (E[Formula: see text]-SVM) extends the range of the hyperparameter by introducing a nonconvex constraint. Here, we consider an extended robust support vector machine (ER-SVM), a robust variant of E[Formula: see text]-SVM. ER-SVM combines two types of nonconvexity from robust SVMs and E[Formula: see text]-SVM. Because of the two nonconvexities, the existing algorithm we proposed needs to be divided into two parts depending on whether the hyperparameter value is in the extended range or not. The algorithm also heuristically solves the nonconvex problem in the extended range. In this letter, we propose a new, efficient algorithm for ER-SVM. The algorithm deals with two types of nonconvexity while never entailing more computations than either E[Formula: see text]-SVM or robust SVM, and it finds a critical point of ER-SVM. Furthermore, we show that ER-SVM includes the existing robust SVMs as special cases. Numerical experiments confirm the effectiveness of integrating the two nonconvexities.
Penerapan Support Vector Machine (SVM untuk Pengkategorian Penelitian
Directory of Open Access Journals (Sweden)
Fithri Selva Jumeilah
2017-07-01
Full Text Available Research every college will continue to grow. Research will be stored in softcopy and hardcopy. The preparation of the research should be categorized in order to facilitate the search for people who need reference. To categorize the research, we need a method for text mining, one of them is with the implementation of Support Vector Machines (SVM. The data used to recognize the characteristics of each category then it takes secondary data which is a collection of abstracts of research. The data will be pre-processed with several stages: case folding converts all the letters into lowercase, stop words removal removal of very common words, tokenizing discard punctuation, and stemming searching for root words by removing the prefix and suffix. Further data that has undergone preprocessing will be converted into a numerical form with for the term weighting stage that is the weighting contribution of each word. From the results of term weighting then obtained data that can be used for data training and test data. The training process is done by providing input in the form of text data that is known to the class or category. Then by using the Support Vector Machines algorithm, the input data is transformed into a rule, function, or knowledge model that can be used in the prediction process. From the results of this study obtained that the categorization of research produced by SVM has been very good. This is proven by the results of the test which resulted in an accuracy of 90%.
Deng, Cheng; Xu, Jie; Zhang, Kaibing; Tao, Dacheng; Gao, Xinbo; Li, Xuelong
2016-12-01
For regression-based single-image super-resolution (SR) problem, the key is to establish a mapping relation between high-resolution (HR) and low-resolution (LR) image patches for obtaining a visually pleasing quality image. Most existing approaches typically solve it by dividing the model into several single-output regression problems, which obviously ignores the circumstance that a pixel within an HR patch affects other spatially adjacent pixels during the training process, and thus tends to generate serious ringing artifacts in resultant HR image as well as increase computational burden. To alleviate these problems, we propose to use structured output regression machine (SORM) to simultaneously model the inherent spatial relations between the HR and LR patches, which is propitious to preserve sharp edges. In addition, to further improve the quality of reconstructed HR images, a nonlocal (NL) self-similarity prior in natural images is introduced to formulate as a regularization term to further enhance the SORM-based SR results. To offer a computation-effective SORM method, we use a relative small nonsupport vector samples to establish the accurate regression model and an accelerating algorithm for NL self-similarity calculation. Extensive SR experiments on various images indicate that the proposed method can achieve more promising performance than the other state-of-the-art SR methods in terms of both visual quality and computational cost.
Imani, Moslem; Kao, Huan-Chin; Lan, Wen-Hau; Kuo, Chung-Yen
2018-02-01
The analysis and the prediction of sea level fluctuations are core requirements of marine meteorology and operational oceanography. Estimates of sea level with hours-to-days warning times are especially important for low-lying regions and coastal zone management. The primary purpose of this study is to examine the applicability and capability of extreme learning machine (ELM) and relevance vector machine (RVM) models for predicting sea level variations and compare their performances with powerful machine learning methods, namely, support vector machine (SVM) and radial basis function (RBF) models. The input dataset from the period of January 2004 to May 2011 used in the study was obtained from the Dongshi tide gauge station in Chiayi, Taiwan. Results showed that the ELM and RVM models outperformed the other methods. The performance of the RVM approach was superior in predicting the daily sea level time series given the minimum root mean square error of 34.73 mm and the maximum determination coefficient of 0.93 (R2) during the testing periods. Furthermore, the obtained results were in close agreement with the original tide-gauge data, which indicates that RVM approach is a promising alternative method for time series prediction and could be successfully used for daily sea level forecasts.
Estimation of the wind turbine yaw error by support vector machines
DEFF Research Database (Denmark)
Sheibat-Othman, Nida; Othman, Sami; Tayari, Raoaa
2015-01-01
Wind turbine yaw error information is of high importance in controlling wind turbine power and structural load. Normally used wind vanes are imprecise. In this work, the estimation of yaw error in wind turbines is studied using support vector machines for regression (SVR). As the methodology...... is data-based, simulated data from a high fidelity aero-elastic model is used for learning. The model simulates a variable speed horizontal-axis wind turbine composed of three blades and a full converter. Both partial load (blade angles fixed at 0 deg) and full load zones (active pitch actuators...
A support vector machine approach to detect financial statement fraud in South Africa: A first look
CSIR Research Space (South Africa)
Moepya, SO
2014-04-01
Full Text Available Auditors face the difficult task of detecting companies that issue manipulated financial statements. In recent years, machine learning methods have provided a feasible solution to this task. This study develops support vector machine (SVM) models...
Fault Identification in an Unbalanced Distribution System Using Support Vector Machine
Directory of Open Access Journals (Sweden)
Sophi Shilpa Gururajapathy
2016-12-01
Full Text Available Fast and effective fault location in distribution system is important to improve the power system reliability. Most of the researches rarely mention about effective fault location consisting of faulted phase, fault type, faulty section and fault distance identification. This work presents a method using support vector machine to identify the faulted phase, fault type, faulty section and distance at the same time. Support vector classification and regression analysis are performed to locate fault. The method uses the voltage sag data during fault condition measured at the primary substation. The faulted phase and the fault type are identified using three-dimensional support vector classification. The possible faulty sections are identified by matching voltage sag at fault condition to the voltage sag in database and the possible sections are ranked using shortest distance principle. The fault distance for the possible faulty sections isthen identified using support vector regression analysis. The performance of the proposed method was tested on an unbalanced distribution system from SaskPower, Canada. The results show that the accuracy of the proposed method is satisfactory.
Learning with Uncertainty - Gaussian Processes and Relevance Vector Machines
DEFF Research Database (Denmark)
Candela, Joaquin Quinonero
2004-01-01
This thesis is concerned with Gaussian Processes (GPs) and Relevance Vector Machines (RVMs), both of which are particular instances of probabilistic linear models. We look at both models from a Bayesian perspective, and are forced to adopt an approximate Bayesian treatment to learning for two....... Computational efficiency is obtained through sparseness: sparse linear models have a significant number of their weights set to zero. For the RVM, which we treat in Chap. 2, we show that it is precisely the particular choice of Bayesian approximation that enforces sparseness. Probabilistic models have...... family of approximations to Gaussian Processes, Reduced Rank Gaussian Processes (RRGPs), which take the form of nite extended linear models; we show that GPs are in general equivalent to in nite extended linear models. We also show that RRGPs result in degenerate GPs, which suffer, like RVMs...
Power transformer protection using support vector machine network
Energy Technology Data Exchange (ETDEWEB)
Moravej, Z. [Semnan Univ., Semnan (Iran, Islamic Republic of). Dept. of Electrical Engineering
2008-07-01
Power transformers are among the most important components in power systems. As such, damage to power transformers must be avoided in order to ensure continuous power delivery. This paper described the advantages and limitations of many commonly used techniques that prevent false tripping. It then presented the fault detection method using a support vector machine network (SVMN). This new and powerful technique for solving supervised classification problems is particularly useful due to its generalization ability. Computer simulations using the EMTP/ATP software package revealed that the proposed network performs well and can detect faults in power transformers quickly and correctly in real time applications. The use of a pattern recognizer for power system diagnosis offers great advancement in the protection field. The fault detector based radial basis function (RBF) network was then used for performance comparison. Both the RBF and SVM network was considered to be an appropriate tool for classification problems. 23 refs., 5 tabs., 2 figs.
Support Vector Machine Diagnosis of Acute Abdominal Pain
Björnsdotter, Malin; Nalin, Kajsa; Hansson, Lars-Erik; Malmgren, Helge
This study explores the feasibility of a decision-support system for patients seeking care for acute abdominal pain, and, specifically the diagnosis of acute diverticulitis. We used a linear support vector machine (SVM) to separate diverticulitis from all other reported cases of abdominal pain and from the important differential diagnosis non-specific abdominal pain (NSAP). On a database containing 3337 patients, the SVM obtained results comparable to those of the doctors in separating diverticulitis or NSAP from the remaining diseases. The distinction between diverticulitis and NSAP was, however, substantially improved by the SVM. For this patient group, the doctors achieved a sensitivity of 0.714 and a specificity of 0.963. When adjusted to the physicians' results, the SVM sensitivity/specificity was higher at 0.714/0.985 and 0.786/0.963 respectively. Age was found as the most important discriminative variable, closely followed by C-reactive protein level and lower left side pain.
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
Directory of Open Access Journals (Sweden)
V. Dheepa
2012-07-01
Full Text Available Along with the great increase of internet and e-commerce, the use of credit card is an unavoidable one. Due to the increase of credit card usage, the frauds associated with this have also increased. There are a lot of approaches used to detect the frauds. In this paper, behavior based classification approach using Support Vector Machines are employed and efficient feature extraction method also adopted. If any discrepancies occur in the behaviors transaction pattern then it is predicted as suspicious and taken for further consideration to find the frauds. Generally credit card fraud detection problem suffers from a large amount of data, which is rectified by the proposed method. Achieving finest accuracy, high fraud catching rate and low false alarms are the main tasks of this approach.
Application of support vector machine for classification of multispectral data
International Nuclear Information System (INIS)
Bahari, Nurul Iman Saiful; Ahmad, Asmala; Aboobaider, Burhanuddin Mohd
2014-01-01
In this paper, support vector machine (SVM) is used to classify satellite remotely sensed multispectral data. The data are recorded from a Landsat-5 TM satellite with resolution of 30x30m. SVM finds the optimal separating hyperplane between classes by focusing on the training cases. The study area of Klang Valley has more than 10 land covers and classification using SVM has been done successfully without any pixel being unclassified. The training area is determined carefully by visual interpretation and with the aid of the reference map of the study area. The result obtained is then analysed for the accuracy and visual performance. Accuracy assessment is done by determination and discussion of Kappa coefficient value, overall and producer accuracy for each class (in pixels and percentage). While, visual analysis is done by comparing the classification data with the reference map. Overall the study shows that SVM is able to classify the land covers within the study area with a high accuracy
Gradient Evolution-based Support Vector Machine Algorithm for Classification
Zulvia, Ferani E.; Kuo, R. J.
2018-03-01
This paper proposes a classification algorithm based on a support vector machine (SVM) and gradient evolution (GE) algorithms. SVM algorithm has been widely used in classification. However, its result is significantly influenced by the parameters. Therefore, this paper aims to propose an improvement of SVM algorithm which can find the best SVMs’ parameters automatically. The proposed algorithm employs a GE algorithm to automatically determine the SVMs’ parameters. The GE algorithm takes a role as a global optimizer in finding the best parameter which will be used by SVM algorithm. The proposed GE-SVM algorithm is verified using some benchmark datasets and compared with other metaheuristic-based SVM algorithms. The experimental results show that the proposed GE-SVM algorithm obtains better results than other algorithms tested in this paper.
Automatic Detection of Retinal Exudates using a Support Vector Machine
Directory of Open Access Journals (Sweden)
Nualsawat HIRANSAKOLWONG
2013-02-01
Full Text Available Retinal exudates are among the preliminary signs of diabetic retinopathy, a major cause of vision loss in diabetic patients. Correct and efficient screening of exudates is very expensive in professional time and may cause human error. Nowadays, the digital retinal image is frequently used to follow-up and diagnoses eye diseases. Therefore, the retinal image is crucial and essential for experts to detect exudates. Unfortunately, it is a normal situation that retinal images in Thailand are poor quality images. In this paper, we present a series of experiments on feature selection and exudates classification using the support vector machine classifiers. The retinal images are segmented following key preprocessing steps, i.e., color normalization, contrast enhancement, noise removal and color space selection. On data sets of poor quality images, sensitivity, specificity and accuracy is 94.46%, 89.52% and 92.14%, respectively.
Importance Weighted Import Vector Machine for Unsupervised Domain Adaptation.
Khalighi, Sirvan; Ribeiro, Bernardete; Nunes, Urbano J
2017-10-01
In real-world applications, the assumption of independent and identical distribution is no longer consistent. To alleviate the significant mismatch between source and target domains, importance weighting import vector machine, which is an adaptive classifier, is proposed. This adaptive probabilistic classification method, which is sparse and computationally efficient, can be used for unsupervised domain adaptation (DA). The effectiveness of the proposed approach is demonstrated via a toy problem, and a real-world cross-domain object recognition task. Even though the sparseness, the proposed method outperforms the state-of-the-art in both unsupervised and semisupervised DA scenarios. We also introduce a reliable importance weighted cross validation (RIWCV), which is an improvement of importance weighted cross validation, for parameter and model selection. The RIWCV avoid falling down in local minimum, by selecting a more reliable combination of the parameters instead of the best parameters.
Support Vector Machines for Hyperspectral Remote Sensing Classification
Gualtieri, J. Anthony; Cromp, R. F.
1998-01-01
The Support Vector Machine provides a new way to design classification algorithms which learn from examples (supervised learning) and generalize when applied to new data. We demonstrate its success on a difficult classification problem from hyperspectral remote sensing, where we obtain performances of 96%, and 87% correct for a 4 class problem, and a 16 class problem respectively. These results are somewhat better than other recent results on the same data. A key feature of this classifier is its ability to use high-dimensional data without the usual recourse to a feature selection step to reduce the dimensionality of the data. For this application, this is important, as hyperspectral data consists of several hundred contiguous spectral channels for each exemplar. We provide an introduction to this new approach, and demonstrate its application to classification of an agriculture scene.
Deep Learning for Person Reidentification Using Support Vector Machines
Directory of Open Access Journals (Sweden)
Mengyu Xu
2017-01-01
Full Text Available Due to the variations of viewpoint, pose, and illumination, a given individual may appear considerably different across different camera views. Tracking individuals across camera networks with no overlapping fields is still a challenging problem. Previous works mainly focus on feature representation and metric learning individually which tend to have a suboptimal solution. To address this issue, in this work, we propose a novel framework to do the feature representation learning and metric learning jointly. Different from previous works, we represent the pairs of pedestrian images as new resized input and use linear Support Vector Machine to replace softmax activation function for similarity learning. Particularly, dropout and data augmentation techniques are also employed in this model to prevent the network from overfitting. Extensive experiments on two publically available datasets VIPeR and CUHK01 demonstrate the effectiveness of our proposed approach.
A Wavelet Kernel-Based Primal Twin Support Vector Machine for Economic Development Prediction
Directory of Open Access Journals (Sweden)
Fang Su
2013-01-01
Full Text Available Economic development forecasting allows planners to choose the right strategies for the future. This study is to propose economic development prediction method based on the wavelet kernel-based primal twin support vector machine algorithm. As gross domestic product (GDP is an important indicator to measure economic development, economic development prediction means GDP prediction in this study. The wavelet kernel-based primal twin support vector machine algorithm can solve two smaller sized quadratic programming problems instead of solving a large one as in the traditional support vector machine algorithm. Economic development data of Anhui province from 1992 to 2009 are used to study the prediction performance of the wavelet kernel-based primal twin support vector machine algorithm. The comparison of mean error of economic development prediction between wavelet kernel-based primal twin support vector machine and traditional support vector machine models trained by the training samples with the 3–5 dimensional input vectors, respectively, is given in this paper. The testing results show that the economic development prediction accuracy of the wavelet kernel-based primal twin support vector machine model is better than that of traditional support vector machine.
Support vector machine-based feature extractor for L/H transitions in JET
Energy Technology Data Exchange (ETDEWEB)
Gonzalez, S.; Vega, J.; Pereira, A. [Asociacion EURATOM/CIEMAT para Fusion, Madrid 28040 (Spain); Murari, A. [Consorzio RFX, Associazione EURATOM ENEA per la Fusione, Padova 4-35127 (Italy); Ramirez, J. M.; Dormido-Canto, S. [Departamento de Informatica y Automatica, UNED, Madrid 28040 (Spain); Collaboration: JET-EFDA Contributors
2010-10-15
Support vector machines (SVM) are machine learning tools originally developed in the field of artificial intelligence to perform both classification and regression. In this paper, we show how SVM can be used to determine the most relevant quantities to characterize the confinement transition from low to high confinement regimes in tokamak plasmas. A set of 27 signals is used as starting point. The signals are discarded one by one until an optimal number of relevant waveforms is reached, which is the best tradeoff between keeping a limited number of quantities and not loosing essential information. The method has been applied to a database of 749 JET discharges and an additional database of 150 JET discharges has been used to test the results obtained.
Using support vector machines to identify literacy skills: Evidence from eye movements.
Lou, Ya; Liu, Yanping; Kaakinen, Johanna K; Li, Xingshan
2017-06-01
Is inferring readers' literacy skills possible by analyzing their eye movements during text reading? This study used Support Vector Machines (SVM) to analyze eye movement data from 61 undergraduate students who read a multiple-paragraph, multiple-topic expository text. Forward fixation time, first-pass rereading time, second-pass fixation time, and regression path reading time on different regions of the text were provided as features. The SVM classification algorithm assisted in distinguishing high-literacy-skilled readers from low-literacy-skilled readers with 80.3 % accuracy. Results demonstrate the effectiveness of combining eye tracking and machine learning techniques to detect readers with low literacy skills, and suggest that such approaches can be potentially used in predicting other cognitive abilities.
Short-Term Prediction of Air Pollution in Macau Using Support Vector Machines
Directory of Open Access Journals (Sweden)
Chi-Man Vong
2012-01-01
Full Text Available Forecasting of air pollution is a popular and important topic in recent years due to the health impact caused by air pollution. It is necessary to build an early warning system, which provides forecast and also alerts health alarm to local inhabitants by medical practitioners and the local government. Meteorological and pollutions data collected daily at monitoring stations of Macau can be used in this study to build a forecasting system. Support vector machines (SVMs, a novel type of machine learning technique based on statistical learning theory, can be used for regression and time series prediction. SVM is capable of good generalization while the performance of the SVM model is often hinged on the appropriate choice of the kernel.
A support vector machine approach for detection of microcalcifications.
El-Naqa, Issam; Yang, Yongyi; Wernick, Miles N; Galatsanos, Nikolas P; Nishikawa, Robert M
2002-12-01
In this paper, we investigate an approach based on support vector machines (SVMs) for detection of microcalcification (MC) clusters in digital mammograms, and propose a successive enhancement learning scheme for improved performance. SVM is a machine-learning method, based on the principle of structural risk minimization, which performs well when applied to data outside the training set. We formulate MC detection as a supervised-learning problem and apply SVM to develop the detection algorithm. We use the SVM to detect at each location in the image whether an MC is present or not. We tested the proposed method using a database of 76 clinical mammograms containing 1120 MCs. We use free-response receiver operating characteristic curves to evaluate detection performance, and compare the proposed algorithm with several existing methods. In our experiments, the proposed SVM framework outperformed all the other methods tested. In particular, a sensitivity as high as 94% was achieved by the SVM method at an error rate of one false-positive cluster per image. The ability of SVM to out perform several well-known methods developed for the widely studied problem of MC detection suggests that SVM is a promising technique for object detection in a medical imaging application.
DNS Tunneling Detection Method Based on Multilabel Support Vector Machine
Directory of Open Access Journals (Sweden)
Ahmed Almusawi
2018-01-01
Full Text Available DNS tunneling is a method used by malicious users who intend to bypass the firewall to send or receive commands and data. This has a significant impact on revealing or releasing classified information. Several researchers have examined the use of machine learning in terms of detecting DNS tunneling. However, these studies have treated the problem of DNS tunneling as a binary classification where the class label is either legitimate or tunnel. In fact, there are different types of DNS tunneling such as FTP-DNS tunneling, HTTP-DNS tunneling, HTTPS-DNS tunneling, and POP3-DNS tunneling. Therefore, there is a vital demand to not only detect the DNS tunneling but rather classify such tunnel. This study aims to propose a multilabel support vector machine in order to detect and classify the DNS tunneling. The proposed method has been evaluated using a benchmark dataset that contains numerous DNS queries and is compared with a multilabel Bayesian classifier based on the number of corrected classified DNS tunneling instances. Experimental results demonstrate the efficacy of the proposed SVM classification method by obtaining an f-measure of 0.80.
Semisupervised Support Vector Machines With Tangent Space Intrinsic Manifold Regularization.
Sun, Shiliang; Xie, Xijiong
2016-09-01
Semisupervised learning has been an active research topic in machine learning and data mining. One main reason is that labeling examples is expensive and time-consuming, while there are large numbers of unlabeled examples available in many practical problems. So far, Laplacian regularization has been widely used in semisupervised learning. In this paper, we propose a new regularization method called tangent space intrinsic manifold regularization. It is intrinsic to data manifold and favors linear functions on the manifold. Fundamental elements involved in the formulation of the regularization are local tangent space representations, which are estimated by local principal component analysis, and the connections that relate adjacent tangent spaces. Simultaneously, we explore its application to semisupervised classification and propose two new learning algorithms called tangent space intrinsic manifold regularized support vector machines (TiSVMs) and tangent space intrinsic manifold regularized twin SVMs (TiTSVMs). They effectively integrate the tangent space intrinsic manifold regularization consideration. The optimization of TiSVMs can be solved by a standard quadratic programming, while the optimization of TiTSVMs can be solved by a pair of standard quadratic programmings. The experimental results of semisupervised classification problems show the effectiveness of the proposed semisupervised learning algorithms.
Support vector machines for nuclear reactor state estimation
International Nuclear Information System (INIS)
Zavaljevski, N.; Gross, K. C.
2000-01-01
Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm
T-wave end detection using neural networks and Support Vector Machines.
Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román
2018-03-06
In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.
Automatic pathology classification using a single feature machine learning support - vector machines
Yepes-Calderon, Fernando; Pedregosa, Fabian; Thirion, Bertrand; Wang, Yalin; Lepore, Natasha
2014-03-01
Magnetic Resonance Imaging (MRI) has been gaining popularity in the clinic in recent years as a safe in-vivo imaging technique. As a result, large troves of data are being gathered and stored daily that may be used as clinical training sets in hospitals. While numerous machine learning (ML) algorithms have been implemented for Alzheimer's disease classification, their outputs are usually difficult to interpret in the clinical setting. Here, we propose a simple method of rapid diagnostic classification for the clinic using Support Vector Machines (SVM)1 and easy to obtain geometrical measurements that, together with a cortical and sub-cortical brain parcellation, create a robust framework capable of automatic diagnosis with high accuracy. On a significantly large imaging dataset consisting of over 800 subjects taken from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, classification-success indexes of up to 99.2% are reached with a single measurement.
Kernel Learning in Support Vector Machines using Dual-Objective Optimization
Pietersma, Auke-Dirk; Schomaker, Lambertus; Wiering, Marco
2011-01-01
Support vector machines (SVMs) are very popular methods for solving classification problems that require mapping input features to target labels. When dealing with real-world data sets, the different classes are usually not linearly separable, and therefore support vector machines employ a
Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era
Energy Technology Data Exchange (ETDEWEB)
Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo; Balaprakash, Prasanna; Unsal, Osman; Labarta, Jesus; Cristal, Adrian; Cappello, Franck
2016-01-01
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data point in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.
Reservoir rock permeability prediction using support vector regression in an Iranian oil field
International Nuclear Information System (INIS)
Saffarzadeh, Sadegh; Shadizadeh, Seyed Reza
2012-01-01
Reservoir permeability is a critical parameter for the evaluation of hydrocarbon reservoirs. It is often measured in the laboratory from reservoir core samples or evaluated from well test data. The prediction of reservoir rock permeability utilizing well log data is important because the core analysis and well test data are usually only available from a few wells in a field and have high coring and laboratory analysis costs. Since most wells are logged, the common practice is to estimate permeability from logs using correlation equations developed from limited core data; however, these correlation formulae are not universally applicable. Recently, support vector machines (SVMs) have been proposed as a new intelligence technique for both regression and classification tasks. The theory has a strong mathematical foundation for dependence estimation and predictive learning from finite data sets. The ultimate test for any technique that bears the claim of permeability prediction from well log data is the accurate and verifiable prediction of permeability for wells where only the well log data are available. The main goal of this paper is to develop the SVM method to obtain reservoir rock permeability based on well log data. (paper)
Bowd, Christopher; Medeiros, Felipe A.; Zhang, Zuohua; Zangwill, Linda M.; Hao, Jiucang; Lee, Te-Won; Sejnowski, Terrence J.; Weinreb, Robert N.; Goldbaum, Michael H.
2010-01-01
Purpose To classify healthy and glaucomatous eyes using relevance vector machine (RVM) and support vector machine (SVM) learning classifiers trained on retinal nerve fiber layer (RNFL) thickness measurements obtained by scanning laser polarimetry (SLP). Methods Seventy-two eyes of 72 healthy control subjects (average age = 64.3 ± 8.8 years, visual field mean deviation =−0.71 ± 1.2 dB) and 92 eyes of 92 patients with glaucoma (average age = 66.9 ± 8.9 years, visual field mean deviation =−5.32 ± 4.0 dB) were imaged with SLP with variable corneal compensation (GDx VCC; Laser Diagnostic Technologies, San Diego, CA). RVM and SVM learning classifiers were trained and tested on SLP-determined RNFL thickness measurements from 14 standard parameters and 64 sectors (approximately 5.6° each) obtained in the circumpapillary area under the instrument-defined measurement ellipse (total 78 parameters). Tenfold cross-validation was used to train and test RVM and SVM classifiers on unique subsets of the full 164-eye data set and areas under the receiver operating characteristic (AUROC) curve for the classification of eyes in the test set were generated. AUROC curve results from RVM and SVM were compared to those for 14 SLP software-generated global and regional RNFL thickness parameters. Also reported was the AUROC curve for the GDx VCC software-generated nerve fiber indicator (NFI). Results The AUROC curves for RVM and SVM were 0.90 and 0.91, respectively, and increased to 0.93 and 0.94 when the training sets were optimized with sequential forward and backward selection (resulting in reduced dimensional data sets). AUROC curves for optimized RVM and SVM were significantly larger than those for all individual SLP parameters. The AUROC curve for the NFI was 0.87. Conclusions Results from RVM and SVM trained on SLP RNFL thickness measurements are similar and provide accurate classification of glaucomatous and healthy eyes. RVM may be preferable to SVM, because it provides a
Directory of Open Access Journals (Sweden)
S.K. Lahiri
2009-09-01
Full Text Available Soft sensors have been widely used in the industrial process control to improve the quality of the product and assure safety in the production. The core of a soft sensor is to construct a soft sensing model. This paper introduces support vector regression (SVR, a new powerful machine learning methodbased on a statistical learning theory (SLT into soft sensor modeling and proposes a new soft sensing modeling method based on SVR. This paper presents an artificial intelligence based hybrid soft sensormodeling and optimization strategies, namely support vector regression – genetic algorithm (SVR-GA for modeling and optimization of mono ethylene glycol (MEG quality variable in a commercial glycol plant. In the SVR-GA approach, a support vector regression model is constructed for correlating the process data comprising values of operating and performance variables. Next, model inputs describing the process operating variables are optimized using genetic algorithm with a view to maximize the process performance. The SVR-GA is a new strategy for soft sensor modeling and optimization. The major advantage of the strategies is that modeling and optimization can be conducted exclusively from the historic process data wherein the detailed knowledge of process phenomenology (reaction mechanism, kinetics etc. is not required. Using SVR-GA strategy, a number of sets of optimized operating conditions were found. The optimized solutions, when verified in an actual plant, resulted in a significant improvement in the quality.
International Nuclear Information System (INIS)
Guo, Q; Shao, J; Ruiz, V
2005-01-01
This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma
Guo, Q.; Shao, J.; Ruiz, V.
2005-01-01
This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma.
Comparison of ν-support vector regression and logistic equation for ...
African Journals Online (AJOL)
Jane
2011-07-04
Jul 4, 2011 ... Due to the complexity and high non-linearity of bioprocess, most simple mathematical models fail to describe the exact behavior of ... Key words: Support vector regression, genetic algorithm, logistic model, prediction of biomass. .... method for solving non-linear regression problem, which depends on the ...
Noninvasive extraction of fetal electrocardiogram based on Support Vector Machine
Fu, Yumei; Xiang, Shihan; Chen, Tianyi; Zhou, Ping; Huang, Weiyan
2015-10-01
The fetal electrocardiogram (FECG) signal has important clinical value for diagnosing the fetal heart diseases and choosing suitable therapeutics schemes to doctors. So, the noninvasive extraction of FECG from electrocardiogram (ECG) signals becomes a hot research point. A new method, the Support Vector Machine (SVM) is utilized for the extraction of FECG with limited size of data. Firstly, the theory of the SVM and the principle of the extraction based on the SVM are studied. Secondly, the transformation of maternal electrocardiogram (MECG) component in abdominal composite signal is verified to be nonlinear and fitted with the SVM. Then, the SVM is trained, and the training results are compared with the real data to ensure the effect of the training. Meanwhile, the parameters of the SVM are optimized to achieve the best performance so that the learning machine can be utilized to fit the unknown samples. Finally, the FECG is extracted by removing the optimal estimation of MECG component from the abdominal composite signal. In order to evaluate the performance of FECG extraction based on the SVM, the Signal-to-Noise Ratio (SNR) and the visual test are used. The experimental results show that the FECG with good quality can be extracted, its SNR ratio is significantly increased as high as 9.2349 dB and the time cost is significantly decreased as short as 0.802 seconds. Compared with the traditional method, the noninvasive extraction method based on the SVM has a simple realization, the shorter treatment time and the better extraction quality under the same conditions.
Prediction of cell penetrating peptides by support vector machines.
Directory of Open Access Journals (Sweden)
William S Sanders
2011-07-01
Full Text Available Cell penetrating peptides (CPPs are those peptides that can transverse cell membranes to enter cells. Once inside the cell, different CPPs can localize to different cellular components and perform different roles. Some generate pore-forming complexes resulting in the destruction of cells while others localize to various organelles. Use of machine learning methods to predict potential new CPPs will enable more rapid screening for applications such as drug delivery. We have investigated the influence of the composition of training datasets on the ability to classify peptides as cell penetrating using support vector machines (SVMs. We identified 111 known CPPs and 34 known non-penetrating peptides from the literature and commercial vendors and used several approaches to build training data sets for the classifiers. Features were calculated from the datasets using a set of basic biochemical properties combined with features from the literature determined to be relevant in the prediction of CPPs. Our results using different training datasets confirm the importance of a balanced training set with approximately equal number of positive and negative examples. The SVM based classifiers have greater classification accuracy than previously reported methods for the prediction of CPPs, and because they use primary biochemical properties of the peptides as features, these classifiers provide insight into the properties needed for cell-penetration. To confirm our SVM classifications, a subset of peptides classified as either penetrating or non-penetrating was selected for synthesis and experimental validation. Of the synthesized peptides predicted to be CPPs, 100% of these peptides were shown to be penetrating.
Khawaja, Taimoor Saleem
A high-belief low-overhead Prognostics and Health Management (PHM) system is desired for online real-time monitoring of complex non-linear systems operating in a complex (possibly non-Gaussian) noise environment. This thesis presents a Bayesian Least Squares Support Vector Machine (LS-SVM) based framework for fault diagnosis and failure prognosis in nonlinear non-Gaussian systems. The methodology assumes the availability of real-time process measurements, definition of a set of fault indicators and the existence of empirical knowledge (or historical data) to characterize both nominal and abnormal operating conditions. An efficient yet powerful Least Squares Support Vector Machine (LS-SVM) algorithm, set within a Bayesian Inference framework, not only allows for the development of real-time algorithms for diagnosis and prognosis but also provides a solid theoretical framework to address key concepts related to classification for diagnosis and regression modeling for prognosis. SVM machines are founded on the principle of Structural Risk Minimization (SRM) which tends to find a good trade-off between low empirical risk and small capacity. The key features in SVM are the use of non-linear kernels, the absence of local minima, the sparseness of the solution and the capacity control obtained by optimizing the margin. The Bayesian Inference framework linked with LS-SVMs allows a probabilistic interpretation of the results for diagnosis and prognosis. Additional levels of inference provide the much coveted features of adaptability and tunability of the modeling parameters. The two main modules considered in this research are fault diagnosis and failure prognosis. With the goal of designing an efficient and reliable fault diagnosis scheme, a novel Anomaly Detector is suggested based on the LS-SVM machines. The proposed scheme uses only baseline data to construct a 1-class LS-SVM machine which, when presented with online data is able to distinguish between normal behavior
General Dimensional Multiple-Output Support Vector Regressions and Their Multiple Kernel Learning.
Chung, Wooyong; Kim, Jisu; Lee, Heejin; Kim, Euntai
2015-11-01
Support vector regression has been considered as one of the most important regression or function approximation methodologies in a variety of fields. In this paper, two new general dimensional multiple output support vector regressions (MSVRs) named SOCPL1 and SOCPL2 are proposed. The proposed methods are formulated in the dual space and their relationship with the previous works is clearly investigated. Further, the proposed MSVRs are extended into the multiple kernel learning and their training is implemented by the off-the-shelf convex optimization tools. The proposed MSVRs are applied to benchmark problems and their performances are compared with those of the previous methods in the experimental section.
Environmental noise forecasting based on support vector machine
Fu, Yumei; Zan, Xinwu; Chen, Tianyi; Xiang, Shihan
2018-01-01
As an important pollution source, the noise pollution is always the researcher's focus. Especially in recent years, the noise pollution is seriously harmful to the human beings' environment, so the research about the noise pollution is a very hot spot. Some noise monitoring technologies and monitoring systems are applied in the environmental noise test, measurement and evaluation. But, the research about the environmental noise forecasting is weak. In this paper, a real-time environmental noise monitoring system is introduced briefly. This monitoring system is working in Mianyang City, Sichuan Province. It is monitoring and collecting the environmental noise about more than 20 enterprises in this district. Based on the large amount of noise data, the noise forecasting by the Support Vector Machine (SVM) is studied in detail. Compared with the time series forecasting model and the artificial neural network forecasting model, the SVM forecasting model has some advantages such as the smaller data size, the higher precision and stability. The noise forecasting results based on the SVM can provide the important and accuracy reference to the prevention and control of the environmental noise.
Nonlinear structural damage detection using support vector machines
Xiao, Li; Qu, Wenzhong
2012-04-01
An actual structure including connections and interfaces may exist nonlinear. Because of many complicated problems about nonlinear structural health monitoring (SHM), relatively little progress have been made in this aspect. Statistical pattern recognition techniques have been demonstrated to be competitive with other methods when applied to real engineering datasets. When a structure existing 'breathing' cracks that open and close under operational loading may cause a linear structural system to respond to its operational and environmental loads in a nonlinear manner nonlinear. In this paper, a vibration-based structural health monitoring when the structure exists cracks is investigated with autoregressive support vector machine (AR-SVM). Vibration experiments are carried out with a model frame. Time-series data in different cases such as: initial linear structure; linear structure with mass changed; nonlinear structure; nonlinear structure with mass changed are acquired.AR model of acceleration time-series is established, and different kernel function types and corresponding parameters are chosen and compared, which can more accurate, more effectively locate the damage. Different cases damaged states and different damage positions have been recognized successfully. AR-SVM method for the insufficient training samples is proved to be practical and efficient on structure nonlinear damage detection.
Automatic inspection of textured surfaces by support vector machines
Jahanbin, Sina; Bovik, Alan C.; Pérez, Eduardo; Nair, Dinesh
2009-08-01
Automatic inspection of manufactured products with natural looking textures is a challenging task. Products such as tiles, textile, leather, and lumber project image textures that cannot be modeled as periodic or otherwise regular; therefore, a stochastic modeling of local intensity distribution is required. An inspection system to replace human inspectors should be flexible in detecting flaws such as scratches, cracks, and stains occurring in various shapes and sizes that have never been seen before. A computer vision algorithm is proposed in this paper that extracts local statistical features from grey-level texture images decomposed with wavelet frames into subbands of various orientations and scales. The local features extracted are second order statistics derived from grey-level co-occurrence matrices. Subsequently, a support vector machine (SVM) classifier is trained to learn a general description of normal texture from defect-free samples. This algorithm is implemented in LabVIEW and is capable of processing natural texture images in real-time.
Using support vector machines in the multivariate state estimation technique
International Nuclear Information System (INIS)
Zavaljevski, N.; Gross, K.C.
1999-01-01
One approach to validate nuclear power plant (NPP) signals makes use of pattern recognition techniques. This approach often assumes that there is a set of signal prototypes that are continuously compared with the actual sensor signals. These signal prototypes are often computed based on empirical models with little or no knowledge about physical processes. A common problem of all data-based models is their limited ability to make predictions on the basis of available training data. Another problem is related to suboptimal training algorithms. Both of these potential shortcomings with conventional approaches to signal validation and sensor operability validation are successfully resolved by adopting a recently proposed learning paradigm called the support vector machine (SVM). The work presented here is a novel application of SVM for data-based modeling of system state variables in an NPP, integrated with a nonlinear, nonparametric technique called the multivariate state estimation technique (MSET), an algorithm developed at Argonne National Laboratory for a wide range of nuclear plant applications
Identifying translation initiation sites in prokaryotes using support vector machine.
Gao, Tingting; Yang, Zhixia; Wang, Yong; Jing, Ling
2010-02-21
Gene identification in genomes has been a fundamental and long-standing task in bioinformatics and computational biology. Many computational methods have been developed to predict genes in prokaryote genomes by identifying translation initiation site (TIS) in transcript data. However, the pseudo-TISs at the genome level make these methods suffer from a high number of false positive predictions. In addition, most of the existing tools use an unsupervised learning framework, whose predictive accuracy may depend on the choice of specific organism. In this paper, we present a supervised learning method, support vector machine (SVM), to identify translation initiation site at the genome level. The features are extracted from the sequence data by modeling the sequence segment around predicted TISs as a position specific weight matrix (PSWM). We train the parameters of our SVM through well constructed positive and negative TIS datasets. Then we apply the method to recognize translation initiation sites in E. coli, B. subtilis, and validate our method on two GC-rich bacteria genomes: Pseudomonas aeruginosa and Burkholderia pseudomallei K96243. We show that translation initiation sites can be recognized accurately at the genome level by our method, irrespective of their GC content. Furthermore, we compare our method with four existing methods and demonstrate that our method outperform these methods by obtaining better performance in all the four organisms. (c) 2009. Published by Elsevier Ltd.
Targeted Local Support Vector Machine for Age-Dependent Classification.
Chen, Tianle; Wang, Yuanjia; Chen, Huaihou; Marder, Karen; Zeng, Donglin
2014-09-01
We develop methods to accurately predict whether pre-symptomatic individuals are at risk of a disease based on their various marker profiles, which offers an opportunity for early intervention well before definitive clinical diagnosis. For many diseases, existing clinical literature may suggest the risk of disease varies with some markers of biological and etiological importance, for example age. To identify effective prediction rules using nonparametric decision functions, standard statistical learning approaches treat markers with clear biological importance (e.g., age) and other markers without prior knowledge on disease etiology interchangeably as input variables. Therefore, these approaches may be inadequate in singling out and preserving the effects from the biologically important variables, especially in the presence of potential noise markers. Using age as an example of a salient marker to receive special care in the analysis, we propose a local smoothing large margin classifier implemented with support vector machine (SVM) to construct effective age-dependent classification rules. The method adaptively adjusts age effect and separately tunes age and other markers to achieve optimal performance. We derive the asymptotic risk bound of the local smoothing SVM, and perform extensive simulation studies to compare with standard approaches. We apply the proposed method to two studies of premanifest Huntington's disease (HD) subjects and controls to construct age-sensitive predictive scores for the risk of HD and risk of receiving HD diagnosis during the study period.
Mobile Phonocardiogram Diagnosis in Newborns Using Support Vector Machine
Directory of Open Access Journals (Sweden)
Amir Mohammad Amiri
2017-03-01
Full Text Available Phonocardiogram (PCG monitoring on newborns is one of the most important and challenging tasks in the heart assessment in the early ages of life. In this paper, we present a novel approach for cardiac monitoring applied in PCG data. This basic system coupled with denoising, segmentation, cardiac cycle selection and classification of heart sound can be used widely for a large number of the data. This paper describes the problems and additional advantages of the PCG method including the possibility of recording heart sound at home, removing unwanted noises and data reduction on a mobile device, and an intelligent system to diagnose heart diseases on the cloud server. A wide range of physiological features from various analysis domains, including modeling, time/frequency domain analysis, an algorithm, etc., is proposed in order to extract features which will be considered as inputs for the classifier. In order to record the PCG data set from multiple subjects over one year, an electronic stethoscope was used for collecting data that was connected to a mobile device. In this study, we used different types of classifiers in order to distinguish between healthy and pathological heart sounds, and a comparison on the performances revealed that support vector machine (SVM provides 92.2% accuracy and AUC = 0.98 in a time of 1.14 seconds for training, on a dataset of 116 samples.
Voice Activity Detection Using Fuzzy Entropy and Support Vector Machine
Directory of Open Access Journals (Sweden)
R. Johny Elton
2016-08-01
Full Text Available This paper proposes support vector machine (SVM based voice activity detection using FuzzyEn to improve detection performance under noisy conditions. The proposed voice activity detection (VAD uses fuzzy entropy (FuzzyEn as a feature extracted from noise-reduced speech signals to train an SVM model for speech/non-speech classification. The proposed VAD method was tested by conducting various experiments by adding real background noises of different signal-to-noise ratios (SNR ranging from −10 dB to 10 dB to actual speech signals collected from the TIMIT database. The analysis proves that FuzzyEn feature shows better results in discriminating noise and corrupted noisy speech. The efficacy of the SVM classifier was validated using 10-fold cross validation. Furthermore, the results obtained by the proposed method was compared with those of previous standardized VAD algorithms as well as recently developed methods. Performance comparison suggests that the proposed method is proven to be more efficient in detecting speech under various noisy environments with an accuracy of 93.29%, and the FuzzyEn feature detects speech efficiently even at low SNR levels.
Classification of masses on mammograms using support vector machine
Chu, Yong; Li, Lihua; Goldgof, Dmitry B.; Qui, Yan; Clark, Robert A.
2003-05-01
Mammography is the most effective method for early detection of breast cancer. However, the positive predictive value for classification of malignant and benign lesion from mammographic images is not very high. Clinical studies have shown that most biopsies for cancer are very low, between 15% and 30%. It is important to increase the diagnostic accuracy by improving the positive predictive value to reduce the number of unnecessary biopsies. In this paper, a new classification method was proposed to distinguish malignant from benign masses in mammography by Support Vector Machine (SVM) method. Thirteen features were selected based on receiver operating characteristic (ROC) analysis of classification using individual feature. These features include four shape features, two gradient features and seven Laws features. With these features, SVM was used to classify the masses into two categories, benign and malignant, in which a Gaussian kernel and sequential minimal optimization learning technique are performed. The data set used in this study consists of 193 cases, in which there are 96 benign cases and 97 malignant cases. The leave-one-out evaluation of SVM classifier was taken. The results show that the positive predict value of the presented method is 81.6% with the sensitivity of 83.7% and the false-positive rate of 30.2%. It demonstrated that the SVM-based classifier is effective in mass classification.
Fault size classification of rotating machinery using support vector machine
Energy Technology Data Exchange (ETDEWEB)
Kim, Y. S.; Lee, D. H.; Park, S. K. [Korea Hydro and Nuclear Power Co. Ltd., Daejeon (Korea, Republic of)
2012-03-15
Studies on fault diagnosis of rotating machinery have been carried out to obtain a machinery condition in two ways. First is a classical approach based on signal processing and analysis using vibration and acoustic signals. Second is to use artificial intelligence techniques to classify machinery conditions into normal or one of the pre-determined fault conditions. Support Vector Machine (SVM) is well known as intelligent classifier with robust generalization ability. In this study, a two-step approach is proposed to predict fault types and fault sizes of rotating machinery in nuclear power plants using multi-class SVM technique. The model firstly classifies normal and 12 fault types and then identifies their sizes in case of predicting any faults. The time and frequency domain features are extracted from the measured vibration signals and used as input to SVM. A test rig is used to simulate normal and the well-know 12 artificial fault conditions with three to six fault sizes of rotating machinery. The application results to the test data show that the present method can estimate fault types as well as fault sizes with high accuracy for bearing an shaft-related faults and misalignment. Further research, however, is required to identify fault size in case of unbalance, rubbing, looseness, and coupling-related faults.
Incremental Support Vector Machine Framework for Visual Sensor Networks
Directory of Open Access Journals (Sweden)
Yuichi Motai
2007-01-01
Full Text Available Motivated by the emerging requirements of surveillance networks, we present in this paper an incremental multiclassification support vector machine (SVM technique as a new framework for action classification based on real-time multivideo collected by homogeneous sites. The technique is based on an adaptation of least square SVM (LS-SVM formulation but extends beyond the static image-based learning of current SVM methodologies. In applying the technique, an initial supervised offline learning phase is followed by a visual behavior data acquisition and an online learning phase during which the cluster head performs an ensemble of model aggregations based on the sensor nodes inputs. The cluster head then selectively switches on designated sensor nodes for future incremental learning. Combining sensor data offers an improvement over single camera sensing especially when the latter has an occluded view of the target object. The optimization involved alleviates the burdens of power consumption and communication bandwidth requirements. The resulting misclassification error rate, the iterative error reduction rate of the proposed incremental learning, and the decision fusion technique prove its validity when applied to visual sensor networks. Furthermore, the enabled online learning allows an adaptive domain knowledge insertion and offers the advantage of reducing both the model training time and the information storage requirements of the overall system which makes it even more attractive for distributed sensor networks communication.
A Semisupervised Support Vector Machines Algorithm for BCI Systems
Qin, Jianzhao; Li, Yuanqing; Sun, Wei
2007-01-01
As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141
A Semisupervised Support Vector Machines Algorithm for BCI Systems
Directory of Open Access Journals (Sweden)
Jianzhao Qin
2007-07-01
Full Text Available As an emerging technology, brain-computer interfaces (BCIs bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM algorithm for brain-computer interface (BCI systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm.
Incremental support vector machines for fast reliable image recognition
Energy Technology Data Exchange (ETDEWEB)
Makili, L., E-mail: makili_le@yahoo.com [Instituto Superior Politécnico da Universidade Katyavala Bwila, Benguela (Angola); Vega, J. [Asociación EURATOM/CIEMAT para Fusión, Madrid (Spain); Dormido-Canto, S. [Dpto. Informática y Automática – UNED, Madrid (Spain)
2013-10-15
Highlights: ► A conformal predictor using SVM as the underlying algorithm was implemented. ► It was applied to image recognition in the TJ–II's Thomson Scattering Diagnostic. ► To improve time efficiency an approach to incremental SVM training has been used. ► Accuracy is similar to the one reached when standard SVM is used. ► Computational time saving is significant for large training sets. -- Abstract: This paper addresses the reliable classification of images in a 5-class problem. To this end, an automatic recognition system, based on conformal predictors and using Support Vector Machines (SVM) as the underlying algorithm has been developed and applied to the recognition of images in the Thomson Scattering Diagnostic of the TJ–II fusion device. Using such conformal predictor based classifier is a computationally intensive task since it implies to train several SVM models to classify a single example and to perform this training from scratch takes a significant amount of time. In order to improve the classification time efficiency, an approach to the incremental training of SVM has been used as the underlying algorithm. Experimental results show that the overall performance of the new classifier is high, comparable to the one corresponding to the use of standard SVM as the underlying algorithm and there is a significant improvement in time efficiency.
Distributed support vector machine in master-slave mode.
Chen, Qingguo; Cao, Feilong
2018-05-01
It is well known that the support vector machine (SVM) is an effective learning algorithm. The alternating direction method of multipliers (ADMM) algorithm has emerged as a powerful technique for solving distributed optimisation models. This paper proposes a distributed SVM algorithm in a master-slave mode (MS-DSVM), which integrates a distributed SVM and ADMM acting in a master-slave configuration where the master node and slave nodes are connected, meaning the results can be broadcasted. The distributed SVM is regarded as a regularised optimisation problem and modelled as a series of convex optimisation sub-problems that are solved by ADMM. Additionally, the over-relaxation technique is utilised to accelerate the convergence rate of the proposed MS-DSVM. Our theoretical analysis demonstrates that the proposed MS-DSVM has linear convergence, meaning it possesses the fastest convergence rate among existing standard distributed ADMM algorithms. Numerical examples demonstrate that the convergence and accuracy of the proposed MS-DSVM are superior to those of existing methods under the ADMM framework. Copyright © 2018 Elsevier Ltd. All rights reserved.
Fault size classification of rotating machinery using support vector machine
International Nuclear Information System (INIS)
Kim, Y. S.; Lee, D. H.; Park, S. K.
2012-01-01
Studies on fault diagnosis of rotating machinery have been carried out to obtain a machinery condition in two ways. First is a classical approach based on signal processing and analysis using vibration and acoustic signals. Second is to use artificial intelligence techniques to classify machinery conditions into normal or one of the pre-determined fault conditions. Support Vector Machine (SVM) is well known as intelligent classifier with robust generalization ability. In this study, a two-step approach is proposed to predict fault types and fault sizes of rotating machinery in nuclear power plants using multi-class SVM technique. The model firstly classifies normal and 12 fault types and then identifies their sizes in case of predicting any faults. The time and frequency domain features are extracted from the measured vibration signals and used as input to SVM. A test rig is used to simulate normal and the well-know 12 artificial fault conditions with three to six fault sizes of rotating machinery. The application results to the test data show that the present method can estimate fault types as well as fault sizes with high accuracy for bearing an shaft-related faults and misalignment. Further research, however, is required to identify fault size in case of unbalance, rubbing, looseness, and coupling-related faults
Support vector regression model for the estimation of γ-ray buildup factors for multi-layer shields
International Nuclear Information System (INIS)
Trontl, Kresimir; Smuc, Tomislav; Pevec, Dubravko
2007-01-01
The accuracy of the point-kernel method, which is a widely used practical tool for γ-ray shielding calculations, strongly depends on the quality and accuracy of buildup factors used in the calculations. Although, buildup factors for single-layer shields comprised of a single material are well known, calculation of buildup factors for stratified shields, each layer comprised of different material or a combination of materials, represent a complex physical problem. Recently, a new compact mathematical model for multi-layer shield buildup factor representation has been suggested for embedding into point-kernel codes thus replacing traditionally generated complex mathematical expressions. The new regression model is based on support vector machines learning technique, which is an extension of Statistical Learning Theory. The paper gives complete description of the novel methodology with results pertaining to realistic engineering multi-layer shielding geometries. The results based on support vector regression machine learning confirm that this approach provides a framework for general, accurate and computationally acceptable multi-layer buildup factor model
Directory of Open Access Journals (Sweden)
Xiaoqian Jiang
Full Text Available Historically, probabilistic models for decision support have focused on discrimination, e.g., minimizing the ranking error of predicted outcomes. Unfortunately, these models ignore another important aspect, calibration, which indicates the magnitude of correctness of model predictions. Using discrimination and calibration simultaneously can be helpful for many clinical decisions. We investigated tradeoffs between these goals, and developed a unified maximum-margin method to handle them jointly. Our approach called, Doubly Optimized Calibrated Support Vector Machine (DOC-SVM, concurrently optimizes two loss functions: the ridge regression loss and the hinge loss. Experiments using three breast cancer gene-expression datasets (i.e., GSE2034, GSE2990, and Chanrion's datasets showed that our model generated more calibrated outputs when compared to other state-of-the-art models like Support Vector Machine (p=0.03, p=0.13, and p<0.001 and Logistic Regression (p=0.006, p=0.008, and p<0.001. DOC-SVM also demonstrated better discrimination (i.e., higher AUCs when compared to Support Vector Machine (p=0.38, p=0.29, and p=0.047 and Logistic Regression (p=0.38, p=0.04, and p<0.0001. DOC-SVM produced a model that was better calibrated without sacrificing discrimination, and hence may be helpful in clinical decision making.
Adaptive image denoising based on support vector machine and wavelet description
An, Feng-Ping; Zhou, Xian-Wei
2017-12-01
Adaptive image denoising method decomposes the original image into a series of basic pattern feature images on the basis of wavelet description and constructs the support vector machine regression function to realize the wavelet description of the original image. The support vector machine method allows the linear expansion of the signal to be expressed as a nonlinear function of the parameters associated with the SVM. Using the radial basis kernel function of SVM, the original image can be extended into a MEXICAN function and a residual trend. This MEXICAN represents a basic image feature pattern. If the residual does not fluctuate, it can also be represented as a characteristic pattern. If the residuals fluctuate significantly, it is treated as a new image and the same decomposition process is repeated until the residuals obtained by the decomposition do not significantly fluctuate. Experimental results show that the proposed method in this paper performs well; especially, it satisfactorily solves the problem of image noise removal. It may provide a new tool and method for image denoising.
Directory of Open Access Journals (Sweden)
Marcel Schwieder
2014-04-01
Full Text Available Anthropogenic interventions in natural and semi-natural ecosystems often lead to substantial changes in their functioning and may ultimately threaten ecosystem service provision. It is, therefore, necessary to monitor these changes in order to understand their impacts and to support management decisions that help ensuring sustainability. Remote sensing has proven to be a valuable tool for these purposes, and especially hyperspectral sensors are expected to provide valuable data for quantitative characterization of land change processes. In this study, simulated EnMAP data were used for mapping shrub cover fractions along a gradient of shrub encroachment, in a study region in southern Portugal. We compared three machine learning regression techniques: Support Vector Regression (SVR; Random Forest Regression (RF; and Partial Least Squares Regression (PLSR. Additionally, we compared the influence of training sample size on the prediction performance. All techniques showed reasonably good results when trained with large samples, while SVR always outperformed the other algorithms. The best model was applied to produce a fractional shrub cover map for the whole study area. The predicted patterns revealed a gradient of shrub cover between regions affected by special agricultural management schemes for nature protection and areas without land use incentives. Our results highlight the value of EnMAP data in combination with machine learning regression techniques for monitoring gradual land change processes.
Classification of Regional Ionospheric Disturbances Based on Support Vector Machines
Begüm Terzi, Merve; Arikan, Feza; Arikan, Orhan; Karatay, Secil
2016-07-01
Ionosphere is an anisotropic, inhomogeneous, time varying and spatio-temporally dispersive medium whose parameters can be estimated almost always by using indirect measurements. Geomagnetic, gravitational, solar or seismic activities cause variations of ionosphere at various spatial and temporal scales. This complex spatio-temporal variability is challenging to be identified due to extensive scales in period, duration, amplitude and frequency of disturbances. Since geomagnetic and solar indices such as Disturbance storm time (Dst), F10.7 solar flux, Sun Spot Number (SSN), Auroral Electrojet (AE), Kp and W-index provide information about variability on a global scale, identification and classification of regional disturbances poses a challenge. The main aim of this study is to classify the regional effects of global geomagnetic storms and classify them according to their risk levels. For this purpose, Total Electron Content (TEC) estimated from GPS receivers, which is one of the major parameters of ionosphere, will be used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. In this work, for the automated classification of the regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. SVM is a supervised learning model used for classification with associated learning algorithm that analyze the data and recognize patterns. In addition to performing linear classification, SVM can efficiently perform nonlinear classification by embedding data into higher dimensional feature spaces. Performance of the developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from the GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing the developed classification
Combining extreme learning machines using support vector machines for breast tissue classification.
Daliri, Mohammad Reza
2015-01-01
In this paper, we present a new approach for breast tissue classification using the features derived from electrical impedance spectroscopy. This method is composed of a feature extraction method, feature selection phase and a classification step. The feature extraction phase derives the features from the electrical impedance spectra. The extracted features consist of the impedivity at zero frequency (I0), the phase angle at 500 KHz, the high-frequency slope of phase angle, the impedance distance between spectral ends, the area under spectrum, the normalised area, the maximum of the spectrum, the distance between impedivity at I0 and the real part of the maximum frequency point and the length of the spectral curve. The system uses the information theoretic criterion as a strategy for feature selection and the combining extreme learning machines (ELMs) for the classification phase. The results of several ELMs are combined using the support vector machines classifier, and the result of classification is reported as a measure of the performance of the system. The results indicate that the proposed system achieves high accuracy in classification of breast tissues using the electrical impedance spectroscopy.
Classification of burn wounds using support vector machines
Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose
2004-05-01
The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.
Transmembrane protein topology prediction using support vector machines
Directory of Open Access Journals (Sweden)
Nugent Timothy
2009-05-01
Full Text Available Abstract Background Alpha-helical transmembrane (TM proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated. Results We present a support vector machine-based (SVM TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from http://bioinf.cs.ucl.ac.uk/psipred/. Conclusion The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins.
Profiled support vector machines for antisense oligonucleotide efficacy prediction
Directory of Open Access Journals (Sweden)
Martín-Guerrero José D
2004-09-01
Full Text Available Abstract Background This paper presents the use of Support Vector Machines (SVMs for prediction and analysis of antisense oligonucleotide (AO efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is crucial given the presence of noisy or redundant features, and the well-known problem of the curse of dimensionality. We propose a two-stage strategy to develop an optimal model: (1 feature selection using correlation analysis, mutual information, and SVM-based recursive feature elimination (SVM-RFE, and (2 AO prediction using standard and profiled SVM formulations. A profiled SVM gives different weights to different parts of the training data to focus the training on the most important regions. Results In the first stage, the SVM-RFE technique was most efficient and robust in the presence of low number of samples and high input space dimension. This method yielded an optimal subset of 14 representative features, which were all related to energy and sequence motifs. The second stage evaluated the performance of the predictors (overall correlation coefficient between observed and predicted efficacy, r; mean error, ME; and root-mean-square-error, RMSE using 8-fold and minus-one-RNA cross-validation methods. The profiled SVM produced the best results (r = 0.44, ME = 0.022, and RMSE= 0.278 and predicted high (>75% inhibition of gene expression and low efficacy (http://aosvm.cgb.ki.se/. Conclusions The SVM approach is well suited to the AO prediction problem, and yields a prediction accuracy superior to previous methods. The profiled SVM was found to perform better than the standard SVM, suggesting that it could lead to improvements in other prediction problems as well.
Application of Hybrid Quantum Tabu Search with Support Vector Regression (SVR for Load Forecasting
Directory of Open Access Journals (Sweden)
Cheng-Wen Lee
2016-10-01
Full Text Available Hybridizing chaotic evolutionary algorithms with support vector regression (SVR to improve forecasting accuracy is a hot topic in electricity load forecasting. Trapping at local optima and premature convergence are critical shortcomings of the tabu search (TS algorithm. This paper investigates potential improvements of the TS algorithm by applying quantum computing mechanics to enhance the search information sharing mechanism (tabu memory to improve the forecasting accuracy. This article presents an SVR-based load forecasting model that integrates quantum behaviors and the TS algorithm with the support vector regression model (namely SVRQTS to obtain a more satisfactory forecasting accuracy. Numerical examples demonstrate that the proposed model outperforms the alternatives.
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
Prediction of toxicity of nitrobenzenes using ab initio and least squares support vector machines
Energy Technology Data Exchange (ETDEWEB)
Niazi, Ali [Department of Chemistry, Faculty of Sciences, Azad University of Arak, Arak (Iran, Islamic Republic of)], E-mail: ali.niazi@gmail.com; Jameh-Bozorghi, Saeed; Nori-Shargh, Davood [Department of Chemistry, Faculty of Sciences, Azad University of Arak, Arak (Iran, Islamic Republic of)
2008-03-01
A quantitative structure-property relationship (QSPR) study is suggested for the prediction of toxicity (IGC{sub 50}) of nitrobenzenes. Ab initio theory was used to calculate some quantum chemical descriptors including electrostatic potentials and local charges at each atom, HOMO and LUMO energies, etc. Modeling of the IGC{sub 50} of nitrobenzenes as a function of molecular structures was established by means of the least squares support vector machines (LS-SVM). This model was applied for the prediction of the toxicity (IGC{sub 50}) of nitrobenzenes, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.0049 for LS-SVM. Results have shown that the introduction of LS-SVM for quantum chemical descriptors drastically enhances the ability of prediction in QSAR studies superior to multiple linear regression and partial least squares.
Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach
Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios
A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.
Prediction of toxicity of nitrobenzenes using ab initio and least squares support vector machines
International Nuclear Information System (INIS)
Niazi, Ali; Jameh-Bozorghi, Saeed; Nori-Shargh, Davood
2008-01-01
A quantitative structure-property relationship (QSPR) study is suggested for the prediction of toxicity (IGC 50 ) of nitrobenzenes. Ab initio theory was used to calculate some quantum chemical descriptors including electrostatic potentials and local charges at each atom, HOMO and LUMO energies, etc. Modeling of the IGC 50 of nitrobenzenes as a function of molecular structures was established by means of the least squares support vector machines (LS-SVM). This model was applied for the prediction of the toxicity (IGC 50 ) of nitrobenzenes, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.0049 for LS-SVM. Results have shown that the introduction of LS-SVM for quantum chemical descriptors drastically enhances the ability of prediction in QSAR studies superior to multiple linear regression and partial least squares
Comparison of ν-support vector regression and logistic equation for ...
African Journals Online (AJOL)
Due to the complexity and high non-linearity of bioprocess, most simple mathematical models fail to describe the exact behavior of biochemistry systems. As a novel type of learning method, support vector regression (SVR) owns the powerful capability to characterize problems via small sample, nonlinearity, high dimension ...
Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model
DEFF Research Database (Denmark)
Møller, Niels Framroze
This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its...
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Vector spot compensation applied in a CNC system for laser cutting machine
Xie, Qiong; Li, Shimin; Xu, Zhene; Wu, Heqing; Yin, Quan
1999-09-01
The vector spot compensation (VSP) is discussed in CNC system for laser cutting machine. The accuracy of laser cutting will be improved by VSP. The modularized vector algorithm which is different from that of traditional geometry algorithm not only unites the symbol system and simplifies the programming complexity, but also improves the spot compensation speed and offers the essential data for the spline interpolation.
Misra, Ankita; Vojinovic, Zoran; Ramakrishnan, Balaji; Luijendijk, Arjen; Ranasinghe, Roshanka
2018-01-01
Satellite imagery along with image processing techniques prove to be efficient tools for bathymetry retrieval as they provide time and cost-effective alternatives to traditional methods of water depth estimation. In this article, a nonlinear machine learning technique of Support Vector Machine (SVM)
Vector control of three-phase AC machines system development in the practice
Quang, Nguyen Phung
2008-01-01
Covers the area of vector control of 3-phase AC machines, in particular induction motors with squirrel-cage rotor, permanent excited synchronous motors and doubly-fed induction machines. This title summarizes the basic structure of a field-oriented controlled 3-phase AC drive and grid voltage orientated controlled wind power plant.
Support vector machine used to diagnose the fault of rotor broken bars of induction motors
DEFF Research Database (Denmark)
Zhitong, Cao; Jiazhong, Fang; Hongpingn, Chen
2003-01-01
The data-based machine learning is an important aspect of modern intelligent technology, while statistical learning theory (SLT) is a new tool that studies the machine learning methods in the case of a small number of samples. As a common learning method, support vector machine (SVM) is derived...... to diagnose the faults of induction motors, and the results suggested that the SVM could yet be regarded as a new method in the fault diagnosis....
Advanced Machine Learning for Classification, Regression, and Generation in Jet Physics
CERN. Geneva
2017-01-01
There is a deep connection between machine learning and jet physics - after all, jets are defined by unsupervised learning algorithms. Jet physics has been a driving force for studying modern machine learning in high energy physics. Domain specific challenges require new techniques to make full use of the algorithms. A key focus is on understanding how and what the algorithms learn. Modern machine learning techniques for jet physics are demonstrated for classification, regression, and generation. In addition to providing powerful baseline performance, we show how to train complex models directly on data and to generate sparse stacked images with non-uniform granularity.
Hamidi, Omid; Tapak, Leili; Abbasi, Hamed; Maryanaji, Zohreh
2017-10-01
We have conducted a case study to investigate the performance of support vector machine, multivariate adaptive regression splines, and random forest time series methods in snowfall modeling. These models were applied to a data set of monthly snowfall collected during six cold months at Hamadan Airport sample station located in the Zagros Mountain Range in Iran. We considered monthly data of snowfall from 1981 to 2008 during the period from October/November to April/May as the training set and the data from 2009 to 2015 as the testing set. The root mean square errors (RMSE), mean absolute errors (MAE), determination coefficient (R 2), coefficient of efficiency (E%), and intra-class correlation coefficient (ICC) statistics were used as evaluation criteria. Our results indicated that the random forest time series model outperformed the support vector machine and multivariate adaptive regression splines models in predicting monthly snowfall in terms of several criteria. The RMSE, MAE, R 2, E, and ICC for the testing set were 7.84, 5.52, 0.92, 0.89, and 0.93, respectively. The overall results indicated that the random forest time series model could be successfully used to estimate monthly snowfall values. Moreover, the support vector machine model showed substantial performance as well, suggesting it may also be applied to forecast snowfall in this area.
Diagnosis of dental deformities in cephalometry images using support vector machine.
Banumathi, Arumugam; Raju, S; Abhaikumar, Varathan
2011-02-01
This paper proposes an automated target recognition algorithm using Support Vector Machine (SVM) to extract landmark points for craniofacial features in cephalometry radiograph. The features are extracted by subjecting the radiograph to the Projected Principle Edge Distribution (PPED) algorithm. Edge flags are accumulated in every four columns and spatial distribution of edge flags are represented by a histogram. The resultants are the front end of support vector machine. Vectors, which possess land marks, are separated from all other vectors. The centroid points, automatically determined from PPED vectors, are the location of landmarks. The landmark points which are serving as a guide for construction and measurement of planes, are used to evaluate the dento-facial relationship, study of growth and development, and also for treatment planning.
Directory of Open Access Journals (Sweden)
Xiaomin Xu
2015-01-01
Full Text Available Not only can the icing coat on transmission line cause the electrical fault of gap discharge and icing flashover but also it will lead to the mechanical failure of tower, conductor, insulators, and others. It will bring great harm to the people’s daily life and work. Thus, accurate prediction of ice thickness has important significance for power department to control the ice disaster effectively. Based on the analysis of standard support vector machine, this paper presents a weighted support vector machine regression model based on the similarity (WSVR. According to the different importance of samples, this paper introduces the weighted support vector machine and optimizes its parameters by hybrid swarm intelligence optimization algorithm with the particle swarm and ant colony (PSO-ACO, which improves the generalization ability of the model. In the case study, the actual data of ice thickness and climate in a certain area of Hunan province have been used to predict the icing thickness of the area, which verifies the validity and applicability of this proposed method. The predicted results show that the intelligent model proposed in this paper has higher precision and stronger generalization ability.
MANCOVA for one way classification with homogeneity of regression coefficient vectors
Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.
2017-11-01
The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.
Modeling and prediction of Turkey's electricity consumption using Support Vector Regression
International Nuclear Information System (INIS)
Kavaklioglu, Kadir
2011-01-01
Support Vector Regression (SVR) methodology is used to model and predict Turkey's electricity consumption. Among various SVR formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. (author)
Directory of Open Access Journals (Sweden)
Wang Lily
2008-07-01
Full Text Available Abstract Background Cancer diagnosis and clinical outcome prediction are among the most important emerging applications of gene expression microarray technology with several molecular signatures on their way toward clinical deployment. Use of the most accurate classification algorithms available for microarray gene expression data is a critical ingredient in order to develop the best possible molecular signatures for patient care. As suggested by a large body of literature to date, support vector machines can be considered "best of class" algorithms for classification of such data. Recent work, however, suggests that random forest classifiers may outperform support vector machines in this domain. Results In the present paper we identify methodological biases of prior work comparing random forests and support vector machines and conduct a new rigorous evaluation of the two algorithms that corrects these limitations. Our experiments use 22 diagnostic and prognostic datasets and show that support vector machines outperform random forests, often by a large margin. Our data also underlines the importance of sound research design in benchmarking and comparison of bioinformatics algorithms. Conclusion We found that both on average and in the majority of microarray datasets, random forests are outperformed by support vector machines both in the settings when no gene selection is performed and when several popular gene selection methods are used.
International Nuclear Information System (INIS)
Riaz, Nadeem; Wiersma, Rodney; Mao Weihua; Xing Lei; Shanker, Piyush; Gudmundsson, Olafur; Widrow, Bernard
2009-01-01
Intra-fraction tumor tracking methods can improve radiation delivery during radiotherapy sessions. Image acquisition for tumor tracking and subsequent adjustment of the treatment beam with gating or beam tracking introduces time latency and necessitates predicting the future position of the tumor. This study evaluates the use of multi-dimensional linear adaptive filters and support vector regression to predict the motion of lung tumors tracked at 30 Hz. We expand on the prior work of other groups who have looked at adaptive filters by using a general framework of a multiple-input single-output (MISO) adaptive system that uses multiple correlated signals to predict the motion of a tumor. We compare the performance of these two novel methods to conventional methods like linear regression and single-input, single-output adaptive filters. At 400 ms latency the average root-mean-square-errors (RMSEs) for the 14 treatment sessions studied using no prediction, linear regression, single-output adaptive filter, MISO and support vector regression are 2.58, 1.60, 1.58, 1.71 and 1.26 mm, respectively. At 1 s, the RMSEs are 4.40, 2.61, 3.34, 2.66 and 1.93 mm, respectively. We find that support vector regression most accurately predicts the future tumor position of the methods studied and can provide a RMSE of less than 2 mm at 1 s latency. Also, a multi-dimensional adaptive filter framework provides improved performance over single-dimension adaptive filters. Work is underway to combine these two frameworks to improve performance.
Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu
2018-02-01
A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.
LINEAR KERNEL SUPPORT VECTOR MACHINES FOR MODELING PORE-WATER PRESSURE RESPONSES
Directory of Open Access Journals (Sweden)
KHAMARUZAMAN W. YUSOF
2017-08-01
Full Text Available Pore-water pressure responses are vital in many aspects of slope management, design and monitoring. Its measurement however, is difficult, expensive and time consuming. Studies on its predictions are lacking. Support vector machines with linear kernel was used here to predict the responses of pore-water pressure to rainfall. Pore-water pressure response data was collected from slope instrumentation program. Support vector machine meta-parameter calibration and model development was carried out using grid search and k-fold cross validation. The mean square error for the model on scaled test data is 0.0015 and the coefficient of determination is 0.9321. Although pore-water pressure response to rainfall is a complex nonlinear process, the use of linear kernel support vector machine can be employed where high accuracy can be sacrificed for computational ease and time.
Terzic, Jenny; Nagarajah, Romesh; Alamgir, Muhammad
2013-01-01
Accurate fluid level measurement in dynamic environments can be assessed using a Support Vector Machine (SVM) approach. SVM is a supervised learning model that analyzes and recognizes patterns. It is a signal classification technique which has far greater accuracy than conventional signal averaging methods. Ultrasonic Fluid Quantity Measurement in Dynamic Vehicular Applications: A Support Vector Machine Approach describes the research and development of a fluid level measurement system for dynamic environments. The measurement system is based on a single ultrasonic sensor. A Support Vector Machines (SVM) based signal characterization and processing system has been developed to compensate for the effects of slosh and temperature variation in fluid level measurement systems used in dynamic environments including automotive applications. It has been demonstrated that a simple ν-SVM model with Radial Basis Function (RBF) Kernel with the inclusion of a Moving Median filter could be used to achieve the high levels...
Goldstein, Benjamin A; Navar, Ann Marie; Carter, Rickey E
2017-06-14
Risk prediction plays an important role in clinical cardiology research. Traditionally, most risk models have been based on regression models. While useful and robust, these statistical methods are limited to using a small number of predictors which operate in the same way on everyone, and uniformly throughout their range. The purpose of this review is to illustrate the use of machine-learning methods for development of risk prediction models. Typically presented as black box approaches, most machine-learning methods are aimed at solving particular challenges that arise in data analysis that are not well addressed by typical regression approaches. To illustrate these challenges, as well as how different methods can address them, we consider trying to predicting mortality after diagnosis of acute myocardial infarction. We use data derived from our institution's electronic health record and abstract data on 13 regularly measured laboratory markers. We walk through different challenges that arise in modelling these data and then introduce different machine-learning approaches. Finally, we discuss general issues in the application of machine-learning methods including tuning parameters, loss functions, variable importance, and missing data. Overall, this review serves as an introduction for those working on risk modelling to approach the diffuse field of machine learning. © The Author 2016. Published by Oxford University Press on behalf of the European Society of Cardiology.
Pair- ${v}$ -SVR: A Novel and Efficient Pairing nu-Support Vector Regression Algorithm.
Hao, Pei-Yi
This paper proposes a novel and efficient pairing nu-support vector regression (pair--SVR) algorithm that combines successfully the superior advantages of twin support vector regression (TSVR) and classical -SVR algorithms. In spirit of TSVR, the proposed pair--SVR solves two quadratic programming problems (QPPs) of smaller size rather than a single larger QPP, and thus has faster learning speed than classical -SVR. The significant advantage of our pair--SVR over TSVR is the improvement in the prediction speed and generalization ability by introducing the concepts of the insensitive zone and the regularization term that embodies the essence of statistical learning theory. Moreover, pair--SVR has additional advantage of using parameter for controlling the bounds on fractions of SVs and errors. Furthermore, the upper bound and lower bound functions of the regression model estimated by pair--SVR capture well the characteristics of data distributions, thus facilitating automatic estimation of the conditional mean and predictive variance simultaneously. This may be useful in many cases, especially when the noise is heteroscedastic and depends strongly on the input values. The experimental results validate the superiority of our pair--SVR in both training/prediction speed and generalization ability.This paper proposes a novel and efficient pairing nu-support vector regression (pair--SVR) algorithm that combines successfully the superior advantages of twin support vector regression (TSVR) and classical -SVR algorithms. In spirit of TSVR, the proposed pair--SVR solves two quadratic programming problems (QPPs) of smaller size rather than a single larger QPP, and thus has faster learning speed than classical -SVR. The significant advantage of our pair--SVR over TSVR is the improvement in the prediction speed and generalization ability by introducing the concepts of the insensitive zone and the regularization term that embodies the essence of statistical learning theory
Directory of Open Access Journals (Sweden)
Shokri Saeid
2015-01-01
Full Text Available An accurate prediction of sulfur content is very important for the proper operation and product quality control in hydrodesulfurization (HDS process. For this purpose, a reliable data- driven soft sensors utilizing Support Vector Regression (SVR was developed and the effects of integrating Vector Quantization (VQ with Principle Component Analysis (PCA were studied on the assessment of this soft sensor. First, in pre-processing step the PCA and VQ techniques were used to reduce dimensions of the original input datasets. Then, the compressed datasets were used as input variables for the SVR model. Experimental data from the HDS setup were employed to validate the proposed integrated model. The integration of VQ/PCA techniques with SVR model was able to increase the prediction accuracy of SVR. The obtained results show that integrated technique (VQ-SVR was better than (PCA-SVR in prediction accuracy. Also, VQ decreased the sum of the training and test time of SVR model in comparison with PCA. For further evaluation, the performance of VQ-SVR model was also compared to that of SVR. The obtained results indicated that VQ-SVR model delivered the best satisfactory predicting performance (AARE= 0.0668 and R2= 0.995 in comparison with investigated models.
A Wavelet Support Vector Machine Combination Model for Singapore Tourist Arrival to Malaysia
Rafidah, A.; Shabri, Ani; Nurulhuda, A.; Suhaila, Y.
2017-08-01
In this study, wavelet support vector machine model (WSVM) is proposed and applied for monthly data Singapore tourist time series prediction. The WSVM model is combination between wavelet analysis and support vector machine (SVM). In this study, we have two parts, first part we compare between the kernel function and second part we compare between the developed models with single model, SVM. The result showed that kernel function linear better than RBF while WSVM outperform with single model SVM to forecast monthly Singapore tourist arrival to Malaysia.
Novel Automatic Filter-Class Feature Selection for Machine Learning Regression
DEFF Research Database (Denmark)
Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard
2017-01-01
With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... 4 other common automatic feature selection algorithms: Backward selection, forward selection, NLPCA and PCA as well as using no algorithms at all. The benchmarking will be performed through two experiments with two different data sets that are both time-series regression-based problems...... selection algorithms that can be used for many different applications....
Short-Term Prediction of Air Pollution in Macau Using Support Vector Machines
Chi-Man Vong; Weng-Fai Ip; Pak-kin Wong; Jing-yi Yang
2012-01-01
Forecasting of air pollution is a popular and important topic in recent years due to the health impact caused by air pollution. It is necessary to build an early warning system, which provides forecast and also alerts health alarm to local inhabitants by medical practitioners and the local government. Meteorological and pollutions data collected daily at monitoring stations of Macau can be used in this study to build a forecasting system. Support vector machines (SVMs), a novel type of machin...
A Novel Empirical Mode Decomposition With Support Vector Regression for Wind Speed Forecasting.
Ren, Ye; Suganthan, Ponnuthurai Nagaratnam; Srikanth, Narasimalu
2016-08-01
Wind energy is a clean and an abundant renewable energy source. Accurate wind speed forecasting is essential for power dispatch planning, unit commitment decision, maintenance scheduling, and regulation. However, wind is intermittent and wind speed is difficult to predict. This brief proposes a novel wind speed forecasting method by integrating empirical mode decomposition (EMD) and support vector regression (SVR) methods. The EMD is used to decompose the wind speed time series into several intrinsic mode functions (IMFs) and a residue. Subsequently, a vector combining one historical data from each IMF and the residue is generated to train the SVR. The proposed EMD-SVR model is evaluated with a wind speed data set. The proposed EMD-SVR model outperforms several recently reported methods with respect to accuracy or computational complexity.
International Nuclear Information System (INIS)
Wang, Lu; Su, Steven W; Celler, Branko G; Chan, Gregory S H; Cheng, Teddy M; Savkin, Andrey V
2009-01-01
This study aims to quantitatively describe the steady-state relationships among percentage changes in key central cardiovascular variables (i.e. stroke volume, heart rate (HR), total peripheral resistance and cardiac output), measured using non-invasive means, in response to moderate exercise, and the oxygen uptake rate, using a new nonlinear regression approach—support vector regression. Ten untrained normal males exercised in an upright position on an electronically braked cycle ergometer with constant workloads ranging from 25 W to 125 W. Throughout the experiment, .VO 2 was determined breath by breath and the HR was monitored beat by beat. During the last minute of each exercise session, the cardiac output was measured beat by beat using a novel non-invasive ultrasound-based device and blood pressure was measured using a tonometric measurement device. Based on the analysis of experimental data, nonlinear steady-state relationships between key central cardiovascular variables and .VO 2 were qualitatively observed except for the HR which increased linearly as a function of increasing .VO 2 . Quantitative descriptions of these complex nonlinear behaviour were provided by nonparametric models which were obtained by using support vector regression
Extreme Learning Machine With Subnetwork Hidden Nodes for Regression and Classification.
Yang, Yimin; Wu, Q M Jonathan
2016-12-01
As demonstrated earlier, the learning effectiveness and learning speed of single-hidden-layer feedforward neural networks are in general far slower than required, which has been a major bottleneck for many applications. Huang et al. proposed extreme learning machine (ELM) which improves the training speed by hundreds of times as compared to its predecessor learning techniques. This paper offers an ELM-based learning method that can grow subnetwork hidden nodes by pulling back residual network error to the hidden layer. Furthermore, the proposed method provides a similar or better generalization performance with remarkably fewer hidden nodes as compared to other ELM methods employing huge number of hidden nodes. Thus, the learning speed of the proposed technique is hundred times faster compared to other ELMs as well as to back propagation and support vector machines. The experimental validations for all methods are carried out on 32 data sets.
Multifrequency spiral vector model for the brushless doubly-fed induction machine
DEFF Research Database (Denmark)
Han, Peng; Cheng, Ming; Zhu, Xinkai
2017-01-01
This paper presents a multifrequency spiral vector model for both steady-state and dynamic performance analysis of the brushless doubly-fed induction machine (BDFIM) with a nested-loop rotor. Winding function theory is first employed to give a full picture of the inductance characteristics...... analytically, revealing the underlying relationship between harmonic components of stator-rotor mutual inductances and the airgap magnetic field distribution. Different from existing vector models, which only model the fundamental components of mutual inductances, the proposed vector model takes...
Margin Error Bounds for Support Vector Machines on Reproducing Kernel Banach Spaces.
Chen, Liangzhi; Zhang, Haizhang
2017-11-01
Support vector machines, which maximize the margin from patterns to the separation hyperplane subject to correct classification, have received remarkable success in machine learning. Margin error bounds based on Hilbert spaces have been introduced in the literature to justify the strategy of maximizing the margin in SVM. Recently, there has been much interest in developing Banach space methods for machine learning. Large margin classification in Banach spaces is a focus of such attempts. In this letter we establish a margin error bound for the SVM on reproducing kernel Banach spaces, thus supplying statistical justification for large-margin classification in Banach spaces.
Directory of Open Access Journals (Sweden)
Cheng-Wen Lee
2017-11-01
Full Text Available Accurate electricity forecasting is still the critical issue in many energy management fields. The applications of hybrid novel algorithms with support vector regression (SVR models to overcome the premature convergence problem and improve forecasting accuracy levels also deserve to be widely explored. This paper applies chaotic function and quantum computing concepts to address the embedded drawbacks including crossover and mutation operations of genetic algorithms. Then, this paper proposes a novel electricity load forecasting model by hybridizing chaotic function and quantum computing with GA in an SVR model (named SVRCQGA to achieve more satisfactory forecasting accuracy levels. Experimental examples demonstrate that the proposed SVRCQGA model is superior to other competitive models.
Improving Non-Destructive Concrete Strength Tests Using Support Vector Machines
Directory of Open Access Journals (Sweden)
Yi-Fan Shih
2015-10-01
Full Text Available Non-destructive testing (NDT methods are important alternatives when destructive tests are not feasible to examine the in situ concrete properties without damaging the structure. The rebound hammer test and the ultrasonic pulse velocity test are two popular NDT methods to examine the properties of concrete. The rebound of the hammer depends on the hardness of the test specimen and ultrasonic pulse travelling speed is related to density, uniformity, and homogeneity of the specimen. Both of these two methods have been adopted to estimate the concrete compressive strength. Statistical analysis has been implemented to establish the relationship between hammer rebound values/ultrasonic pulse velocities and concrete compressive strength. However, the estimated results can be unreliable. As a result, this research proposes an Artificial Intelligence model using support vector machines (SVMs for the estimation. Data from 95 cylinder concrete samples are collected to develop and validate the model. The results show that combined NDT methods (also known as SonReb method yield better estimations than single NDT methods. The results also show that the SVM model is more accurate than the statistical regression model.
A novel support vector machine-based approach for rare variant detection.
Directory of Open Access Journals (Sweden)
Yao-Hwei Fang
Full Text Available Advances in next-generation sequencing technologies have enabled the identification of multiple rare single nucleotide polymorphisms involved in diseases or traits. Several strategies for identifying rare variants that contribute to disease susceptibility have recently been proposed. An important feature of many of these statistical methods is the pooling or collapsing of multiple rare single nucleotide variants to achieve a reasonably high frequency and effect. However, if the pooled rare variants are associated with the trait in different directions, then the pooling may weaken the signal, thereby reducing its statistical power. In the present paper, we propose a backward support vector machine (BSVM-based variant selection procedure to identify informative disease-associated rare variants. In the selection procedure, the rare variants are weighted and collapsed according to their positive or negative associations with the disease, which may be associated with common variants and rare variants with protective, deleterious, or neutral effects. This nonparametric variant selection procedure is able to account for confounding factors and can also be adopted in other regression frameworks. The results of a simulation study and a data example show that the proposed BSVM approach is more powerful than four other approaches under the considered scenarios, while maintaining valid type I errors.
Using Support Vector Machine to Forecast Energy Usage of a Manhattan Skyscraper
Winter, R.; Boulanger, A.; Anderson, R.; Wu, L.
2011-12-01
As our society gains a better understanding of how humans have negatively impacted the environment, research related to reducing carbon emissions and overall energy consumption has become increasingly important. One of the simplest ways to reduce energy usage is by making current buildings less wasteful. By improving energy efficiency, this method of lowering our carbon footprint is particularly worthwhile because it actually reduces energy costs of operating the building, unlike many environmental initiatives that require large monetary investments. In order to improve the efficiency of the heating and air conditioning (HVAC) system of a Manhattan skyscraper, 345 Park Avenue, a predictive computer model was designed to forecast the amount of energy the building will consume. This model uses support vector machine (SVM), a method that builds a regression based purely on historical data of the building, requiring no knowledge of its size, heating and cooling methods, or any other physical properties. This pure dependence on historical data makes the model very easily applicable to different types of buildings with few model adjustments. The SVM model was built to predict a week of future energy usage based on past energy, temperature, and dew point temperature data. The predictive model closely approximated the actual values of energy usage for the spring and less closely for the winter. The prediction may be improved with additional historical data to help the model account for seasonal variability. This model is useful for creating a close approximation of future energy usage and predicting ways to diminish waste.
Dynamic temperature modeling of an SOFC using least squares support vector machines
Energy Technology Data Exchange (ETDEWEB)
Kang, Ying-Wei; Li, Jun; Cao, Guang-Yi; Tu, Heng-Yong [Institute of Fuel Cell, Shanghai Jiao Tong University, Shanghai 200240 (China); Li, Jian; Yang, Jie [School of Materials Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074 (China)
2008-05-01
Cell temperature control plays a crucial role in SOFC operation. In order to design effective temperature control strategies by model-based control methods, a dynamic temperature model of an SOFC is presented in this paper using least squares support vector machines (LS-SVMs). The nonlinear temperature dynamics of the SOFC is represented by a nonlinear autoregressive with exogenous inputs (NARXs) model that is implemented using an LS-SVM regression model. Issues concerning the development of the LS-SVM temperature model are discussed in detail, including variable selection, training set construction and tuning of the LS-SVM parameters (usually referred to as hyperparameters). Comprehensive validation tests demonstrate that the developed LS-SVM model is sufficiently accurate to be used independently from the SOFC process, emulating its temperature response from the only process input information over a relatively wide operating range. The powerful ability of the LS-SVM temperature model benefits from the approaches of constructing the training set and tuning hyperparameters automatically by the genetic algorithm (GA), besides the modeling method itself. The proposed LS-SVM temperature model can be conveniently employed to design temperature control strategies of the SOFC. (author)
Directory of Open Access Journals (Sweden)
Antonio Blanco-Oliver
2014-10-01
Full Text Available Despite the leading role that micro-entrepreneurship plays in economic development, and the high failure rate of microenterprise start-ups in their early years, very few studies have designed financial distress models to detect the financial problems of micro-entrepreneurs. Moreover, due to a lack of research, nothing is known about whether non-financial information and nonparametric statistical techniques improve the predictive capacity of these models. Therefore, this paper provides an innovative financial distress model specifically designed for microenterprise startups via support vector machines (SVMs that employs financial, non-financial, and macroeconomic variables. Based on a sample of almost 5,500 micro- entrepreneurs from a Peruvian Microfinance Institution (MFI, our findings show that the introduction of non-financial information related to the zone in which the entrepreneurs live and situate their business, the duration of the MFI-entrepreneur relationship, the number of loans granted by the MFI in the last year, the loan destination, and the opinion of experts on the probability that microenterprise start-ups may experience financial problems, significantly increases the accuracy performance of our financial distress model. Furthermore, the results reveal that the models that use SVMs outperform those which employ traditional logistic regression (LR analysis.
Predicting beta-turns in proteins using support vector machines with fractional polynomials
2013-01-01
Background β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. Results We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. Conclusions In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods. PMID:24565438
Directory of Open Access Journals (Sweden)
Naradasu Kumar Ravi
2013-01-01
Full Text Available Diesel engine designers are constantly on the look-out for performance enhancement through efficient control of operating parameters. In this paper, the concept of an intelligent engine control system is proposed that seeks to ensure optimized performance under varying operating conditions. The concept is based on arriving at the optimum engine operating parameters to ensure the desired output in terms of efficiency. In addition, a Support Vector Machines based prediction model has been developed to predict the engine performance under varying operating conditions. Experiments were carried out at varying loads, compression ratios and amounts of exhaust gas recirculation using a variable compression ratio diesel engine for data acquisition. It was observed that the SVM model was able to predict the engine performance accurately.
Model Predictive Engine Air-Ratio Control Using Online Sequential Relevance Vector Machine
Directory of Open Access Journals (Sweden)
Hang-cheong Wong
2012-01-01
Full Text Available Engine power, brake-specific fuel consumption, and emissions relate closely to air ratio (i.e., lambda among all the engine variables. An accurate and adaptive model for lambda prediction is essential to effective lambda control for long term. This paper utilizes an emerging technique, relevance vector machine (RVM, to build a reliable time-dependent lambda model which can be continually updated whenever a sample is added to, or removed from, the estimated lambda model. The paper also presents a new model predictive control (MPC algorithm for air-ratio regulation based on RVM. This study shows that the accuracy, training, and updating time of the RVM model are superior to the latest modelling methods, such as diagonal recurrent neural network (DRNN and decremental least-squares support vector machine (DLSSVM. Moreover, the control algorithm has been implemented on a real car to test. Experimental results reveal that the control performance of the proposed relevance vector machine model predictive controller (RVMMPC is also superior to DRNNMPC, support vector machine-based MPC, and conventional proportional-integral (PI controller in production cars. Therefore, the proposed RVMMPC is a promising scheme to replace conventional PI controller for engine air-ratio control.
Fuzzy-based multi-kernel spherical support vector machine for ...
Indian Academy of Sciences (India)
In the proposed classifier, we design a new multi-kernel function based on the fuzzy triangular membership function. Finally, a newly developed multi-kernel function is incorporated into the spherical support vector machine to enhance the performance significantly. The experimental results are evaluated and performance is ...
Fuzzy-based multi-kernel spherical support vector machine for ...
Indian Academy of Sciences (India)
A K Sampath
2017-08-08
Aug 8, 2017 ... performance is compared with existing systems based on the percentage of training data samples. Thus, the outcome of our proposed ... be correlated with the training data samples since the character is varied in style and size; ...... port vector machine (SVM) for English handwritten char- acter recognition.
LENUS (Irish Health Repository)
Mourao-Miranda, J
2012-05-01
To date, magnetic resonance imaging (MRI) has made little impact on the diagnosis and monitoring of psychoses in individual patients. In this study, we used a support vector machine (SVM) whole-brain classification approach to predict future illness course at the individual level from MRI data obtained at the first psychotic episode.
This paper presents a novel wrinkle evaluation method that uses modified wavelet coefficients and an optimized support-vector-machine (SVM) classification scheme to characterize and classify wrinkle appearance of fabric. Fabric images were decomposed with the wavelet transform (WT), and five parame...
Parallelization of multicategory support vector machines (PMC-SVM for classifying microarray data
Directory of Open Access Journals (Sweden)
Deng Youping
2006-12-01
Full Text Available Abstract Background Multicategory Support Vector Machines (MC-SVM are powerful classification systems with excellent performance in a variety of data classification problems. Since the process of generating models in traditional multicategory support vector machines for large datasets is very computationally intensive, there is a need to improve the performance using high performance computing techniques. Results In this paper, Parallel Multicategory Support Vector Machines (PMC-SVM have been developed based on the sequential minimum optimization-type decomposition method for support vector machines (SMO-SVM. It was implemented in parallel using MPI and C++ libraries and executed on both shared memory supercomputer and Linux cluster for multicategory classification of microarray data. PMC-SVM has been analyzed and evaluated using four microarray datasets with multiple diagnostic categories, such as different cancer types and normal tissue types. Conclusion The experiments show that the PMC-SVM can significantly improve the performance of classification of microarray data without loss of accuracy, compared with previous work.
A Support Vector Machine Approach to Dutch Part-of-Speech Tagging
Poel, Mannes; Stegeman, L.; op den Akker, Hendrikus J.A.; Berthold, M.R.; Shawe-Taylor, J.; Lavrac, N.
Part-of-Speech tagging, the assignment of Parts-of-Speech to the words in a given context of use, is a basic technique in many systems that handle natural languages. This paper describes a method for supervised training of a Part-of-Speech tagger using a committee of Support Vector Machines on a
Linear and support vector regressions based on geometrical correlation of data
Directory of Open Access Journals (Sweden)
Kaijun Wang
2007-10-01
Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.
An implementation of support vector machine on sentiment classification of movie reviews
Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.
2018-03-01
With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%
Chiogna, Gabriele; Marcolini, Giorgia; Liu, Wanying; Pérez Ciria, Teresa; Tuo, Ye
2018-03-21
Water management in the alpine region has an important impact on streamflow. In particular, hydropower production is known to cause hydropeaking i.e., sudden fluctuations in river stage caused by the release or storage of water in artificial reservoirs. Modeling hydropeaking with hydrological models, such as the Soil Water Assessment Tool (SWAT), requires knowledge of reservoir management rules. These data are often not available since they are sensitive information belonging to hydropower production companies. In this short communication, we propose to couple the results of a calibrated hydrological model with a machine learning method to reproduce hydropeaking without requiring the knowledge of the actual reservoir management operation. We trained a support vector machine (SVM) with SWAT model outputs, the day of the week and the energy price. We tested the model for the Upper Adige river basin in North-East Italy. A wavelet analysis showed that energy price has a significant influence on river discharge, and a wavelet coherence analysis demonstrated the improved performance of the SVM model in comparison to the SWAT model alone. The SVM model was also able to capture the fluctuations in streamflow caused by hydropeaking when both energy price and river discharge displayed a complex temporal dynamic. Copyright © 2018 Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Chen Qiang; Ren Xuemei; Na Jing
2011-01-01
Highlights: Model uncertainty of the system is approximated by multiple-kernel LSSVM. Approximation errors and disturbances are compensated in the controller design. Asymptotical anti-synchronization is achieved with model uncertainty and disturbances. Abstract: In this paper, we propose a robust anti-synchronization scheme based on multiple-kernel least squares support vector machine (MK-LSSVM) modeling for two uncertain chaotic systems. The multiple-kernel regression, which is a linear combination of basic kernels, is designed to approximate system uncertainties by constructing a multiple-kernel Lagrangian function and computing the corresponding regression parameters. Then, a robust feedback control based on MK-LSSVM modeling is presented and an improved update law is employed to estimate the unknown bound of the approximation error. The proposed control scheme can guarantee the asymptotic convergence of the anti-synchronization errors in the presence of system uncertainties and external disturbances. Numerical examples are provided to show the effectiveness of the proposed method.
Directory of Open Access Journals (Sweden)
Changhao Fan
2017-01-01
Full Text Available In modeling, only information from the deviation between the output of the support vector regression (SVR model and the training sample is considered, whereas the other prior information of the training sample, such as probability distribution information, is ignored. Probabilistic distribution information describes the overall distribution of sample data in a training sample that contains different degrees of noise and potential outliers, as well as helping develop a high-accuracy model. To mine and use the probability distribution information of a training sample, a new support vector regression model that incorporates probability distribution information weight SVR (PDISVR is proposed. In the PDISVR model, the probability distribution of each sample is considered as the weight and is then introduced into the error coefficient and slack variables of SVR. Thus, the deviation and probability distribution information of the training sample are both used in the PDISVR model to eliminate the influence of noise and outliers in the training sample and to improve predictive performance. Furthermore, examples with different degrees of noise were employed to demonstrate the performance of PDISVR, which was then compared with those of three SVR-based methods. The results showed that PDISVR performs better than the three other methods.
Height and Weight Estimation From Anthropometric Measurements Using Machine Learning Regressions.
Rativa, Diego; Fernandes, Bruno J T; Roque, Alexandre
2018-01-01
Height and weight are measurements explored to tracking nutritional diseases, energy expenditure, clinical conditions, drug dosages, and infusion rates. Many patients are not ambulant or may be unable to communicate, and a sequence of these factors may not allow accurate estimation or measurements; in those cases, it can be estimated approximately by anthropometric means. Different groups have proposed different linear or non-linear equations which coefficients are obtained by using single or multiple linear regressions. In this paper, we present a complete study of the application of different learning models to estimate height and weight from anthropometric measurements: support vector regression, Gaussian process, and artificial neural networks. The predicted values are significantly more accurate than that obtained with conventional linear regressions. In all the cases, the predictions are non-sensitive to ethnicity, and to gender, if more than two anthropometric parameters are analyzed. The learning model analysis creates new opportunities for anthropometric applications in industry, textile technology, security, and health care.
Churpek, Matthew M; Yuen, Trevor C; Winslow, Christopher; Meltzer, David O; Kattan, Michael W; Edelson, Dana P
2016-02-01
Machine learning methods are flexible prediction algorithms that may be more accurate than conventional regression. We compared the accuracy of different techniques for detecting clinical deterioration on the wards in a large, multicenter database. Observational cohort study. Five hospitals, from November 2008 until January 2013. Hospitalized ward patients None Demographic variables, laboratory values, and vital signs were utilized in a discrete-time survival analysis framework to predict the combined outcome of cardiac arrest, intensive care unit transfer, or death. Two logistic regression models (one using linear predictor terms and a second utilizing restricted cubic splines) were compared to several different machine learning methods. The models were derived in the first 60% of the data by date and then validated in the next 40%. For model derivation, each event time window was matched to a non-event window. All models were compared to each other and to the Modified Early Warning score, a commonly cited early warning score, using the area under the receiver operating characteristic curve (AUC). A total of 269,999 patients were admitted, and 424 cardiac arrests, 13,188 intensive care unit transfers, and 2,840 deaths occurred in the study. In the validation dataset, the random forest model was the most accurate model (AUC, 0.80 [95% CI, 0.80-0.80]). The logistic regression model with spline predictors was more accurate than the model utilizing linear predictors (AUC, 0.77 vs 0.74; p machine learning methods more accurately predicted clinical deterioration than logistic regression. Use of detection algorithms derived from these techniques may result in improved identification of critically ill patients on the wards.
Extreme Learning Machine and Moving Least Square Regression Based Solar Panel Vision Inspection
Directory of Open Access Journals (Sweden)
Heng Liu
2017-01-01
Full Text Available In recent years, learning based machine intelligence has aroused a lot of attention across science and engineering. Particularly in the field of automatic industry inspection, the machine learning based vision inspection plays a more and more important role in defect identification and feature extraction. Through learning from image samples, many features of industry objects, such as shapes, positions, and orientations angles, can be obtained and then can be well utilized to determine whether there is defect or not. However, the robustness and the quickness are not easily achieved in such inspection way. In this work, for solar panel vision inspection, we present an extreme learning machine (ELM and moving least square regression based approach to identify solder joint defect and detect the panel position. Firstly, histogram peaks distribution (HPD and fractional calculus are applied for image preprocessing. Then an ELM-based defective solder joints identification is discussed in detail. Finally, moving least square regression (MLSR algorithm is introduced for solar panel position determination. Experimental results and comparisons show that the proposed ELM and MLSR based inspection method is efficient not only in detection accuracy but also in processing speed.
A support vector machine approach for classification of welding defects from ultrasonic signals
Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming
2014-07-01
Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.
Wang, Hongjin; Hsieh, Sheng-Jen; Peng, Bo; Zhou, Xunfei
2016-07-01
A method without requirements on knowledge about thermal properties of coatings or those of substrates will be interested in the industrial application. Supervised machine learning regressions may provide possible solution to the problem. This paper compares the performances of two regression models (artificial neural networks (ANN) and support vector machines for regression (SVM)) with respect to coating thickness estimations made based on surface temperature increments collected via time resolved thermography. We describe SVM roles in coating thickness prediction. Non-dimensional analyses are conducted to illustrate the effects of coating thicknesses and various factors on surface temperature increments. It's theoretically possible to correlate coating thickness with surface increment. Based on the analyses, the laser power is selected in such a way: during the heating, the temperature increment is high enough to determine the coating thickness variance but low enough to avoid surface melting. Sixty-one pain-coated samples with coating thicknesses varying from 63.5 μm to 571 μm are used to train models. Hyper-parameters of the models are optimized by 10-folder cross validation. Another 28 sets of data are then collected to test the performance of the three methods. The study shows that SVM can provide reliable predictions of unknown data, due to its deterministic characteristics, and it works well when used for a small input data group. The SVM model generates more accurate coating thickness estimates than the ANN model.
Kumar, Deepak; Thakur, Manoj; Dubey, Chandra S.; Shukla, Dericks P.
2017-10-01
In recent years, various machine learning techniques have been applied for landslide susceptibility mapping. In this study, three different variants of support vector machine viz., SVM, Proximal Support Vector Machine (PSVM) and L2-Support Vector Machine - Modified Finite Newton (L2-SVM-MFN) have been applied on the Mandakini River Basin in Uttarakhand, India to carry out the landslide susceptibility mapping. Eight thematic layers such as elevation, slope, aspect, drainages, geology/lithology, buffer of thrusts/faults, buffer of streams and soil along with the past landslide data were mapped in GIS environment and used for landslide susceptibility mapping in MATLAB. The study area covering 1625 km2 has merely 0.11% of area under landslides. There are 2009 pixels for past landslides out of which 50% (1000) landslides were considered as training set while remaining 50% as testing set. The performance of these techniques has been evaluated and the computational results show that L2-SVM-MFN obtains higher prediction values (0.829) of receiver operating characteristic curve (AUC-area under the curve) as compared to 0.807 for PSVM model and 0.79 for SVM. The results obtained from L2-SVM-MFN model are found to be superior than other SVM prediction models and suggest the usefulness of this technique to problem of landslide susceptibility mapping where training data is very less. However, these techniques can be used for satisfactory determination of susceptible zones with these inputs.
Lumbar Ultrasound Image Feature Extraction and Classification with Support Vector Machine.
Yu, Shuang; Tan, Kok Kiong; Sng, Ban Leong; Li, Shengjin; Sia, Alex Tiong Heng
2015-10-01
Needle entry site localization remains a challenge for procedures that involve lumbar puncture, for example, epidural anesthesia. To solve the problem, we have developed an image classification algorithm that can automatically identify the bone/interspinous region for ultrasound images obtained from lumbar spine of pregnant patients in the transverse plane. The proposed algorithm consists of feature extraction, feature selection and machine learning procedures. A set of features, including matching values, positions and the appearance of black pixels within pre-defined windows along the midline, were extracted from the ultrasound images using template matching and midline detection methods. A support vector machine was then used to classify the bone images and interspinous images. The support vector machine model was trained with 1,040 images from 26 pregnant subjects and tested on 800 images from a separate set of 20 pregnant patients. A success rate of 95.0% on training set and 93.2% on test set was achieved with the proposed method. The trained support vector machine model was further tested on 46 off-line collected videos, and successfully identified the proper needle insertion site (interspinous region) in 45 of the cases. Therefore, the proposed method is able to process the ultrasound images of lumbar spine in an automatic manner, so as to facilitate the anesthetists' work of identifying the needle entry site. Copyright © 2015 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Favaro L.
2011-07-01
Full Text Available The white-clawed crayfish’s habitat has been profoundly modified in Piedmont (NW Italy due to environmental changes caused by human impact. Consequently, native populations have decreased markedly. In this research project, support vector machines were tested as possible tools for evaluating the ecological factors that determine the presence of white-clawed crayfish. A system of 175 sites was investigated, 98 of which recorded the presence of Austropotamobius pallipes. At each site 27 physical-chemical, environmental and climatic variables were measured according to their importance to A. pallipes. Various feature selection methods were employed. These yielded three subsets of variables that helped build three different types of models: (1 models with no variable selection; (2 models built by applying Goldberg’s genetic algorithm after variable selection; (3 models built by using a combination of four supervised-filter evaluators after variable selection. These different model types helped us realise how important it was to select the right features if we wanted to build support vector machines that perform as well as possible. In addition, support vector machines have a high potential for predicting indigenous crayfish occurrence, according to our findings. Therefore, they are valuable tools for freshwater management, tools that may prove to be much more promising than traditional and other machine-learning techniques.
Directory of Open Access Journals (Sweden)
Gwo-Fong Lin
2016-01-01
Full Text Available This study describes the development of a reservoir inflow forecasting model for typhoon events to improve short lead-time flood forecasting performance. To strengthen the forecasting ability of the original support vector machines (SVMs model, the self-organizing map (SOM is adopted to group inputs into different clusters in advance of the proposed SOM-SVM model. Two different input methods are proposed for the SVM-based forecasting method, namely, SOM-SVM1 and SOM-SVM2. The methods are applied to an actual reservoir watershed to determine the 1 to 3 h ahead inflow forecasts. For 1, 2, and 3 h ahead forecasts, improvements in mean coefficient of efficiency (MCE due to the clusters obtained from SOM-SVM1 are 21.5%, 18.5%, and 23.0%, respectively. Furthermore, improvement in MCE for SOM-SVM2 is 20.9%, 21.2%, and 35.4%, respectively. Another SOM-SVM2 model increases the SOM-SVM1 model for 1, 2, and 3 h ahead forecasts obtained improvement increases of 0.33%, 2.25%, and 10.08%, respectively. These results show that the performance of the proposed model can provide improved forecasts of hourly inflow, especially in the proposed SOM-SVM2 model. In conclusion, the proposed model, which considers limit and higher related inputs instead of all inputs, can generate better forecasts in different clusters than are generated from the SOM process. The SOM-SVM2 model is recommended as an alternative to the original SVR (Support Vector Regression model because of its accuracy and robustness.
Saidi, Lotfi; Ben Ali, Jaouher; Fnaiech, Farhat
2015-01-01
Condition monitoring and fault diagnosis of rolling element bearings timely and accurately are very important to ensure the reliability of rotating machinery. This paper presents a novel pattern classification approach for bearings diagnostics, which combines the higher order spectra analysis features and support vector machine classifier. The use of non-linear features motivated by the higher order spectra has been reported to be a promising approach to analyze the non-linear and non-Gaussian characteristics of the mechanical vibration signals. The vibration bi-spectrum (third order spectrum) patterns are extracted as the feature vectors presenting different bearing faults. The extracted bi-spectrum features are subjected to principal component analysis for dimensionality reduction. These principal components were fed to support vector machine to distinguish four kinds of bearing faults covering different levels of severity for each fault type, which were measured in the experimental test bench running under different working conditions. In order to find the optimal parameters for the multi-class support vector machine model, a grid-search method in combination with 10-fold cross-validation has been used. Based on the correct classification of bearing patterns in the test set, in each fold the performance measures are computed. The average of these performance measures is computed to report the overall performance of the support vector machine classifier. In addition, in fault detection problems, the performance of a detection algorithm usually depends on the trade-off between robustness and sensitivity. The sensitivity and robustness of the proposed method are explored by running a series of experiments. A receiver operating characteristic (ROC) curve made the results more convincing. The results indicated that the proposed method can reliably identify different fault patterns of rolling element bearings based on vibration signals. Copyright © 2014 ISA
Study on Parameter Optimization for Support Vector Regression in Solving the Inverse ECG Problem
Jiang, Mingfeng; Jiang, Shanshan; Zhu, Lingyan; Wang, Yaming; Huang, Wenqing; Zhang, Heng
2013-01-01
The typical inverse ECG problem is to noninvasively reconstruct the transmembrane potentials (TMPs) from body surface potentials (BSPs). In the study, the inverse ECG problem can be treated as a regression problem with multi-inputs (body surface potentials) and multi-outputs (transmembrane potentials), which can be solved by the support vector regression (SVR) method. In order to obtain an effective SVR model with optimal regression accuracy and generalization performance, the hyperparameters of SVR must be set carefully. Three different optimization methods, that is, genetic algorithm (GA), differential evolution (DE) algorithm, and particle swarm optimization (PSO), are proposed to determine optimal hyperparameters of the SVR model. In this paper, we attempt to investigate which one is the most effective way in reconstructing the cardiac TMPs from BSPs, and a full comparison of their performances is also provided. The experimental results show that these three optimization methods are well performed in finding the proper parameters of SVR and can yield good generalization performance in solving the inverse ECG problem. Moreover, compared with DE and GA, PSO algorithm is more efficient in parameters optimization and performs better in solving the inverse ECG problem, leading to a more accurate reconstruction of the TMPs. PMID:23983808
Seasonal prediction of winter extreme precipitation over Canada by support vector regression
Directory of Open Access Journals (Sweden)
Z. Zeng
2011-01-01
Full Text Available For forecasting the maximum 5-day accumulated precipitation over the winter season at lead times of 3, 6, 9 and 12 months over Canada from 1950 to 2007, two nonlinear and two linear regression models were used, where the models were support vector regression (SVR (nonlinear and linear versions, nonlinear Bayesian neural network (BNN and multiple linear regression (MLR. The 118 stations were grouped into six geographic regions by K-means clustering. For each region, the leading principal components of the winter maximum 5-d accumulated precipitation anomalies were the predictands. Potential predictors included quasi-global sea surface temperature anomalies and 500 hPa geopotential height anomalies over the Northern Hemisphere, as well as six climate indices (the Niño-3.4 region sea surface temperature, the North Atlantic Oscillation, the Pacific-North American teleconnection, the Pacific Decadal Oscillation, the Scandinavia pattern, and the East Atlantic pattern. The results showed that in general the two robust SVR models tended to have better forecast skills than the two non-robust models (MLR and BNN, and the nonlinear SVR model tended to forecast slightly better than the linear SVR model. Among the six regions, the Prairies region displayed the highest forecast skills, and the Arctic region the second highest. The strongest nonlinearity was manifested over the Prairies and the weakest nonlinearity over the Arctic.
Directory of Open Access Journals (Sweden)
Jia Uddin
2014-01-01
Full Text Available This paper proposes a method for the reliable fault detection and classification of induction motors using two-dimensional (2D texture features and a multiclass support vector machine (MCSVM. The proposed model first converts time-domain vibration signals to 2D gray images, resulting in texture patterns (or repetitive patterns, and extracts these texture features by generating the dominant neighborhood structure (DNS map. The principal component analysis (PCA is then used for the purpose of dimensionality reduction of the high-dimensional feature vector including the extracted texture features due to the fact that the high-dimensional feature vector can degrade classification performance, and this paper configures an effective feature vector including discriminative fault features for diagnosis. Finally, the proposed approach utilizes the one-against-all (OAA multiclass support vector machines (MCSVMs to identify induction motor failures. In this study, the Gaussian radial basis function kernel cooperates with OAA MCSVMs to deal with nonlinear fault features. Experimental results demonstrate that the proposed approach outperforms three state-of-the-art fault diagnosis algorithms in terms of fault classification accuracy, yielding an average classification accuracy of 100% even in noisy environments.
Bearing Fault Diagnosis Based on Multiscale Permutation Entropy and Support Vector Machine
Directory of Open Access Journals (Sweden)
Jian-Jiun Ding
2012-07-01
Full Text Available Bearing fault diagnosis has attracted significant attention over the past few decades. It consists of two major parts: vibration signal feature extraction and condition classification for the extracted features. In this paper, multiscale permutation entropy (MPE was introduced for feature extraction from faulty bearing vibration signals. After extracting feature vectors by MPE, the support vector machine (SVM was applied to automate the fault diagnosis procedure. Simulation results demonstrated that the proposed method is a very powerful algorithm for bearing fault diagnosis and has much better performance than the methods based on single scale permutation entropy (PE and multiscale entropy (MSE.
Single Image Super-Resolution by Non-Linear Sparse Representation and Support Vector Regression
Directory of Open Access Journals (Sweden)
Yungang Zhang
2017-02-01
Full Text Available Sparse representations are widely used tools in image super-resolution (SR tasks. In the sparsity-based SR methods, linear sparse representations are often used for image description. However, the non-linear data distributions in images might not be well represented by linear sparse models. Moreover, many sparsity-based SR methods require the image patch self-similarity assumption; however, the assumption may not always hold. In this paper, we propose a novel method for single image super-resolution (SISR. Unlike most prior sparsity-based SR methods, the proposed method uses non-linear sparse representation to enhance the description of the non-linear information in images, and the proposed framework does not need to assume the self-similarity of image patches. Based on the minimum reconstruction errors, support vector regression (SVR is applied for predicting the SR image. The proposed method was evaluated on various benchmark images, and promising results were obtained.
International Nuclear Information System (INIS)
Bae, In Ho; Naa, Man Gyun; Lee, Yoon Joon; Park, Goon Cherl
2009-01-01
The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, F q ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor
Applying the Support Vector Machine Method to Matching IRAS and SDSS Catalogues
Directory of Open Access Journals (Sweden)
Chen Cao
2007-10-01
Full Text Available This paper presents results of applying a machine learning technique, the Support Vector Machine (SVM, to the astronomical problem of matching the Infra-Red Astronomical Satellite (IRAS and Sloan Digital Sky Survey (SDSS object catalogues. In this study, the IRAS catalogue has much larger positional uncertainties than those of the SDSS. A model was constructed by applying the supervised learning algorithm (SVM to a set of training data. Validation of the model shows a good identification performance (∼ 90% correct, better than that derived from classical cross-matching algorithms, such as the likelihood-ratio method used in previous studies.
International Nuclear Information System (INIS)
Sahin, M.Oe.; Kruecker, D.; Melzer-Pellmann, I.A.
2016-01-01
In this paper we promote the use of Support Vector Machines (SVM) as a machine learning tool for searches in high-energy physics. As an example for a new-physics search we discuss the popular case of Supersymmetry at the Large Hadron Collider. We demonstrate that the SVM is a valuable tool and show that an automated discovery-significance based optimization of the SVM hyper-parameters is a highly efficient way to prepare an SVM for such applications. A new C++ LIBSVM interface called SVM-HINT is developed and available on Github.
A novel license plate recognition method based on least squares support vector machine
Gui, Lin
2011-12-01
Support vector machine (SVM) is a new type of machine learning method based on statistical learning. It avoids the neural network of some inherent disadvantages, such as the local minimum in the training process, structure, selection, slow convergence speed problem, have very strong nonlinear system identification and generalization ability and small sample. In this paper, we established multi-classifier based on binary tree model. Category rules are based on experience. But this method cans classifier to come to less than optimal results. Our future work focused on how to find a category to get the optimal rules of multi-classifier method, experience risk analysis, cluster analysis will consider further.
Vector control of three-phase AC machines system development in the practice
Quang, Nguyen Phung; Dittrich, J
2015-01-01
This book addresses the vector control of three-phase AC machines, in particular induction motors with squirrel-cage rotors (IM), permanent magnet synchronous motors (PMSM) and doubly-fed induction machines (DFIM), from a practical design and development perspective. The main focus is on the application of IM and PMSM in electrical drive systems, where field-orientated control has been successfully established in practice. It also discusses the use of grid-voltage oriented control of DFIMs in wind power plants. This second, enlarged edition includes new insights into flatness-based nonlinear
Combining Bootstrap Aggregation with Support Vector Regression for Small Blood Pressure Measurement.
Lee, Soojeong; Ahmad, Awais; Jeon, Gwanggil
2018-02-28
Blood pressure measurement based on oscillometry is one of the most popular techniques to check a health condition of individual subjects. This paper proposes a support vector using fusion estimator with a bootstrap technique for oscillometric blood pressure (BP) estimation. However, some inherent problems exist with this approach. First, it is not simple to identify the best support vector regression (SVR) estimator, and worthy information might be omitted when selecting one SVR estimator and discarding others. Additionally, our input feature data, acquired from only five BP measurements per subject, represent a very small sample size. This constitutes a critical limitation when utilizing the SVR technique and can cause overfitting or underfitting, depending on the structure of the algorithm. To overcome these challenges, a fusion with an asymptotic approach (based on combining the bootstrap with the SVR technique) is utilized to generate the pseudo features needed to predict the BP values. This ensemble estimator using the SVR technique can learn to effectively mimic the non-linear relations between the input data acquired from the oscillometry and the nurse's BPs.
Noise reduction by support vector regression with a Ricker wavelet kernel
International Nuclear Information System (INIS)
Deng, Xiaoying; Yang, Dinghui; Xie, Jing
2009-01-01
We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR
Noise reduction by support vector regression with a Ricker wavelet kernel
Deng, Xiaoying; Yang, Dinghui; Xie, Jing
2009-06-01
We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR.
A Vector Approach to Regression Analysis and Its Implications to Heavy-Duty Diesel Emissions
Energy Technology Data Exchange (ETDEWEB)
McAdams, H.T.
2001-02-14
An alternative approach is presented for the regression of response data on predictor variables that are not logically or physically separable. The methodology is demonstrated by its application to a data set of heavy-duty diesel emissions. Because of the covariance of fuel properties, it is found advantageous to redefine the predictor variables as vectors, in which the original fuel properties are components, rather than as scalars each involving only a single fuel property. The fuel property vectors are defined in such a way that they are mathematically independent and statistically uncorrelated. Because the available data set does not allow definitive separation of vehicle and fuel effects, and because test fuels used in several of the studies may be unrealistically contrived to break the association of fuel variables, the data set is not considered adequate for development of a full-fledged emission model. Nevertheless, the data clearly show that only a few basic patterns of fuel-property variation affect emissions and that the number of these patterns is considerably less than the number of variables initially thought to be involved. These basic patterns, referred to as ''eigenfuels,'' may reflect blending practice in accordance with their relative weighting in specific circumstances. The methodology is believed to be widely applicable in a variety of contexts. It promises an end to the threat of collinearity and the frustration of attempting, often unrealistically, to separate variables that are inseparable.
Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady
2017-09-01
Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights
Directory of Open Access Journals (Sweden)
N. Sujay Raghavendra
2015-12-01
Full Text Available This research demonstrates the state-of-the-art capability of Wavelet packet analysis in improving the forecasting efficiency of Support vector regression (SVR through the development of a novel hybrid Wavelet packet–Support vector regression (WP–SVR model for forecasting monthly groundwater level fluctuations observed in three shallow unconfined coastal aquifers. The Sequential Minimal Optimization Algorithm-based SVR model is also employed for comparative study with WP–SVR model. The input variables used for modeling were monthly time series of total rainfall, average temperature, mean tide level, and past groundwater level observations recorded during the period 1996–2006 at three observation wells located near Mangalore, India. The Radial Basis function is employed as a kernel function during SVR modeling. Model parameters are calibrated using the first seven years of data, and the remaining three years data are used for model validation using various input combinations. The performance of both the SVR and WP–SVR models is assessed using different statistical indices. From the comparative result analysis of the developed models, it can be seen that WP–SVR model outperforms the classic SVR model in predicting groundwater levels at all the three well locations (e.g. NRMSE(WP–SVR = 7.14, NRMSE(SVR = 12.27; NSE(WP–SVR = 0.91, NSE(SVR = 0.8 during the test phase with respect to well location at Surathkal. Therefore, using the WP–SVR model is highly acceptable for modeling and forecasting of groundwater level fluctuations.
Energy Technology Data Exchange (ETDEWEB)
Fei, Sheng-wei; Wang, Ming-Jun; Miao, Yu-bin; Tu, Jun; Liu, Cheng-liang [School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240 (China)
2009-06-15
Forecasting of dissolved gases content in power transformer oil is a complicated problem due to its nonlinearity and the small quantity of training data. Support vector machine (SVM) has been successfully employed to solve regression problem of nonlinearity and small sample. However, the practicability of SVM is effected due to the difficulty of selecting appropriate SVM parameters. Particle swarm optimization (PSO) is a new optimization method, which is motivated by social behaviour of organisms such as bird flocking and fish schooling. The method not only has strong global search capability, but also is very easy to implement. Thus, the proposed PSO-SVM model is applied to forecast dissolved gases content in power transformer oil in this paper, among which PSO is used to determine free parameters of support vector machine. The experimental data from several electric power companies in China is used to illustrate the performance of proposed PSO-SVM model. The experimental results indicate that the PSO-SVM method can achieve greater forecasting accuracy than grey model, artificial neural network under the circumstances of small sample. (author)
Energy Technology Data Exchange (ETDEWEB)
Fei Shengwei [School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240 (China)], E-mail: feishengwei@sohu.com; Wang Mingjun; Miao Yubin; Tu Jun; Liu Chengliang [School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200240 (China)
2009-06-15
Forecasting of dissolved gases content in power transformer oil is a complicated problem due to its nonlinearity and the small quantity of training data. Support vector machine (SVM) has been successfully employed to solve regression problem of nonlinearity and small sample. However, the practicability of SVM is effected due to the difficulty of selecting appropriate SVM parameters. Particle swarm optimization (PSO) is a new optimization method, which is motivated by social behaviour of organisms such as bird flocking and fish schooling. The method not only has strong global search capability, but also is very easy to implement. Thus, the proposed PSO-SVM model is applied to forecast dissolved gases content in power transformer oil in this paper, among which PSO is used to determine free parameters of support vector machine. The experimental data from several electric power companies in China is used to illustrate the performance of proposed PSO-SVM model. The experimental results indicate that the PSO-SVM method can achieve greater forecasting accuracy than grey model, artificial neural network under the circumstances of small sample.
Directory of Open Access Journals (Sweden)
Chao Dong
2012-01-01
Full Text Available Benefiting from the kernel skill and the sparse property, the relevance vector machine (RVM could acquire a sparse solution, with an equivalent generalization ability compared with the support vector machine. The sparse property requires much less time in the prediction, making RVM potential in classifying the large-scale hyperspectral image. However, RVM is not widespread influenced by its slow training procedure. To solve the problem, the classification of the hyperspectral image using RVM is accelerated by the parallel computing technique in this paper. The parallelization is revealed from the aspects of the multiclass strategy, the ensemble of multiple weak classifiers, and the matrix operations. The parallel RVMs are implemented using the C language plus the parallel functions of the linear algebra packages and the message passing interface library. The proposed methods are evaluated by the AVIRIS Indian Pines data set on the Beowulf cluster and the multicore platforms. It shows that the parallel RVMs accelerate the training procedure obviously.
Credit Scoring by Fuzzy Support Vector Machines with a Novel Membership Function
Directory of Open Access Journals (Sweden)
Jian Shi
2016-11-01
Full Text Available Due to the recent financial crisis and European debt crisis, credit risk evaluation has become an increasingly important issue for financial institutions. Reliable credit scoring models are crucial for commercial banks to evaluate the financial performance of clients and have been widely studied in the fields of statistics and machine learning. In this paper a novel fuzzy support vector machine (SVM credit scoring model is proposed for credit risk analysis, in which fuzzy membership is adopted to indicate different contribution of each input point to the learning of SVM classification hyperplane. Considering the methodological consistency, support vector data description (SVDD is introduced to construct the fuzzy membership function and to reduce the effect of outliers and noises. The SVDD-based fuzzy SVM model is tested against the traditional fuzzy SVM on two real-world datasets and the research results confirm the effectiveness of the presented method.
Dual linear structured support vector machine tracking method via scale correlation filter
Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen
2018-01-01
Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.
Automatic Detection of P and S Phases by Support Vector Machine
Jiang, Y.; Ning, J.; Bao, T.
2017-12-01
Many methods in seismology rely on accurately picked phases. A well performed program on automatically phase picking will assure the application of these methods. Related researches before mostly focus on finding different characteristics between noise and phases, which are all not enough successful. We have developed a new method which mainly based on support vector machine to detect P and S phases. In it, we first input some waveform pieces into the support vector machine, then employ it to work out a hyper plane which can divide the space into two parts: respectively noise and phase. We further use the same method to find a hyper plane which can separate the phase space into P and S parts based on the three components' cross-correlation matrix. In order to further improve the ability of phase detection, we also employ array data. At last, we show that the overall effect of our method is robust by employing both synthetic and real data.
An Enhanced MEMS Error Modeling Approach Based on Nu-Support Vector Regression
Directory of Open Access Journals (Sweden)
Deepak Bhatt
2012-07-01
Full Text Available Micro Electro Mechanical System (MEMS-based inertial sensors have made possible the development of a civilian land vehicle navigation system by offering a low-cost solution. However, the accurate modeling of the MEMS sensor errors is one of the most challenging tasks in the design of low-cost navigation systems. These sensors exhibit significant errors like biases, drift, noises; which are negligible for higher grade units. Different conventional techniques utilizing the Gauss Markov model and neural network method have been previously utilized to model the errors. However, Gauss Markov model works unsatisfactorily in the case of MEMS units due to the presence of high inherent sensor errors. On the other hand, modeling the random drift utilizing Neural Network (NN is time consuming, thereby affecting its real-time implementation. We overcome these existing drawbacks by developing an enhanced Support Vector Machine (SVM based error model. Unlike NN, SVMs do not suffer from local minimisation or over-fitting problems and delivers a reliable global solution. Experimental results proved that the proposed SVM approach reduced the noise standard deviation by 10–35% for gyroscopes and 61–76% for accelerometers. Further, positional error drifts under static conditions improved by 41% and 80% in comparison to NN and GM approaches.
Discussion About Nonlinear Time Series Prediction Using Least Squares Support Vector Machine
International Nuclear Information System (INIS)
Xu Ruirui; Bian Guoxing; Gao Chenfeng; Chen Tianlun
2005-01-01
The least squares support vector machine (LS-SVM) is used to study the nonlinear time series prediction. First, the parameter γ and multi-step prediction capabilities of the LS-SVM network are discussed. Then we employ clustering method in the model to prune the number of the support values. The learning rate and the capabilities of filtering noise for LS-SVM are all greatly improved.
A Fast Classification Method of Faults in Power Electronic Circuits Based on Support Vector Machines
Cui Jiang; Shi Ge; Gong Chunying
2017-01-01
Fault detection and location are important and front-end tasks in assuring the reliability of power electronic circuits. In essence, both tasks can be considered as the classification problem. This paper presents a fast fault classification method for power electronic circuits by using the support vector machine (SVM) as a classifier and the wavelet transform as a feature extraction technique. Using one-against-rest SVM and one-against-one SVM are two general approaches to fault classificatio...
Yan, Xing; Chowdhury, Nurul A.
2015-01-01
Currently, there are many techniques available for short-term forecasting of the electricity market clearing price (MCP), but very little work has been done in the area of midterm forecasting of the electricity MCP. The midterm forecasting of the electricity MCP is essential for maintenance scheduling, planning, bilateral contracting, resources reallocation, and budgeting. A two-stage multiple support vector machine (SVM) based midterm forecasting model of the electricity MCP is proposed in t...
Lv, Jie; Yan, Zhenguo; Wei, Jingyi
2014-11-01
Accurate retrieval of crop chlorophyll content is of great importance for crop growth monitoring, crop stress situations, and the crop yield estimation. This study focused on retrieval of rice chlorophyll content from data through radiative transfer model inversion. A field campaign was carried out in September 2009 in the farmland of ChangChun, Jinlin province, China. A different set of 10 sites of the same species were used in 2009 for validation of methodologies. Reflectance of rice was collected using ASD field spectrometer for the solar reflective wavelengths (350-2500 nm), chlorophyll content of rice was measured by SPAD-502 chlorophyll meter. Each sample sites was recorded with a Global Position System (GPS).Firstly, the PROSPECT radiative transfer model was inverted using support vector machine in order to link rice spectrum and the corresponding chlorophyll content. Secondly, genetic algorithms were adopted to select parameters of support vector machine, then support vector machine was trained the training data set, in order to establish leaf chlorophyll content estimation model. Thirdly, a validation data set was established based on hyperspectral data, and the leaf chlorophyll content estimation model was applied to the validation data set to estimate leaf chlorophyll content of rice in the research area. Finally, the outcome of the inversion was evaluated using the calculated R2 and RMSE values with the field measurements. The results of the study highlight the significance of support vector machine in estimating leaf chlorophyll content of rice. Future research will concentrated on the view of the definition of satellite images and the selection of the best measurement configuration for accurate estimation of rice characteristics.
A technique to identify some typical radio frequency interference using support vector machine
Wang, Yuanchao; Li, Mingtao; Li, Dawei; Zheng, Jianhua
2017-07-01
In this paper, we present a technique to automatically identify some typical radio frequency interference from pulsar surveys using support vector machine. The technique has been tested by candidates. In these experiments, to get features of SVM, we use principal component analysis for mosaic plots and its classification accuracy is 96.9%; while we use mathematical morphology operation for smog plots and horizontal stripes plots and its classification accuracy is 86%. The technique is simple, high accurate and useful.
Land, Walker H., Jr.; McKee, Dan; Velazquez, Roberto; Wong, Lut; Lo, Joseph Y.; Anderson, Francis R.
2003-05-01
The objectives of this paper are to discuss: (1) the development and testing of a new Evolutionary Programming (EP) method to optimally configure Support Vector Machine (SVM) parameters for facilitating the diagnosis of breast cancer; (2) evaluation of EP derived learning machines when the number of BI-RADS and clinical history discriminators are reduced from 16 to 7; (3) establishing system performance for several SVM kernels in addition to the EP/Adaptive Boosting (EP/AB) hybrid using the Digital Database for Screening Mammography, University of South Florida (DDSM USF) and Duke data sets; and (4) obtaining a preliminary evaluation of the measurement of SVM learning machine inter-institutional generalization capability using BI-RADS data. Measuring performance of the SVM designs and EP/AB hybrid against these objectives will provide quantative evidence that the software packages described can generalize to larger patient data sets from different institutions. Most iterative methods currently in use to optimize learning machine parameters are time consuming processes, which sometimes yield sub-optimal values resulting in performance degradation. SVMs are new machine intelligence paradigms, which use the Structural Risk Minimization (SRM) concept to develop learning machines. These learning machines can always be trained to provide global minima, given that the machine parameters are optimally computed. In addition, several system performance studies are described which include EP derived SVM performance as a function of: (a) population and generation size as well as a method for generating initial populations and (b) iteratively derived versus EP derived learning machine parameters. Finally, the authors describe a set of experiments providing preliminary evidence that both the EP/AB hybrid and SVM Computer Aided Diagnostic C++ software packages will work across a large population of patients, based on a data set of approximately 2,500 samples from five different
González, Alvaro J; Liao, Li
2010-10-29
Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http
Directory of Open Access Journals (Sweden)
Liao Li
2010-10-01
Full Text Available Abstract Background Protein-protein interaction (PPI plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs, based on domains represented as interaction profile hidden Markov models (ipHMM where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB. Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD. Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure, an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on
Wei, Liyang; Yang, Yongyi; Nishikawa, Robert M.
2005-04-01
Microcalcification (MC) clusters in mammograms can be important early signs of breast cancer in women. Accurate detection of MC clusters is an important but challenging problem. In this paper, we propose the use of a recently developed machine learning technique -- relevance vector machine (RVM) -- for automatic detection of MCs in digitized mammograms. RVM is based on Bayesian estimation theory, and as a feature it can yield a decision function that depends on only a very small number of so-called relevance vectors. We formulate MC detection as a supervised-learning problem, and use RVM to classify if an MC object is present or not at each location in a mammogram image. MC clusters are then identified by grouping the detected MC objects. The proposed method is tested using a database of 141 clinical mammograms, and compared with a support vector machine (SVM) classifier which we developed previously. The detection performance is evaluated using the free-response receiver operating characteristic (FROC) curves. It is demonstrated that the RVM classifier matches closely with the SVM classifier in detection performance, and does so with a much sparser kernel representation than the SVM classifier. Consequently, the RVM classifier greatly reduces the computational complexity, making it more suitable for real-time processing of MC clusters in mammograms.
Implementation of algorithms based on support vector machine (SVM for electric systems: topic review
Directory of Open Access Journals (Sweden)
Jefferson Jara Estupiñan
2016-06-01
Full Text Available Objective: To perform a review of implementation of algorithms based on support vectore machine applied to electric systems. Method: A paper search is done mainly on Bibliographic Indexes (BI and Bibliographic Bases with Selection Committee (BBSC about support vector machine. This work shows a qualitative and/or quantitative description about advances and applications in the electrical environment, approaching topics such as: electrical market prediction, demand prediction, non-technical losses (theft, alternative energy source and transformers, among others, in each work the respective citation is done in order to guarantee the copy right and allow to the reader a dynamic movement between the reading and the cited works. Results: A detailed review is done, focused on the searching of implemented algorithms in electric systems and innovating application areas. Conclusion: Support vector machines have a lot of applications due to their multiple benefits, however in the electric energy area; they have not been totally applied, this allow to identify a promising area of researching.
Support vector machine based fault classification and location of a long transmission line
Directory of Open Access Journals (Sweden)
Papia Ray
2016-09-01
Full Text Available This paper investigates support vector machine based fault type and distance estimation scheme in a long transmission line. The planned technique uses post fault single cycle current waveform and pre-processing of the samples is done by wavelet packet transform. Energy and entropy are obtained from the decomposed coefficients and feature matrix is prepared. Then the redundant features from the matrix are taken out by the forward feature selection method and normalized. Test and train data are developed by taking into consideration variables of a simulation situation like fault type, resistance path, inception angle, and distance. In this paper 10 different types of short circuit fault are analyzed. The test data are examined by support vector machine whose parameters are optimized by particle swarm optimization method. The anticipated method is checked on a 400 kV, 300 km long transmission line with voltage source at both the ends. Two cases were examined with the proposed method. The first one is fault very near to both the source end (front and rear and the second one is support vector machine with and without optimized parameter. Simulation result indicates that the anticipated method for fault classification gives high accuracy (99.21% and least fault distance estimation error (0.29%.
Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
Genome-wide prediction of discrete traits using bayesian regressions and machine learning
Directory of Open Access Journals (Sweden)
Forni Selma
2011-02-01
Full Text Available Abstract Background Genomic selection has gained much attention and the main goal is to increase the predictive accuracy and the genetic gain in livestock using dense marker information. Most methods dealing with the large p (number of covariates small n (number of observations problem have dealt only with continuous traits, but there are many important traits in livestock that are recorded in a discrete fashion (e.g. pregnancy outcome, disease resistance. It is necessary to evaluate alternatives to analyze discrete traits in a genome-wide prediction context. Methods This study shows two threshold versions of Bayesian regressions (Bayes A and Bayesian LASSO and two machine learning algorithms (boosting and random forest to analyze discrete traits in a genome-wide prediction context. These methods were evaluated using simulated and field data to predict yet-to-be observed records. Performances were compared based on the models' predictive ability. Results The simulation showed that machine learning had some advantages over Bayesian regressions when a small number of QTL regulated the trait under pure additivity. However, differences were small and disappeared with a large number of QTL. Bayesian threshold LASSO and boosting achieved the highest accuracies, whereas Random Forest presented the highest classification performance. Random Forest was the most consistent method in detecting resistant and susceptible animals, phi correlation was up to 81% greater than Bayesian regressions. Random Forest outperformed other methods in correctly classifying resistant and susceptible animals in the two pure swine lines evaluated. Boosting and Bayes A were more accurate with crossbred data. Conclusions The results of this study suggest that the best method for genome-wide prediction may depend on the genetic basis of the population analyzed. All methods were less accurate at correctly classifying intermediate animals than extreme animals. Among the different
PSO-based support vector machine with cuckoo search technique for clinical disease diagnoses.
Liu, Xiaoyong; Fu, Hui
2014-01-01
Disease diagnosis is conducted with a machine learning method. We have proposed a novel machine learning method that hybridizes support vector machine (SVM), particle swarm optimization (PSO), and cuckoo search (CS). The new method consists of two stages: firstly, a CS based approach for parameter optimization of SVM is developed to find the better initial parameters of kernel function, and then PSO is applied to continue SVM training and find the best parameters of SVM. Experimental results indicate that the proposed CS-PSO-SVM model achieves better classification accuracy and F-measure than PSO-SVM and GA-SVM. Therefore, we can conclude that our proposed method is very efficient compared to the previously reported algorithms.
Reducing U2R and R2L category false negative rates with support vector machines
Directory of Open Access Journals (Sweden)
Maček Nemanja
2014-01-01
Full Text Available The KDD Cup '99 is commonly used dataset for training and testing IDS machine learning algorithms. Some of the major downsides of the dataset are the distribution and the proportions of U2R and R2L instances, which represent the most dangerous attack types, as well as the existence of R2L attack instances identical to normal traffic. This enforces minor category detection complexity and causes problems while building a machine learning model capable of detecting these attacks with sufficiently low false negative rate. This paper presents a new support vector machine based intrusion detection system that classifies unknown data instances according both to the feature values and weight factors that represent importance of features towards the classification. Increased detection rate and significantly decreased false negative rate for U2R and R2L categories, that have a very few instances in the training set, have been empirically proven.
A Numerical Comparison of Rule Ensemble Methods and Support Vector Machines
Energy Technology Data Exchange (ETDEWEB)
Meza, Juan C.; Woods, Mark
2009-12-18
Machine or statistical learning is a growing field that encompasses many scientific problems including estimating parameters from data, identifying risk factors in health studies, image recognition, and finding clusters within datasets, to name just a few examples. Statistical learning can be described as 'learning from data' , with the goal of making a prediction of some outcome of interest. This prediction is usually made on the basis of a computer model that is built using data where the outcomes and a set of features have been previously matched. The computer model is called a learner, hence the name machine learning. In this paper, we present two such algorithms, a support vector machine method and a rule ensemble method. We compared their predictive power on three supernova type 1a data sets provided by the Nearby Supernova Factory and found that while both methods give accuracies of approximately 95%, the rule ensemble method gives much lower false negative rates.
International Nuclear Information System (INIS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-01-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO 2 , Fe 2 O 3 , CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na 2 O, K 2 O, TiO 2 , and P 2 O 5 , the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144
Energy Technology Data Exchange (ETDEWEB)
Boucher, Thomas F., E-mail: boucher@cs.umass.edu [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Ozanne, Marie V. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Carmosino, Marco L. [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Dyar, M. Darby [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Mahadevan, Sridhar [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Breves, Elly A.; Lepore, Kate H. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Clegg, Samuel M. [Los Alamos National Laboratory, P.O. Box 1663, MS J565, Los Alamos, NM 87545 (United States)
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO{sub 2}, Fe{sub 2}O{sub 3}, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na{sub 2}O, K{sub 2}O, TiO{sub 2}, and P{sub 2}O{sub 5}, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high
Borin, Alessandra; Ferrão, Marco Flôres; Mello, Cesar; Maretto, Danilo Althmann; Poppi, Ronei Jesus
2006-10-02
This paper proposes the use of the least-squares support vector machine (LS-SVM) as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants (starch, whey or sucrose) found in powdered milk samples, using near-infrared spectroscopy with direct measurements by diffuse reflectance. Due to the spectral differences of the three adulterants a nonlinear behavior is present when all groups of adulterants are in the same data set, making the use of linear methods such as partial least squares regression (PLSR) difficult. Excellent models were built using LS-SVM, with low prediction errors and superior performance in relation to PLSR. These results show it possible to built robust models to quantify some common adulterants in powdered milk using near-infrared spectroscopy and LS-SVM as a nonlinear multivariate calibration procedure.
International Nuclear Information System (INIS)
Li Qiong; Meng Qinglin; Cai Jiejin; Yoshino, Hiroshi; Mochida, Akashi
2009-01-01
This study presents four modeling techniques for the prediction of hourly cooling load in the building. In addition to the traditional back propagation neural network (BPNN), the radial basis function neural network (RBFNN), general regression neural network (GRNN) and support vector machine (SVM) are considered. All the prediction models have been applied to an office building in Guangzhou, China. Evaluation of the prediction accuracy of the four models is based on the root mean square error (RMSE) and mean relative error (MRE). The simulation results demonstrate that the four discussed models can be effective for building cooling load prediction. The SVM and GRNN methods can achieve better accuracy and generalization than the BPNN and RBFNN methods
Irrechukwu, Onyi N; Reiter, David A; Lin, Ping-Chang; Roque, Remigio A; Fishbein, Kenneth W; Spencer, Richard G
2012-06-01
Increased sensitivity in the characterization of cartilage matrix status by magnetic resonance (MR) imaging, through the identification of surrogate markers for tissue quality, would be of great use in the noninvasive evaluation of engineered cartilage. Recent advances in MR evaluation of cartilage include multiexponential and multiparametric analysis, which we now extend to engineered cartilage. We studied constructs which developed from chondrocytes seeded in collagen hydrogels. MR measurements of transverse relaxation times were performed on samples after 1, 2, 3, and 4 weeks of development. Corresponding biochemical measurements of sulfated glycosaminoglycan (sGAG) were also performed. sGAG per wet weight increased from 7.74±1.34 μg/mg in week 1 to 21.06±4.14 μg/mg in week 4. Using multiexponential T₂ analysis, we detected at least three distinct water compartments, with T₂ values and weight fractions of (45 ms, 3%), (200 ms, 4%), and (500 ms, 97%), respectively. These values are consistent with known properties of engineered cartilage and previous studies of native cartilage. Correlations between sGAG and MR measurements were examined using conventional univariate analysis with T₂ data from monoexponential fits with individual multiexponential compartment fractions and sums of these fractions, through multiple linear regression based on linear combinations of fractions, and, finally, with multivariate analysis using the support vector regression (SVR) formalism. The phenomenological relationship between T₂ from monoexponential fitting and sGAG exhibited a correlation coefficient of r²=0.56, comparable to the more physically motivated correlations between individual fractions or sums of fractions and sGAG; the correlation based on the sum of the two proteoglycan-associated fractions was r²=0.58. Correlations between measured sGAG and those calculated using standard linear regression were more modest, with r² in the range 0
Forecast daily indices of solar activity, F10.7, using support vector regression method
International Nuclear Information System (INIS)
Huang Cong; Liu Dandan; Wang Jingsong
2009-01-01
The 10.7 cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7 cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network. (research paper)
Directory of Open Access Journals (Sweden)
Shahrbanoo Goli
2016-01-01
Full Text Available The Support Vector Regression (SVR model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.
Predictive based monitoring of nuclear plant component degradation using support vector regression
Energy Technology Data Exchange (ETDEWEB)
Agarwal, Vivek [Idaho National Lab. (INL), Idaho Falls, ID (United States). Dept. of Human Factors, Controls, Statistics; Alamaniotis, Miltiadis [Purdue Univ., West Lafayette, IN (United States). School of Nuclear Engineering; Tsoukalas, Lefteri H. [Purdue Univ., West Lafayette, IN (United States). School of Nuclear Engineering
2015-02-01
Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component’s respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.
Hybrid ARIMA and Support Vector Regression in Short‑term Electricity Price Forecasting
Directory of Open Access Journals (Sweden)
Jindřich Pokora
2017-01-01
Full Text Available The literature suggests that, in short‑term electricity‑price forecasting, a combination of ARIMA and support vector regression (SVR yields performance improvement over separate use of each method. The objective of the research is to investigate the circumstances under which these hybrid models are superior for day‑ahead hourly price forecasting. Analysis of the Nord Pool market with 16 interconnected areas and 6 investigated monthly periods allows not only for a considerable level of generalizability but also for assessment of the effect of transmission congestion since this causes differences in prices between the Nord Pool areas. The paper finds that SVR, SVRARIMA and ARIMASVR provide similar performance, at the same time, hybrid methods outperform single models in terms of RMSE in 98 % of investigated time series. Furthermore, it seems that higher flexibility of hybrid models improves modeling of price spikes at a slight cost of imprecision during steady periods. Lastly, superiority of hybrid models is pronounced under transmission congestions, measured as first and second moments of the electricity price.
Predictive based monitoring of nuclear plant component degradation using support vector regression
International Nuclear Information System (INIS)
Agarwal, Vivek; Alamaniotis, Miltiadis; Tsoukalas, Lefteri H.
2015-01-01
Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component's respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.
Estimation of the laser cutting operating cost by support vector regression methodology
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Directory of Open Access Journals (Sweden)
Xian-Xia Zhang
2013-01-01
Full Text Available This paper presents a reference function based 3D FLC design methodology using support vector regression (SVR learning. The concept of reference function is introduced to 3D FLC for the generation of 3D membership functions (MF, which enhance the capability of the 3D FLC to cope with more kinds of MFs. The nonlinear mathematical expression of the reference function based 3D FLC is derived, and spatial fuzzy basis functions are defined. Via relating spatial fuzzy basis functions of a 3D FLC to kernel functions of an SVR, an equivalence relationship between a 3D FLC and an SVR is established. Therefore, a 3D FLC can be constructed using the learned results of an SVR. Furthermore, the universal approximation capability of the proposed 3D fuzzy system is proven in terms of the finite covering theorem. Finally, the proposed method is applied to a catalytic packed-bed reactor and simulation results have verified its effectiveness.
Directory of Open Access Journals (Sweden)
Mustakim Mustakim
2016-02-01
Full Text Available The largest region that produces oil palm in Indonesia has an important role in improving the welfare of society and economy. Oil palm has increased significantly in Riau Province in every period, to determine the production development for the next few years with the functions and benefits of oil palm carried prediction production results that were seen from time series data last 8 years (2005-2013. In its prediction implementation, it was done by comparing the performance of Support Vector Regression (SVR method and Artificial Neural Network (ANN. From the experiment, SVR produced the best model compared with ANN. It is indicated by the correlation coefficient of 95% and 6% for MSE in the kernel Radial Basis Function (RBF, whereas ANN produced only 74% for R2 and 9% for MSE on the 8th experiment with hiden neuron 20 and learning rate 0,1. SVR model generates predictions for next 3 years which increased between 3% - 6% from actual data and RBF model predictions.
Neutron Buildup Factors Calculation for Support Vector Regression Application in Shielding Analysis
International Nuclear Information System (INIS)
Duckic, P.; Matijevic, M.; Grgic, D.
2016-01-01
In this paper initial set of data for neutron buildup factors determination using Support Vector Regression (SVR) method is prepared. The performance of SVR technique strongly depends on the quality of information used for model training. Thus it is very important to provide representable data to the SVR. SVR is a supervised type of learning so it demands data in the input/output form. In the case of neutron buildup factors estimation, the input parameters are the incident neutron energy, shielding thickness and shielding material and the output parameter is the neutron buildup factor value. So far the initial sets of data for different shielding configurations have been obtained using SCALE4.4 sequence SAS3. However, this results were obtained using group constants, thus the incident neutron energy was determined as the average value for each energy group. Obtained this way, the data provided to the SVR are fewer and therefore insufficient. More valuable information is obtained using SCALE6.2beta5 sequence MAVRIC which can perform calculations for the explicit incident neutron energy, which leads to greater maneuvering possibilities when active learning measures are employed, and consequently improves the quality of the developed SVR model.(author).
A soft-sensing model for feedwater flow rate using fuzzy support vector regression
Energy Technology Data Exchange (ETDEWEB)
Na, Man Gyun; Yang, Heon Young; Lim, Dong Hyuk [Chosun University, Gwangju (Korea, Republic of)
2008-02-15
Most pressurized water reactors use Venturi flow meters to measure the feedwater flow rate. However, fouling phenomena, which allow corrosion products to accumulate and increase the differential pressure across the Venturi flow meter, can result in an overestimation of the flow rate. In this study, a soft-sensing model based on fuzzy support vector regression was developed to enable accurate on-line prediction of the feedwater flow rate. The available data was divided into two groups by fuzzy c-means clustering in order to reduce the training time. The data for training the soft-sensing model was selected from each data group with the aid of a subtractive clustering scheme because informative data increases the learning effect. The proposed soft-sensing model was confirmed with the real plant data of Yonggwang Nuclear Power Plant Unit 3. The root mean square error and relative maximum error of the model were quite small. Hence, this model can be used to validate and monitor existing hardware feedwater flow meters.
Support vector regression methodology for estimating global solar radiation in Algeria
Guermoui, Mawloud; Rabehi, Abdelaziz; Gairaa, Kacem; Benkaciali, Said
2018-01-01
Accurate estimation of Daily Global Solar Radiation (DGSR) has been a major goal for solar energy applications. In this paper we show the possibility of developing a simple model based on the Support Vector Regression (SVM-R), which could be used to estimate DGSR on the horizontal surface in Algeria based only on sunshine ratio as input. The SVM model has been developed and tested using a data set recorded over three years (2005-2007). The data was collected at the Applied Research Unit for Renewable Energies (URAER) in Ghardaïa city. The data collected between 2005-2006 are used to train the model while the 2007 data are used to test the performance of the selected model. The measured and the estimated values of DGSR were compared during the testing phase statistically using the Root Mean Square Error (RMSE), Relative Square Error (rRMSE), and correlation coefficient (r2), which amount to 1.59(MJ/m2), 8.46 and 97,4%, respectively. The obtained results show that the SVM-R is highly qualified for DGSR estimation using only sunshine ratio.
Zhou, Pei-pei; Shan, Jin-feng; Jiang, Jian-lan
2015-12-01
To optimize the optimal microwave-assisted extraction method of curcuminoids from Curcuma longa. On the base of single factor experiment, the ethanol concentration, the ratio of liquid to solid and the microwave time were selected for further optimization. Support Vector Regression (SVR) and Central Composite Design-Response Surface Methodology (CCD) algorithm were utilized to design and establish models respectively, while Particle Swarm Optimization (PSO) was introduced to optimize the parameters of SVR models and to search optimal points of models. The evaluation indicator, the sum of curcumin, demethoxycurcumin and bisdemethoxycurcumin by HPLC, were used. The optimal parameters of microwave-assisted extraction were as follows: ethanol concentration of 69%, ratio of liquid to solid of 21 : 1, microwave time of 55 s. On those conditions, the sum of three curcuminoids was 28.97 mg/g (per gram of rhizomes powder). Both the CCD model and the SVR model were credible, for they have predicted the similar process condition and the deviation of yield were less than 1.2%.
Dai, Wensheng
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740
Tian, Y.; Xu, Y. P.
2017-12-01
In this paper, the Support Vector Regression (SVR) model incorporating climate indices and drought indices are developed to predict agriculture drought in Xiangjiang River basin, Central China. The agriculture droughts are presented with the Precipitation-Evapotranspiration Index (SPEI). According to the analysis of the relationship between SPEI with different time scales and soil moisture, it is found that SPEI of six months time scales (SPEI-6) could reflect the soil moisture better than that of three and one month time scale from the drought features including drought duration, severity and peak. Climate forcing like El Niño Southern Oscillation and western Pacific subtropical high (WPSH) are represented by climate indices such as MEI and series indices of WPSH. Ridge Point of WPSH is found to be the key factor that influences the agriculture drought mainly through the control of temperature. Based on the climate indices analysis, the predictions of SPEI-6 are conducted using the SVR model. The results show that the SVR model incorperating climate indices, especially ridge point of WPSH, could improve the prediction accuracy compared to that using drought index only. The improvement was more significant for the prediction of one month lead time than that of three months lead time. However, it needs to be cautious in selection of the input parameters, since adding more useless information could have a counter effect in attaining a better prediction.
Prediction of diffuse solar irradiance using machine learning and multivariable regression
International Nuclear Information System (INIS)
Lou, Siwei; Li, Danny H.W.; Lam, Joseph C.; Chan, Wilco W.H.
2016-01-01
Highlights: • 54.9% of the annual global irradiance is composed by its diffuse part in HK. • Hourly diffuse irradiance was predicted by accessible variables. • The importance of variable in prediction was assessed by machine learning. • Simple prediction equations were developed with the knowledge of variable importance. - Abstract: The paper studies the horizontal global, direct-beam and sky-diffuse solar irradiance data measured in Hong Kong from 2008 to 2013. A machine learning algorithm was employed to predict the horizontal sky-diffuse irradiance and conduct sensitivity analysis for the meteorological variables. Apart from the clearness index (horizontal global/extra atmospheric solar irradiance), we found that predictors including solar altitude, air temperature, cloud cover and visibility are also important in predicting the diffuse component. The mean absolute error (MAE) of the logistic regression using the aforementioned predictors was less than 21.5 W/m 2 and 30 W/m 2 for Hong Kong and Denver, USA, respectively. With the systematic recording of the five variables for more than 35 years, the proposed model would be appropriate to estimate of long-term diffuse solar radiation, study climate change and develope typical meteorological year in Hong Kong and places with similar climates.
Georga, Eleni I; Protopappas, Vasilios C; Ardigò, Diego; Polyzos, Demosthenes; Fotiadis, Dimitrios I
2013-08-01
The prevention of hypoglycemic events is of paramount importance in the daily management of insulin-treated diabetes. The use of short-term prediction algorithms of the subcutaneous (s.c.) glucose concentration may contribute significantly toward this direction. The literature suggests that, although the recent glucose profile is a prominent predictor of hypoglycemia, the overall patient's context greatly impacts its accurate estimation. The objective of this study is to evaluate the performance of a support vector for regression (SVR) s.c. glucose method on hypoglycemia prediction. We extend our SVR model to predict separately the nocturnal events during sleep and the non-nocturnal (i.e., diurnal) ones over 30-min and 60-min horizons using information on recent glucose profile, meals, insulin intake, and physical activities for a hypoglycemic threshold of 70 mg/dL. We also introduce herein additional variables accounting for recurrent nocturnal hypoglycemia due to antecedent hypoglycemia, exercise, and sleep. SVR predictions are compared with those from two other machine learning techniques. The method is assessed on a dataset of 15 patients with type 1 diabetes under free-living conditions. Nocturnal hypoglycemic events are predicted with 94% sensitivity for both horizons and with time lags of 5.43 min and 4.57 min, respectively. As concerns the diurnal events, when physical activities are not considered, the sensitivity is 92% and 96% for a 30-min and 60-min horizon, respectively, with both time lags being less than 5 min. However, when such information is introduced, the diurnal sensitivity decreases by 8% and 3%, respectively. Both nocturnal and diurnal predictions show a high (>90%) precision. Results suggest that hypoglycemia prediction using SVR can be accurate and performs better in most diurnal and nocturnal cases compared with other techniques. It is advised that the problem of hypoglycemia prediction should be handled differently for nocturnal
Liu, Shelley H; Bobb, Jennifer F; Lee, Kyu Ha; Gennings, Chris; Claus Henn, Birgit; Bellinger, David; Austin, Christine; Schnaas, Lourdes; Tellez-Rojo, Martha M; Hu, Howard; Wright, Robert O; Arora, Manish; Coull, Brent A
2017-09-06
The impact of neurotoxic chemical mixtures on children's health is a critical public health concern. It is well known that during early life, toxic exposures may impact cognitive function during critical time intervals of increased vulnerability, known as windows of susceptibility. Knowledge on time windows of susceptibility can help inform treatment and prevention strategies, as chemical mixtures may affect a developmental process that is operating at a specific life phase. There are several statistical challenges in estimating the health effects of time-varying exposures to multi-pollutant mixtures, such as: multi-collinearity among the exposures both within time points and across time points, and complex exposure-response relationships. To address these concerns, we develop a flexible statistical method, called lagged kernel machine regression (LKMR). LKMR identifies critical exposure windows of chemical mixtures, and accounts for complex non-linear and non-additive effects of the mixture at any given exposure window. Specifically, LKMR estimates how the effects of a mixture of exposures change with the exposure time window using a Bayesian formulation of a grouped, fused lasso penalty within a kernel machine regression (KMR) framework. A simulation study demonstrates the performance of LKMR under realistic exposure-response scenarios, and demonstrates large gains over approaches that consider each time window separately, particularly when serial correlation among the time-varying exposures is high. Furthermore, LKMR demonstrates gains over another approach that inputs all time-specific chemical concentrations together into a single KMR. We apply LKMR to estimate associations between neurodevelopment and metal mixtures in Early Life Exposures in Mexico and Neurotoxicology, a prospective cohort study of child health in Mexico City. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Directory of Open Access Journals (Sweden)
L. Gregor
2017-12-01
Full Text Available The Southern Ocean accounts for 40 % of oceanic CO2 uptake, but the estimates are bound by large uncertainties due to a paucity in observations. Gap-filling empirical methods have been used to good effect to approximate pCO2 from satellite observable variables in other parts of the ocean, but many of these methods are not in agreement in the Southern Ocean. In this study we propose two additional methods that perform well in the Southern Ocean: support vector regression (SVR and random forest regression (RFR. The methods are used to estimate ΔpCO2 in the Southern Ocean based on SOCAT v3, achieving similar trends to the SOM-FFN method by Landschützer et al. (2014. Results show that the SOM-FFN and RFR approaches have RMSEs of similar magnitude (14.84 and 16.45 µatm, where 1 atm = 101 325 Pa where the SVR method has a larger RMSE (24.40 µatm. However, the larger errors for SVR and RFR are, in part, due to an increase in coastal observations from SOCAT v2 to v3, where the SOM-FFN method used v2 data. The success of both SOM-FFN and RFR depends on the ability to adapt to different modes of variability. The SOM-FFN achieves this by having independent regression models for each cluster, while this flexibility is intrinsic to the RFR method. Analyses of the estimates shows that the SVR and RFR's respective sensitivity and robustness to outliers define the outcome significantly. Further analyses on the methods were performed by using a synthetic dataset to assess the following: which method (RFR or SVR has the best performance? What is the effect of using time, latitude and longitude as proxy variables on ΔpCO2? What is the impact of the sampling bias in the SOCAT v3 dataset on the estimates? We find that while RFR is indeed better than SVR, the ensemble of the two methods outperforms either one, due to complementary strengths and weaknesses of the methods. Results also show that for the RFR and SVR implementations, it is
Gregor, Luke; Kok, Schalk; Monteiro, Pedro M. S.
2017-12-01
The Southern Ocean accounts for 40 % of oceanic CO2 uptake, but the estimates are bound by large uncertainties due to a paucity in observations. Gap-filling empirical methods have been used to good effect to approximate pCO2 from satellite observable variables in other parts of the ocean, but many of these methods are not in agreement in the Southern Ocean. In this study we propose two additional methods that perform well in the Southern Ocean: support vector regression (SVR) and random forest regression (RFR). The methods are used to estimate ΔpCO2 in the Southern Ocean based on SOCAT v3, achieving similar trends to the SOM-FFN method by Landschützer et al. (2014). Results show that the SOM-FFN and RFR approaches have RMSEs of similar magnitude (14.84 and 16.45 µatm, where 1 atm = 101 325 Pa) where the SVR method has a larger RMSE (24.40 µatm). However, the larger errors for SVR and RFR are, in part, due to an increase in coastal observations from SOCAT v2 to v3, where the SOM-FFN method used v2 data. The success of both SOM-FFN and RFR depends on the ability to adapt to different modes of variability. The SOM-FFN achieves this by having independent regression models for each cluster, while this flexibility is intrinsic to the RFR method. Analyses of the estimates shows that the SVR and RFR's respective sensitivity and robustness to outliers define the outcome significantly. Further analyses on the methods were performed by using a synthetic dataset to assess the following: which method (RFR or SVR) has the best performance? What is the effect of using time, latitude and longitude as proxy variables on ΔpCO2? What is the impact of the sampling bias in the SOCAT v3 dataset on the estimates? We find that while RFR is indeed better than SVR, the ensemble of the two methods outperforms either one, due to complementary strengths and weaknesses of the methods. Results also show that for the RFR and SVR implementations, it is better to include coordinates as proxy
Kalayeh, Mahdi M.; Marin, Thibault; Pretorius, P. Hendrik; Wernick, Miles N.; Yang, Yongyi; Brankov, Jovan G.
2011-03-01
In this paper, we present a numerical observer for image quality assessment, aiming to predict human observer accuracy in a cardiac perfusion defect detection task for single-photon emission computed tomography (SPECT). In medical imaging, image quality should be assessed by evaluating the human observer accuracy for a specific diagnostic task. This approach is known as task-based assessment. Such evaluations are important for optimizing and testing imaging devices and algorithms. Unfortunately, human observer studies with expert readers are costly and time-demanding. To address this problem, numerical observers have been developed as a surrogate for human readers to predict human diagnostic performance. The channelized Hotelling observer (CHO) with internal noise model has been found to predict human performance well in some situations, but does not always generalize well to unseen data. We have argued in the past that finding a model to predict human observers could be viewed as a machine learning problem. Following this approach, in this paper we propose a channelized relevance vector machine (CRVM) to predict human diagnostic scores in a detection task. We have previously used channelized support vector machines (CSVM) to predict human scores and have shown that this approach offers better and more robust predictions than the classical CHO method. The comparison of the proposed CRVM with our previously introduced CSVM method suggests that CRVM can achieve similar generalization accuracy, while dramatically reducing model complexity and computation time.
Torija, Antonio J; Ruiz, Diego P; Ramos-Ridao, Angel F
2014-06-01
To ensure appropriate soundscape management in urban environments, the urban-planning authorities need a range of tools that enable such a task to be performed. An essential step during the management of urban areas from a sound standpoint should be the evaluation of the soundscape in such an area. In this sense, it has been widely acknowledged that a subjective and acoustical categorization of a soundscape is the first step to evaluate it, providing a basis for designing or adapting it to match people's expectations as well. In this sense, this work proposes a model for automatic classification of urban soundscapes. This model is intended for the automatic classification of urban soundscapes based on underlying acoustical and perceptual criteria. Thus, this classification model is proposed to be used as a tool for a comprehensive urban soundscape evaluation. Because of the great complexity associated with the problem, two machine learning techniques, Support Vector Machines (SVM) and Support Vector Machines trained with Sequential Minimal Optimization (SMO), are implemented in developing model classification. The results indicate that the SMO model outperforms the SVM model in the specific task of soundscape classification. With the implementation of the SMO algorithm, the classification model achieves an outstanding performance (91.3% of instances correctly classified). © 2013 Elsevier B.V. All rights reserved.
Yavuz, Ahmet Sinan; Sezerman, Osman Ugur
2014-01-01
Sumoylation, which is a reversible and dynamic post-translational modification, is one of the vital processes in a cell. Before a protein matures to perform its function, sumoylation may alter its localization, interactions, and possibly structural conformation. Abberations in protein sumoylation has been linked with a variety of disorders and developmental anomalies. Experimental approaches to identification of sumoylation sites may not be effective due to the dynamic nature of sumoylation, laborsome experiments and their cost. Therefore, computational approaches may guide experimental identification of sumoylation sites and provide insights for further understanding sumoylation mechanism. In this paper, the effectiveness of using various sequence properties in predicting sumoylation sites was investigated with statistical analyses and machine learning approach employing support vector machines. These sequence properties were derived from windows of size 7 including position-specific amino acid composition, hydrophobicity, estimated sub-window volumes, predicted disorder, and conformational flexibility. 5-fold cross-validation results on experimentally identified sumoylation sites revealed that our method successfully predicts sumoylation sites with a Matthew's correlation coefficient, sensitivity, specificity, and accuracy equal to 0.66, 73%, 98%, and 97%, respectively. Additionally, we have showed that our method compares favorably to the existing prediction methods and basic regular expressions scanner. By using support vector machines, a new, robust method for sumoylation site prediction was introduced. Besides, the possible effects of predicted conformational flexibility and disorder on sumoylation site recognition were explored computationally for the first time to our knowledge as an additional parameter that could aid in sumoylation site prediction.
Automated valve fault detection based on acoustic emission parameters and support vector machine
Directory of Open Access Journals (Sweden)
Salah M. Ali
2018-03-01
Full Text Available Reciprocating compressors are one of the most used types of compressors with wide applications in industry. The most common failure in reciprocating compressors is always related to the valves. Therefore, a reliable condition monitoring method is required to avoid the unplanned shutdown in this category of machines. Acoustic emission (AE technique is one of the effective recent methods in the field of valve condition monitoring. However, a major challenge is related to the analysis of AE signal which perhaps only depends on the experience and knowledge of technicians. This paper proposes automated fault detection method using support vector machine (SVM and AE parameters in an attempt to reduce human intervention in the process. Experiments were conducted on a single stage reciprocating air compressor by combining healthy and faulty valve conditions to acquire the AE signals. Valve functioning was identified through AE waveform analysis. SVM faults detection model was subsequently devised and validated based on training and testing samples respectively. The results demonstrated automatic valve fault detection model with accuracy exceeding 98%. It is believed that valve faults can be detected efficiently without human intervention by employing the proposed model for a single stage reciprocating compressor. Keywords: Condition monitoring, Faults detection, Signal analysis, Acoustic emission, Support vector machine
Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie
2014-09-01
Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique. Copyright © 2014 Elsevier Ltd. All rights reserved.
Predicting and improving the protein sequence alignment quality by support vector regression
Directory of Open Access Journals (Sweden)
Kim Dongsup
2007-12-01
Full Text Available Abstract Background For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significantly depending on our choice of various alignment parameters such as gap opening penalty and gap extension penalty. Because the accuracy of sequence alignment is typically measured by comparing it with its corresponding structure alignment, there is no good way of evaluating alignment accuracy without knowing the structure of a query protein, which is obviously not available at the time of structure prediction. Moreover, there is no universal alignment parameter option that would always yield the optimal alignment. Results In this work, we develop a method to predict the quality of the alignment between a query and a template. We train the support vector regression (SVR models to predict the MaxSub scores as a measure of alignment quality. The alignment between a query protein and a template of length n is transformed into a (n + 1-dimensional feature vector, then it is used as an input to predict the alignment quality by the trained SVR model. Performance of our work is evaluated by various measures including Pearson correlation coefficient between the observed and predicted MaxSub scores. Result shows high correlation coefficient of 0.945. For a pair of query and template, 48 alignments are generated by changing alignment options. Trained SVR models are then applied to predict the MaxSub scores of those and to select the best alignment option which is chosen specifically to the query-template pair. This adaptive selection procedure results in 7.4% improvement of MaxSub scores, compared to those when the single best parameter option is used for all query-template pairs. Conclusion The present work demonstrates that the
Assimilation of PFISR Data Using Support Vector Regression and Ground Based Camera Constraints
Clayton, R.; Lynch, K. A.; Nicolls, M. J.; Hampton, D. L.; Michell, R.; Samara, M.; Guinther, J.
2013-12-01
In order to best interpret the information gained from multipoint in situ measurements, a Support Vector Regression algorithm is being developed to interpret the data collected from the instruments in the context of ground observations (such as those from camera or radar array). The idea behind SVR is to construct the simplest function that models the data with the least squared error, subject to constraints given by the user. Constraints can be brought into the algorithm from other data sources or from models. As is often the case with data, a perfect solution to such a problem may be impossible, thus 'slack' may be introduced to control how closely the model adheres to the data. The algorithm employs kernels, and chooses radial basis functions as an appropriate kernel. The current SVR code can take input data as one to three dimensional scalars or vectors, and may also include time. External data can be incorporated and assimilated into a model of the environment. Regions of minimal and maximal values are allowed to relax to the sample average (or a user-supplied model) on size and time scales determined by user input, known as feature sizes. These feature sizes can vary for each degree of freedom if the user desires. The user may also select weights for each data point, if it is desirable to weight parts of the data differently. In order to test the algorithm, Poker Flat Incoherent Scatter Radar (PFISR) and MICA sounding rocket data are being used as sample data. The PFISR data consists of many beams, each with multiple ranges. In addition to analyzing the radar data as it stands, the algorithm is being used to simulate data from a localized ionospheric swarm of Cubesats using existing PFISR data. The sample points of the radar at one altitude slice can serve as surrogates for satellites in a cubeswarm. The number of beams of the PFISR radar can then be used to see what the algorithm would output for a swarm of similar size. By using PFISR data in the 15-beam to
Directory of Open Access Journals (Sweden)
Indah Agustien
2010-10-01
Full Text Available Human pose interpretation using Particle filter with Binary Gaussian Weighting and Support Vector Machine is proposed. In the proposed system, Particle filter is used to track human object, then this human object is skeletonized using thinning algorithm and classified using Support Vector Machine. The classification is to identify human pose, whether a normal or abnormal behavior. Here Particle filter is modified through weight calculation using Gaussiandistribution to reduce the computational time. The modified particle filter consists of four main phases. First, particles are generated to predict target’s location. Second, weight of certain particles is calculated and these particles are used to build Gaussian distribution. Third, weight of all particles is calculated based on Gaussian distribution. Fourth, update particles based on each weight. The modified particle filter could reduce computational time of object tracking since this method does not have to calculate particle’s weight one by one. To calculate weight, the proposed method builds Gaussian distribution and calculates particle’s weight using this distribution. Through experiment using video data taken in front of cashier of convenient store, the proposed method reduced computational time in tracking process until 68.34% in average compare to the conventional one, meanwhile the accuracy of tracking with this new method is comparable with particle filter method i.e. 90.3%. Combination particle filter with binary Gaussian weighting and support vector machine is promising for advanced early crime scene investigation.
Wu, Qi
2010-03-01
Demand forecasts play a crucial role in supply chain management. The future demand for a certain product is the basis for the respective replenishment systems. Aiming at demand series with small samples, seasonal character, nonlinearity, randomicity and fuzziness, the existing support vector kernel does not approach the random curve of the sales time series in the space (quadratic continuous integral space). In this paper, we present a hybrid intelligent system combining the wavelet kernel support vector machine and particle swarm optimization for demand forecasting. The results of application in car sale series forecasting show that the forecasting approach based on the hybrid PSOWv-SVM model is effective and feasible, the comparison between the method proposed in this paper and other ones is also given, which proves that this method is, for the discussed example, better than hybrid PSOv-SVM and other traditional methods.
Neutron–gamma discrimination based on the support vector machine method
International Nuclear Information System (INIS)
Yu, Xunzhen; Zhu, Jingjun; Lin, ShinTed; Wang, Li; Xing, Haoyang; Zhang, Caixun; Xia, Yuxi; Liu, Shukui; Yue, Qian; Wei, Weiwei; Du, Qiang; Tang, Changjian
2015-01-01
In this study, the combination of the support vector machine (SVM) method with the moment analysis method (MAM) is proposed and utilized to perform neutron/gamma (n/γ) discrimination of the pulses from an organic liquid scintillator (OLS). Neutron and gamma events, which can be firmly separated on the scatter plot drawn by the charge comparison method (CCM), are detected to form the training data set and the test data set for the SVM, and the MAM is used to create the feature vectors for individual events in the data sets. Compared to the traditional methods, such as CCM, the proposed method can not only discriminate the neutron and gamma signals, even at lower energy levels, but also provide the corresponding classification accuracy for each event, which is useful in validating the discrimination. Meanwhile, the proposed method can also offer a predication of the classification for the under-energy-limit events
Research on bearing life prediction based on support vector machine and its application
International Nuclear Information System (INIS)
Sun Chuang; Zhang Zhousuo; He Zhengjia
2011-01-01
Life prediction of rolling element bearing is the urgent demand in engineering practice, and the effective life prediction technique is beneficial to predictive maintenance. Support vector machine (SVM) is a novel machine learning method based on statistical learning theory, and is of advantage in prediction. This paper develops SVM-based model for bearing life prediction. The inputs of the model are features of bearing vibration signal and the output is the bearing running time-bearing failure time ratio. The model is built base on a few failed bearing data, and it can fuse information of the predicted bearing. So it is of advantage to bearing life prediction in practice. The model is applied to life prediction of a bearing, and the result shows the proposed model is of high precision.
Signal Detection for QPSK Based Cognitive Radio Systems using Support Vector Machines
Directory of Open Access Journals (Sweden)
M. T. Mushtaq
2015-04-01
Full Text Available Cognitive radio based network enables opportunistic dynamic spectrum access by sensing, adopting and utilizing the unused portion of licensed spectrum bands. Cognitive radio is intelligent enough to adapt the communication parameters of the unused licensed spectrum. Spectrum sensing is one of the most important tasks of the cognitive radio cycle. In this paper, the auto-correlation function kernel based Support Vector Machine (SVM classifier along with Welch's Periodogram detector is successfully implemented for the detection of four QPSK (Quadrature Phase Shift Keying based signals propagating through an AWGN (Additive White Gaussian Noise channel. It is shown that the combination of statistical signal processing and machine learning concepts improve the spectrum sensing process and spectrum sensing is possible even at low Signal to Noise Ratio (SNR values up to -50 dB.
Aging Detection of Electrical Point Machines Based on Support Vector Data Description
Directory of Open Access Journals (Sweden)
Jaewon Sa
2017-11-01
Full Text Available Electrical point machines (EPM must be replaced at an appropriate time to prevent the occurrence of operational safety or stability problems in trains resulting from aging or budget constraints. However, it is difficult to replace EPMs effectively because the aging conditions of EPMs depend on the operating environments, and thus, a guideline is typically not be suitable for replacing EPMs at the most timely moment. In this study, we propose a method of classification for the detection of an aging effect to facilitate the timely replacement of EPMs. We employ support vector data description to segregate data of “aged” and “not-yet-aged” equipment by analyzing the subtle differences in normalized electrical signals resulting from aging. Based on the before and after-replacement data that was obtained from experimental studies that were conducted on EPMs, we confirmed that the proposed method was capable of classifying machines based on exhibited aging effects with adequate accuracy.
New fuzzy support vector machine for the class imbalance problem in medical datasets classification.
Gu, Xiaoqing; Ni, Tongguang; Wang, Hongyuan
2014-01-01
In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.
New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification
Directory of Open Access Journals (Sweden)
Xiaoqing Gu
2014-01-01
Full Text Available In medical datasets classification, support vector machine (SVM is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM for the class imbalance problem (called FSVM-CIP is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP.
Directory of Open Access Journals (Sweden)
Yu Huiling
2016-01-01
Full Text Available The suppor1t vector machine (SVM is a relatively new artificial intelligence technique which is increasingly being applied to geotechnical problems and is yielding encouraging results. SVM is a new machine learning method based on the statistical learning theory. A case study based on road foundation engineering project shows that the forecast results are in good agreement with the measured data. The SVM model is also compared with BP artificial neural network model and traditional hyperbola method. The prediction results indicate that the SVM model has a better prediction ability than BP neural network model and hyperbola method. Therefore, settlement prediction based on SVM model can reflect actual settlement process more correctly. The results indicate that it is effective and feasible to use this method and the nonlinear mapping relation between foundation settlement and its influence factor can be expressed well. It will provide a new method to predict foundation settlement.
Rutishauser, David
2006-01-01
The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters
A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine
Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing
2017-09-01
Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.
Bidding strategy with forecast technology based on support vector machine in the electricity market
Gao, Ciwei; Bompard, Ettore; Napoli, Roberto; Wan, Qiulan; Zhou, Jian
2008-06-01
The participants in the electricity market are concerned very much with the market price evolution. Various technologies have been developed for price forecasting. The SVM (Support Vector Machine) has shown its good performance in market price forecasting. Two approaches for forming the market bidding strategies based on SVM are proposed. One is based on the price forecasting accuracy, with which the rejection risk is defined. The other takes into account the impact of the producer’s own bid. The risks associated with the bidding are controlled by the parameter settings. The proposed approaches have been tested on a numerical example.
Multimodal biometric authentication based on score level fusion using support vector machine
Wang, F.; Han, J.
2009-03-01
Fusion of multiple biometrics for human authentication performance improvement has received considerable attention. This paper presents a novel multimodal biometric authentication method integrating face and iris based on score level fusion. For score level fusion, support vector machine (SVM) based fusion rule is applied to combine two matching scores, respectively from Laplacianface based face verifier and phase information based iris verifier, to generate a single scalar score which is used to make the final decision. Experimental results show that the performance of the proposed method can bring obvious improvement comparing to the unimodal biometric identification methods and the previous fused face-iris methods.
Directory of Open Access Journals (Sweden)
Yukai Yao
2015-01-01
Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.
Automatic parametrization of Support Vector Machines for short texts polarity detection
Directory of Open Access Journals (Sweden)
Aurelio Sanabria Rodríguez
2017-04-01
Full Text Available The information from social media is emerging as a valuable source in decision-making, unfortunately the tools to turn these data into useful information still need some work. Using Support Vector Machines for polarity detection in short texts are popular among researchers for their good results, but parameter optimization to train classification models is a complex and costly process. This article compares two algorithms for automated parameter optimization in the process of creating classification models for polarity detection: the recently created Grey Wolf Optimizer and the Grid Search, using accuracy and f-score metrics.
Schmidt, Andreas; Lausch, Angela; Vogel, Hans-Jörg
2016-04-01
Diffuse reflectance spectroscopy as a soil analytical tool is spreading more and more. There is a wide range of possible applications ranging from the point scale (e.g. simple soil samples, drill cores, vertical profile scans) through the field scale to the regional and even global scale (UAV, airborne and space borne instruments, soil reflectance databases). The basic idea is that the soil's reflectance spectrum holds information about its properties (like organic matter content or mineral composition). The relation between soil properties and the observable spectrum is usually not exactly know and is typically derived from statistical methods. Nowadays these methods are classified in the term machine learning, which comprises a vast pool of algorithms and methods for learning the relationship between pairs if input - output data (training data set). Within this pool of methods a Gaussian Process Regression (GPR) is newly emerging method (originating from Bayesian statistics) which is increasingly applied to applications in different fields. For example, it was successfully used to predict vegetation parameters from hyperspectral remote sensing data. In this study we apply GPR to predict soil organic carbon from soil spectroscopy data (400 - 2500 nm). We compare it to more traditional and widely used methods such as Partitial Least Squares Regression (PLSR), Random Forest (RF) and Gradient Boosted Regression Trees (GBRT). All these methods have the common ability to calculate a measure for the variable importance (wavelengths importance). The main advantage of GPR is its ability to also predict the variance of the target parameter. This makes it easy to see whether a prediction is reliable or not. The ability to choose from various covariance functions makes GPR a flexible method. This allows for including different assumptions or a priori knowledge about the data. For this study we use samples from three different locations to test the prediction accuracies. One
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression
Directory of Open Access Journals (Sweden)
Morufu Olusola Ibitoye
2016-07-01
Full Text Available The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70% and testing (30% subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2 between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
A dynamic particle filter-support vector regression method for reliability prediction
International Nuclear Information System (INIS)
Wei, Zhao; Tao, Tao; ZhuoShu, Ding; Zio, Enrico
2013-01-01
Support vector regression (SVR) has been applied to time series prediction and some works have demonstrated the feasibility of its use to forecast system reliability. For accuracy of reliability forecasting, the selection of SVR's parameters is important. The existing research works on SVR's parameters selection divide the example dataset into training and test subsets, and tune the parameters on the training data. However, these fixed parameters can lead to poor prediction capabilities if the data of the test subset differ significantly from those of training. Differently, the novel method proposed in this paper uses particle filtering to estimate the SVR model parameters according to the whole measurement sequence up to the last observation instance. By treating the SVR training model as the observation equation of a particle filter, our method allows updating the SVR model parameters dynamically when a new observation comes. Because of the adaptability of the parameters to dynamic data pattern, the new PF–SVR method has superior prediction performance over that of standard SVR. Four application results show that PF–SVR is more robust than SVR to the decrease of the number of training data and the change of initial SVR parameter values. Also, even if there are trends in the test data different from those in the training data, the method can capture the changes, correct the SVR parameters and obtain good predictions. -- Highlights: •A dynamic PF–SVR method is proposed to predict the system reliability. •The method can adjust the SVR parameters according to the change of data. •The method is robust to the size of training data and initial parameter values. •Some cases based on both artificial and real data are studied. •PF–SVR shows superior prediction performance over standard SVR
Tian, Ye; Xu, Yue-Ping; Wang, Guoqing
2018-05-01
Drought can have a substantial impact on the ecosystem and agriculture of the affected region and does harm to local economy. This study aims to analyze the relation between soil moisture and drought and predict agricultural drought in Xiangjiang River basin. The agriculture droughts are presented with the Precipitation-Evapotranspiration Index (SPEI). The Support Vector Regression (SVR) model incorporating climate indices is developed to predict the agricultural droughts. Analysis of climate forcing including El Niño Southern Oscillation and western Pacific subtropical high (WPSH) are carried out to select climate indices. The results show that SPEI of six months time scales (SPEI-6) represents the soil moisture better than that of three and one month time scale on drought duration, severity and peaks. The key factor that influences the agriculture drought is the Ridge Point of WPSH, which mainly controls regional temperature. The SVR model incorporating climate indices, especially ridge point of WPSH, could improve the prediction accuracy compared to that solely using drought index by 4.4% in training and 5.1% in testing measured by Nash Sutcliffe efficiency coefficient (NSE) for three month lead time. The improvement is more significant for the prediction with one month lead (15.8% in training and 27.0% in testing) than that with three months lead time. However, it needs to be cautious in selection of the input parameters, since adding redundant information could have a counter effect in attaining a better prediction. Copyright © 2017 Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.; Sugiura, K.
2017-01-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.
2017-02-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
Energy Technology Data Exchange (ETDEWEB)
Nishizuka, N.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M. [Applied Electromagnetic Research Institute, National Institute of Information and Communications Technology, 4-2-1, Nukui-Kitamachi, Koganei, Tokyo 184-8795 (Japan); Sugiura, K., E-mail: nishizuka.naoto@nict.go.jp [Advanced Speech Translation Research and Development Promotion Center, National Institute of Information and Communications Technology (Japan)
2017-02-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
River Flow Estimation from Upstream Flow Records Using Support Vector Machines
Directory of Open Access Journals (Sweden)
Halil Karahan
2014-01-01
Full Text Available A novel architecture for flood routing model has been proposed and its efficiency is validated on several problems by employing support vector machines. The architecture is designed by including the inputs and observed and calculated outflows from the previous time step output. Whole observed data have been used for determining the model parameters in the heuristic methods given in the literature, which constitutes the major disadvantage of the existing approaches. Moreover, using the whole data for training may lead to overtraining problem that causes overfitting of estimations and data. Therefore, in this study, 60–90% of the data are randomly selected for training and then the remaining data are used for validation. In order to take the effects of the measurement errors into consideration, the data are corrupted by some additive noise. The results show that the proposed architecture improves the model performance under noisy and missing data conditions and that support vector machines can be powerful alternative in flood routing modeling.
International Nuclear Information System (INIS)
Chiracharit, W.; Kumhom, P.; Chamnongthai, K.; Sun, Y.; Delp, E.J.; Babbs, C.F
2007-01-01
Automatic detection of normal mammograms, as a ''first look'' for breast cancer, is a new approach to computer-aided diagnosis. This approach may be limited, however, by two main causes. The first problem is the presence of poorly separable ''crossed-distributions'' in which the correct classification depends upon the value of each feature. The second problem is overlap of the feature distributions that are extracted from digitized mammograms of normal and abnormal patients. Here we introduce a new Support Vector Machine (SVM) based method utilizing with the proposed uncrossing mapping and Local Probability Difference (LPD). Crossed-distribution feature pairs are identified and mapped into a new features that can be separated by a zero-hyperplane of the new axis. The probability density functions of the features of normal and abnormal mammograms are then sampled and the local probability difference functions are estimated to enhance the features. From 1,000 ground-truth-known mammograms, 250 normal and 250 abnormal cases, including spiculated lesions, circumscribed masses or microcalcifications, are used for training a support vector machine. The classification results tested with another 250 normal and 250 abnormal sets show improved testing performances with 90% sensitivity and 89% specificity. (author)
Prediction of water quality index in constructed wetlands using support vector machine.
Mohammadpour, Reza; Shaharuddin, Syafiq; Chang, Chun Kiat; Zakaria, Nor Azazi; Ab Ghani, Aminuddin; Chan, Ngai Weng
2015-04-01
Poor water quality is a serious problem in the world which threatens human health, ecosystems, and plant/animal life. Prediction of surface water quality is a main concern in water resource and environmental systems. In this research, the support vector machine and two methods of artificial neural networks (ANNs), namely feed forward back propagation (FFBP) and radial basis function (RBF), were used to predict the water quality index (WQI) in a free constructed wetland. Seventeen points of the wetland were monitored twice a month over a period of 14 months, and an extensive dataset was collected for 11 water quality variables. A detailed comparison of the overall performance showed that prediction of the support vector machine (SVM) model with coefficient of correlation (R(2)) = 0.9984 and mean absolute error (MAE) = 0.0052 was either better or comparable with neural networks. This research highlights that the SVM and FFBP can be successfully employed for the prediction of water quality in a free surface constructed wetland environment. These methods simplify the calculation of the WQI and reduce substantial efforts and time by optimizing the computations.
Vision-Based Perception and Classification of Mosquitoes Using Support Vector Machine
Directory of Open Access Journals (Sweden)
Masataka Fuchida
2017-01-01
Full Text Available The need for a novel automated mosquito perception and classification method is becoming increasingly essential in recent years, with steeply increasing number of mosquito-borne diseases and associated casualties. There exist remote sensing and GIS-based methods for mapping potential mosquito inhabitants and locations that are prone to mosquito-borne diseases, but these methods generally do not account for species-wise identification of mosquitoes in closed-perimeter regions. Traditional methods for mosquito classification involve highly manual processes requiring tedious sample collection and supervised laboratory analysis. In this research work, we present the design and experimental validation of an automated vision-based mosquito classification module that can deploy in closed-perimeter mosquito inhabitants. The module is capable of identifying mosquitoes from other bugs such as bees and flies by extracting the morphological features, followed by support vector machine-based classification. In addition, this paper presents the results of three variants of support vector machine classifier in the context of mosquito classification problem. This vision-based approach to the mosquito classification problem presents an efficient alternative to the conventional methods for mosquito surveillance, mapping and sample image collection. Experimental results involving classification between mosquitoes and a predefined set of other bugs using multiple classification strategies demonstrate the efficacy and validity of the proposed approach with a maximum recall of 98%.
Beam Structure Damage Identification Based on BP Neural Network and Support Vector Machine
Directory of Open Access Journals (Sweden)
Bo Yan
2014-01-01
Full Text Available It is not easy to find marine cracks of structures by directly manual testing. When the cracks of important components are extended under extreme offshore environment, the whole structure would lose efficacy, endanger the staff’s safety, and course a significant economic loss and marine environment pollution. Thus, early discovery of structure cracks is very important. In this paper, a beam structure damage identification model based on intelligent algorithm is firstly proposed to identify partial cracks in supported beams on ocean platform. In order to obtain the replacement mode and strain mode of the beams, the paper takes simple supported beam with single crack and double cracks as an example. The results show that the difference curves of strain mode change drastically only on the injured part and different degrees of injury would result in different mutation degrees of difference curve more or less. While the model based on support vector machine (SVM and BP neural network can identify cracks of supported beam intelligently, the methods can discern injured degrees of sound condition, single crack, and double cracks. Furthermore, the two methods are compared. The results show that the two methods presented in the paper have a preferable identification precision and adaptation. And damage identification based on support vector machine (SVM has smaller error results.
Pirooznia, Mehdi; Deng, Youping
2006-01-01
Motivation Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. Results The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1–BRCA2 samples with RBF kernel of SVM. Conclusion We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at . PMID:17217518
Pirooznia, Mehdi; Deng, Youping
2006-12-12
Graphical user interface (GUI) software promotes novelty by allowing users to extend the functionality. SVM Classifier is a cross-platform graphical application that handles very large datasets well. The purpose of this study is to create a GUI application that allows SVM users to perform SVM training, classification and prediction. The GUI provides user-friendly access to state-of-the-art SVM methods embodied in the LIBSVM implementation of Support Vector Machine. We implemented the java interface using standard swing libraries. We used a sample data from a breast cancer study for testing classification accuracy. We achieved 100% accuracy in classification among the BRCA1-BRCA2 samples with RBF kernel of SVM. We have developed a java GUI application that allows SVM users to perform SVM training, classification and prediction. We have demonstrated that support vector machines can accurately classify genes into functional categories based upon expression data from DNA microarray hybridization experiments. Among the different kernel functions that we examined, the SVM that uses a radial basis kernel function provides the best performance. The SVM Classifier is available at http://mfgn.usm.edu/ebl/svm/.
Classification of EEG Signals Using a Multiple Kernel Learning Support Vector Machine
Directory of Open Access Journals (Sweden)
Xiaoou Li
2014-07-01
Full Text Available In this study, a multiple kernel learning support vector machine algorithm is proposed for the identification of EEG signals including mental and cognitive tasks, which is a key component in EEG-based brain computer interface (BCI systems. The presented BCI approach included three stages: (1 a pre-processing step was performed to improve the general signal quality of the EEG; (2 the features were chosen, including wavelet packet entropy and Granger causality, respectively; (3 a multiple kernel learning support vector machine (MKL-SVM based on a gradient descent optimization algorithm was investigated to classify EEG signals, in which the kernel was defined as a linear combination of polynomial kernels and radial basis function kernels. Experimental results showed that the proposed method provided better classification performance compared with the SVM based on a single kernel. For mental tasks, the average accuracies for 2-class, 3-class, 4-class, and 5-class classifications were 99.20%, 81.25%, 76.76%, and 75.25% respectively. Comparing stroke patients with healthy controls using the proposed algorithm, we achieved the average classification accuracies of 89.24% and 80.33% for 0-back and 1-back tasks respectively. Our results indicate that the proposed approach is promising for implementing human-computer interaction (HCI, especially for mental task classification and identifying suitable brain impairment candidates.
Wen, Congcong; Wang, Zhiyi; Zhang, Meiling; Wang, Shuanghu; Geng, Peiwu; Sun, Fa; Chen, Mengchun; Lin, Guanyang; Hu, Lufeng; Ma, Jianshe; Wang, Xianqin
2016-01-01
Paraquat is quick-acting and non-selective, killing green plant tissue on contact; it is also toxic to human beings and animals. In this study, we developed a urine metabonomic method by gas chromatography-mass spectrometry to evaluate the effect of acute paraquat poisoning on rats. Pattern recognition analysis, including both partial least squares discriminate analysis and principal component analysis revealed that acute paraquat poisoning induced metabolic perturbations. Compared with the control group, the levels of benzeneacetic acid and hexadecanoic acid of the acute paraquat poisoning group (intragastric administration 36 mg/kg) increased, while the levels of butanedioic acid, pentanedioic acid, altronic acid decreased. Based on these urinary metabolomics data, support vector machine was applied to discriminate the metabolomic change of paraquat groups from the control group, which achieved 100% classification accuracy. In conclusion, metabonomic method combined with support vector machine can be used as a useful diagnostic tool in paraquat-poisoned rats. Copyright © 2015 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Alim Samat
2016-03-01
Full Text Available In order to deal with scenarios where the training data, used to deduce a model, and the validation data have different statistical distributions, we study the problem of transformed subspace feature transfer for domain adaptation (DA in the context of hyperspectral image classification via a geodesic Gaussian flow kernel based support vector machine (GFKSVM. To show the superior performance of the proposed approach, conventional support vector machines (SVMs and state-of-the-art DA algorithms, including information-theoretical learning of discriminative cluster for domain adaptation (ITLDC, joint distribution adaptation (JDA, and joint transfer matching (JTM, are also considered. Additionally, unsupervised linear and nonlinear subspace feature transfer techniques including principal component analysis (PCA, randomized nonlinear principal component analysis (rPCA, factor analysis (FA and non-negative matrix factorization (NNMF are investigated and compared. Experiments on two real hyperspectral images show the cross-image classification performances of the GFKSVM, confirming its effectiveness and suitability when applied to hyperspectral images.
Feature Import Vector Machine: A General Classifier with Flexible Feature Selection.
Ghosh, Samiran; Wang, Yazhen
2015-02-01
The support vector machine (SVM) and other reproducing kernel Hilbert space (RKHS) based classifier systems are drawing much attention recently due to its robustness and generalization capability. General theme here is to construct classifiers based on the training data in a high dimensional space by using all available dimensions. The SVM achieves huge data compression by selecting only few observations which lie close to the boundary of the classifier function. However when the number of observations are not very large (small n ) but the number of dimensions/features are large (large p ), then it is not necessary that all available features are of equal importance in the classification context. Possible selection of an useful fraction of the available features may result in huge data compression. In this paper we propose an algorithmic approach by means of which such an optimal set of features could be selected. In short, we reverse the traditional sequential observation selection strategy of SVM to that of sequential feature selection. To achieve this we have modified the solution proposed by Zhu and Hastie (2005) in the context of import vector machine (IVM), to select an optimal sub-dimensional model to build the final classifier with sufficient accuracy.
Directory of Open Access Journals (Sweden)
Elizabeth M Sweeney
Full Text Available Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS lesion segmentation in structural magnetic resonance imaging (MRI. We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w, T2-weighted (T2-w and fluid-attenuated inversion recovery (FLAIR MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
Directory of Open Access Journals (Sweden)
Bao Wang
2012-11-01
Full Text Available The accuracy of annual electric load forecasting plays an important role in the economic and social benefits of electric power systems. The least squares support vector machine (LSSVM has been proven to offer strong potential in forecasting issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. As a novel meta-heuristic and evolutionary algorithm, the fruit fly optimization algorithm (FOA has the advantages of being easy to understand and fast convergence to the global optimal solution. Therefore, to improve the forecasting performance, this paper proposes a LSSVM-based annual electric load forecasting model that uses FOA to automatically determine the appropriate values of the two parameters for the LSSVM model. By taking the annual electricity consumption of China as an instance, the computational result shows that the LSSVM combined with FOA (LSSVM-FOA outperforms other alternative methods, namely single LSSVM, LSSVM combined with coupled simulated annealing algorithm (LSSVM-CSA, generalized regression neural network (GRNN and regression model.
International Nuclear Information System (INIS)
Wang Hui; Huang Gang
2010-01-01
Objective: To evaluate the early diagnostic value of tumor markers for gastric cancer using support vector machine (SVM) model. Methods: Subjects involved in the study consisted of 262 cases with gastric cancer, 156 cases with benign gastric diseases and 149 healthy controls. From those subjects, five tumor markers, carcinoembryonic antigen (CEA), carbohydrate (CA) 125, CA19-9, alphafetoprotein (AFP) and CA50, were assayed and collected to make the datasets. To modify SVM model to fit the diagnostic classifiers, radial basis function was adopted and kernel function was optimized and validated by grid search and cross validation. For comparative study, methods of combination tests of five markers, Logistic regression, and decision tree were also used. Results: For gastric cancer, the diagnostic accuracy of the combination tests, Logistic regression, decision tree and SVM model were 46.2%, 64.5%, 63.9% and 95.1% respectively. SVM model significantly elevated the diagnostic value comparing with other three methods. Conclusion: The application of SVM model is of high value in enhancing the tumor marker for the diagnosis of gastric cancer. (authors)
Developing a support vector machine based QSPR model for prediction of half-life of some herbicides.
Samghani, Kobra; HosseinFatemi, Mohammad
2016-07-01
The half-life (t1/2) of 58 herbicides were modeled by quantitative structure-property relationship (QSPR) based molecular structure descriptors. After calculation and the screening of a large number of molecular descriptors, the most relevant those ones selected by stepwise multiple linear regression were used for developing linear and nonlinear models which developed by using multiple linear regression and support vector machine, respectively. Comparison between statistical parameters of linear and nonlinear models indicates the suitability of SVM over MLR model for predicting the half-life of herbicides. The statistical parameters of R(2) and standard error for training set of SVM model were; 0.96 and 0.087, respectively, and were 0.93 and 0.092 for the test set. The SVM model was evaluated by leave one out cross validation test, which its result indicates the robustness and predictability of the model. The established SVM model was used for predicting the half-life of other herbicides that are located in the applicability domain of model that were determined via leverage approach. The results of this study indicate that the relationship among selected molecular descriptors and herbicide's half-life is non-linear. These results emphases that the process of degradation of herbicides in the environment is very complex and can be affected by various environmental and structural features, therefore simple linear model cannot be able to successfully predict it. Copyright © 2016. Published by Elsevier Inc.
Wang, Xiaolei
2014-12-12
Background: A quantitative understanding of interactions between transcription factors (TFs) and their DNA binding sites is key to the rational design of gene regulatory networks. Recent advances in high-throughput technologies have enabled high-resolution measurements of protein-DNA binding affinity. Importantly, such experiments revealed the complex nature of TF-DNA interactions, whereby the effects of nucleotide changes on the binding affinity were observed to be context dependent. A systematic method to give high-quality estimates of such complex affinity landscapes is, thus, essential to the control of gene expression and the advance of synthetic biology. Results: Here, we propose a two-round prediction method that is based on support vector regression (SVR) with weighted degree (WD) kernels. In the first round, a WD kernel with shifts and mismatches is used with SVR to detect the importance of subsequences with different lengths at different positions. The subsequences identified as important in the first round are then fed into a second WD kernel to fit the experimentally measured affinities. To our knowledge, this is the first attempt to increase the accuracy of the affinity prediction by applying two rounds of string kernels and by identifying a small number of crucial k-mers. The proposed method was tested by predicting the binding affinity landscape of Gcn4p in Saccharomyces cerevisiae using datasets from HiTS-FLIP. Our method explicitly identified important subsequences and showed significant performance improvements when compared with other state-of-the-art methods. Based on the identified important subsequences, we discovered two surprisingly stable 10-mers and one sensitive 10-mer which were not reported before. Further test on four other TFs in S. cerevisiae demonstrated the generality of our method. Conclusion: We proposed in this paper a two-round method to quantitatively model the DNA binding affinity landscape. Since the ability to modify
A Support Vector Machine Classification of Thyroid Bioptic Specimens Using MALDI-MSI Data
Directory of Open Access Journals (Sweden)
Manuel Galli
2016-01-01
Full Text Available Biomarkers able to characterise and predict multifactorial diseases are still one of the most important targets for all the “omics” investigations. In this context, Matrix-Assisted Laser Desorption/Ionisation-Mass Spectrometry Imaging (MALDI-MSI has gained considerable attention in recent years, but it also led to a huge amount of complex data to be elaborated and interpreted. For this reason, computational and machine learning procedures for biomarker discovery are important tools to consider, both to reduce data dimension and to provide predictive markers for specific diseases. For instance, the availability of protein and genetic markers to support thyroid lesion diagnoses would impact deeply on society due to the high presence of undetermined reports (THY3 that are generally treated as malignant patients. In this paper we show how an accurate classification of thyroid bioptic specimens can be obtained through the application of a state-of-the-art machine learning approach (i.e., Support Vector Machines on MALDI-MSI data, together with a particular wrapper feature selection algorithm (i.e., recursive feature elimination. The model is able to provide an accurate discriminatory capability using only 20 out of 144 features, resulting in an increase of the model performances, reliability, and computational efficiency. Finally, tissue areas rather than average proteomic profiles are classified, highlighting potential discriminating areas of clinical interest.
Application of support vector machines to breast cancer screening using mammogram and history data
Land, Walker H., Jr.; Akanda, Anab; Lo, Joseph Y.; Anderson, Francis; Bryden, Margaret
2002-05-01
Support Vector Machines (SVMs) are a new and radically different type of classifiers and learning machines that use a hypothesis space of linear functions in a high dimensional feature space. This relatively new paradigm, based on Statistical Learning Theory (SLT) and Structural Risk Minimization (SRM), has many advantages when compared to traditional neural networks, which are based on Empirical Risk Minimization (ERM). Unlike neural networks, SVM training always finds a global minimum. Furthermore, SVMs have inherent ability to solve pattern classification without incorporating any problem-domain knowledge. In this study, the SVM was employed as a pattern classifier, operating on mammography data used for breast cancer detection. The main focus was to formulate the best learning machine configurations for optimum specificity and positive predictive value at very high sensitivities. Using a mammogram database of 500 biopsy-proven samples, the best performing SVM, on average, was able to achieve (under statistical 5-fold cross-validation) a specificity of 45.0% and a positive predictive value (PPV) of 50.1% at 100% sensitivity. At 97% sensitivity, a specificity of 55.8% and a PPV of 55.2% were obtained.
Tyrosine Kinase Ligand-Receptor Pair Prediction by Using Support Vector Machine
Directory of Open Access Journals (Sweden)
Masayuki Yarimizu
2015-01-01
Full Text Available Receptor tyrosine kinases are essential proteins involved in cellular differentiation and proliferation in vivo and are heavily involved in allergic diseases, diabetes, and onset/proliferation of cancerous cells. Identifying the interacting partner of this protein, a growth factor ligand, will provide a deeper understanding of cellular proliferation/differentiation and other cell processes. In this study, we developed a method for predicting tyrosine kinase ligand-receptor pairs from their amino acid sequences. We collected tyrosine kinase ligand-receptor pairs from the Database of Interacting Proteins (DIP and UniProtKB, filtered them by removing sequence redundancy, and used them as a dataset for machine learning and assessment of predictive performance. Our prediction method is based on support vector machines (SVMs, and we evaluated several input features suitable for tyrosine kinase for machine learning and compared and analyzed the results. Using sequence pattern information and domain information extracted from sequences as input features, we obtained 0.996 of the area under the receiver operating characteristic curve. This accuracy is higher than that obtained from general protein-protein interaction pair predictions.
Carbon Nanotube Growth Rate Regression using Support Vector Machines and Artificial Neural Networks
2014-03-27
from [18]. ©1999 Elsevier 12 2.6 CNT resistance versus cyclohexane vapor pressure showing the ability of SWNTs to detect certain gases. Reproduced from...1 output neuron. (b) A zoomed in view of one individual neuron. The inputs are multiplied by the weights and this product is summed before reaching...of applications across science and technology [12, 23]. CNTs are the only nanostructured carbon material to reach large scale production to include
Xiong, Tao; Bao, Yukun; Hu, Zhongyi
2014-01-01
Highly accurate interval forecasting of a stock price index is fundamental to successfully making a profit when making investment decisions, by providing a range of values rather than a point estimate. In this study, we investigate the possibility of forecasting an interval-valued stock price index series over short and long horizons using multi-output support vector regression (MSVR). Furthermore, this study proposes a firefly algorithm (FA)-based approach, built on the established MSVR, for...
Directory of Open Access Journals (Sweden)
Ravindra Kumar
2017-09-01
Full Text Available Background The endoplasmic reticulum plays an important role in many cellular processes, which includes protein synthesis, folding and post-translational processing of newly synthesized proteins. It is also the site for quality control of misfolded proteins and entry point of extracellular proteins to the secretory pathway. Hence at any given point of time, endoplasmic reticulum contains two different cohorts of proteins, (i proteins involved in endoplasmic reticulum-specific function, which reside in the lumen of the endoplasmic reticulum, called as endoplasmic reticulum resident proteins and (ii proteins which are in process of moving to the extracellular space. Thus, endoplasmic reticulum resident proteins must somehow be distinguished from newly synthesized secretory proteins, which pass through the endoplasmic reticulum on their way out of the cell. Approximately only 50% of the proteins used in this study as training data had endoplasmic reticulum retention signal, which shows that these signals are not essentially present in all endoplasmic reticulum resident proteins. This also strongly indicates the role of additional factors in retention of endoplasmic reticulum-specific proteins inside the endoplasmic reticulum. Methods This is a support vector machine based method, where we had used different forms of protein features as inputs for support vector machine to develop the prediction models. During training leave-one-out approach of cross-validation was used. Maximum performance was obtained with a combination of amino acid compositions of different part of proteins. Results In this study, we have reported a novel support vector machine based method for predicting endoplasmic reticulum resident proteins, named as ERPred. During training we achieved a maximum accuracy of 81.42% with leave-one-out approach of cross-validation. When evaluated on independent dataset, ERPred did prediction with sensitivity of 72.31% and specificity of 83
Tang, Buzhou; Cao, Hongxin; Wu, Yonghui; Jiang, Min; Xu, Hua
2013-01-01
Named entity recognition (NER) is an important task in clinical natural language processing (NLP) research. Machine learning (ML) based NER methods have shown good performance in recognizing entities in clinical text. Algorithms and features are two important factors that largely affect the performance of ML-based NER systems. Conditional Random Fields (CRFs), a sequential labelling algorithm, and Support Vector Machines (SVMs), which is based on large margin theory, are two typical machine learning algorithms that have been widely applied to clinical NER tasks. For features, syntactic and semantic information of context words has often been used in clinical NER systems. However, Structural Support Vector Machines (SSVMs), an algorithm that combines the advantages of both CRFs and SVMs, and word representation features, which contain word-level back-off information over large unlabelled corpus by unsupervised algorithms, have not been extensively investigated for clinical text processing. Therefore, the primary goal of this study is to evaluate the use of SSVMs and word representation features in clinical NER tasks. In this study, we developed SSVMs-based NER systems to recognize clinical entities in hospital discharge summaries, using the data set from the concept extration task in the 2010 i2b2 NLP challenge. We compared the performance of CRFs and SSVMs-based NER classifiers with the same feature sets. Furthermore, we extracted two different types of word representation features (clustering-based representation features and distributional representation features) and integrated them with the SSVMs-based clinical NER system. We then reported the performance of SSVM-based NER systems with different types of word representation features. Using the same training (N = 27,837) and test (N = 45,009) sets in the challenge, our evaluation showed that the SSVMs-based NER systems achieved better performance than the CRFs-based systems for clinical entity recognition, when
Illuminance Flow Estimation by Regression
Karlsson, S.M.; Pont, S.C.; Koenderink, J.J.; Zisserman, A.
2010-01-01
We investigate the estimation of illuminance flow using Histograms of Oriented Gradient features (HOGs). In a regression setting, we found for both ridge regression and support vector machines, that the optimal solution shows close resemblance to the gradient based structure tensor (also known as
Qiu, Shibin; Lane, Terran
2009-01-01
The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions.
A Vector AutoRegressive (VAR) Approach to the Credit Channel for ...
African Journals Online (AJOL)
This paper is an attempt to determine the presence and empirical significance of monetary policy and the bank lending view of the credit channel for Mauritius, which is particularly relevant at these times. A vector autoregressive (VAR) model of order three is used to examine the monetary transmission mechanism using ...
SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM
International Nuclear Information System (INIS)
Bobra, M. G.; Couvidat, S.
2015-01-01
We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a database of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities
Hybrid Radar Emitter Recognition Based on Rough k-Means Classifier and Relevance Vector Machine
Yang, Zhutian; Wu, Zhilu; Yin, Zhendong; Quan, Taifan; Sun, Hongjian
2013-01-01
Due to the increasing complexity of electromagnetic signals, there exists a significant challenge for recognizing radar emitter signals. In this paper, a hybrid recognition approach is presented that classifies radar emitter signals by exploiting the different separability of samples. The proposed approach comprises two steps, namely the primary signal recognition and the advanced signal recognition. In the former step, a novel rough k-means classifier, which comprises three regions, i.e., certain area, rough area and uncertain area, is proposed to cluster the samples of radar emitter signals. In the latter step, the samples within the rough boundary are used to train the relevance vector machine (RVM). Then RVM is used to recognize the samples in the uncertain area; therefore, the classification accuracy is improved. Simulation results show that, for recognizing radar emitter signals, the proposed hybrid recognition approach is more accurate, and presents lower computational complexity than traditional approaches. PMID:23344380
Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine
Lawi, Armin; Sya'Rani Machrizzandi, M.
2018-03-01
Facial expression is one of behavior characteristics of human-being. The use of biometrics technology system with facial expression characteristics makes it possible to recognize a person’s mood or emotion. The basic components of facial expression analysis system are face detection, face image extraction, facial classification and facial expressions recognition. This paper uses Principal Component Analysis (PCA) algorithm to extract facial features with expression parameters, i.e., happy, sad, neutral, angry, fear, and disgusted. Then Multiclass Ensemble Least-Squares Support Vector Machine (MELS-SVM) is used for the classification process of facial expression. The result of MELS-SVM model obtained from our 185 different expression images of 10 persons showed high accuracy level of 99.998% using RBF kernel.
Directory of Open Access Journals (Sweden)
Kai Hu
Full Text Available Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback
Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi
2016-01-01
Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505
Fault Detection and Diagnosis in Process Data Using Support Vector Machines
Directory of Open Access Journals (Sweden)
Fang Wu
2014-01-01
Full Text Available For the complex industrial process, it has become increasingly challenging to effectively diagnose complicated faults. In this paper, a combined measure of the original Support Vector Machine (SVM and Principal Component Analysis (PCA is provided to carry out the fault classification, and compare its result with what is based on SVM-RFE (Recursive Feature Elimination method. RFE is used for feature extraction, and PCA is utilized to project the original data onto a lower dimensional space. PCA T2, SPE statistics, and original SVM are proposed to detect the faults. Some common faults of the Tennessee Eastman Process (TEP are analyzed in terms of the practical system and reflections of the dataset. PCA-SVM and SVM-RFE can effectively detect and diagnose these common faults. In RFE algorithm, all variables are decreasingly ordered according to their contributions. The classification accuracy rate is improved by choosing a reasonable number of features.
Modulation transfer function (MTF) measurement method based on support vector machine (SVM)
Zhang, Zheng; Chen, Yueting; Feng, Huajun; Xu, Zhihai; Li, Qi
2016-03-01
An imaging system's spatial quality can be expressed by the system's modulation spread function (MTF) as a function of spatial frequency in terms of the linear response theory. Methods have been proposed to assess the MTF of an imaging system using point, slit or edge techniques. The edge method is widely used for the low requirement of targets. However, the traditional edge methods are limited by the edge angle. Besides, image noise will impair the measurement accuracy, making the measurement result unstable. In this paper, a novel measurement method based on the support vector machine (SVM) is proposed. Image patches with different edge angles and MTF levels are generated as the training set. Parameters related with MTF and image structure are extracted from the edge images. Trained with image parameters and the corresponding MTF, the SVM classifier can assess the MTF of any edge image. The result shows that the proposed method has an excellent performance on measuring accuracy and stability.
Biometric gait recognition for mobile devices using wavelet transform and support vector machines
DEFF Research Database (Denmark)
Hestbek, Martin Reese; Nickel, C.; Busch, C.
2012-01-01
The ever growing number of mobile devices has turned the attention to security and usability. If a mobile device is lost or stolen this can lead to loss of personal information and the possibility of identity theft. People often tend not to use passwords which leads to lack of personal security...... mainly due to convenience and frequent use. This paper suggests to serve both convenience and security needs at the same time. Thus we suggest to observe the user's gait characteristic. Our approach realizes user authentication by applying the discrete wavelet transform (DWT) to acceleration signals...... obtained from mobile devices. Gait templates were constructed of Bark-frequency cepstral coefficients (BFCC) from the wavelet coefficients and these were arranged to train a support vector machine (SVM). A cross-day scenario demonstrates that the proposed approach shows competitive recognition performance...
Fachrurrozi, Muhammad; Saparudin; Erwin
2017-04-01
Real-time Monitoring and early detection system which measures the quality standard of waste in Musi River, Palembang, Indonesia is a system for determining air and water pollution level. This system was designed in order to create an integrated monitoring system and provide real time information that can be read. It is designed to measure acidity and water turbidity polluted by industrial waste, as well as to show and provide conditional data integrated in one system. This system consists of inputting and processing the data, and giving output based on processed data. Turbidity, substances, and pH sensor is used as a detector that produce analog electrical direct current voltage (DC). Early detection system works by determining the value of the ammonia threshold, acidity, and turbidity level of water in Musi River. The results is then presented based on the level group pollution by the Support Vector Machine classification method.
Bearing Degradation Process Prediction Based on the Support Vector Machine and Markov Model
Directory of Open Access Journals (Sweden)
Shaojiang Dong
2014-01-01
Full Text Available Predicting the degradation process of bearings before they reach the failure threshold is extremely important in industry. This paper proposed a novel method based on the support vector machine (SVM and the Markov model to achieve this goal. Firstly, the features are extracted by time and time-frequency domain methods. However, the extracted original features are still with high dimensional and include superfluous information, and the nonlinear multifeatures fusion technique LTSA is used to merge the features and reduces the dimension. Then, based on the extracted features, the SVM model is used to predict the bearings degradation process, and the CAO method is used to determine the embedding dimension of the SVM model. After the bearing degradation process is predicted by SVM model, the Markov model is used to improve the prediction accuracy. The proposed method was validated by two bearing run-to-failure experiments, and the results proved the effectiveness of the methodology.
Application of the Support Vector Machine to Predict Subclinical Mastitis in Dairy Cattle
Directory of Open Access Journals (Sweden)
Nazira Mammadova
2013-01-01
Full Text Available This study presented a potentially useful alternative approach to ascertain the presence of subclinical and clinical mastitis in dairy cows using support vector machine (SVM techniques. The proposed method detected mastitis in a cross-sectional representative sample of Holstein dairy cattle milked using an automatic milking system. The study used such suspected indicators of mastitis as lactation rank, milk yield, electrical conductivity, average milking duration, and control season as input data. The output variable was somatic cell counts obtained from milk samples collected monthly throughout the 15 months of the control period. Cattle were judged to be healthy or infected based on those somatic cell counts. This study undertook a detailed scrutiny of the SVM methodology, constructing and examining a model which showed 89% sensitivity, 92% specificity, and 50% error in mastitis detection.
Directory of Open Access Journals (Sweden)
Xing Yan
2015-01-01
Full Text Available Currently, there are many techniques available for short-term forecasting of the electricity market clearing price (MCP, but very little work has been done in the area of midterm forecasting of the electricity MCP. The midterm forecasting of the electricity MCP is essential for maintenance scheduling, planning, bilateral contracting, resources reallocation, and budgeting. A two-stage multiple support vector machine (SVM based midterm forecasting model of the electricity MCP is proposed in this paper. The first stage is utilized to separate the input data into corresponding price zones by using a single SVM. Then, the second stage is applied utilizing four parallel designed SVMs to forecast the electricity price in four different price zones. Compared to the forecasting model using a single SVM, the proposed model showed improved forecasting accuracy in both peak prices and overall system. PJM interconnection data are used to test the proposed model.
Support Vector Machines for decision support in electricity markets׳ strategic bidding
DEFF Research Database (Denmark)
Pinto, Tiago; Sousa, Tiago M.; Praça, Isabel
2015-01-01
. The ALBidS system allows MASCEM market negotiating players to take the best possible advantages from the market context. This paper presents the application of a Support Vector Machines (SVM) based approach to provide decision support to electricity market players. This strategy is tested and validated...... by being included in ALBidS and then compared with the application of an Artificial Neural Network (ANN), originating promising results: an effective electricity market price forecast in a fast execution time. The proposed approach is tested and validated using real electricity markets data from MIBEL......׳ research group has developed a multi-agent system: Multi-Agent System for Competitive Electricity Markets (MASCEM), which simulates the electricity markets environment. MASCEM is integrated with Adaptive Learning Strategic Bidding System (ALBidS) that works as a decision support system for market players...
Support vector machines and evolutionary algorithms for classification single or together?
Stoean, Catalin
2014-01-01
When discussing classification, support vector machines are known to be a capable and efficient technique to learn and predict with high accuracy within a quick time frame. Yet, their black box means to do so make the practical users quite circumspect about relying on it, without much understanding of the how and why of its predictions. The question raised in this book is how can this ‘masked hero’ be made more comprehensible and friendly to the public: provide a surrogate model for its hidden optimization engine, replace the method completely or appoint a more friendly approach to tag along and offer the much desired explanations? Evolutionary algorithms can do all these and this book presents such possibilities of achieving high accuracy, comprehensibility, reasonable runtime as well as unconstrained performance.
Kareim, Ameer A.; Mansor, Muhamad Bin
2013-06-01
The aim of this paper is to improve efficiency of maximum power point tracking (MPPT) for PV systems. The Support Vector Machine (SVM) was proposed to achieve the MPPT controller. The theoretical, the perturbation and observation (P&O), and incremental conductance (IC) algorithms were used to compare with proposed SVM algorithm. MATLAB models for PV module, theoretical, SVM, P&O, and IC algorithms are implemented. The improved MPPT uses the SVM method to predict the optimum voltage of the PV system in order to extract the maximum power point (MPP). The SVM technique used two inputs which are solar radiation and ambient temperature of the modeled PV module. The results show that the proposed SVM technique has less Root Mean Square Error (RMSE) and higher efficiency than P&O and IC methods.
Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi
2016-01-01
Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.
Towards human behavior recognition based on spatio temporal features and support vector machines
Ghabri, Sawsen; Ouarda, Wael; Alimi, Adel M.
2017-03-01
Security and surveillance are vital issues in today's world. The recent acts of terrorism have highlighted the urgent need for efficient surveillance. There is indeed a need for an automated system for video surveillance which can detect identity and activity of person. In this article, we propose a new paradigm to recognize an aggressive human behavior such as boxing action. Our proposed system for human activity detection includes the use of a fusion between Spatio Temporal Interest Point (STIP) and Histogram of Oriented Gradient (HoG) features. The novel feature called Spatio Temporal Histogram Oriented Gradient (STHOG). To evaluate the robustness of our proposed paradigm with a local application of HoG technique on STIP points, we made experiments on KTH human action dataset based on Multi Class Support Vector Machines classification. The proposed scheme outperforms basic descriptors like HoG and STIP to achieve 82.26% us an accuracy value of classification rate.
Financial Distress Prediction using Linear Discriminant Analysis and Support Vector Machine
Santoso, Noviyanti; Wibowo, Wahyu
2018-03-01
A financial difficulty is the early stages before the bankruptcy. Bankruptcies caused by the financial distress can be seen from the financial statements of the company. The ability to predict financial distress became an important research topic because it can provide early warning for the company. In addition, predicting financial distress is also beneficial for investors and creditors. This research will be made the prediction model of financial distress at industrial companies in Indonesia by comparing the performance of Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) combined with variable selection technique. The result of this research is prediction model based on hybrid Stepwise-SVM obtains better balance among fitting ability, generalization ability and model stability than the other models.
Nemmour, Hassiba; Chibani, Youcef
The reliability of support vector machines for classifying hyper-spectral images of remote sensing has been proven in various studies. In this paper, we investigate their applicability for land cover change detection. First, SVM-based change detection is presented and performed for mapping urban growth in the Algerian capital. Different performance indicators, as well as a comparison with artificial neural networks, are used to support our experimental analysis. In a second step, a combination framework is proposed to improve change detection accuracy. Two combination rules, namely, Fuzzy Integral and Attractor Dynamics, are implemented and evaluated with respect to individual SVMs. Recognition rates achieved by individual SVMs, compared to neural networks, confirm their efficiency for land cover change detection. Furthermore, the relevance of SVM combination is highlighted.
Directory of Open Access Journals (Sweden)
Yizhou Yang
2017-01-01
Full Text Available To diagnose mechanical faults of rotor-bearing-casing system by analyzing its casing vibration signal, this paper proposes a training procedure of a fault classifier based on variational mode decomposition (VMD, local linear embedding (LLE, and support vector machine (SVM. VMD is used first to decompose the casing signal into several modes, which are subsignals usually modulated by fault frequencies. Vibrational features are extracted from both VMD subsignals and the original one. LLE is employed here to reduce the dimensionality of these extracted features and make the samples more separable. Then low-dimensional data sets are used to train the multiclass SVM whose accuracy is tested by classifying the test samples. When the parameters of LLE and SVM are well optimized, this proposed method performs well on experimental data, showing its capacity of diagnosing casing vibration faults.
Prediction of Carbohydrate-Binding Proteins from Sequences Using Support Vector Machines
Directory of Open Access Journals (Sweden)
Seizi Someya
2010-01-01
Full Text Available Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs. We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.
Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin
2010-12-01
We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.
Data on Support Vector Machines (SVM model to forecast photovoltaic power
Directory of Open Access Journals (Sweden)
M. Malvoni
2016-12-01
Full Text Available The data concern the photovoltaic (PV power, forecasted by a hybrid model that considers weather variations and applies a technique to reduce the input data size, as presented in the paper entitled “Photovoltaic forecast based on hybrid pca-lssvm using dimensionality reducted data” (M. Malvoni, M.G. De Giorgi, P.M. Congedo, 2015 [1]. The quadratic Renyi entropy criteria together with the principal component analysis (PCA are applied to the Least Squares Support Vector Machines (LS-SVM to predict the PV power in the day-ahead time frame. The data here shared represent the proposed approach results. Hourly PV power predictions for 1,3,6,12, 24 ahead hours and for different data reduction sizes are provided in Supplementary material.
A SUPPORT VECTOR MACHINE APPROACH FOR DEVELOPING TELEMEDICINE SOLUTIONS: MEDICAL DIAGNOSIS
Directory of Open Access Journals (Sweden)
Mihaela GHEORGHE
2015-06-01
Full Text Available Support vector machine represents an important tool for artificial neural networks techniques including classification and prediction. It offers a solution for a wide range of different issues in which cases the traditional optimization algorithms and methods cannot be applied directly due to different constraints, including memory restrictions, hidden relationships between variables, very high volume of computations that needs to be handled. One of these issues relates to medical diagnosis, a subset of the medical field. In this paper, the SVM learning algorithm is tested on a diabetes dataset and the results obtained for training with different kernel functions are presented and analyzed in order to determine a good approach from a telemedicine perspective.
Liang, Long; Zhang, Tianlong; Wang, Kang; Tang, Hongsheng; Yang, Xiaofeng; Zhu, Xiaoqin; Duan, Yixiang; Li, Hua
2014-02-01
The feasibility of steel materials classification by support vector machines (SVMs), in combination with laser-induced breakdown spectroscopy (LIBS) technology, was investigated. Multi-classification methods based on SVM, the one-against-all and the one-against-one models, and a combination model, are applied to classify nine types of round steel. Due to the inhomogeneity of steel composition, the data obtained using the one-against-all and one-against-one models were ambiguous and difficult to discriminate; whereas, the combination model, was able to successfully distinguish most of the ambiguous data and control the computation cost within an acceptable range. The studies presented here demonstrate that LIBS-SVM is a useful technique for the identification and discrimination of steel materials, and would be very well-suited for process analysis in the steelmaking industry.
Shen, Chia-Ping; Chen, Wei-Hsin; Chen, Jia-Ming; Hsu, Kai-Ping; Lin, Jeng-Wei; Chiu, Ming-Jang; Chen, Chi-Huang; Lai, Feipei
2010-01-01
Today, many bio-signals such as Electroencephalography (EEG) are recorded in digital format. It is an emerging research area of analyzing these digital bio-signals to extract useful health information in biomedical engineering. In this paper, a bio-signal analyzing cloud computing architecture, called BACCA, is proposed. The system has been designed with the purpose of seamless integration into the National Taiwan University Health Information System. Based on the concept of. NET Service Oriented Architecture, the system integrates heterogeneous platforms, protocols, as well as applications. In this system, we add modern analytic functions such as approximated entropy and adaptive support vector machine (SVM). It is shown that the overall accuracy of EEG bio-signal analysis has increased to nearly 98% for different data sets, including open-source and clinical data sets.
[Study on predicting model for acute hypotensive episodes in ICU based on support vector machine].
Lai, Lijuan; Wang, Zhigang; Wu, Xiaoming; Xiong, Dongsheng
2011-06-01
The occurrence of acute hypotensive episodes (AHE) in intensive care units (ICU) seriously endangers the lives of patients, and the treatment is mainly depended on the expert experience of doctors. In this paper, a model for predicting the occurrence of AHE in ICU has been developed using the theory of medical Informatics. We analyzed the trend and characteristics of the mean arterial blood pressure (MAP) between the patients who were suffering AHE and those who were not, and extracted the median, mean and other statistical parameters for learning and training based on support vector machine (SVM), then developed a predicting model. On this basis, we also compared different models consisted of different kernel functions. Experiments demonstrated that this approach performed well on classification and prediction, which would contribute to forecast the occurrence of AHE.
Cao, Hongliang; Xin, Ya; Yuan, Qiaoxia
2016-02-01
To predict conveniently the biochar yield from cattle manure pyrolysis, intelligent modeling approach was introduced in this research. A traditional artificial neural networks (ANN) model and a novel least squares support vector machine (LS-SVM) model were developed. For the identification and prediction evaluation of the models, a data set with 33 experimental data was used, which were obtained using a laboratory-scale fixed bed reaction system. The results demonstrated that the intelligent modeling approach is greatly convenient and effective for the prediction of the biochar yield. In particular, the novel LS-SVM model has a more satisfying predicting performance and its robustness is better than the traditional ANN model. The introduction and application of the LS-SVM modeling method gives a successful example, which is a good reference for the modeling study of cattle manure pyrolysis process, even other similar processes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wu, Jingzhu; Dong, Jingjing; Dong, Wenfei; Chen, Yan; Liu, Cuiling
2016-10-01
A classification method of support vector machines with linear kernel was employed to authenticate genuine olive oil based on near-infrared spectroscopy. There were three types of adulteration of olive oil experimented in the study. The adulterated oil was respectively soybean oil, rapeseed oil and the mixture of soybean and rapeseed oil. The average recognition rate of second experiment was more than 90% and that of the third experiment was reach to 100%. The results showed the method had good performance in classifying genuine olive oil and the adulteration with small variation range of adulterated concentration and it was a promising and rapid technique for the detection of oil adulteration and fraud in the food industry.
Astawa, INGA; Gusti Ngurah Bagus Caturbawa, I.; Made Sajayasa, I.; Dwi Suta Atmaja, I. Made Ari
2018-01-01
The license plate recognition usually used as part of system such as parking system. License plate detection considered as the most important step in the license plate recognition system. We propose methods that can be used to detect the vehicle plate on mobile phone. In this paper, we used Sliding Window, Histogram of Oriented Gradient (HOG), and Support Vector Machines (SVM) method to license plate detection so it will increase the detection level even though the image is not in a good quality. The image proceed by Sliding Window method in order to find plate position. Feature extraction in every window movement had been done by HOG and SVM method. Good result had shown in this research, which is 96% of accuracy.
Automatic road sign detecion and classification based on support vector machines and HOG descriptos
Adam, A.; Ioannidis, C.
2014-05-01
This paper examines the detection and classification of road signs in color-images acquired by a low cost camera mounted on a moving vehicle. A new method for the detection and classification of road signs is proposed based on color based detection, in order to locate regions of interest. Then, a circular Hough transform is applied to complete detection taking advantage of the shape properties of the road signs. The regions of interest are finally represented using HOG descriptors and are fed into trained Support Vector Machines (SVMs) in order to be recognized. For the training procedure, a database with several training examples depicting Greek road sings has been developed. Many experiments have been conducted and are presented, to measure the efficiency of the proposed methodology especially under adverse weather conditions and poor illumination. For the experiments training datasets consisting of different number of examples were used and the results are presented, along with some possible extensions of this work.
SYN Flood Attack Detection in Cloud Computing using Support Vector Machine
Directory of Open Access Journals (Sweden)
Zerina Mašetić
2017-11-01
Full Text Available Cloud computing is a trending technology, as it reduces the cost of running a business. However, many companies are skeptic moving about towards cloud due to the security concerns. Based on the Cloud Security Alliance report, Denial of Service (DoS attacks are among top 12 attacks in the cloud computing. Therefore, it is important to develop a mechanism for detection and prevention of these attacks. The aim of this paper is to evaluate Support Vector Machine (SVM algorithm in creating the model for classification of DoS attacks and normal network behaviors. The study was performed in several phases: a attack simulation, b data collection, cfeature selection, and d classification. The proposedmodel achieved 100% classification accuracy with true positive rate (TPR of 100%. SVM showed outstanding performance in DoS attack detection and proves that it serves as a valuable asset in the network security area.
Directory of Open Access Journals (Sweden)
Mario Sansone
2013-01-01
Full Text Available Computer systems for Electrocardiogram (ECG analysis support the clinician in tedious tasks (e.g., Holter ECG monitored in Intensive Care Units or in prompt detection of dangerous events (e.g., ventricular fibrillation. Together with clinical applications (arrhythmia detection and heart rate variability analysis, ECG is currently being investigated in biometrics (human identification, an emerging area receiving increasing attention. Methodologies for clinical applications can have both differences and similarities with respect to biometrics. This paper reviews methods of ECG processing from a pattern recognition perspective. In particular, we focus on features commonly used for heartbeat classification. Considering the vast literature in the field and the limited space of this review, we dedicated a detailed discussion only to a few classifiers (Artificial Neural Networks and Support Vector Machines because of their popularity; however, other techniques such as Hidden Markov Models and Kalman Filtering will be also mentioned.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Semi-automatic classification of birdsong elements using a linear support vector machine.
Tachibana, Ryosuke O; Oosugi, Naoya; Okanoya, Kazuo
2014-01-01
Birdsong provides a unique model for understanding the behavioral and neural bases underlying complex sequential behaviors. However, birdsong analyses require laborious effort to make the data quantitatively analyzable. The previous attempts had succeeded to provide some reduction of human efforts involved in birdsong segment classification. The present study was aimed to further reduce human efforts while increasing classification performance. In the current proposal, a linear-kernel support vector machine was employed to minimize the amount of human-generated label samples for reliable element classification in birdsong, and to enable the classifier to handle highly-dimensional acoustic features while avoiding the over-fitting problem. Bengalese finch's songs in which distinct elements (i.e., syllables) were aligned in a complex sequential pattern were used as a representative test case in the neuroscientific research field. Three evaluations were performed to test (1) algorithm validity and accuracy with exploring appropriate classifier settings, (2) capability to provide accuracy with reducing amount of instruction dataset, and (3) capability in classifying large dataset with minimized manual labeling. The results from the evaluation (1) showed that the algorithm is 99.5% reliable in song syllables classification. This accuracy was indeed maintained in evaluation (2), even when the instruction data classified by human were reduced to one-minute excerpt (corresponding to 300-400 syllables) for classifying two-minute excerpt. The reliability remained comparable, 98.7% accuracy, when a large target dataset of whole day recordings (∼30,000 syllables) was used. Use of a linear-kernel support vector machine showed sufficient accuracies with minimized manually generated instruction data in bird song element classification. The methodology proposed would help reducing laborious processes in birdsong analysis without sacrificing reliability, and therefore can help
Identification of Type 2 Diabetes-associated combination of SNPs using Support Vector Machine
Directory of Open Access Journals (Sweden)
Park Keun-Joon
2010-04-01
Full Text Available Abstract Background Type 2 diabetes mellitus (T2D, a metabolic disorder characterized by insulin resistance and relative insulin deficiency, is a complex disease of major public health importance. Its incidence is rapidly increasing in the developed countries. Complex diseases are caused by interactions between multiple genes and environmental factors. Most association studies aim to identify individual susceptibility single markers using a simple disease model. Recent studies are trying to estimate the effects of multiple genes and multi-locus in genome-wide association. However, estimating the effects of association is very difficult. We aim to assess the rules for classifying diseased and normal subjects by evaluating potential gene-gene interactions in the same or distinct biological pathways. Results We analyzed the importance of gene-gene interactions in T2D susceptibility by investigating 408 single nucleotide polymorphisms (SNPs in 87 genes involved in major T2D-related pathways in 462 T2D patients and 456 healthy controls from the Korean cohort studies. We evaluated the support vector machine (SVM method to differentiate between cases and controls using SNP information in a 10-fold cross-validation test. We achieved a 65.3% prediction rate with a combination of 14 SNPs in 12 genes by using the radial basis function (RBF-kernel SVM. Similarly, we investigated subpopulation data sets of men and women and identified different SNP combinations with the prediction rates of 70.9% and 70.6%, respectively. As the high-throughput technology for genome-wide SNPs improves, it is likely that a much higher prediction rate with biologically more interesting combination of SNPs can be acquired by using this method. Conclusions Support Vector Machine based feature selection method in this research found novel association between combinations of SNPs and T2D in a Korean population.
SVMQA: support-vector-machine-based protein single-model quality assessment.
Manavalan, Balachandran; Lee, Jooyoung
2017-08-15
The accurate ranking of predicted structural models and selecting the best model from a given candidate pool remain as open problems in the field of structural bioinformatics. The quality assessment (QA) methods used to address these problems can be grouped into two categories: consensus methods and single-model methods. Consensus methods in general perform better and attain higher correlation between predicted and true quality measures. However, these methods frequently fail to generate proper quality scores for native-like structures which are distinct from the rest of the pool. Conversely, single-model methods do not suffer from this drawback and are better suited for real-life applications where many models from various sources may not be readily available. In this study, we developed a support-vector-machine-based single-model global quality assessment (SVMQA) method. For a given protein model, the SVMQA method predicts TM-score and GDT_TS score based on a feature vector containing statistical potential energy terms and consistency-based terms between the actual structural features (extracted from the three-dimensional coordinates) and predicted values (from primary sequence). We trained SVMQA using CASP8, CASP9 and CASP10 targets and determined the machine parameters by 10-fold cross-validation. We evaluated the performance of our SVMQA method on various benchmarking datasets. Results show that SVMQA outperformed the existing best single-model QA methods both in ranking provided protein models and in selecting the best model from the pool. According to the CASP12 assessment, SVMQA was the best method in selecting good-quality models from decoys in terms of GDTloss. SVMQA method can be freely downloaded from http://lee.kias.re.kr/SVMQA/SVMQA_eval.tar.gz. jlee@kias.re.kr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Support vector machine based diagnostic system for breast cancer using swarm intelligence.
Chen, Hui-Ling; Yang, Bo; Wang, Gang; Wang, Su-Jing; Liu, Jie; Liu, Da-You
2012-08-01
Breast cancer is becoming a leading cause of death among women in the whole world, meanwhile, it is confirmed that the early detection and accurate diagnosis of this disease can ensure a long survival of the patients. In this paper, a swarm intelligence technique based support vector machine classifier (PSO_SVM) is proposed for breast cancer diagnosis. In the proposed PSO-SVM, the issue of model selection and feature selection in SVM is simultaneously solved under particle swarm (PSO optimization) framework. A weighted function is adopted to design the objective function of PSO, which takes into account the average accuracy rates of SVM (ACC), the number of support vectors (SVs) and the selected features simultaneously. Furthermore, time varying acceleration coefficients (TVAC) and inertia weight (TVIW) are employed to efficiently control the local and global search in PSO algorithm. The effectiveness of PSO-SVM has been rigorously evaluated against the Wisconsin Breast Cancer Dataset (WBCD), which is commonly used among researchers who use machine learning methods for breast cancer diagnosis. The proposed system is compared with the grid search method with feature selection by F-score. The experimental results demonstrate that the proposed approach not only obtains much more appropriate model parameters and discriminative feature subset, but also needs smaller set of SVs for training, giving high predictive accuracy. In addition, Compared to the existing methods in previous studies, the proposed system can also be regarded as a promising success with the excellent classification accuracy of 99.3% via 10-fold cross validation (CV) analysis. Moreover, a combination of five informative features is identified, which might provide important insights to the nature of the breast cancer disease and give an important clue for the physicians to take a closer attention. We believe the promising result can ensure that the physicians make very accurate diagnostic decision in
Object Recognition System-on-Chip Using the Support Vector Machines
Directory of Open Access Journals (Sweden)
Houzet Dominique
2005-01-01
Full Text Available The first aim of this work is to propose the design of a system-on-chip (SoC platform dedicated to digital image and signal processing, which is tuned to implement efficiently multiply-and-accumulate (MAC vector/matrix operations. The second aim of this work is to implement a recent promising neural network method, namely, the support vector machine (SVM used for real-time object recognition, in order to build a vision machine. With such a reconfigurable and programmable SoC platform, it is possible to implement any SVM function dedicated to any object recognition problem. The final aim is to obtain an automatic reconfiguration of the SoC platform, based on the results of the learning phase on an objects' database, which makes it possible to recognize practically any object without manual programming. Recognition can be of any kind that is from image to signal data. Such a system is a general-purpose automatic classifier. Many applications can be considered as a classification problem, but are usually treated specifically in order to optimize the cost of the implemented solution. The cost of our approach is more important than a dedicated one, but in a near future, hundreds of millions of gates will be common and affordable compared to the design cost. What we are proposing here is a general-purpose classification neural network implemented on a reconfigurable SoC platform. The first version presented here is limited in size and thus in object recognition performances, but can be easily upgraded according to technology improvements.
Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim
2015-07-30
Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Hong-Hai Tran
2014-01-01
Full Text Available Permeation grouting is a commonly used approach for soil improvement in construction engineering. Thus, predicting the results of grouting activities is a crucial task that needs to be carried out in the planning phase of any grouting project. In this research, a novel artificial intelligence approach—autotuning support vector machine—is proposed to forecast the result of grouting activities that employ microfine cement grouts. In the new model, the support vector machine (SVM algorithm is utilized to classify grouting activities into two classes: success and failure. Meanwhile, the differential evolution (DE optimization algorithm is employed to identify the optimal tuning parameters of the SVM algorithm, namely, the penalty parameter and the kernel function parameter. The integration of the SVM and DE algorithms allows the newly established method to operate automatically without human prior knowledge or tedious processes for parameter setting. An experiment using a set of in situ data samples demonstrates that the newly established method can produce an outstanding prediction performance.
Classification of fruits using computer vision and a multiclass support vector machine.
Zhang, Yudong; Wu, Lenan
2012-01-01
Automatic classification of fruits via computer vision is still a complicated task due to the various properties of numerous types of fruits. We propose a novel classification method based on a multi-class kernel support vector machine (kSVM) with the desirable goal of accurate and fast classification of fruits. First, fruit images were acquired by a digital camera, and then the background of each image was removed by a split-and-merge algorithm; Second, the color histogram, texture and shape features of each fruit image were extracted to compose a feature space; Third, principal component analysis (PCA) was used to reduce the dimensions of feature space; Finally, three kinds of multi-class SVMs were constructed, i.e., Winner-Takes-All SVM, Max-Wins-Voting SVM, and Directed Acyclic Graph SVM. Meanwhile, three kinds of kernels were chosen, i.e., linear kernel, Homogeneous Polynomial kernel, and Gaussian Radial Basis kernel; finally, the SVMs were trained using 5-fold stratified cross validation with the reduced feature vectors as input. The experimental results demonstrated that the Max-Wins-Voting SVM with Gaussian Radial Basis kernel achieves the best classification accuracy of 88.2%. For computation time, the Directed Acyclic Graph SVMs performs swiftest.
Guo, Lei; Abbosh, Amin
2018-05-01
For any chance for stroke patients to survive, the stroke type should be classified to enable giving medication within a few hours of the onset of symptoms. In this paper, a microwave-based stroke localization and classification framework is proposed. It is based on microwave tomography, k-means clustering, and a support vector machine (SVM) method. The dielectric profile of the brain is first calculated using the Born iterative method, whereas the amplitude of the dielectric profile is then taken as the input to k-means clustering. The cluster is selected as the feature vector for constructing and testing the SVM. A database of MRI-derived realistic head phantoms at different signal-to-noise ratios is used in the classification procedure. The performance of the proposed framework is evaluated using the receiver operating characteristic (ROC) curve. The results based on a two-dimensional framework show that 88% classification accuracy, with a sensitivity of 91% and a specificity of 87%, can be achieved. Bioelectromagnetics. 39:312-324, 2018. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Gentle, J. N., Jr.; Kahn, A.; Pierce, S. A.; Wang, S.; Wade, C.; Moran, S.
2016-12-01
With the continued spread of the zika virus in the United States in both Florida and Virginia, increased public awareness, prevention and targeted prediction is necessary to effectively mitigate further infection and propagation of the virus throughout the human population. The goal of this project is to utilize publicly accessible data and HPC resources coupled with machine learning algorithms to identify potential threat vectors for the spread of the zika virus in Texas, the United States and globally by correlating available zika case data collected from incident reports in medical databases (e.g., CDC, Florida Department of Health) with known bodies of water in various earth science databases (e.g., USGS NAQWA Data, NASA ASTER Data, TWDB Data) and by using known mosquito population centers as a proxy for trends in population distribution (e.g., WHO, European CDC, Texas Data) while correlating historical trends in the spread of other mosquito borne diseases (e.g., chikungunya, malaria, dengue, yellow fever, west nile, etc.). The resulting analysis should refine the identification of the specific threat vectors for the spread of the virus which will correspondingly increase the effectiveness of the limited resources allocated towards combating the disease through better strategic implementation of defense measures. The minimal outcome of this research is a better understanding of the factors involved in the spread of the zika virus, with the greater potential to save additional lives through more effective resource utilization and public outreach.
Fault Diagnosis of a Reconfigurable Crawling–Rolling Robot Based on Support Vector Machines
Directory of Open Access Journals (Sweden)
Karthikeyan Elangovan
2017-10-01
Full Text Available As robots begin to perform jobs autonomously, with minimal or no human intervention, a new challenge arises: robots also need to autonomously detect errors and recover from faults. In this paper, we present a Support Vector Machine (SVM-based fault diagnosis system for a bio-inspired reconfigurable robot named Scorpio. The diagnosis system needs to detect and classify faults while Scorpio uses its crawling and rolling locomotion modes. Specifically, we classify between faulty and non-faulty conditions by analyzing onboard Inertial Measurement Unit (IMU sensor data. The data capture nine different locomotion gaits, which include rolling and crawling modes, at three different speeds. Statistical methods are applied to extract features and to reduce the dimensionality of original IMU sensor data features. These statistical features were given as inputs for training and testing. Additionally, the c-Support Vector Classification (c-SVC and nu-SVC models of SVM, and their fault classification accuracies, were compared. The results show that the proposed SVM approach can be used to autonomously diagnose locomotion gait faults while the reconfigurable robot is in operation.
Directory of Open Access Journals (Sweden)
Luo Wei
2017-01-01
Full Text Available Power transformer is one of the most important equipment in power system. In order to predict the potential fault of power transformer and identify the fault types correctly, we proposed a transformer fault intelligent diagnosis model based on chemical reaction optimization (CRO algorithm and relevance vector machine(RVM. RVM is a powerful machine learning method, which can solve nonlinear, high-dimensional classification problems with a limited number of samples. CRO algorithm has well global optimization and simple calculation, so it is suitable to solve parameter optimization problems. In this paper, firstly, a multi-layer RVM classification model was built by binary tree recognition strategy. Secondly, CRO algorithm was adopted to optimize the kernel function parameters which could enhance the performance of RVM classifiers. Compared with IEC three-ratio method and the RVM model, the CRO-RVM model not only overcomes the coding defect problem of IEC three-ratio method, but also has higher classification accuracy than the RVM model. Finally, the new method was applied to analyze a transformer fault case, Its predicted result accord well with the real situation. The research provides a practical method for transformer fault intelligent diagnosis and prediction.
BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins
Directory of Open Access Journals (Sweden)
MuthuKrishnan Selvaraj
2016-01-01
Full Text Available The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM models were developed for predicting HbL proteins based upon amino acid composition (AC, dipeptide composition (DC, hybrid method (AC + DC, and position specific scoring matrix (PSSM. In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM profiles. The average accuracy, standard deviation (SD, false positive rate (FPR, confusion matrix, and receiver operating characteristic (ROC were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.
Orrù, Graziella; Pettersson-Yeo, William; Marquand, Andre F; Sartori, Giuseppe; Mechelli, Andrea
2012-04-01
Standard univariate analysis of neuroimaging data has revealed a host of neuroanatomical and functional differences between healthy individuals and patients suffering a wide range of neurological and psychiatric disorders. Significant only at group level however these findings have had limited clinical translation, and recent attention has turned toward alternative forms of analysis, including Support-Vector-Machine (SVM). A type of machine learning, SVM allows categorisation of an individual's previously unseen data into a predefined group using a classification algorithm, developed on a training data set. In recent years, SVM has been successfully applied in the context of disease diagnosis, transition prediction and treatment prognosis, using both structural and functional neuroimaging data. Here we provide a brief overview of the method and review those studies that applied it to the investigation of Alzheimer's disease, schizophrenia, major depression, bipolar disorder, presymptomatic Huntington's disease, Parkinson's disease and autistic spectrum disorder. We conclude by discussing the main theoretical and practical challenges associated with the implementation of this method into the clinic and possible future directions. Copyright © 2012 Elsevier Ltd. All rights reserved.
International Nuclear Information System (INIS)
Huang, Hong Zhong; Wang, Hai Kun; Li, Yan Feng; Zhang, Longlong; Liu, Zhiliang
2015-01-01
Estimation of remaining useful life (RUL) is helpful to manage life cycles of machines and to reduce maintenance cost. Support vector machine (SVM) is a promising algorithm for estimation of RUL because it can easily process small training sets and multi-dimensional data. Many SVM based methods have been proposed to predict RUL of some key components. We did a literature review related to SVM based RUL estimation within a decade. The references reviewed are classified into two categories: improved SVM algorithms and their applications to RUL estimation. The latter category can be further divided into two types: one, to predict the condition state in the future and then build a relationship between state and RUL; two, to establish a direct relationship between current state and RUL. However, SVM is seldom used to track the degradation process and build an accurate relationship between the current health condition state and RUL. Based on the above review and summary, this paper points out that the ability to continually improve SVM, and obtain a novel idea for RUL prediction using SVM will be future works.
Classification of Autism Spectrum Disorder Using Random Support Vector Machine Cluster
Directory of Open Access Journals (Sweden)
Xia-an Bi
2018-02-01
Full Text Available Autism spectrum disorder (ASD is mainly reflected in the communication and language barriers, difficulties in social communication, and it is a kind of neurological developmental disorder. Most researches have used the machine learning method to classify patients and normal controls, among which support vector machines (SVM are widely employed. But the classification accuracy of SVM is usually low, due to the usage of a single SVM as classifier. Thus, we used multiple SVMs to classify ASD patients and typical controls (TC. Resting-state functional magnetic resonance imaging (fMRI data of 46 TC and 61 ASD patients were obtained from the Autism Brain Imaging Data Exchange (ABIDE database. Only 84 of 107 subjects are utilized in experiments because the translation or rotation of 7 TC and 16 ASD patients has surpassed ±2 mm or ±2°. Then the random SVM cluster was proposed to distinguish TC and ASD. The results show that this method has an excellent classification performance based on all the features. Furthermore, the accuracy based on the optimal feature set could reach to 96.15%. Abnormal brain regions could also be found, such as inferior frontal gyrus (IFG (orbital and opercula part, hippocampus, and precuneus. It is indicated that the method of random SVM cluster may apply to the auxiliary diagnosis of ASD.
Directory of Open Access Journals (Sweden)
Rizky Maulana
2017-05-01
Full Text Available Twitter merupakan jejaring sosial dengan pertumbuhan tercepat sejak tahun 2006 menurut MIT Technology Review (2013, Indonesia menempati Negara ketiga penyumbang tweet terbanyak dengan jumlah 1 milyar tweet. Fakta tersebut menjadikan Twitter menjadi salah satu sumber data text yang dapat digali dan dimanfaatkan untuk berbagai keperluan melalui metode-metode pengambilan data teks atau text mining, salah satunya adalah analisis sentimen pengguna terhadap tokoh-tokoh publik indonesia. Penelitian ini membuat sebuah sistem yang dapat melakukan analisis sentimen pengguna twitter terhadap tokoh publik secara real time dengan menggunakan Twitter Streming API dan metode Support Vectore Machine (SVM memanfaatkan pustaka libSVM sebagai salah satu machine learning untuk text classification. Algoritma Porter digunakan dalam proses stemming untuk ekstraksi fitur dan metode Term Frequency untuk pembobotan. Perangkat lunak dibangun dengan menggunakan bahasa pemrograman PHP untuk sisi server yang berjalan pada platform cloud Windows Azure dan Java untuk sisi client yang berjalan pada platform Android. Dari hasil penelitian dengan 1.400 tweet pada dataset dan 200 data uji didapatkan akurasi sebesar 79,5%.
Support vector machines classification of hERG liabilities based on atom types.
Jia, Lei; Sun, Hongmao
2008-06-01
Drug-induced long QT syndrome (LQTS) can cause critical cardiovascular side effects and has accounted for the withdrawal of several drugs from the market. Blockade of the potassium ion channel encoded by the human ether-a-go-go-related gene (hERG) has been identified as a major contributor to drug-induced LQTS. Experimental measurement of hERG activity for each compound in development is costly and time-consuming, thus it is beneficial to develop a predictive hERG model. Here, we present a hERG classification model formulated using support vector machines (SVM) as machine learning method and using atom types as molecular descriptors. The training set used in this study was composed of 977 corporate compounds with hERG activities measured under the same conditions. The impact of soft margin and kernel function on the performance of the SVM models was examined. The robustness of SVM was evaluated by comparing the predictive power of the models built with 90%, 50%, and 10% of the training set data. The final SVM model was able to correctly classify 94% of an external testing set containing 66 drug molecules. The most important atom types with respect to discriminative power were extracted and analyzed.
Identifying Sugarcane Plantation using LANDSAT-8 Images with Support Vector Machines
Mulyono, Sidik; Nadirah
2016-11-01
The use of remote sensing has been highly beneficial in the identification and also mapping and monitoring of plantations. The identification of plantations includes the physiology, disease, environmental conditions, and also the production and time of harvesting. It can be done by doing satellite imagery classification. However, to reach the final result of identification, it could be carried out by getting the solid ground truth information. This paper will discuss about detection of sugarcane plantation in Magetan district of East Java province area by using LANDSAT-8 image with specific approach of phenology profile using EVI (Enhanced Vegetation Index) value from satellite data, as an alternative vegetation index to address some of the limitation of the NDVI (Normalized Difference Vegetation Index). Method of classification used for detecting sugarcane plantation is Support Vector machines (SVM), which is a promising machine learning methodology. It has the ability to generalize well even with limited training samples and complex data. A number of samples of phenology profile for training purpose using SVMs are obtained from the area that identified as sugarcane plantation during field campaign in 2015. The same manner is also done for the objects instead of sugarcane plantation with relatively the same number of samples. The result of the research shows that Remote Sensing is able to detect the sugarcane plantation cross the district with good accuracy.
Chen, Chi-Jim; Pai, Tun-Wen; Cheng, Mox
2015-03-31
A sweeping fingerprint sensor converts fingerprints on a row by row basis through image reconstruction techniques. However, a built fingerprint image might appear to be truncated and distorted when the finger was swept across a fingerprint sensor at a non-linear speed. If the truncated fingerprint images were enrolled as reference targets and collected by any automated fingerprint identification system (AFIS), successful prediction rates for fingerprint matching applications would be decreased significantly. In this paper, a novel and effective methodology with low time computational complexity was developed for detecting truncated fingerprints in a real time manner. Several filtering rules were implemented to validate existences of truncated fingerprints. In addition, a machine learning method of supported vector machine (SVM), based on the principle of structural risk minimization, was applied to reject pseudo truncated fingerprints containing similar characteristics of truncated ones. The experimental result has shown that an accuracy rate of 90.7% was achieved by successfully identifying truncated fingerprint images from testing images before AFIS enrollment procedures. The proposed effective and efficient methodology can be extensively applied to all existing fingerprint matching systems as a preliminary quality control prior to construction of fingerprint templates.
Laib dit Leksir, Y.; Mansour, M.; Moussaoui, A.
2018-03-01
Analysis and processing of databases obtained from infrared thermal inspections made on electrical installations require the development of new tools to obtain more information to visual inspections. Consequently, methods based on the capture of thermal images show a great potential and are increasingly employed in this field. However, there is a need for the development of effective techniques to analyse these databases in order to extract significant information relating to the state of the infrastructures. This paper presents a technique explaining how this approach can be implemented and proposes a system that can help to detect faults in thermal images of electrical installations. The proposed method classifies and identifies the region of interest (ROI). The identification is conducted using support vector machine (SVM) algorithm. The aim here is to capture the faults that exist in electrical equipments during an inspection of some machines using A40 FLIR camera. After that, binarization techniques are employed to select the region of interest. Later the comparative analysis of the obtained misclassification errors using the proposed method with Fuzzy c means and Ostu, has also be addressed.
Jegadeeshwaran, R.; Sugumaran, V.
2015-02-01
Hydraulic brakes in automobiles are important components for the safety of passengers; therefore, the brakes are a good subject for condition monitoring. The condition of the brake components can be monitored by using the vibration characteristics. On-line condition monitoring by using machine learning approach is proposed in this paper as a possible solution to such problems. The vibration signals for both good as well as faulty conditions of brakes were acquired from a hydraulic brake test setup with the help of a piezoelectric transducer and a data acquisition system. Descriptive statistical features were extracted from the acquired vibration signals and the feature selection was carried out using the C4.5 decision tree algorithm. There is no specific method to find the right number of features required for classification for a given problem. Hence an extensive study is needed to find the optimum number of features. The effect of the number of features was also studied, by using the decision tree as well as Support Vector Machines (SVM). The selected features were classified using the C-SVM and Nu-SVM with different kernel functions. The results are discussed and the conclusion of the study is presented.
Directory of Open Access Journals (Sweden)
Lei Jiang
2012-01-01
Full Text Available Energy signature analysis of power appliance is the core of nonintrusive load monitoring (NILM where the detailed data of the appliances used in houses are obtained by analyzing changes in the voltage and current. This paper focuses on developing an automatic power load event detection and appliance classification based on machine learning. In power load event detection, the paper presents a new transient detection algorithm. By turn-on and turn-off transient waveforms analysis, it can accurately detect the edge point when a device is switched on or switched off. The proposed load classification technique can identify different power appliances with improved recognition accuracy and computational speed. The load classification method is composed of two processes including frequency feature analysis and support vector machine. The experimental results indicated that the incorporation of the new edge detection and turn-on and turn-off transient signature analysis into NILM revealed more information than traditional NILM methods. The load classification method has achieved more than ninety percent recognition rate.
Delgado, Juan A.; Altuve, Miguel; Nabhan Homsi, Masun
2015-12-01
This paper introduces a robust method based on the Support Vector Machine (SVM) algorithm to detect the presence of Fetal QRS (fQRS) complexes in electrocardiogram (ECG) recordings provided by the PhysioNet/CinC challenge 2013. ECG signals are first segmented into contiguous frames of 250 ms duration and then labeled in six classes. Fetal segments are tagged according to the position of fQRS complex within each one. Next, segment features extraction and dimensionality reduction are obtained by applying principal component analysis on Haar-wavelet transform. After that, two sub-datasets are generated to separate representative segments from atypical ones. Imbalanced class problem is dealt by applying sampling without replacement on each sub-dataset. Finally, two SVMs are trained and cross-validated using the two balanced sub-datasets separately. Experimental results show that the proposed approach achieves high performance rates in fetal heartbeats detection that reach up to 90.95% of accuracy, 92.16% of sensitivity, 88.51% of specificity, 94.13% of positive predictive value and 84.96% of negative predictive value. A comparative study is also carried out to show the performance of other two machine learning algorithms for fQRS complex estimation, which are K-nearest neighborhood and Bayesian network.
Directory of Open Access Journals (Sweden)
Marijana Zekić-Sušac
2010-12-01
Full Text Available Entrepreneurial intentions of students are important to recognize during the study in order to provide those students with educational background that will support such intentions and lead them to successful entrepreneurship after the study. The paper aims to develop a model that will classify students according to their entrepreneurial intentions by benchmarking three machine learning classifiers: neural networks, decision trees, and support vector machines. A survey was conducted at a Croatian university including a sample of students at the first year of study. Input variables described students’ demographics, importance of business objectives, perception of entrepreneurial carrier, and entrepreneurial predispositions. Due to a large dimension of input space, a feature selection method was used in the pre-processing stage. For comparison reasons, all tested models were validated on the same out-of-sample dataset, and a cross-validation procedure for testing generalization ability of the models was conducted. The models were compared according to its classification accuracy, as well according to input variable importance. The results show that although the best neural network model produced the highest average hit rate, the difference in performance is not statistically significant. All three models also extract similar set of features relevant for classifying students, which can be suggested to be taken into consideration by universities while designing their academic programs.
Guidi, Gabriele; Maffei, Nicola; Vecchi, Claudio; Ciarmatori, Alberto; Mistretta, Grazia Maria; Gottardi, Giovanni; Meduri, Bruno; Baldazzi, Giuseppe; Bertoni, Filippo; Costi, Tiziana
2015-07-01
Adaptive radiation therapy (ART) is an advanced field of radiation oncology. Image-guided radiation therapy (IGRT) methods can support daily setup and assess anatomical variations during therapy, which could prevent incorrect dose distribution and unexpected toxicities. A re-planning to correct these anatomical variations should be done daily/weekly, but to be applicable to a large number of patients, still require time consumption and resources. Using unsupervised machine learning on retrospective data, we have developed a predictive network, to identify patients that would benefit of a re-planning. 1200 MVCT of 40 head and neck (H&N) cases were re-contoured, automatically, using deformable hybrid registration and structures mapping. Deformable algorithm and MATLAB(®) homemade machine learning process, developed, allow prediction of criticalities for Tomotherapy treatments. Using retrospective analysis of H&N treatments, we have investigated and predicted tumor shrinkage and organ at risk (OAR) deformations. Support vector machine (SVM) and cluster analysis have identified cases or treatment sessions with potential criticalities, based on dose and volume discrepancies between fractions. During 1st weeks of treatment, 84% of patients shown an output comparable to average standard radiation treatment behavior. Starting from the 4th week, significant morpho-dosimetric changes affect 77% of patients, suggesting need for re-planning. The comparison of treatment delivered and ART simulation was carried out with receiver operating characteristic (ROC) curves, showing monotonous increase of ROC area. Warping methods, supported by daily image analysis and predictive tools, can improve personalization and monitoring of each treatment, thereby minimizing anatomic and dosimetric divergences from initial constraints. Copyright © 2015 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Mining protein function from text using term-based support vector machines
Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J
2005-01-01
Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835
Feres, Magda; Louzoun, Yoram; Haber, Simi; Faveri, Marcelo; Figueiredo, Luciene C; Levin, Liran
2018-02-01
The existence of specific microbial profiles for different periodontal conditions is still a matter of debate. The aim of this study was to test the hypothesis that 40 bacterial species could be used to classify patients, utilising machine learning, into generalised chronic periodontitis (ChP), generalised aggressive periodontitis (AgP) and periodontal health (PH). Subgingival biofilm samples were collected from patients with AgP, ChP and PH and analysed for their content of 40 bacterial species using checkerboard DNA-DNA hybridisation. Two stages of machine learning were then performed. First of all, we tested whether there was a difference between the composition of bacterial communities in PH and in disease, and then we tested whether a difference existed in the composition of bacterial communities between ChP and AgP. The data were split in each analysis to 70% train and 30% test. A support vector machine (SVM) classifier was used with a linear kernel and a Box constraint of 1. The analysis was divided into two parts. Overall, 435 patients (3,915 samples) were included in the analysis (PH = 53; ChP = 308; AgP = 74). The variance of the healthy samples in all principal component analysis (PCA) directions was smaller than that of the periodontally diseased samples, suggesting that PH is characterised by a uniform bacterial composition and that the bacterial composition of periodontally diseased samples is much more diverse. The relative bacterial load could distinguish between AgP and ChP. An SVC classifier using a panel of 40 bacterial species was able to distinguish between PH, AgP in young individuals and ChP. © 2017 FDI World Dental Federation.
Directory of Open Access Journals (Sweden)
Hazlee Azil Illias
Full Text Available Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM with modified evolutionary particle swarm optimisation (EPSO algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO-Time Varying Acceleration Coefficient (TVAC technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site.
Illias, Hazlee Azil; Zhao Liang, Wee
2018-01-01
Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site.
Chen, Y. M.; Lin, P.; He, J. Q.; He, Y.; Li, X.L.
2016-01-01
This study was carried out for rapid and noninvasive determination of the class of sorghum species by using the manifold dimensionality reduction (MDR) method and the nonlinear regression method of least squares support vector machines (LS-SVM) combing with the mid-infrared spectroscopy (MIRS) techniques. The methods of Durbin and Run test of augmented partial residual plot (APaRP) were performed to diagnose the nonlinearity of the raw spectral data. The nonlinear MDR methods of isometric feature mapping (ISOMAP), local linear embedding, laplacian eigenmaps and local tangent space alignment, as well as the linear MDR methods of principle component analysis and metric multidimensional scaling were employed to extract the feature variables. The extracted characteristic variables were utilized as the input of LS-SVM and established the relationship between the spectra and the target attributes. The mean average precision (MAP) scores and prediction accuracy were respectively used to evaluate the performance of models. The prediction results showed that the ISOMAP-LS-SVM model obtained the best classification performance, where the MAP scores and prediction accuracy were 0.947 and 92.86%, respectively. It can be concluded that the ISOMAP-LS-SVM model combined with the MIRS technique has the potential of classifying the species of sorghum in a reasonable accuracy. PMID:26817580
Directory of Open Access Journals (Sweden)
Milan Gocic
2016-01-01
Full Text Available The monthly precipitation data from 29 stations in Serbia during the period of 1946–2012 were considered. Precipitation trends were calculated using linear regression method. Three CLINO periods (1961–1990, 1971–2000, and 1981–2010 in three subregions were analysed. The CLINO 1981–2010 period had a significant increasing trend. Spatial pattern of the precipitation concentration index (PCI was presented. For the purpose of PCI prediction, three Support Vector Machine (SVM models, namely, SVM coupled with the discrete wavelet transform (SVM-Wavelet, the firefly algorithm (SVM-FFA, and using the radial basis function (SVM-RBF, were developed and used. The estimation and prediction results of these models were compared with each other using three statistical indicators, that is, root mean square error, coefficient of determination, and coefficient of efficiency. The experimental results showed that an improvement in predictive accuracy and capability of generalization can be achieved by the SVM-Wavelet approach. Moreover, the results indicated the proposed SVM-Wavelet model can adequately predict the PCI.
Chen, Y M; Lin, P; He, J Q; He, Y; Li, X L
2016-01-28
This study was carried out for rapid and noninvasive determination of the class of sorghum species by using the manifold dimensionality reduction (MDR) method and the nonlinear regression method of least squares support vector machines (LS-SVM) combing with the mid-infrared spectroscopy (MIRS) techniques. The methods of Durbin and Run test of augmented partial residual plot (APaRP) were performed to diagnose the nonlinearity of the raw spectral data. The nonlinear MDR methods of isometric feature mapping (ISOMAP), local linear embedding, laplacian eigenmaps and local tangent space alignment, as well as the linear MDR methods of principle component analysis and metric multidimensional scaling were employed to extract the feature variables. The extracted characteristic variables were utilized as the input of LS-SVM and established the relationship between the spectra and the target attributes. The mean average precision (MAP) scores and prediction accuracy were respectively used to evaluate the performance of models. The prediction results showed that the ISOMAP-LS-SVM model obtained the best classification performance, where the MAP scores and prediction accuracy were 0.947 and 92.86%, respectively. It can be concluded that the ISOMAP-LS-SVM model combined with the MIRS technique has the potential of classifying the species of sorghum in a reasonable accuracy.
Directory of Open Access Journals (Sweden)
Jinman Wang
2014-01-01
Full Text Available The effect of gypsum on the physical and chemical characteristics of sodic soils is nonlinear and controlled by multiple factors. The support vector machine (SVM is able to solve practical problems such as small samples, nonlinearity, high dimensions, and local minima points. This paper reports the use of the SVM regression method to predict changes in the chemical properties of sodic soils under different gypsum application rates in a soil column experiment and to evaluate the effect of gypsum reclamation on sodic soils. The research results show that (1 the SVM soil solute transport model using the Matlab toolbox represents the change in Ca2+ and Na+ in the soil solution and leachate well, with a high prediction accuracy. (2 Using the SVM model to predict the spatial and temporal variations in the soil solute content is feasible and does not require a specific mathematical model. The SVM model can take full advantage of the distribution characteristics of the training sample. (3 The workload of the soil solute transport prediction model based on the SVM is greatly reduced by not having to determine the hydrodynamic dispersion coefficient and retardation coefficient, and the model is thus highly practical.
2018-01-01
Early detection of power transformer fault is important because it can reduce the maintenance cost of the transformer and it can ensure continuous electricity supply in power systems. Dissolved Gas Analysis (DGA) technique is commonly used to identify oil-filled power transformer fault type but utilisation of artificial intelligence method with optimisation methods has shown convincing results. In this work, a hybrid support vector machine (SVM) with modified evolutionary particle swarm optimisation (EPSO) algorithm was proposed to determine the transformer fault type. The superiority of the modified PSO technique with SVM was evaluated by comparing the results with the actual fault diagnosis, unoptimised SVM and previous reported works. Data reduction was also applied using stepwise regression prior to the training process of SVM to reduce the training time. It was found that the proposed hybrid SVM-Modified EPSO (MEPSO)-Time Varying Acceleration Coefficient (TVAC) technique results in the highest correct identification percentage of faults in a power transformer compared to other PSO algorithms. Thus, the proposed technique can be one of the potential solutions to identify the transformer fault type based on DGA data on site. PMID:29370230
International Nuclear Information System (INIS)
Liu, Jie
2015-01-01
This Ph. D. work is motivated by the possibility of monitoring the conditions of components of energy systems for their extended and safe use, under proper practice of operation and adequate policies of maintenance. The aim is to develop a Support Vector Regression (SVR)-based framework for predicting time series data under stationary/nonstationary environmental and operational conditions. Single SVR and SVR-based ensemble approaches are developed to tackle the prediction problem based on both small and large datasets. Strategies are proposed for adaptively updating the single SVR and SVR-based ensemble models in the existence of pattern drifts. Comparisons with other online learning approaches for kernel-based modelling are provided with reference to time series data from a critical component in Nuclear Power Plants (NPPs) provided by Electricite de France (EDF). The results show that the proposed approaches achieve comparable prediction results, considering the Mean Squared Error (MSE) and Mean Relative Error (MRE), in much less computation time. Furthermore, by analyzing the geometrical meaning of the Feature Vector Selection (FVS) method proposed in the literature, a novel geometrically interpretable kernel method, named Reduced Rank Kernel Ridge Regression-II (RRKRR-II), is proposed to describe the linear relations between a predicted value and the predicted values of the Feature Vectors (FVs) selected by FVS. Comparisons with several kernel methods on a number of public datasets prove the good prediction accuracy and the easy-of-tuning of the hyper-parameters of RRKRR-II. (author)
Chen, Chau-Kuang
2010-01-01
Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…
Directory of Open Access Journals (Sweden)
Shovasis Kumar Biswas
2015-02-01
Full Text Available Abstract Support Vector Machine SVM and back-propagation neural network BPNN has been applied successfully in many areas for example rule extraction classification and evaluation. In this paper we studied the back-propagation algorithm for training the multilayer artificial neural network and a support vector machine for data classification and image reconstruction aspects. A model focused on SVM with Gaussian RBF kernel is utilized here for data classification. Back propagation neural network is viewed as one of the most straightforward and is most general methods used for supervised training of multilayered neural network. We compared a support vector machine SVM with a back-propagation neural network BPNN for the task of data classification and image reconstruction. We made a comparison between the performances of the multi-class classification of these two learning methods. Comparing with these two methods we can conclude that the classification accuracy of the support vector machine is better and algorithm is much faster than the MLP with back propagation algorithm.
DEFF Research Database (Denmark)
Andersen, Rasmus S.; Poulsen, Erik S.; Puthusserypady, Sadasivan
2017-01-01
for AF detection based on Inter Beat Intervals (IBI) extracted from long term electrocardiogram (ECG) recordings. Five time-domain features are extracted from the IBIs and a Support Vector Machine (SVM) is used for classiﬁcation. The results are compared to a state of the art algorithm based on raw ECG...
A Machine Learning Regression scheme to design a FR-Image Quality Assessment Algorithm
Charrier, Christophe; Lezoray, Olivier; Lebrun, Gilles
2012-01-01
International audience; A crucial step in image compression is the evaluation of its performance, and more precisely available ways to measure the quality of compressed images. In this paper, a machine learning expert, providing a quality score is proposed. This quality measure is based on a learned classification process in order to respect that of human observers. The proposed method namely Machine Learning-based Image Quality Measurment (MLIQM) first classifies the quality using multi Supp...
Sherafati, Sh. A.; Saradjian, M. R.; Niazmardi, S.
2013-09-01
Numerous investigations on Urban Heat Island (UHI) show that land cover change is the main factor of increasing Land Surface Temperature (LST) in urban areas. Therefore, to achieve a model which is able to simulate UHI growth, urban expansion should be concerned first. Considerable researches on urban expansion modeling have been done based on cellular automata. Accordingly the objective of this paper is to implement CA method for trend detection of Tehran UHI spatiotemporal growth based on urban sprawl parameters (such as Distance to nearest road, Digital Elevation Model (DEM), Slope and Aspect ratios). It should be mentioned that UHI growth modeling may have more complexities in comparison with urban expansion, since the amount of each pixel's temperature should be investigated instead of its state (urban and non-urban areas). The most challenging part of CA model is the definition of Transfer Rules. Here, two methods have used to find appropriate transfer Rules which are Artificial Neural Networks (ANN) and Support Vector Regression (SVR). The reason of choosing these approaches is that artificial neural networks and support vector regression have significant abilities to handle the complications of such a spatial analysis in comparison with other methods like Genetic or Swarm intelligence. In this paper, UHI change trend has discussed between 1984 and 2007. For this purpose, urban sprawl parameters in 1984 have calculated and added to the retrieved LST of this year. In order to achieve LST, Thematic Mapper (TM) and Enhanced Thematic Mapper (ETM+) night-time images have exploited. The reason of implementing night-time images is that UHI phenomenon is more obvious during night hours. After that multilayer feed-forward neural networks and support vector regression have used separately to find the relationship between this data and the retrieved LST in 2007. Since the transfer rules might not be the same in different regions, the satellite image of the city has
Directory of Open Access Journals (Sweden)
Hamed Ahmadi
2017-06-01
Full Text Available BackgroundIn the nutrition literature, there are several reports on the use of artificial neural network (ANN and multiple linear regression (MLR approaches for predicting feed composition and nutritive value, while the use of support vector machines (SVM method as a new alternative approach to MLR and ANN models is still not fully investigated.MethodsThe MLR, ANN, and SVM models were developed to predict metabolizable energy (ME content of compound feeds for pigs based on the German energy evaluation system from analyzed contents of crude protein (CP, ether extract (EE, crude fiber (CF, and starch. A total of 290 datasets from standardized digestibility studies with compound feeds was provided from several institutions and published papers, and ME was calculated thereon. Accuracy and precision of developed models were evaluated, given their produced prediction values.ResultsThe results revealed that the developed ANN [R2 = 0.95; root mean square error (RMSE = 0.19 MJ/kg of dry matter] and SVM (R2 = 0.95; RMSE = 0.21 MJ/kg of dry matter models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR (R2 = 0.89; RMSE = 0.27 MJ/kg of dry matter.ConclusionThe developed ANN and SVM models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR; however, there were not obvious differences between performance of ANN and SVM models. Thus, SVM model may also be considered as a promising tool for modeling the relationship between chemical composition and ME of compound feeds for pigs. To provide the readers and nutritionist with the easy and rapid tool, an Excel® calculator, namely, SVM_ME_pig, was created to predict the metabolizable energy values in compound feeds for pigs using developed support vector machine model.
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin
2017-01-01
Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction. PMID:28708836
Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin
2017-01-01
Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction.
Directory of Open Access Journals (Sweden)
Bing-Chun Liu
Full Text Available Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR for Urban Air Quality Index (AQI prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction.
Delbari, Masoomeh; Sharifazari, Salman; Mohammadi, Ehsan
2018-02-01
The knowledge of soil temperature at different depths is important for agricultural industry and for understanding climate change. The aim of this study is to evaluate the performance of a support vector regression (SVR)-based model in estimating daily soil temperature at 10, 30 and 100 cm depth at different climate conditions over Iran. The obtained results were compared to those obtained from a more classical multiple linear regression (MLR) model. The correlation sensitivity for the input combinations and periodicity effect were also investigated. Climatic data used as inputs to the models were minimum and maximum air temperature, solar radiation, relative humidity, dew point, and the atmospheric pressure (reduced to see level), collected from five synoptic stations Kerman, Ahvaz, Tabriz, Saghez, and Rasht located respectively in the hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid climate conditions. According to the results, the performance of both MLR and SVR models was quite well at surface layer, i.e., 10-cm depth. However, SVR performed better than MLR in estimating soil temperature at deeper layers especially 100 cm depth. Moreover, both models performed better in humid climate condition than arid and hyper-arid areas. Further, adding a periodicity component into the modeling process considerably improved the models' performance especially in the case of SVR.
Yao, X.; Tham, L. G.; Dai, F. C.
2008-11-01
The Support Vector Machine (SVM) is an increasingly popular learning procedure based on statistical learning theory, and involves a training phase in which the model is trained by a training dataset of associated input and target output values. The trained model is then used to evaluate a separate set of testing data. There are two main ideas underlying the SVM for discriminant-type problems. The first is an optimum linear separating hyperplane that separates the data patterns. The second is the use of kernel functions to convert the original non-linear data patterns into the format that is linearly separable in a high-dimensional feature space. In this paper, an overview of the SVM, both one-class and two-class SVM methods, is first presented followed by its use in landslide susceptibility mapping. A study area was selected from the natural terrain of Hong Kong, and slope angle, slope aspect, elevation, profile curvature of slope, lithology, vegetation cover and topographic wetness index (TWI) were used as environmental parameters which influence the occurrence of landslides. One-class and two-class SVM models were trained and then used to map landslide susceptibility respectively. The resulting susceptibility maps obtained by the methods were compared to that obtained by the logistic regression (LR) method. It is concluded that two-class SVM possesses better prediction efficiency than logistic regression and one-class SVM. However, one-class SVM, which only requires failed cases, has an advantage over the other two methods as only "failed" case information is usually available in landslide susceptibility mapping.
Astuti, Yuniar Andi
2011-01-01
This study examines techniques Support Vector Regression and Decision Tree C4.5 has been used in studies in various fields, in order to know the advantages and disadvantages of both techniques that appear in Data Mining. From the ten studies that use both techniques, the results of the analysis showed that the accuracy of the SVR technique for 59,64% and C4.5 for 76,97% So in this study obtained a statement that C4.5 is better than SVR 097038020
Directory of Open Access Journals (Sweden)
Hosein Nouri-Ahmadabadi
2017-12-01
Full Text Available In this study, an intelligent system based on combined machine vision (MV and Support Vector Machine (SVM was developed for sorting of peeled pistachio kernels and shells. The system was composed of conveyor belt, lighting box, camera, processing unit and sorting unit. A color CCD camera was used to capture images. The images were digitalized by a capture card and transferred to a personal computer for further analysis. Initially, images were converted from RGB color space to HSV color ones. For segmentation of the acquired images, H-component in the HSV color space and Otsu thresholding method were applied. A feature vector containing 30 color features was extracted from the captured images. A feature selection method based on sensitivity analysis was carried out to select superior features. The selected features were presented to SVM classifier. Various SVM models having a different kernel function were developed and tested. The SVM model having cubic polynomial kernel function and 38 support vectors achieved the best accuracy (99.17% and then was selected to use in online decision-making unit of the system. By launching the online system, it was found that limiting factors of the system capacity were related to the hardware parts of the system (conveyor belt and pneumatic valves used in the sorting unit. The limiting factors led to a distance of 8 mm between the samples. The overall accuracy and capacity of the sorter were obtained 94.33% and 22.74 kg/h, respectively. Keywords: Pistachio kernel, Sorting, Machine vision, Sensitivity analysis, Support vector machine
Alouene, Y.; Petropoulos, G. P.; Kalogrias, A.; Papanikolaou, F.
2012-04-01
Floods are a water-related natural disaster affecting and often threatening different aspects of human life, such as property damage, economic degradation, and in some instances even loss of precious human lives. Being able to provide accurately and cost-effectively assessment of damage from floods is essential to both scientists and policy makers in many aspects ranging from mitigating to assessing damage extent as well as in rehabilitation of affected areas. Remote Sensing often combined with Geographical Information Systems (GIS) has generally shown a very promising potential in performing rapidly and cost-effectively flooding damage assessment, particularly so in remote, otherwise inaccessible locations. The progress in remote sensing during the last twenty years or so has resulted to the development of a large number of image processing techniques suitable for use with a range of remote sensing data in performing flooding damage assessment. Supervised image classification is regarded as one of the most widely used approaches employed for this purpose. Yet, the use of recently developed image classification algorithms such as of machine learning-based Support Vector Machines (SVMs) classifier has not been adequately investigated for this purpose. The objective of our work had been to quantitatively evaluate the ability of SVMs combined with Landsat TM multispectral imagery in performing a damage assessment of a flood occurred in a Mediterranean region. A further objective has been to examine if the inclusion of additional spectral information apart from the original TM bands in SVMs can improve flooded area extraction accuracy. As a case study is used the case of a river Evros flooding of 2010 located in the north of Greece, in which TM imagery before and shortly after the flooding was available. Assessment of the flooded area is performed in a GIS environment on the basis of classification accuracy assessment metrics as well as comparisons versus a vector
PENERAPAN METODE VECTOR AUTO REGRESSION DALAM INTERAKSI KEBIJAKAN FISKAL DAN MONETER DI INDONESIA
Directory of Open Access Journals (Sweden)
Adrian Sutawijaya
2013-06-01
Full Text Available The purpose of this study is to analyze the interaction of fiscal and monetary policy in Indonesia, especially after the introduction of fiscal and monetary policy shocks. The research method used is the vector autoregression (VAR. VAR is usually used for projecting coherent system variables and time to analyze the dynamic impact of disturbance factors contained in the system variables. Variables used in this study is the level of interest rates as a proxy for monetary policy instruments, government expenditures as a proxy for fiscal policy, inflation rates and national income. The results show that fiscal policy is a negative shock to inflation and responded with a tight monetary policy, while the shock in monetary policy will reduce national income. The application of fiscal and monetary policies that will effectively promote economic growth.
Peramalan Crude Palm Oil (CPO Menggunakan Support Vector Regression Kernel Radial Basis
Directory of Open Access Journals (Sweden)
Rezzy Eko Caraka
2017-06-01
Full Text Available Recently, instead of selecting a kernel has been proposed which uses SVR, where the weight of each kernel is optimized during training. Along this line of research, many pioneering kernel learning algorithms have been proposed. The use of kernels provides a powerful and principled approach to modeling nonlinear patterns through linear patterns in a feature space. Another bene?t is that the design of kernels and linear methods can be decoupled, which greatly facilitates the modularity of machine learning methods. We perform experiments on real data sets crude palm oil prices for application and better illustration using kernel radial basis. We see that evaluation gives a good to fit prediction and actual also good values showing the validity and accuracy of the realized model based on MAPE and R2. Keywords: Crude Palm Oil; Forecasting; SVR; Radial Basis; Kernel
Clustering and Support Vector Regression for Water Demand Forecasting and Anomaly Detection
Directory of Open Access Journals (Sweden)
Antonio Candelieri
2017-03-01
Full Text Available This paper presents a completely data-driven and machine-learning-based approach, in two stages, to first characterize and then forecast hourly water demand in the short term with applications of two different data sources: urban water demand (SCADA data and individual customer water consumption (AMR data. In the first case, reliable forecasting can be used to optimize operations, particularly the pumping schedule, in order to reduce energy-related costs, while in the second case, the comparison between forecast and actual values may support the online detection of anomalies, such as smart meter faults, fraud or possible cyber-physical attacks. Results are presented for a real case: the water distribution network in Milan.
Directory of Open Access Journals (Sweden)
Gabere MN
2016-06-01
Full Text Available Musa Nur Gabere,1 Mohamed Aly Hussein,1 Mohammad Azhar Aziz2 1Department of Bioinformatics, King Abdullah International Medical Research Center/King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia; 2Colorectal Cancer Research Program, Department of Medical Genomics, King Abdullah International Medical Research Center, Riyadh, Saudi Arabia Purpose: There has been considerable interest in using whole-genome expression profiles for the classification of colorectal cancer (CRC. The selection of important features is a crucial step before training a classifier.Methods: In this study, we built a model that uses support vector machine (SVM to classify cancer and normal samples using Affymetrix exon microarray data obtained from 90 samples of 48 patients diagnosed with CRC. From the 22,011 genes, we selected the 20, 30, 50, 100, 200, 300, and 500 genes most relevant to CRC using the minimum-redundancy–maximum-relevance (mRMR technique. With these gene sets, an SVM model was designed using four different kernel types (linear, polynomial, radial basis function [RBF], and sigmoid.Results: The best model, which used 30 genes and RBF kernel, outperformed other combinations; it had an accuracy of 84% for both ten fold and leave-one-out cross validations in discriminating the cancer samples from the normal samples. With this 30 genes set from mRMR, six classifiers were trained using random forest (RF, Bayes net (BN, multilayer perceptron (MLP, naïve Bayes (NB, reduced error pruning tree (REPT, and SVM. Two hybrids, mRMR + SVM and mRMR + BN, were the best models when tested on other datasets, and they achieved a prediction accuracy of 95.27% and 91.99%, respectively, compared to other mRMR hybrid models (mRMR + RF, mRMR + NB, mRMR + REPT, and mRMR + MLP. Ingenuity pathway analysis was used to analyze the functions of the 30 genes selected for this model and their potential association with CRC: CDH3, CEACAM7, CLDN1, IL8, IL6R, MMP1
Directory of Open Access Journals (Sweden)
Abazar Solgi
2017-06-01
Full Text Available Introduction: Chemical pollution of surface water is one of the serious issues that threaten the quality of water. This would be more important when the surface waters used for human drinking supply. One of the key parameters used to measure water pollution is BOD. Because many variables affect the water quality parameters and a complex nonlinear relationship between them is established conventional methods can not solve the problem of quality management of water resources. For years, the Artificial Intelligence methods were used for prediction of nonlinear time series and a good performance of them has been reported. Recently, the wavelet transform that is a signal processing method, has shown good performance in hydrological modeling and is widely used. Extensive research has been globally provided in use of Artificial Neural Network and Adaptive Neural Fuzzy Inference System models to forecast the BOD. But support vector machine has not yet been extensively studied. For this purpose, in this study the ability of support vector machine to predict the monthly BOD parameter based on the available data, temperature, river flow, DO and BOD was evaluated. Materials and Methods: SVM was introduced in 1992 by Vapnik that was a Russian mathematician. This method has been built based on the statistical learning theory. In recent years the use of SVM, is highly taken into consideration. SVM was used in applications such as handwriting recognition, face recognition and has good results. Linear SVM is simplest type of SVM, consists of a hyperplane that dataset of positive and negative is separated with maximum distance. The suitable separator has maximum distance from every one of two dataset. So about this machine that its output groups label (here -1 to +1, the aim is to obtain the maximum distance between categories. This is interpreted to have a maximum margin. Wavelet transform is one of methods in the mathematical science that its main idea was
Directory of Open Access Journals (Sweden)
P. Arockia Jansi Rani
2010-08-01
Full Text Available Image compression is very important in reducing the costs of data storage and transmission in relatively slow channels. In this paper, a still image compression scheme driven by Self-Organizing Map with polynomial regression modeling and entropy coding, employed within the wavelet framework is presented. The image compressibility and interpretability are improved by incorporating noise reduction into the compression scheme. The implementation begins with the classical wavelet decomposition, quantization followed by Huffman encoder. The codebook for the quantization process is designed using an unsupervised learning algorithm and further modified using polynomial regression to control the amount of noise reduction. Simulation results show that the proposed method reduces bit rate significantly and provides better perceptual quality than earlier methods.
A support vector machine to search for metal-poor galaxies
Shi, Fei; Liu, Yu-Yan; Kong, Xu; Chen, Yang; Li, Zhong-Hua; Zhi, Shu-Teng
2014-10-01
To develop a fast and reliable method for selecting metal-poor galaxies (MPGs), especially in large surveys and huge data bases, a support vector machine (SVM) supervized learning algorithms is applied to a sample of star-forming galaxies from the Sloan Digital Sky Survey data release 9 provided by the Max Planck Institute and the Johns Hopkins University (http://www.sdss3.org/dr9/spectro/spectroaccess.php). A two-step approach is adopted: (i) the SVM must be trained with a subset of objects that are known to be either MPGs or metal-rich galaxies (MRGs), treating the strong emission line flux measurements as input feature vectors in n-dimensional space, where n is the number of strong emission line flux ratios. (ii) After training on a sample of star-forming galaxies, the remaining galaxies are classified in the automatic test analysis as either MPGs or MRGs using a 10-fold cross-validation technique. For target selection, we have achieved an acquisition accuracy for MPGs of ˜96 and ˜95 per cent for an MPG threshold of 12 + log(O/H) = 8.00 and 12 + log(O/H) = 8.39, respectively. Running the code takes minutes in most cases under the MATLAB 2013a software environment. The code in the Letter is available on the web (http://fshi5388.blog.163.com). The SVM method can easily be extended to any MPGs target selection task and can be regarded as an efficient classification method particularly suitable for modern large surveys.
TJ-II wave forms analysis with wavelets and support vector machines
International Nuclear Information System (INIS)
Dormido-Canto, S.; Farias, G.; Dormido, R.; Vega, J.; Sanchez, J.; Santos, M.
2004-01-01
Since the fusion plasma experiment generates hundreds of signals, it is essential to have automatic mechanisms for searching similarities and retrieving of specific data in the wave form database. Wavelet transform (WT) is a transformation that allows one to map signals to spaces of lower dimensionality. Support vector machine (SVM) is a very effective method for general purpose pattern recognition. Given a set of input vectors which belong to two different classes, the SVM maps the inputs into a high-dimensional feature space through some nonlinear mapping, where an optimal separating hyperplane is constructed. In this work, the combined use of WT and SVM is proposed for searching and retrieving similar wave forms in the TJ-II database. In a first stage, plasma signals will be preprocessed by WT to reduce their dimensionality and to extract their main features. In the next stage, and using the smoothed signals produced by the WT, SVM will be applied to show up the efficiency of the proposed method to deal with the problem of sorting out thousands of fusion plasma signals.From observation of several experiments, our WT+SVM method is very viable, and the results seems promising. However, we have further work to do. We have to finish the development of a Matlab toolbox for WT+SVM processing and to include new relevant features in the SVM inputs to improve the technique. We have also to make a better preprocessing of the input signals and to study the performance of other generic and self custom kernels. To reach it, and since the preprocessing stages are very time consuming, we are going to study the viability of using DSPs, RPGAs or parallel programming techniques to reduce the execution time
Support vector machine-based facial-expression recognition method combining shape and appearance
Han, Eun Jung; Kang, Byung Jun; Park, Kang Ryoung; Lee, Sangyoun
2010-11-01
Facial expression recognition can be widely used for various applications, such as emotion-based human-machine interaction, intelligent robot interfaces, face recognition robust to expression variation, etc. Previous studies have been classified as either shape- or appearance-based recognition. The shape-based method has the disadvantage that the individual variance of facial feature points exists irrespective of similar expressions, which can cause a reduction of the recognition accuracy. The appearance-based method has a limitation in that the textural information of the face is very sensitive to variations in illumination. To overcome these problems, a new facial-expression recognition method is proposed, which combines both shape and appearance information, based on the support vector machine (SVM). This research is novel in the following three ways as compared to previous works. First, the facial feature points are automatically detected by using an active appearance model. From these, the shape-based recognition is performed by using the ratios between the facial feature points based on the facial-action coding system. Second, the SVM, which is trained to recognize the same and different expression classes, is proposed to combine two matching scores obtained from the shape- and appearance-based recognitions. Finally, a single SVM is trained to discriminate four different expressions, such as neutral, a smile, anger, and a scream. By determining the expression of the input facial image whose SVM output is at a minimum, the accuracy of the expression recognition is much enhanced. The experimental results showed that the recognition accuracy of the proposed method was better than previous researches and other fusion methods.
Sørensen, Lauge; Nielsen, Mads
2018-05-15
The International Challenge for Automated Prediction of MCI from MRI data offered independent, standardized comparison of machine learning algorithms for multi-class classification of normal control (NC), mild cognitive impairment (MCI), converting MCI (cMCI), and Alzheimer's disease (AD) using brain imaging and general cognition. We proposed to use an ensemble of support vector machines (SVMs) that combined bagging without replacement and feature selection. SVM is the most commonly used algorithm in multivariate classification of dementia, and it was therefore valuable to evaluate the potential benefit of ensembling this type of classifier. The ensemble SVM, using either a linear or a radial basis function (RBF) kernel, achieved multi-class classification accuracies of 55.6% and 55.0% in the challenge test set (60 NC, 60 MCI, 60 cMCI, 60 AD), resulting in a third place in the challenge. Similar feature subset sizes were obtained for both kernels, and the most frequently selected MRI features were the volumes of the two hippocampal subregions left presubiculum and right subiculum. Post-challenge analysis revealed that enforcing a minimum number of selected features and increasing the number of ensemble classifiers improved classification accuracy up to 59.1%. The ensemble SVM outperformed single SVM classifications consistently in the challenge test set. Ensemble methods using bagging and feature selection can improve the performance of the commonly applied SVM classifier in dementia classification. This resulted in competitive classification accuracies in the International Challenge for Automated Prediction of MCI from MRI data. Copyright © 2018 Elsevier B.V. All rights reserved.
An efficient scheme for automatic web pages categorization using the support vector machine
Bhalla, Vinod Kumar; Kumar, Neeraj
2016-07-01
In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.
Support vector machines for automated snoring detection: proof-of-concept.
Samuelsson, Laura B; Rangarajan, Anusha A; Shimada, Kenji; Krafty, Robert T; Buysse, Daniel J; Strollo, Patrick J; Kravitz, Howard M; Zheng, Huiyong; Hall, Martica H
2017-03-01
Snoring has been shown to be associated with adverse physical and mental health, independent of the effects of sleep disordered breathing. Despite increasing evidence for the risks of snoring, few studies on sleep and health include objective measures of snoring. One reason for this methodological limitation is the difficulty of quantifying snoring. Conventional methods may rely on manual scoring of snore events by trained human scorers, but this process is both time- and labor-intensive, making the measurement of objective snoring impractical for large or multi-night studies. The current study is a proof-of-concept to validate the use of support vector machines (SVM), a form of machine learning, for the automated scoring of an objective snoring signal. An SVM algorithm was trained and tested on a set of approximately 150,000 snoring and non-snoring data segments, and F-scores for SVM performance compared to visual scoring performance were calculated using the Wilcoxon signed rank test for paired data. The ability of the SVM algorithm to discriminate snore from non-snore segments of data did not differ statistically from visual scorer performance (SVM F-score = 82.46 ± 7.93 versus average visual F-score = 88.35 ± 4.61, p = 0.2786), supporting SVM snore classification ability comparable to visual scorers. In this proof-of-concept, we established that the SVM algorithm performs comparably to trained visual scorers, supporting the use of SVM for automated snoring detection in future studies.
Ben Salem, Samira; Bacha, Khmais; Chaari, Abdelkader
2012-09-01
In this work we suggest an original fault signature based on an improved combination of Hilbert and Park transforms. Starting from this combination we can create two fault signatures: Hilbert modulus current space vector (HMCSV) and Hilbert phase current space vector (HPCSV). These two fault signatures are subsequently analysed using the classical fast Fourier transform (FFT). The effects of mechanical faults on the HMCSV and HPCSV spectrums are described, and the related frequencies are determined. The magnitudes of spectral components, relative to the studied faults (air-gap eccentricity and outer raceway ball bearing defect), are extracted in order to develop the input vector necessary for learning and testing the support vector machine with an aim of classifying automatically the various states of the induction motor. Copyright © 2012 ISA. Published by Elsevier Ltd. All rights reserved.