WorldWideScience

Sample records for machine regression svr

  1. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    Science.gov (United States)

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  2. Application of Hybrid Quantum Tabu Search with Support Vector Regression (SVR for Load Forecasting

    Directory of Open Access Journals (Sweden)

    Cheng-Wen Lee

    2016-10-01

    Full Text Available Hybridizing chaotic evolutionary algorithms with support vector regression (SVR to improve forecasting accuracy is a hot topic in electricity load forecasting. Trapping at local optima and premature convergence are critical shortcomings of the tabu search (TS algorithm. This paper investigates potential improvements of the TS algorithm by applying quantum computing mechanics to enhance the search information sharing mechanism (tabu memory to improve the forecasting accuracy. This article presents an SVR-based load forecasting model that integrates quantum behaviors and the TS algorithm with the support vector regression model (namely SVRQTS to obtain a more satisfactory forecasting accuracy. Numerical examples demonstrate that the proposed model outperforms the alternatives.

  3. Direct Surge Margin Control for Aeroengines Based on Improved SVR Machine and LQR Method

    Directory of Open Access Journals (Sweden)

    Haibo Zhang

    2013-01-01

    Full Text Available A novel scheme of high stability engine control (HISTEC on the basis of an improved linear quadratic regulator (ILQR, called direct surge margin control, is derived for super-maneuver flights. Direct surge margin control, which is different from conventional control scheme, puts surge margin into the engine closed-loop system and takes surge margin as controlled variable directly. In this way, direct surge margin control can exploit potential performance of engine more effectively with a decrease of engine stability margin which usually happened in super-maneuver flights. For conquering the difficulty that aeroengine surge margin is undetectable, an approach based on improved support vector regression (SVR machine is proposed to construct a surge margin prediction model. The surge margin modeling contains two parts: a baseline model under no inlet distortion states and the calculation for surge margin loss under supermaneuvering flight conditions. The previous one is developed using neural network method, the inputs of which are selected by a weighted feature selection algorithm. Considering the hysteresis between pilot input and angle of attack output, an online scrolling window least square support vector regression (LSSVR method is employed to firstly estimate inlet distortion index and further compute surge margin loss via some empirical look-up tables.

  4. An adaptive online learning approach for Support Vector Regression: Online-SVR-FID

    Science.gov (United States)

    Liu, Jie; Zio, Enrico

    2016-08-01

    Support Vector Regression (SVR) is a popular supervised data-driven approach for building empirical models from available data. Like all data-driven methods, under non-stationary environmental and operational conditions it needs to be provided with adaptive learning capabilities, which might become computationally burdensome with large datasets cumulating dynamically. In this paper, a cost-efficient online adaptive learning approach is proposed for SVR by combining Feature Vector Selection (FVS) and Incremental and Decremental Learning. The proposed approach adaptively modifies the model only when different pattern drifts are detected according to proposed criteria. Two tolerance parameters are introduced in the approach to control the computational complexity, reduce the influence of the intrinsic noise in the data and avoid the overfitting problem of SVR. Comparisons of the prediction results is made with other online learning approaches e.g. NORMA, SOGA, KRLS, Incremental Learning, on several artificial datasets and a real case study concerning time series prediction based on data recorded on a component of a nuclear power generation system. The performance indicators MSE and MARE computed on the test dataset demonstrate the efficiency of the proposed online learning method.

  5. The seam offset identification based on support vector regression machines

    Institute of Scientific and Technical Information of China (English)

    Zeng Songsheng; Shi Yonghua; Wang Guorong; Huang Guoxing

    2009-01-01

    The principle of the support vector regression machine(SVR) is first analysed. Then the new data-dependent kernel function is constructed from information geometry perspective. The current waveforms change regularly in accordance with the different horizontal offset when the rotational frequency of the high speed rotational arc sensor is in the range from 15 Hz to 30 Hz. The welding current data is pretreated by wavelet filtering, mean filtering and normalization treatment. The SVR model is constructed by making use of the evolvement laws, the decision function can be achieved by training the SVR and the seam offset can be identified. The experimental results show that the precision of the offset identification can be greatly improved by modifying the SVR and applying mean filtering from the longitudinal direction.

  6. Prediction of Tourism Demand in Iran by Using Artificial Neural Network (ANN and Supporting Vector Machine (SVR

    Directory of Open Access Journals (Sweden)

    Seyedehelham Sadatiseyedmahalleh

    2016-02-01

    Full Text Available This research examines and proves this effectiveness connected with artificial neural networks (ANNs as an alternative approach to the use of Support Vector Machine (SVR in the tourism research. This method can be used for the tourism industry to define the turism’s demands in Iran. The outcome reveals the use of ANNs in tourism research might result in better quotations when it comes to prediction bias and accuracy. Even more applications of ANNs in the context of tourism demand evaluation is needed to establish and validate the effects.

  7. Hybrid Forecasting Approach Based on GRNN Neural Network and SVR Machine for Electricity Demand Forecasting

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-01-01

    Full Text Available Accurate electric power demand forecasting plays a key role in electricity markets and power systems. The electric power demand is usually a non-linear problem due to various unknown reasons, which make it difficult to get accurate prediction by traditional methods. The purpose of this paper is to propose a novel hybrid forecasting method for managing and scheduling the electricity power. EEMD-SCGRNN-PSVR, the proposed new method, combines ensemble empirical mode decomposition (EEMD, seasonal adjustment (S, cross validation (C, general regression neural network (GRNN and support vector regression machine optimized by the particle swarm optimization algorithm (PSVR. The main idea of EEMD-SCGRNN-PSVR is respectively to forecast waveform and trend component that hidden in demand series to substitute directly forecasting original electric demand. EEMD-SCGRNN-PSVR is used to predict the one week ahead half-hour’s electricity demand in two data sets (New South Wales (NSW and Victorian State (VIC in Australia. Experimental results show that the new hybrid model outperforms the other three models in terms of forecasting accuracy and model robustness.

  8. Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): A case study of Beijing-Tianjin-Shijiazhuang

    Science.gov (United States)

    Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin

    2017-01-01

    Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction. PMID:28708836

  9. Prediction of Pressure Drop of Slurry Flow in Pipeline by Hybrid Support Vector Regression and Genetic Algorithm Model%基于SVR-GA模型的浆态管流压力差的预测

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    This paper describes a robust support vector regression (SVR) methodology, which can offer superior performance for important process engineering problems. The method incorporates hybrid support vector regression and genetic algorithm technique (SVR-GA) for efficient tuning of SVR mcta-parameters. The algorithm has been applied for prediction of pressure drop of solid liquid slurry flow. A comparison with selected correlations in the literature showed that the developed SVR correlation noticeably improved the prediction of pressure drop over a wide range of operating conditions, physical properties, and pipe diameters.

  10. A Novel Homogenous Hybridization Scheme for Performance Improvement of Support Vector Machines Regression in Reservoir Characterization

    Directory of Open Access Journals (Sweden)

    Kabiru O. Akande

    2016-01-01

    Full Text Available Hybrid computational intelligence is defined as a combination of multiple intelligent algorithms such that the resulting model has superior performance to the individual algorithms. Therefore, the importance of fusing two or more intelligent algorithms to achieve better performance cannot be overemphasized. In this work, a novel homogenous hybridization scheme is proposed for the improvement of the generalization and predictive ability of support vector machines regression (SVR. The proposed and developed hybrid SVR (HSVR works by considering the initial SVR prediction as a feature extraction process and then employs the SVR output, which is the extracted feature, as its sole descriptor. The developed hybrid model is applied to the prediction of reservoir permeability and the predicted permeability is compared to core permeability which is regarded as standard in petroleum industry. The results show that the proposed hybrid scheme (HSVR performed better than the existing SVR in both generalization and prediction ability. The outcome of this research will assist petroleum engineers to effectively predict permeability of carbonate reservoirs with higher degree of accuracy and will invariably lead to better reservoir. Furthermore, the encouraging performance of this hybrid will serve as impetus for further exploring homogenous hybrid system.

  11. [Calibration transfer between two FTNIR spectrophotometers using SVR].

    Science.gov (United States)

    Zhao, Long-lian; Li, Jun-hui; Zhang, Wen-juan; Wang, Jian-cai; Zhang, Lu-da

    2008-10-01

    In the present research, a set of maize powder samples was used to study the calibration transfer between two fourier transform near-infrared (FTNIR) spectrophotometers, and a method of moving window support vector regression machines (SVR) was used to correct the differences between the two instruments. Bruker Vector 22/N was referred to as "master" on which the maize protein calibration model was built. Bruker MPA was referred to as "slave" instrument. A transformation matrix was constructed based on the spectra of a sample set (for calibration transfer) measured on both instruments. After transfer, NIR spectra acquired on "slave" will appear as if they were measured on master instrument. The calibration model available for the master can then be used to predict the transformed spectra measured on the slave. The transfer parameters were computed as follows. For wavelength i, the absorbance vector obtained on the master instrument was regressed against the corresponding absorbance matrix of a spectral window obtained on the slave instrument. Method of SVR was used for regression Moving the wavelength i and corresponding window, the transfer parameter for each wavelength can be obtained. For the two FTNIR spectrophotometers, a window size of 31 wavelengths and a subset of 15 transfer samples were chosen to establish the SVR regression model between "master" and "slave". Applying the calibration model to the prediction samples after being corrected by the transfer parameters, a good transfer performance can be achieved. The correlation coefficient (r) is 0.9434, while the relative standard deviation (RSD) is 4.23%. These results suggest that the SVR method can be used to successfully transfer the calibration model for protein of maize developed on a FTNIR spectrophotometer to another.

  12. Estimating Fractional Shrub Cover Using Simulated EnMAP Data: A Comparison of Three Machine Learning Regression Techniques

    Directory of Open Access Journals (Sweden)

    Marcel Schwieder

    2014-04-01

    Full Text Available Anthropogenic interventions in natural and semi-natural ecosystems often lead to substantial changes in their functioning and may ultimately threaten ecosystem service provision. It is, therefore, necessary to monitor these changes in order to understand their impacts and to support management decisions that help ensuring sustainability. Remote sensing has proven to be a valuable tool for these purposes, and especially hyperspectral sensors are expected to provide valuable data for quantitative characterization of land change processes. In this study, simulated EnMAP data were used for mapping shrub cover fractions along a gradient of shrub encroachment, in a study region in southern Portugal. We compared three machine learning regression techniques: Support Vector Regression (SVR; Random Forest Regression (RF; and Partial Least Squares Regression (PLSR. Additionally, we compared the influence of training sample size on the prediction performance. All techniques showed reasonably good results when trained with large samples, while SVR always outperformed the other algorithms. The best model was applied to produce a fractional shrub cover map for the whole study area. The predicted patterns revealed a gradient of shrub cover between regions affected by special agricultural management schemes for nature protection and areas without land use incentives. Our results highlight the value of EnMAP data in combination with machine learning regression techniques for monitoring gradual land change processes.

  13. Deep Support Vector Machines for Regression Problems

    NARCIS (Netherlands)

    Wiering, Marco; Schutten, Marten; Millea, Adrian; Meijster, Arnold; Schomaker, Lambertus

    2013-01-01

    In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain su

  14. Deep Support Vector Machines for Regression Problems

    NARCIS (Netherlands)

    Wiering, Marco; Schutten, Marten; Millea, Adrian; Meijster, Arnold; Schomaker, Lambertus

    2013-01-01

    In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain

  15. Generating Fuzzy Rule-based Systems from Examples Based on Robust Support Vector Machine

    Institute of Scientific and Technical Information of China (English)

    JIA Jiong; ZHANG Hao-ran

    2006-01-01

    This paper firstly proposes a new support vector machine regression (SVR) with a robust loss function, and designs a gradient based algorithm for implementation of the SVR,then uses the SVR to extract fuzzy rules and designs fuzzy rule-based system. Simulations show that fuzzy rule-based system technique based on robust SVR achieves superior performance to the conventional fuzzy inference method, the proposed method provides satisfactory performance with excellent approximation and generalization property than the existing algorithm.

  16. Research on milk electrical conductivity NlR regression model based on PSO-SVR algorithm%基于PSO-SVR算法的牛乳电导率近红外回归模型研究

    Institute of Scientific and Technical Information of China (English)

    谈爱玲; 赵勇; 王思远; 陈静雯

    2016-01-01

    The rapid and accurate detection of milk electrical conductivity is of great significance to the healthy development of the dairy farm.A novel rapid detection method using near infrared spectroscopy combined with chemometrics was proposed in this paper. For the NIR spectral data of 90 farm milk samples, the support vector regression models were built by using genetic algorithm and particle swarm algorithm respectively, the results show that the PSO-SVR model has better performance and higher prediction preci-sion compared with the GA-SVR model and the traditional PLS model.The near infrared model based on PSO-SVR algorithm can be applied to the rapid and accurate measurement of the milk electrical conductivity.%牛乳电导率值是判断奶牛是否感染乳腺炎的重要依据,其快速、准确测量对奶牛养殖业的健康发展具有重要意义。本文提出一种近红外光谱结合化学计量学快速测量牛乳电导率的新方法。针对90个牛乳样本的近红外光谱,建立电导率值的支持向量机回归模型,分别采用遗传算法和粒子群算法进行模型参数寻优,结果表明基于粒子群寻优算法所建立的牛乳电导率值预测模型,相比GA-SVR模型和传统PLS模型具有更好的性能指标,预测精度更高,可以应用于牛乳电导率的快速、准确测量。

  17. Novel algorithm for constructing support vector machine regression ensemble

    Institute of Scientific and Technical Information of China (English)

    Li Bo; Li Xinjun; Zhao Zhiyan

    2006-01-01

    A novel algorithm for constructing support vector machine regression ensemble is proposed. As to regression prediction, support vector machine regression(SVMR) ensemble is proposed by resampling from given training data sets repeatedly and aggregating several independent SVMRs, each of which is trained to use a replicated training set. After training, several independently trained SVMRs need to be aggregated in an appropriate combination manner. Generally, the linear weighting is usually used like expert weighting score in Boosting Regression and it is without optimization capacity. Three combination techniques are proposed, including simple arithmetic mean,linear least square error weighting and nonlinear hierarchical combining that uses another upper-layer SVMR to combine several lower-layer SVMRs. Finally, simulation experiments demonstrate the accuracy and validity of the presented algorithm.

  18. Accurate Descriptions of Hot Flow Behaviors Across β Transus of Ti-6Al-4V Alloy by Intelligence Algorithm GA-SVR

    Science.gov (United States)

    Wang, Li-yong; Li, Le; Zhang, Zhi-hua

    2016-09-01

    Hot compression tests of Ti-6Al-4V alloy in a wide temperature range of 1023-1323 K and strain rate range of 0.01-10 s-1 were conducted by a servo-hydraulic and computer-controlled Gleeble-3500 machine. In order to accurately and effectively characterize the highly nonlinear flow behaviors, support vector regression (SVR) which is a machine learning method was combined with genetic algorithm (GA) for characterizing the flow behaviors, namely, the GA-SVR. The prominent character of GA-SVR is that it with identical training parameters will keep training accuracy and prediction accuracy at a stable level in different attempts for a certain dataset. The learning abilities, generalization abilities, and modeling efficiencies of the mathematical regression model, ANN, and GA-SVR for Ti-6Al-4V alloy were detailedly compared. Comparison results show that the learning ability of the GA-SVR is stronger than the mathematical regression model. The generalization abilities and modeling efficiencies of these models were shown as follows in ascending order: the mathematical regression model work hardening and dynamic recovery, characterizing dynamic recrystallization evolution, and improving processing maps.

  19. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Boucher, Thomas F., E-mail: boucher@cs.umass.edu [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Ozanne, Marie V. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Carmosino, Marco L. [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Dyar, M. Darby [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Mahadevan, Sridhar [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Breves, Elly A.; Lepore, Kate H. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Clegg, Samuel M. [Los Alamos National Laboratory, P.O. Box 1663, MS J565, Los Alamos, NM 87545 (United States)

    2015-05-01

    dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO{sub 2}, MgO, Fe{sub 2}O{sub 3}, and Na{sub 2}O, lasso for Al{sub 2}O{sub 3}, elastic net for MnO, and PLS-1 for CaO, TiO{sub 2}, and K{sub 2}O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user. - Highlights: • We compared 9 machine learning regression models for predicting mineral composition from LIBS. • These models vary over factors: linear/nonlinear, sparse/dense, univariate/multivariate. • The linear models evaluated generalized well for out-of-sample predictions. • The nonlinear models evaluated tended to overfit the training data and generalize poorly. • Sparse models best predicted the elements with a small number of high transition probability emission lines.

  20. Clifford support vector machines for classification, regression, and recurrence.

    Science.gov (United States)

    Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy

    2010-11-01

    This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.

  1. Prediction of Profitability of Industries using Weighted SVR

    Directory of Open Access Journals (Sweden)

    Divya Tomar,

    2011-05-01

    Full Text Available In order to measure the profitability of an industry by predicting Pre-Tax Operating Margin by applying regression technique on Price/Sales Ratio and Net Margin of various industries. Prediction ofPre-Tax Operating Margin is done using Support vector Regression (SVR. We present a model in this paper in order to solve the problem of over-fitting which is due to noise and outliers in dataset. For this a weighted coefficient based approach is proposed that reduces the prediction error and provides the higher accuracy than simple support vector regression. At last, the comparison of SVR using different kernel functions with weight is done and results of experiments shows that LS-SVR with RBF kernel function using weighted coefficient have better accuracy.

  2. Experimental and Analytical Studies on Improved Feedforward ML Estimation Based on LS-SVR

    Directory of Open Access Journals (Sweden)

    Xueqian Liu

    2013-01-01

    Full Text Available Maximum likelihood (ML algorithm is the most common and effective parameter estimation method. However, when dealing with small sample and low signal-to-noise ratio (SNR, threshold effects are resulted and estimation performance degrades greatly. It is proved that support vector machine (SVM is suitable for small sample. Consequently, we employ the linear relationship between least squares support vector regression (LS-SVR’s inputs and outputs and regard LS-SVR process as a time-varying linear filter to increase input SNR of received signals and decrease the threshold value of mean square error (MSE curve. Furthermore, it is verified that by taking single-tone sinusoidal frequency estimation, for example, and integrating data analysis and experimental validation, if LS-SVR’s parameters are set appropriately, not only can the LS-SVR process ensure the single-tone sinusoid and additive white Gaussian noise (AWGN channel characteristics of original signals well, but it can also improves the frequency estimation performance. During experimental simulations, LS-SVR process is applied to two common and representative single-tone sinusoidal ML frequency estimation algorithms, the DFT-based frequency-domain periodogram (FDP and phase-based Kay ones. And the threshold values of their MSE curves are decreased by 0.3 dB and 1.2 dB, respectively, which obviously exhibit the advantage of the proposed algorithm.

  3. SVR

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Severe Thunderstorm Warnings (SVRs) are issued by NWS Weather Forecast Office (WFO) meteorologists when there is radar of satellite indication and/or reliable...

  4. Evaluating a coupled discrete wavelet transform and support vector regression for daily and monthly streamflow forecasting

    Science.gov (United States)

    Liu, Zhiyong; Zhou, Ping; Chen, Gang; Guo, Ledong

    2014-11-01

    This study investigated the performance and potential of a hybrid model that combined the discrete wavelet transform and support vector regression (the DWT-SVR model) for daily and monthly streamflow forecasting. Three key factors of the wavelet decomposition phase (mother wavelet, decomposition level, and edge effect) were proposed to consider for improving the accuracy of the DWT-SVR model. The performance of DWT-SVR models with different combinations of these three factors was compared with the regular SVR model. The effectiveness of these models was evaluated using the root-mean-squared error (RMSE) and Nash-Sutcliffe model efficiency coefficient (NSE). Daily and monthly streamflow data observed at two stations in Indiana, United States, were used to test the forecasting skill of these models. The results demonstrated that the different hybrid models did not always outperform the SVR model for 1-day and 1-month lead time streamflow forecasting. This suggests that it is crucial to consider and compare the three key factors when using the DWT-SVR model (or other machine learning methods coupled with the wavelet transform), rather than choosing them based on personal preferences. We then combined forecasts from multiple candidate DWT-SVR models using a model averaging technique based upon Akaike's information criterion (AIC). This ensemble prediction was superior to the single best DWT-SVR model and regular SVR model for both 1-day and 1-month ahead predictions. With respect to longer lead times (i.e., 2- and 3-day and 2-month), the ensemble predictions using the AIC averaging technique were consistently better than the best DWT-SVR model and SVR model. Therefore, integrating model averaging techniques with the hybrid DWT-SVR model would be a promising approach for daily and monthly streamflow forecasting. Additionally, we strongly recommend considering these three key factors when using wavelet-based SVR models (or other wavelet-based forecasting models).

  5. Research on Application of Regression Least Squares Support Vector Machine on Performance Prediction of Hydraulic Excavator

    Directory of Open Access Journals (Sweden)

    Zhan-bo Chen

    2014-01-01

    Full Text Available In order to improve the performance prediction accuracy of hydraulic excavator, the regression least squares support vector machine is applied. First, the mathematical model of the regression least squares support vector machine is studied, and then the algorithm of the regression least squares support vector machine is designed. Finally, the performance prediction simulation of hydraulic excavator based on regression least squares support vector machine is carried out, and simulation results show that this method can predict the performance changing rules of hydraulic excavator correctly.

  6. Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression

    Science.gov (United States)

    Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin

    2016-06-01

    Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.

  7. SVR-Boosting ensemble model for electricity price forecasting in electric power market

    Institute of Scientific and Technical Information of China (English)

    ZHOU Dian-min; GAO Lin; GUAN Xiao-hong; GAO Feng

    2008-01-01

    A revised support vector regression (SVR) ensemble model based on boosting algorithm (SVR-Boos-ting) is presented in this paper for electricity price forecasting in electric power market. In the light of charac-the forecasting model to inhibit the learning from abnormal data in electricity price sequence. The results from actual data indicate that, compared with the single support vector regression model, the proposed SVR-Boosting ensemble model is able to enhance the stability of the model output remarkably, acquire higher predicting accu-racy, and possess comparatively satisfactory generalization capability.

  8. Phase Space Prediction of Chaotic Time Series with Nu-Support Vector Machine Regression

    Institute of Scientific and Technical Information of China (English)

    YE Mei-Ying; WANG Xiao-Dong

    2005-01-01

    A new class of support vector machine, nu-support vector machine, is discussed which can handle both classification and regression. We focus on nu-support vector machine regression and use it for phase space prediction of compares nu-support vector machine with back propagation (BP) networks in order to better evaluate the performance of the proposed methods. The experimental results show that the nu-support vector machine regression obtains lower root mean squared error than the BP networks and provides an accurate chaotic time series prediction. These results can be attributable to the fact that nu-support vector machine implements the structural risk minimization principle and this leads to better generalization than the BP networks.

  9. Path Following Control of an AUV under the Current Using the SVR-ADRC

    Directory of Open Access Journals (Sweden)

    Zheping Yan

    2014-01-01

    Full Text Available A novel active disturbance rejection control (ADRC controller is proposed based on support vector regression (SVR. The SVR-ADRC is designed to force an underactuated autonomous underwater vehicle (AUV to follow a path in the horizontal plane with the ocean current disturbance. It is derived using SVR algorithm to adjust the coefficients of the nonlinear state error feedback (ELSEF part in ADRC to deal with nonlinear variations at different operating points. The trend of change about ELSEF coefficients in the simulation proves that the designed SVR algorithm maintains the characteristics of astringency and stability. Furthermore, the path following errors under current in simulation has proved the high accuracy, strong robustness, and stability of the proposed SVR-ADRC. The contributions of the proposed controller are to improve the characteristics of ADRC considering the changing parameters in operating environment which make the controller more adaptive for the situation.

  10. Short-Term Wind Power Prediction Based on Support Vector Regression Machine Optimized by Adaptive Disturbance Quantum-Behaved Particle Swarm Optimization%基于自适应扰动量子粒子群算法参数优化的支持向量回归机短期风电功率预测

    Institute of Scientific and Technical Information of China (English)

    陈道君; 龚庆武; 金朝意; 张静; 王定美

    2013-01-01

      智能电网的建设和大规模风电接入电网对短期风电功率预测精度提出了更高的要求。为了克服支持向量回归机(support vector regression machine,SVR)依赖人为经验选择学习参数的弊端,在量子粒子群优化(quantum-behaved particle swarm optimization,QPSO)算法中加入自适应早熟判定准则、混合扰动算子和动态扩张−收缩系数,提出了自适应扰动量子粒子群优化算法(adaptive disturbance quantum-behaved particle swarm optimization,ADQPSO),并使用ADQPSO 优化选择SVR 的学习参数。实例研究表明, ADQPSO 算法全局寻优能力强、鲁棒性好、计算耗时短,利用ADQPSO 优化得到的SVR 参数,可有效提高模型的预测精度;与反向传播神经网络(back propagation neural network,BPNN)和径向基神经网络(radial basis function neural network,RBFNN)相比,提出的ADQPSO-SVR 能够提高短期风电功率预测的准确性和稳定性。%A higher accuracy of short-term wind farm output prediction is required due to the construction of smart grid and grid-connection of large-scale wind farms. To remedy the defect of support vector regression machine (SVR) that the learning parameter selection of SVR depends on factitious experiences, adaptive disturbance quantum-behaved particle swarm optimization (ADQPSO) algorithm is proposed by adding adaptive premature criterion, mixed disturbance operator and dynamic expansion-contraction coefficient in quantum-behaved particle swarm optimization (QPSO) algorithm, and ADQPSO algorithm is used in optimized selection of learning parameters for SVR. Case study shows that the proposed ADQPSO algorithm possesses such advantages as good global search ability, strong robustness and high computation efficiency, and applying the ADQPSO algorithm to the optimization of the obtained learning parameters of SVR the accuracy of short-term wind power prediction is higher than those by back propagation

  11. Simulation of groundwater level variations using wavelet combined with neural network, linear regression and support vector machine

    Science.gov (United States)

    Ebrahimi, Hadi; Rajaee, Taher

    2017-01-01

    Simulation of groundwater level (GWL) fluctuations is an important task in management of groundwater resources. In this study, the effect of wavelet analysis on the training of the artificial neural network (ANN), multi linear regression (MLR) and support vector regression (SVR) approaches was investigated, and the ANN, MLR and SVR along with the wavelet-ANN (WNN), wavelet-MLR (WLR) and wavelet-SVR (WSVR) models were compared in simulating one-month-ahead of GWL. The only variable used to develop the models was the monthly GWL data recorded over a period of 11 years from two wells in the Qom plain, Iran. The results showed that decomposing GWL time series into several sub-time series, extremely improved the training of the models. For both wells 1 and 2, the Meyer and Db5 wavelets produced better results compared to the other wavelets; which indicated wavelet types had similar behavior in similar case studies. The optimal number of delays was 6 months, which seems to be due to natural phenomena. The best WNN model, using Meyer mother wavelet with two decomposition levels, simulated one-month-ahead with RMSE values being equal to 0.069 m and 0.154 m for wells 1 and 2, respectively. The RMSE values for the WLR model were 0.058 m and 0.111 m, and for WSVR model were 0.136 m and 0.060 m for wells 1 and 2, respectively.

  12. Particle Swarm Optimization Based Support Vector Regression for Blind Image Restoration

    Institute of Scientific and Technical Information of China (English)

    Ratnakar Dash; Pankaj Kumar Sa; Banshidhar Majhi

    2012-01-01

    This paper presents a swarm intelligence based parameter optimization of the support vector machine (SVM)for blind image restoration.In this work,SVM is used to solve a regression problem.Support vector regression (SVR)has been utilized to obtain a true mapping of images from the observed noisy blurred images.The parameters of SVR are optimized through particle swarm optimization (PSO) technique.The restoration error function has been utilized as the fitness function for PSO.The suggested scheme tries to adapt the SVM parameters depending on the type of blur and noise strength and the experimental results validate its effectiveness.The results show that the parameter optimization of the SVR model gives better performance than conventional SVR model as well as other competent schemes for blind image restoration.

  13. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    Science.gov (United States)

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  14. [Comparative Efficiency of Algorithms Based on Support Vector Machines for Regression].

    Science.gov (United States)

    Kadyrova, N O; Pavlova, L V

    2015-01-01

    Methods of construction of support vector machines do not require additional a priori information and can be used to process large scale data set. It is especially important for various problems in computational biology. The main set of algorithms of support vector machines for regression is presented. The comparative efficiency of a number of support-vector-algorithms for regression is investigated. A thorough analysis of the study results found the most efficient support vector algorithms for regression. The description of the presented algorithms, sufficient for their practical implementation is given.

  15. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy

    Science.gov (United States)

    Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.

    2015-05-01

    The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels

  16. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Directory of Open Access Journals (Sweden)

    Suduan Chen

    2014-01-01

    Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  17. Time-series gas prediction model using LS-SVR within a Bayesian framework

    Institute of Scientific and Technical Information of China (English)

    Qiao Meiying; Ma Xiaoping; Lan Jianyi; Wang Ying

    2011-01-01

    The traditional least squares support vector regression (LS-SVR) model, using cross validation to determine the regularization parameter and kernel parameter, is time-consuming. We propose a Bayesian evidence framework to infer the LS-SVR model parameters. Three levels Bayesian inferences are used to determine the model parameters, regularization hyper-parameters and tune the nuclear parameters by model comparison. On this basis, we established Bayesian LS-SVR time-series gas forecasting models and provide steps for the algorithm. The gas outburst data of a Hebi 10th mine working face is used to validate the model. The optimal embedding dimension and delay time of the time series were obtained by the smallest differential entropy method. Finally, within a MATLAB7.1 environment, we used actual coal gas data to compare the traditional LS-SVR and the Bayesian LS-SVR with LS-SVMlab1.5 Toolbox simulation. The results show that the Bayesian framework of an LS-SVR significantly improves the speed and accuracy of the forecast

  18. 支持向量回归-同步荧光光谱法预测鸭肉中克百威残留%Prediction of Carbofuran Residue in Duck Meat by Synchronous Fluorescence Spectroscopy Based on Support Vector Regression(SVR)

    Institute of Scientific and Technical Information of China (English)

    肖海斌; 赵进辉; 袁海超; 徐将; 李倩; 刘木华

    2013-01-01

    为了满足鸭肉中克百威残留分析及快速检测的要求,基于克百威水解物在巯基乙醇存在的条件下能与邻苯二甲醛反应产生具有强荧光性衍生物的方法,建立了应用同步荧光光谱法测定鸭肉中克百威残留量的预测模型.对含有克百威鸭肉样品的三维同步荧光光谱进行分析,确定其最佳波长差△λ为120 nm;利用遗传算法(GA)结合交互验证均方根误差(RMSECV)从240~450 nm光谱中筛选出19个波长作为定量分析模型的输入特征变量;对SVR、PCR、PLS3种回归模型的性能进行比较,实验发现SVR模型的预测结果最好,其预测集的决定系数(r2)和预测均方根误差(RMSEP)分别为0.9994和0.878 7.研究结果表明,采用同步荧光光谱法结合支持向量回归算法测定鸭肉中克百威的残留量,具有快速、预测精度高等特点,可为检测鸭肉中的克百威残留量提供一种可行的方法.%A prediction model was established for the rapid analysis of carbofuran residue in duck meat by synchronous fluorescence spectroscopy method, based on the condition that the strong fluorescent derivatives can be generated in the reaction between carbofuran hydrolyzate and phthaldialde-hyde(OPA) in the presence of mercaptoethanol. The 3D synchronous fluorescence spectra of the duck meat containing carbofuran were analyzed, and 120 nm was selected as the optimum wavelength difference. 19 wavelengths between 240 nm and 450 nm were selected as the input features for quantitative analysis models using genetic algorithm combined with the root mean square error of cross-validation. The performances of three regression models, SVR, PCR and PLS were compared, and the results showed that the prediction of SVR regression model was the best, the determination coefficient ( r2) and the root mean squared error of prediction (RMSEP) for the prediction samples were 0. 999 4 and 0. 878 7, respectively. The results approved that the method of

  19. Load forecast method of electric vehicle charging station using SVR based on GA-PSO

    Science.gov (United States)

    Lu, Kuan; Sun, Wenxue; Ma, Changhui; Yang, Shenquan; Zhu, Zijian; Zhao, Pengfei; Zhao, Xin; Xu, Nan

    2017-06-01

    This paper presents a Support Vector Regression (SVR) method for electric vehicle (EV) charging station load forecast based on genetic algorithm (GA) and particle swarm optimization (PSO). Fuzzy C-Means (FCM) clustering is used to establish similar day samples. GA is used for global parameter searching and PSO is used for a more accurately local searching. Load forecast is then regressed using SVR. The practical load data of an EV charging station were taken to illustrate the proposed method. The result indicates an obvious improvement in the forecasting accuracy compared with SVRs based on PSO and GA exclusively.

  20. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges.

    Science.gov (United States)

    Goldstein, Benjamin A; Navar, Ann Marie; Carter, Rickey E

    2016-07-19

    Risk prediction plays an important role in clinical cardiology research. Traditionally, most risk models have been based on regression models. While useful and robust, these statistical methods are limited to using a small number of predictors which operate in the same way on everyone, and uniformly throughout their range. The purpose of this review is to illustrate the use of machine-learning methods for development of risk prediction models. Typically presented as black box approaches, most machine-learning methods are aimed at solving particular challenges that arise in data analysis that are not well addressed by typical regression approaches. To illustrate these challenges, as well as how different methods can address them, we consider trying to predicting mortality after diagnosis of acute myocardial infarction. We use data derived from our institution's electronic health record and abstract data on 13 regularly measured laboratory markers. We walk through different challenges that arise in modelling these data and then introduce different machine-learning approaches. Finally, we discuss general issues in the application of machine-learning methods including tuning parameters, loss functions, variable importance, and missing data. Overall, this review serves as an introduction for those working on risk modelling to approach the diffuse field of machine learning.

  1. A comparative study of slope failure prediction using logistic regression, support vector machine and least square support vector machine models

    Science.gov (United States)

    Zhou, Lim Yi; Shan, Fam Pei; Shimizu, Kunio; Imoto, Tomoaki; Lateh, Habibah; Peng, Koay Swee

    2017-08-01

    A comparative study of logistic regression, support vector machine (SVM) and least square support vector machine (LSSVM) models has been done to predict the slope failure (landslide) along East-West Highway (Gerik-Jeli). The effects of two monsoon seasons (southwest and northeast) that occur in Malaysia are considered in this study. Two related factors of occurrence of slope failure are included in this study: rainfall and underground water. For each method, two predictive models are constructed, namely SOUTHWEST and NORTHEAST models. Based on the results obtained from logistic regression models, two factors (rainfall and underground water level) contribute to the occurrence of slope failure. The accuracies of the three statistical models for two monsoon seasons are verified by using Relative Operating Characteristics curves. The validation results showed that all models produced prediction of high accuracy. For the results of SVM and LSSVM, the models using RBF kernel showed better prediction compared to the models using linear kernel. The comparative results showed that, for SOUTHWEST models, three statistical models have relatively similar performance. For NORTHEAST models, logistic regression has the best predictive efficiency whereas the SVM model has the second best predictive efficiency.

  2. Small-time scale network traffic prediction based on a local support vector machine regression model

    Institute of Scientific and Technical Information of China (English)

    Meng Qing-Fang; Chen Yue-Hui; Peng Yu-Hua

    2009-01-01

    In this paper we apply the nonlinear time series analysis method to small-time scale traffic measurement data. The prediction-based method is used to determine the embedding dimension of the traffic data. Based on the reconstructed phase space, the local support vector machine prediction method is used to predict the traffic measurement data, and the BIC-based neighbouring point selection method is used to choose the number of the nearest neighbouring points for the local support vector machine regression model. The experimental results show that the local support vector machine prediction method whose neighbouring points are optimized can effectively predict the small-time scale traffic measurement data and can reproduce the statistical features of real traffic measurements.

  3. A Machine Learning Tool for Weighted Regressions in Time, Discharge, and Season

    Directory of Open Access Journals (Sweden)

    Alexander Maestre

    2014-01-01

    Full Text Available A new machine learning tool has been developed to classify water stations with similar water quality trends. The tool is based on the statistical method, Weighted Regressions in Time, Discharge, and Season (WRTDS, developed by the United States Geological Survey (USGS to estimate daily concentrations of water constituents in rivers and streams based on continuous daily discharge data and discrete water quality samples collected at the same or nearby locations. WRTDS is based on parametric survival regressions using a jack-knife cross validation procedure that generates unbiased estimates of the prediction errors. One of the disadvantages of WRTDS is that it needs a large number of samples (n > 200 collected during at least two decades. In this article, the tool is used to evaluate the use of Boosted Regression Trees (BRT as an alternative to the parametric survival regressions for water quality stations with a small number of samples. We describe the development of the machine learning tool as well as an evaluation comparison of the two methods, WRTDS and BRT. The purpose of the tool is to evaluate the reduction in variability of the estimates by clustering data from nearby stations with similar concentration and discharge characteristics. The results indicate that, using clustering, the predicted concentrations using BRT are in general higher than the observed concentrations. In addition, it appears that BRT generates higher sum of square residuals than the parametric survival regressions.

  4. Soft sensor development and optimization of the commercial petrochemical plant integrating support vector regression and genetic algorithm

    Directory of Open Access Journals (Sweden)

    S.K. Lahiri

    2009-09-01

    Full Text Available Soft sensors have been widely used in the industrial process control to improve the quality of the product and assure safety in the production. The core of a soft sensor is to construct a soft sensing model. This paper introduces support vector regression (SVR, a new powerful machine learning methodbased on a statistical learning theory (SLT into soft sensor modeling and proposes a new soft sensing modeling method based on SVR. This paper presents an artificial intelligence based hybrid soft sensormodeling and optimization strategies, namely support vector regression – genetic algorithm (SVR-GA for modeling and optimization of mono ethylene glycol (MEG quality variable in a commercial glycol plant. In the SVR-GA approach, a support vector regression model is constructed for correlating the process data comprising values of operating and performance variables. Next, model inputs describing the process operating variables are optimized using genetic algorithm with a view to maximize the process performance. The SVR-GA is a new strategy for soft sensor modeling and optimization. The major advantage of the strategies is that modeling and optimization can be conducted exclusively from the historic process data wherein the detailed knowledge of process phenomenology (reaction mechanism, kinetics etc. is not required. Using SVR-GA strategy, a number of sets of optimized operating conditions were found. The optimized solutions, when verified in an actual plant, resulted in a significant improvement in the quality.

  5. Efficient Prediction of Low-Visibility Events at Airports Using Machine-Learning Regression

    Science.gov (United States)

    Cornejo-Bueno, L.; Casanova-Mateo, C.; Sanz-Justo, J.; Cerro-Prada, E.; Salcedo-Sanz, S.

    2017-06-01

    We address the prediction of low-visibility events at airports using machine-learning regression. The proposed model successfully forecasts low-visibility events in terms of the runway visual range at the airport, with the use of support-vector regression, neural networks (multi-layer perceptrons and extreme-learning machines) and Gaussian-process algorithms. We assess the performance of these algorithms based on real data collected at the Valladolid airport, Spain. We also propose a study of the atmospheric variables measured at a nearby tower related to low-visibility atmospheric conditions, since they are considered as the inputs of the different regressors. A pre-processing procedure of these input variables with wavelet transforms is also described. The results show that the proposed machine-learning algorithms are able to predict low-visibility events well. The Gaussian process is the best algorithm among those analyzed, obtaining over 98% of the correct classification rate in low-visibility events when the runway visual range is {>} 1000 m, and about 80% under this threshold. The performance of all the machine-learning algorithms tested is clearly affected in extreme low-visibility conditions ({<} 500 m). However, we show improved results of all the methods when data from a neighbouring meteorological tower are included, and also with a pre-processing scheme using a wavelet transform. Also presented are results of the algorithm performance in daytime and nighttime conditions, and for different prediction time horizons.

  6. Statistical regression modeling and machinability study of hardened AISI 52100 steel using cemented carbide insert

    Directory of Open Access Journals (Sweden)

    Amlana Panda

    2017-01-01

    Full Text Available The present study investigates performance and feasibility of application of low cost cemented carbide insert in dry machining of AISI 52100 steel hardened to (55 ± 1 HRC which is rarely researched as far as machining of bearing steel is concerned. Machinability studies i.e. flank wear, surface roughness and morphology analysis of chip has been investigated and statistical regression modeling has been developed. The test has been conducted based on Taguchi L16 OA taking machining parameters like cutting speed, feed and depth of cut. It is observed that uncoated cemented carbide insert performs well at some selected runs (Run 1, 5 and 9 which show its feasibility for hard turning applications. The developed serrated saw tooth chip of burnt blue colour adversely affects the surface quality. Adequacy of the developed statistical regression model has been checked using ANOVA analysis (depending on F value, P value and R2 value and normal probability plot at 95% confidence level. The results of optimal parametric combinations may be adopted while turning hardened AISI 52100 steel under dry environment with uncoated cemented carbide insert.

  7. Edge Detector Design Based on LS-SVR

    Directory of Open Access Journals (Sweden)

    Zhongdang Yu

    2014-01-01

    Full Text Available For locating inaccurate problem of the discrete localization criterion proposed by Demigny, a new criterion expression of “good localization” is proposed. Firstly, a discrete expression of good detection and good localization criterion of two dimension edge detection operator is employed, and then an experiment to measure optimal parameters of two dimension Canny's edge detection operator is introduced after. Moreover, a detailed performance comparison and analysis of two dimension optimal filter obtained via utilizing tensor product for one dimension optimal filter are provided which can prove that least square support vector regression (LS-SVR is a smoothness filter and give the construct method of the derivate operator. This paper uses LS-SVR as the object function constructor and then realizes the approximation of two dimension optimal edge detection operator. This paper proposes the utility method of using singleness operator to realize multiscale edge detection by referencing the multiscale analysis technology of the wavelets theory. Experiment shows that the method has utility and efficiency.

  8. SVR learning-based spatiotemporal fuzzy logic controller for nonlinear spatially distributed dynamic systems.

    Science.gov (United States)

    Zhang, Xian-Xia; Jiang, Ye; Li, Han-Xiong; Li, Shao-Yuan

    2013-10-01

    A data-driven 3-D fuzzy-logic controller (3-D FLC) design methodology based on support vector regression (SVR) learning is developed for nonlinear spatially distributed dynamic systems. Initially, the spatial information expression and processing as well as the fuzzy linguistic expression and rule inference of a 3-D FLC are integrated into spatial fuzzy basis functions (SFBFs), and then the 3-D FLC can be depicted by a three-layer network structure. By relating SFBFs of the 3-D FLC directly to spatial kernel functions of an SVR, an equivalence relationship of the 3-D FLC and the SVR is established, which means that the 3-D FLC can be designed with the help of the SVR learning. Subsequently, for an easy implementation, a systematic SVR learning-based 3-D FLC design scheme is formulated. In addition, the universal approximation capability of the proposed 3-D FLC is presented. Finally, the control of a nonlinear catalytic packed-bed reactor is considered as an application to demonstrate the effectiveness of the proposed 3-D FLC.

  9. Study on the medical meteorological forecast of the number of hypertension inpatient based on SVR

    Science.gov (United States)

    Zhai, Guangyu; Chai, Guorong; Zhang, Haifeng

    2017-06-01

    The purpose of this study is to build a hypertension prediction model by discussing the meteorological factors for hypertension incidence. The research method is selecting the standard data of relative humidity, air temperature, visibility, wind speed and air pressure of Lanzhou from 2010 to 2012(calculating the maximum, minimum and average value with 5 days as a unit ) as the input variables of Support Vector Regression(SVR) and the standard data of hypertension incidence of the same period as the output dependent variables to obtain the optimal prediction parameters by cross validation algorithm, then by SVR algorithm learning and training, a SVR forecast model for hypertension incidence is built. The result shows that the hypertension prediction model is composed of 15 input independent variables, the training accuracy is 0.005, the final error is 0.0026389. The forecast accuracy based on SVR model is 97.1429%, which is higher than statistical forecast equation and neural network prediction method. It is concluded that SVR model provides a new method for hypertension prediction with its simple calculation, small error as well as higher historical sample fitting and Independent sample forecast capability.

  10. Acoustic emission localization based on FBG sensing network and SVR algorithm

    Science.gov (United States)

    Sai, Yaozhang; Zhao, Xiuxia; Hou, Dianli; Jiang, Mingshun

    2016-11-01

    In practical application, carbon fiber reinforced plastics (CFRP) structures are easy to appear all sorts of invisible damages. So the damages should be timely located and detected for the safety of CFPR structures. In this paper, an acoustic emission (AE) localization system based on fiber Bragg grating (FBG) sensing network and support vector regression (SVR) is proposed for damage localization. AE signals, which are caused by damage, are acquired by high speed FBG interrogation. According to the Shannon wavelet transform, time differences between AE signals are extracted for localization algorithm based on SVR. According to the SVR model, the coordinate of AE source can be accurately predicted without wave velocity. The FBG system and localization algorithm are verified on a 500 mm×500 mm×2 mm CFRP plate. The experimental results show that the average error of localization system is 2.8 mm and the training time is 0.07 s.

  11. Acoustic emission localization based on FBG sensing network and SVR algorithm

    Science.gov (United States)

    Sai, Yaozhang; Zhao, Xiuxia; Hou, Dianli; Jiang, Mingshun

    2017-03-01

    In practical application, carbon fiber reinforced plastics (CFRP) structures are easy to appear all sorts of invisible damages. So the damages should be timely located and detected for the safety of CFPR structures. In this paper, an acoustic emission (AE) localization system based on fiber Bragg grating (FBG) sensing network and support vector regression (SVR) is proposed for damage localization. AE signals, which are caused by damage, are acquired by high speed FBG interrogation. According to the Shannon wavelet transform, time differences between AE signals are extracted for localization algorithm based on SVR. According to the SVR model, the coordinate of AE source can be accurately predicted without wave velocity. The FBG system and localization algorithm are verified on a 500 mm×500 mm×2 mm CFRP plate. The experimental results show that the average error of localization system is 2.8 mm and the training time is 0.07 s.

  12. Extreme Learning Machine and Moving Least Square Regression Based Solar Panel Vision Inspection

    Directory of Open Access Journals (Sweden)

    Heng Liu

    2017-01-01

    Full Text Available In recent years, learning based machine intelligence has aroused a lot of attention across science and engineering. Particularly in the field of automatic industry inspection, the machine learning based vision inspection plays a more and more important role in defect identification and feature extraction. Through learning from image samples, many features of industry objects, such as shapes, positions, and orientations angles, can be obtained and then can be well utilized to determine whether there is defect or not. However, the robustness and the quickness are not easily achieved in such inspection way. In this work, for solar panel vision inspection, we present an extreme learning machine (ELM and moving least square regression based approach to identify solder joint defect and detect the panel position. Firstly, histogram peaks distribution (HPD and fractional calculus are applied for image preprocessing. Then an ELM-based defective solder joints identification is discussed in detail. Finally, moving least square regression (MLSR algorithm is introduced for solar panel position determination. Experimental results and comparisons show that the proposed ELM and MLSR based inspection method is efficient not only in detection accuracy but also in processing speed.

  13. Estimation of the wind turbine yaw error by support vector machines

    DEFF Research Database (Denmark)

    Sheibat-Othman, Nida; Othman, Sami; Tayari, Raoaa

    2015-01-01

    Wind turbine yaw error information is of high importance in controlling wind turbine power and structural load. Normally used wind vanes are imprecise. In this work, the estimation of yaw error in wind turbines is studied using support vector machines for regression (SVR). As the methodology...

  14. Reinforced Extreme Learning Machines for Fast Robust Regression in the Presence of Outliers.

    Science.gov (United States)

    Frenay, Benoit; Verleysen, Michel

    2016-12-01

    Extreme learning machines (ELMs) are fast methods that obtain state-of-the-art results in regression. However, they are not robust to outliers and their meta-parameter (i.e., the number of neurons for standard ELMs and the regularization constant of output weights for L2 -regularized ELMs) selection is biased by such instances. This paper proposes a new robust inference algorithm for ELMs which is based on the pointwise probability reinforcement methodology. Experiments show that the proposed approach produces results which are comparable to the state of the art, while being often faster.

  15. Thrust estimator design based on least squares support vector regression machine

    Institute of Scientific and Technical Information of China (English)

    ZHAO Yong-ping; SUN Jian-guo

    2010-01-01

    In order to realize direct thrust control instead of traditional sensor-based control for nero-engines,it is indispensable to design a thrust estimator with high accuracy,so a scheme for thrust estimator design based on the least square support vector regression machine is proposed to solve this problem.Furthermore,numerical simulations confirm the effectiveness of our presented scheme.During the process of estimator design,a wrap per criterion that can not only reduce the computational complexity but also enhance the generalization performance is proposed to select variables as input variables for estimator.

  16. Forecasting daily and monthly exchange rates with machine learning techniques

    OpenAIRE

    Papadimitriou, Theophilos; Gogas, Periklis; Plakandaras, Vasilios

    2013-01-01

    We combine signal processing to machine learning methodologies by introducing a hybrid Ensemble Empirical Mode Decomposition (EEMD), Multivariate Adaptive Regression Splines (MARS) and Support Vector Regression (SVR) model in order to forecast the monthly and daily Euro (EUR)/United States Dollar (USD), USD/Japanese Yen (JPY), Australian Dollar (AUD)/Norwegian Krone (NOK), New Zealand Dollar (NZD)/Brazilian Real (BRL) and South African Rand (ZAR)/Philippine Peso (PHP) exchange rates. After th...

  17. Support vector machine applied in QSAR modelling

    Institute of Scientific and Technical Information of China (English)

    MEI Hu; ZHOU Yuan; LIANG Guizhao; LI Zhiliang

    2005-01-01

    Support vector machine (SVM), partial least squares (PLS), and Back-Propagation artificial neural network (ANN) were employed to establish QSAR models of 2 dipeptide datasets. In order to validate predictive capabilities on external dataset of the resulting models, both internal and external validations were performed. The division of dataset into both training and test sets was carried out by D-optimal design. The results showed that support vector machine (SVM) behaved well in both calibration and prediction. For the dataset of 48 bitter tasting dipeptides (BTD), the results obtained by support vector regression (SVR) were superior to that by PLS in both calibration and prediction. When compared with BP artificial neural network, SVR showed less calibration power but more predictive capability. For the dataset of angiotensin-converting enzyme (ACE) inhibitors, the results obtained by support vector machine (SVM) regression were equivalent to those by PLS and BP artificial neural network. In both datasets, SVR using linear kernel function behaved well as that using radial basis kernel function. The results showed that there is wide prospect for the application of support vector machine (SVM) into QSAR modeling.

  18. Towards smart energy systems: application of kernel machine regression for medium term electricity load forecasting.

    Science.gov (United States)

    Alamaniotis, Miltiadis; Bargiotas, Dimitrios; Tsoukalas, Lefteri H

    2016-01-01

    Integration of energy systems with information technologies has facilitated the realization of smart energy systems that utilize information to optimize system operation. To that end, crucial in optimizing energy system operation is the accurate, ahead-of-time forecasting of load demand. In particular, load forecasting allows planning of system expansion, and decision making for enhancing system safety and reliability. In this paper, the application of two types of kernel machines for medium term load forecasting (MTLF) is presented and their performance is recorded based on a set of historical electricity load demand data. The two kernel machine models and more specifically Gaussian process regression (GPR) and relevance vector regression (RVR) are utilized for making predictions over future load demand. Both models, i.e., GPR and RVR, are equipped with a Gaussian kernel and are tested on daily predictions for a 30-day-ahead horizon taken from the New England Area. Furthermore, their performance is compared to the ARMA(2,2) model with respect to mean average percentage error and squared correlation coefficient. Results demonstrate the superiority of RVR over the other forecasting models in performing MTLF.

  19. An Adaptive Support Vector Regression Machine for the State Prognosis of Mechanical Systems

    Directory of Open Access Journals (Sweden)

    Qing Zhang

    2015-01-01

    Full Text Available Due to the unsteady state evolution of mechanical systems, the time series of state indicators exhibits volatile behavior and staged characteristics. To model hidden trends and predict deterioration failure utilizing volatile state indicators, an adaptive support vector regression (ASVR machine is proposed. In ASVR, the width of an error-insensitive tube, which is a constant in the traditional support vector regression, is set as a variable determined by the transient distribution boundary of local regions in the training time series. Thus, the localized regions are obtained using a sliding time window, and their boundaries are defined by a robust measure known as the truncated range. Utilizing an adaptive error-insensitive tube, a stabilized tolerance level for noise is achieved, whether the time series occurs in low-volatility regions or in high-volatility regions. The proposed method is evaluated by vibrational data measured on descaling pumps. The results show that ASVR is capable of capturing the local trends of the volatile time series of state indicators and is superior to the standard support vector regression for state prediction.

  20. On Weighted Support Vector Regression

    DEFF Research Database (Denmark)

    Han, Xixuan; Clemmensen, Line Katrine Harder

    2014-01-01

    We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...

  1. Prediction of chaotic time series based on modified minimax probability machine regression

    Institute of Scientific and Technical Information of China (English)

    Sun Jian-Cheng

    2007-01-01

    Long-term prediction of chaotic time series is very difficult, for the chaos restricts predictability. In thie paper a new method is studied to model and predict chaotic time series based on minimax probability machine regression (MPMR). Since the positive global Lyapunov exponents lead the errors to increase exponentially in modelling the chaotic time series, a weighted term is introduced to compensate a cost function. Using mean square error (MSE) and absolute error (AE) as a criterion, simulation results show that the proposed method is more effective and accurate for multistep prediction. It can identify the system characteristics quite well and provide a new way to make long-term predictions of the chaotic time series.

  2. Optimization of Filter by using Support Vector Regression Machine with Cuckoo Search Algorithm

    Directory of Open Access Journals (Sweden)

    M. İlarslan

    2014-09-01

    Full Text Available Herein, a new methodology using a 3D Electromagnetic (EM simulator-based Support Vector Regression Machine (SVRM models of base elements is presented for band-pass filter (BPF design. SVRM models of elements, which are as fast as analytical equations and as accurate as a 3D EM simulator, are employed in a simple and efficient Cuckoo Search Algorithm (CSA to optimize an ultra-wideband (UWB microstrip BPF. CSA performance is verified by comparing it with other Meta-Heuristics such as Genetic Algorithm (GA and Particle Swarm Optimization (PSO. As an example of the proposed design methodology, an UWB BPF that operates between the frequencies of 3.1 GHz and 10.6 GHz is designed, fabricated and measured. The simulation and measurement results indicate in conclusion the superior performance of this optimization methodology in terms of improved filter response characteristics like return loss, insertion loss, harmonic suppression and group delay.

  3. A Comparison between Regression, Artificial Neural Networks and Support Vector Machines for Predicting Stock Market Index

    Directory of Open Access Journals (Sweden)

    Alaa F. Sheta

    2015-07-01

    Full Text Available Obtaining accurate prediction of stock index sig-nificantly helps decision maker to take correct actions to develop a better economy. The inability to predict fluctuation of the stock market might cause serious profit loss. The challenge is that we always deal with dynamic market which is influenced by many factors. They include political, financial and reserve occasions. Thus, stable, robust and adaptive approaches which can provide models have the capability to accurately predict stock index are urgently needed. In this paper, we explore the use of Artificial Neural Networks (ANNs and Support Vector Machines (SVM to build prediction models for the S&P 500 stock index. We will also show how traditional models such as multiple linear regression (MLR behave in this case. The developed models will be evaluated and compared based on a number of evaluation criteria.

  4. Greedy and Linear Ensembles of Machine Learning Methods Outperform Single Approaches for QSPR Regression Problems.

    Science.gov (United States)

    Kew, William; Mitchell, John B O

    2015-09-01

    The application of Machine Learning to cheminformatics is a large and active field of research, but there exist few papers which discuss whether ensembles of different Machine Learning methods can improve upon the performance of their component methodologies. Here we investigated a variety of methods, including kernel-based, tree, linear, neural networks, and both greedy and linear ensemble methods. These were all tested against a standardised methodology for regression with data relevant to the pharmaceutical development process. This investigation focused on QSPR problems within drug-like chemical space. We aimed to investigate which methods perform best, and how the 'wisdom of crowds' principle can be applied to ensemble predictors. It was found that no single method performs best for all problems, but that a dynamic, well-structured ensemble predictor would perform very well across the board, usually providing an improvement in performance over the best single method. Its use of weighting factors allows the greedy ensemble to acquire a bigger contribution from the better performing models, and this helps the greedy ensemble generally to outperform the simpler linear ensemble. Choice of data preprocessing methodology was found to be crucial to performance of each method too. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. Machine learning of swimming data via wisdom of crowd and regression analysis.

    Science.gov (United States)

    Xie, Jiang; Xu, Junfu; Nie, Celine; Nie, Qing

    2017-04-01

    Every performance, in an officially sanctioned meet, by a registered USA swimmer is recorded into an online database with times dating back to 1980. For the first time, statistical analysis and machine learning methods are systematically applied to 4,022,631 swim records. In this study, we investigate performance features for all strokes as a function of age and gender. The variances in performance of males and females for different ages and strokes were studied, and the correlations of performances for different ages were estimated using the Pearson correlation. Regression analysis show the performance trends for both males and females at different ages and suggest critical ages for peak training. Moreover, we assess twelve popular machine learning methods to predict or classify swimmer performance. Each method exhibited different strengths or weaknesses in different cases, indicating no one method could predict well for all strokes. To address this problem, we propose a new method by combining multiple inference methods to derive Wisdom of Crowd Classifier (WoCC). Our simulation experiments demonstrate that the WoCC is a consistent method with better overall prediction accuracy. Our study reveals several new age-dependent trends in swimming and provides an accurate method for classifying and predicting swimming times.

  6. Simultaneous chemiluminescence determination of thebaine and noscapine using support vector machine regression

    Science.gov (United States)

    Ensafi, Ali A.; Hasanpour, F.; Khayamian, T.; Mokhtari, A.; Taei, M.

    2010-02-01

    In this work, a batch chemiluminescence (CL) method has been proposed for the simultaneous determination of two structurally similar alkaloids, noscapine and thebaine. The method is based on the kinetic distinction of the CL reactions of noscapine and thebaine with Ru(bipy) 32+ and Ce(IV) system in a sulfuric acid medium. The least squared support vector machine (LS-SVM) regression was applied for relating the concentrations of both compounds to their CL profiles. The parameters of the model consisting of σ2 and γ were optimized by constructing LS-SVM models with all possible combinations of these two parameters to select the model with the minimum root mean squared error of cross validation (RMSECV) as the best. The parameters of this model were then selected as optimized values. Under the optimized experimental conditions for both compounds, the detection limits obtained using the LS-SVM regression were 0.08 and 0.1 μmol L -1 for noscapine and thebaine, respectively. The proposed method was utilized for the simultaneous determination of the compounds in pharmaceutical formulations and plasma samples with satisfactory results.

  7. Gaussian Process Regression as a machine learning tool for predicting organic carbon from soil spectra - a machine learning comparison study

    Science.gov (United States)

    Schmidt, Andreas; Lausch, Angela; Vogel, Hans-Jörg

    2016-04-01

    Diffuse reflectance spectroscopy as a soil analytical tool is spreading more and more. There is a wide range of possible applications ranging from the point scale (e.g. simple soil samples, drill cores, vertical profile scans) through the field scale to the regional and even global scale (UAV, airborne and space borne instruments, soil reflectance databases). The basic idea is that the soil's reflectance spectrum holds information about its properties (like organic matter content or mineral composition). The relation between soil properties and the observable spectrum is usually not exactly know and is typically derived from statistical methods. Nowadays these methods are classified in the term machine learning, which comprises a vast pool of algorithms and methods for learning the relationship between pairs if input - output data (training data set). Within this pool of methods a Gaussian Process Regression (GPR) is newly emerging method (originating from Bayesian statistics) which is increasingly applied to applications in different fields. For example, it was successfully used to predict vegetation parameters from hyperspectral remote sensing data. In this study we apply GPR to predict soil organic carbon from soil spectroscopy data (400 - 2500 nm). We compare it to more traditional and widely used methods such as Partitial Least Squares Regression (PLSR), Random Forest (RF) and Gradient Boosted Regression Trees (GBRT). All these methods have the common ability to calculate a measure for the variable importance (wavelengths importance). The main advantage of GPR is its ability to also predict the variance of the target parameter. This makes it easy to see whether a prediction is reliable or not. The ability to choose from various covariance functions makes GPR a flexible method. This allows for including different assumptions or a priori knowledge about the data. For this study we use samples from three different locations to test the prediction accuracies. One

  8. Comprehensive modeling of monthly mean soil temperature using multivariate adaptive regression splines and support vector machine

    Science.gov (United States)

    Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan

    2017-07-01

    Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.

  9. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Science.gov (United States)

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure.

  10. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    Science.gov (United States)

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  11. Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression

    Directory of Open Access Journals (Sweden)

    Ali Asghar Besalatpour

    2016-02-01

    Full Text Available Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010. Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA, mean weight diameter (MWD, geometric mean diameter (GMD of aggregates, and water-dispersible clay (WDC content (Calero et al., 2008. Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010. Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014. The main objective of this study is to investigate the potential use of support vector machines (SVMs method for estimating soil aggregate stability (as quantified by GMD as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E, which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM content was determined by the Walkley-Black method (Nelson & Sommers, 1986. Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand were measured using the procedure described by Gee & Bauder (1986 and calcium carbonate equivalent (CCE content was determined by the back-titration method (Nelson, 1982. The modified Kemper & Rosenau (1986 method was used to determine wet-aggregate stability (GMD. The topographic attributes of elevation, slope, and aspect were characterized using a 20-m

  12. Hybridizing DEMD and Quantum PSO with SVR in Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Li-Ling Peng

    2016-03-01

    Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents an SVR model hybridized with the differential empirical mode decomposition (DEMD method and quantum particle swarm optimization algorithm (QPSO for electric load forecasting. The DEMD method is employed to decompose the electric load to several detail parts associated with high frequencies (intrinsic mode function—IMF and an approximate part associated with low frequencies. Hybridized with quantum theory to enhance particle searching performance, the so-called QPSO is used to optimize the parameters of SVR. The electric load data of the New South Wales (Sydney, Australia market and the New York Independent System Operator (NYISO, New York, USA are used for comparing the forecasting performances of different forecasting models. The results illustrate the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.

  13. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data

    Institute of Scientific and Technical Information of China (English)

    Li'ai Wang; Xudong Zhou; Xinkai Zhu; Zhaodi Dong; Wenshan Guo

    2016-01-01

    Wheat biomass can be estimated using appropriate spectral vegetation indices. However, the accuracy of estimation should be further improved for on-farm crop management. Previous studies focused on developing vegetation indices, however limited research exists on modeling algorithms. The emerging Random Forest (RF) machine-learning algorithm is regarded as one of the most precise prediction methods for regression modeling. The objectives of this study were to (1) investigate the applicability of the RF regression algorithm for remotely estimating wheat biomass, (2) test the performance of the RF regression model, and (3) compare the performance of the RF algorithm with support vector regression (SVR) and artificial neural network (ANN) machine-learning algorithms for wheat biomass estimation. Single HJ-CCD images of wheat from test sites in Jiangsu province were obtained during the jointing, booting, and anthesis stages of growth. Fifteen vegetation indices were calculated based on these images. In-situ wheat above-ground dry biomass was measured during the HJ-CCD data acquisition. The results showed that the RF model produced more accurate estimates of wheat biomass than the SVR and ANN models at each stage, and its robustness is as good as SVR but better than ANN. The RF algorithm provides a useful exploratory and predictive tool for estimating wheat biomass on a large scale in Southern China.

  14. Estimation of biomass in wheat using random forest regression algorithm and remote sensing data

    Institute of Scientific and Technical Information of China (English)

    Li’ai Wang; Xudong Zhou; Xinkai Zhu; Zhaodi Dong; Wenshan Guo

    2016-01-01

    Wheat biomass can be estimated using appropriate spectral vegetation indices.However,the accuracy of estimation should be further improved for on-farm crop management.Previous studies focused on developing vegetation indices,however limited research exists on modeling algorithms.The emerging Random Forest(RF) machine-learning algorithm is regarded as one of the most precise prediction methods for regression modeling.The objectives of this study were to(1) investigate the applicability of the RF regression algorithm for remotely estimating wheat biomass,(2) test the performance of the RF regression model,and(3) compare the performance of the RF algorithm with support vector regression(SVR) and artificial neural network(ANN) machine-learning algorithms for wheat biomass estimation.Single HJ-CCD images of wheat from test sites in Jiangsu province were obtained during the jointing,booting,and anthesis stages of growth.Fifteen vegetation indices were calculated based on these images.In-situ wheat above-ground dry biomass was measured during the HJ-CCD data acquisition.The results showed that the RF model produced more accurate estimates of wheat biomass than the SVR and ANN models at each stage,and its robustness is as good as SVR but better than ANN.The RF algorithm provides a useful exploratory and predictive tool for estimating wheat biomass on a large scale in Southern China.

  15. Adaptive support vector regression for UAV flight control.

    Science.gov (United States)

    Shin, Jongho; Jin Kim, H; Kim, Youdan

    2011-01-01

    This paper explores an application of support vector regression for adaptive control of an unmanned aerial vehicle (UAV). Unlike neural networks, support vector regression (SVR) generates global solutions, because SVR basically solves quadratic programming (QP) problems. With this advantage, the input-output feedback-linearized inverse dynamic model and the compensation term for the inversion error are identified off-line, which we call I-SVR (inversion SVR) and C-SVR (compensation SVR), respectively. In order to compensate for the inversion error and the unexpected uncertainty, an online adaptation algorithm for the C-SVR is proposed. Then, the stability of the overall error dynamics is analyzed by the uniformly ultimately bounded property in the nonlinear system theory. In order to validate the effectiveness of the proposed adaptive controller, numerical simulations are performed on the UAV model.

  16. Machine Learning of the Reactor Core Loading Pattern Critical Parameters

    Directory of Open Access Journals (Sweden)

    Krešimir Trontl

    2008-01-01

    Full Text Available The usual approach to loading pattern optimization involves high degree of engineering judgment, a set of heuristic rules, an optimization algorithm, and a computer code used for evaluating proposed loading patterns. The speed of the optimization process is highly dependent on the computer code used for the evaluation. In this paper, we investigate the applicability of a machine learning model which could be used for fast loading pattern evaluation. We employ a recently introduced machine learning technique, support vector regression (SVR, which is a data driven, kernel based, nonlinear modeling paradigm, in which model parameters are automatically determined by solving a quadratic optimization problem. The main objective of the work reported in this paper was to evaluate the possibility of applying SVR method for reactor core loading pattern modeling. We illustrate the performance of the solution and discuss its applicability, that is, complexity, speed, and accuracy.

  17. Novel Automatic Filter-Class Feature Selection for Machine Learning Regression

    DEFF Research Database (Denmark)

    Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard

    2017-01-01

    With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature...

  18. Novel Automatic Filter-Class Feature Selection for Machine Learning Regression

    DEFF Research Database (Denmark)

    Wollsen, Morten Gill; Hallam, John; Jørgensen, Bo Nørregaard

    2016-01-01

    With the increased focus on application of Big Data in all sectors of society, the performance of machine learning becomes essential. Efficient machine learning depends on efficient feature selection algorithms. Filter feature selection algorithms are model-free and therefore very fast, but require...... model in the feature selection process. PCA is often used in machine learning litterature and can be considered the default feature selection method. RDESF outperformed PCA in both experiments in both prediction error and computational speed. RDESF is a new step into filter-based automatic feature...

  19. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    Science.gov (United States)

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  20. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

    Science.gov (United States)

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075

  1. A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework

    Science.gov (United States)

    Liang, Zhongmin; Li, Yujie; Hu, Yiming; Li, Binquan; Wang, Jun

    2017-06-01

    Accurate and reliable long-term forecasting plays an important role in water resources management and utilization. In this paper, a hybrid model called SVR-HUP is presented to predict long-term runoff and quantify the prediction uncertainty. The model is created based on three steps. First, appropriate predictors are selected according to the correlations between meteorological factors and runoff. Second, a support vector regression (SVR) model is structured and optimized based on the LibSVM toolbox and a genetic algorithm. Finally, using forecasted and observed runoff, a hydrologic uncertainty processor (HUP) based on a Bayesian framework is used to estimate the posterior probability distribution of the simulated values, and the associated uncertainty of prediction was quantitatively analyzed. Six precision evaluation indexes, including the correlation coefficient (CC), relative root mean square error (RRMSE), relative error (RE), mean absolute percentage error (MAPE), Nash-Sutcliffe efficiency (NSE), and qualification rate (QR), are used to measure the prediction accuracy. As a case study, the proposed approach is applied in the Han River basin, South Central China. Three types of SVR models are established to forecast the monthly, flood season and annual runoff volumes. The results indicate that SVR yields satisfactory accuracy and reliability at all three scales. In addition, the results suggest that the HUP cannot only quantify the uncertainty of prediction based on a confidence interval but also provide a more accurate single value prediction than the initial SVR forecasting result. Thus, the SVR-HUP model provides an alternative method for long-term runoff forecasting.

  2. Ligand efficiency-based support vector regression models for predicting bioactivities of ligands to drug target proteins.

    Science.gov (United States)

    Sugaya, Nobuyoshi

    2014-10-27

    The concept of ligand efficiency (LE) indices is widely accepted throughout the drug design community and is frequently used in a retrospective manner in the process of drug development. For example, LE indices are used to investigate LE optimization processes of already-approved drugs and to re-evaluate hit compounds obtained from structure-based virtual screening methods and/or high-throughput experimental assays. However, LE indices could also be applied in a prospective manner to explore drug candidates. Here, we describe the construction of machine learning-based regression models in which LE indices are adopted as an end point and show that LE-based regression models can outperform regression models based on pIC50 values. In addition to pIC50 values traditionally used in machine learning studies based on chemogenomics data, three representative LE indices (ligand lipophilicity efficiency (LLE), binding efficiency index (BEI), and surface efficiency index (SEI)) were adopted, then used to create four types of training data. We constructed regression models by applying a support vector regression (SVR) method to the training data. In cross-validation tests of the SVR models, the LE-based SVR models showed higher correlations between the observed and predicted values than the pIC50-based models. Application tests to new data displayed that, generally, the predictive performance of SVR models follows the order SEI > BEI > LLE > pIC50. Close examination of the distributions of the activity values (pIC50, LLE, BEI, and SEI) in the training and validation data implied that the performance order of the SVR models may be ascribed to the much higher diversity of the LE-based training and validation data. In the application tests, the LE-based SVR models can offer better predictive performance of compound-protein pairs with a wider range of ligand potencies than the pIC50-based models. This finding strongly suggests that LE-based SVR models are better than pIC50-based

  3. A “Salt and Pepper” Noise Reduction Scheme for Digital Images Based on Support Vector Machines Classification and Regression

    Directory of Open Access Journals (Sweden)

    Hilario Gómez-Moreno

    2014-01-01

    Full Text Available We present a new impulse noise removal technique based on Support Vector Machines (SVM. Both classification and regression were used to reduce the “salt and pepper” noise found in digital images. Classification enables identification of noisy pixels, while regression provides a means to determine reconstruction values. The training vectors necessary for the SVM were generated synthetically in order to maintain control over quality and complexity. A modified median filter based on a previous noise detection stage and a regression-based filter are presented and compared to other well-known state-of-the-art noise reduction algorithms. The results show that the filters proposed achieved good results, outperforming other state-of-the-art algorithms for low and medium noise ratios, and were comparable for very highly corrupted images.

  4. Forecasting the NOK/USD Exchange Rate with Machine Learning Techniques

    OpenAIRE

    Theophilos Papadimitriou; Periklis Gogas; Vasilios Plakandaras

    2013-01-01

    In this paper, we approximate the empirical findings of Papadamou and Markopoulos (2012) on the NOK/USD exchange rate under a Machine Learning (ML) framework. By applying Support Vector Regression (SVR) on a general monetary exchange rate model and a Dynamic Evolving Neuro-Fuzzy Inference System (DENFIS) to extract model structure, we test for the validity of popular monetary exchange rate models. We reach to mixed results since the coefficient sign of interest rate differential is in favor o...

  5. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    Science.gov (United States)

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  6. Regional Flood Frequency Analysis using Support Vector Regression under historical and future climate

    Science.gov (United States)

    Gizaw, Mesgana Seyoum; Gan, Thian Yew

    2016-07-01

    Regional Flood Frequency Analysis (RFFA) is a statistical method widely used to estimate flood quantiles of catchments with limited streamflow data. In addition, to estimate the flood quantile of ungauged sites, there could be only a limited number of stations with complete dataset are available from hydrologically similar, surrounding catchments. Besides traditional regression based RFFA methods, recent applications of machine learning algorithms such as the artificial neural network (ANN) have shown encouraging results in regional flood quantile estimations. Another novel machine learning technique that is becoming widely applicable in the hydrologic community is the Support Vector Regression (SVR). In this study, an RFFA model based on SVR was developed to estimate regional flood quantiles for two study areas, one with 26 catchments located in southeastern British Columbia (BC) and another with 23 catchments located in southern Ontario (ON), Canada. The SVR-RFFA model for both study sites was developed from 13 sets of physiographic and climatic predictors for the historical period. The Ef (Nash Sutcliffe coefficient) and R2 of the SVR-RFFA model was about 0.7 when estimating flood quantiles of 10, 25, 50 and 100 year return periods which indicate satisfactory model performance in both study areas. In addition, the SVR-RFFA model also performed well based on other goodness-of-fit statistics such as BIAS (mean bias) and BIASr (relative BIAS). If the amount of data available for training RFFA models is limited, the SVR-RFFA model was found to perform better than an ANN based RFFA model, and with significantly lower median CV (coefficient of variation) of the estimated flood quantiles. The SVR-RFFA model was then used to project changes in flood quantiles over the two study areas under the impact of climate change using the RCP4.5 and RCP8.5 climate projections of five Coupled Model Intercomparison Project (CMIP5) GCMs (Global Climate Models) for the 2041

  7. Carbon Nanotube Growth Rate Regression using Support Vector Machines and Artificial Neural Networks

    Science.gov (United States)

    2014-03-27

    rates are realized by this faster search. 1.3 Assumptions The machine learning approach used for extracting optimal growth parameters assumes the catalyst...and high strength polymers. [25] All carbon to carbon bonds are filled in a CNT so they are chemically inert and stable in acids, bases and solvents ...research in maximizing CNT length. SWNTs of 18.5 cm in length were obtained by using an ethanol precursor and an iron molybdenum catalyst [10]. Also, by

  8. Water Quality Prediction and Validation by Using Support Vector Regression Machines%支持向量回归机在水质预测中的应用与验证

    Institute of Scientific and Technical Information of China (English)

    武国正; 徐宗学; 李畅游

    2012-01-01

    以干旱区浅水湖泊乌梁素海的多年实测pH值为例,在分析支持向量回归机算法(ε-SVR)核函数选取的基础上进行了回归分析及预测,并与线性回归、BP神经网络、RBF网络等算法进行了比较。研究结果显示:①基于径向基核的支持向量回归机模拟效果优于其他核函数;②ε—SVR模拟结果与线性回归(LR)、BP神经网络和RBF网络等算法模拟结果相比,其拟合精度与预测精度均比其他三种方法要高。计算结果充分证明了支持向量回归机有较强的学习能力和泛化能力且该方法可以应用于水质预测研究。%Based on the observed pH values of the Wuliangsuhai Lake, the support vector regression (SVR) method of ε-SVR and kernel functions are analyzed. Then the model using ε-SVR with radial basis kernel is established to predict the pH values. In addition, the algorithms of linear regression, back propagation neural network and radial basis function network are introduced to verify the accuracy of the SVR model. The results indicate that: ①the accuracy of simulated results using RBF kernel in ε-SVR are superior to using other kernels; ②the ε-SVR has more excellent fitting accuracy and prediction accuracy than the other methods such as linear regression, back propagation neural network and radial basis function network. The research results suffice to support the conclusion that the SVR has outstanding learning capacity and generalization capacity and the method can be used in water quality prediction.

  9. Application of Support Vector Machines Regression in Prediction Shanghai Stock Composite Index

    Institute of Scientific and Technical Information of China (English)

    Wang Dong; Wu Wen-feng

    2003-01-01

    The SVMs for regression is used to forecast Shanghai stock composite index (SSCI). Implementing structural risk minimization principle, SVMs can overcome the over-fitting problem. The regression uses ε-insensitive loss function. The training of SVMs leads to a quadratic programming problem and it has a global unique solution. The experiment uses BP neural networks as benchmark for comparison. The results demonstrate that the prediction figure of SSCI can help to find timing for buy or sell, the forecasting variation of SVMs is smaller than that of BP, and the direction forecasting of SVMs is more accurate than that of BP.

  10. Blind identification of threshold auto-regressive model for machine fault diagnosis

    Institute of Scientific and Technical Information of China (English)

    LI Zhinong; HE Yongyong; CHU Fulei; WU Zhaotong

    2007-01-01

    A blind identification method was developed for the threshold auto-regressive (TAR) model. The method had good identification accuracy and rapid convergence, especially for higher order systems. The proposed method was then combined with the hidden Markov model (HMM) to determine the auto-regressive (AR) coefficients for each interval used for feature extraction, with the HMM as a classifier. The fault diagnoses during the speed-up and speed- down processes for rotating machinery have been success- fully completed. The result of the experiment shows that the proposed method is practical and effective.

  11. Prediction of Solar Flare Size and Time-to-Flare Using Support Vector Machine Regression

    CERN Document Server

    Boucheron, Laura E; McAteer, R T James

    2015-01-01

    We study the prediction of solar flare size and time-to-flare using 38 features describing magnetic complexity of the photospheric magnetic field. This work uses support vector regression to formulate a mapping from the 38-dimensional feature space to a continuous-valued label vector representing flare size or time-to-flare. When we consider flaring regions only, we find an average error in estimating flare size of approximately half a \\emph{geostationary operational environmental satellite} (\\emph{GOES}) class. When we additionally consider non-flaring regions, we find an increased average error of approximately 3/4 a \\emph{GOES} class. We also consider thresholding the regressed flare size for the experiment containing both flaring and non-flaring regions and find a true positive rate of 0.69 and a true negative rate of 0.86 for flare prediction. The results for both of these size regression experiments are consistent across a wide range of predictive time windows, indicating that the magnetic complexity fe...

  12. Analysis and design of machine learning techniques evolutionary solutions for regression, prediction, and control problems

    CERN Document Server

    Stalph, Patrick

    2014-01-01

    Manipulating or grasping objects seems like a trivial task for humans, as these are motor skills of everyday life. Nevertheless, motor skills are not easy to learn for humans and this is also an active research topic in robotics. However, most solutions are optimized for industrial applications and, thus, few are plausible explanations for human learning. The fundamental challenge, that motivates Patrick Stalph, originates from the cognitive science: How do humans learn their motor skills? The author makes a connection between robotics and cognitive sciences by analyzing motor skill learning using implementations that could be found in the human brain – at least to some extent. Therefore three suitable machine learning algorithms are selected – algorithms that are plausible from a cognitive viewpoint and feasible for the roboticist. The power and scalability of those algorithms is evaluated in theoretical simulations and more realistic scenarios with the iCub humanoid robot. Convincing results confirm the...

  13. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    Science.gov (United States)

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  14. Study on Ice Regime Forecast Based on SVR Optimized by Particle Swarm Optimization Algorithm

    Institute of Scientific and Technical Information of China (English)

    WANG; Fu-qiang; RONG; Fei

    2012-01-01

    [Objective] The research aimed to study forecast models for frozen and melted dates of the river water in Ningxia-Inner Mongolia section of the Yellow River based on SVR optimized by particle swarm optimization algorithm. [Method] Correlation analysis and cause analysis were used to select suitable forecast factor combination of the ice regime. Particle swarm optimization algorithm was used to determine the optimal parameter to construct forecast model. The model was used to forecast frozen and melted dates of the river water in Ningxia-Inner Mongolia section of the Yellow River. [Result] The model had high prediction accuracy and short running time. Average forecast error was 3.51 d, and average running time was 10.464 s. Its forecast effect was better than that of the support vector regression optimized by genetic algorithm (GA) and back propagation type neural network (BPNN). It could accurately forecast frozen and melted dates of the river water. [Conclusion] SVR based on particle swarm optimization algorithm could be used for ice regime forecast.

  15. Multiobjective Optimization for Fixture Locating Layout of Sheet Metal Part Using SVR and NSGA-II

    Directory of Open Access Journals (Sweden)

    Yuan Yang

    2017-01-01

    Full Text Available Fixture plays a significant role in determining the sheet metal part (SMP spatial position and restraining its excessive deformation in many manufacturing operations. However, it is still a difficult task to design and optimize SMP fixture locating layout at present because there exist multiple conflicting objectives and excessive computational cost of finite element analysis (FEA during the optimization process. To this end, a new multiobjective optimization method for SMP fixture locating layout is proposed in this paper based on the support vector regression (SVR surrogate model and the elitist nondominated sorting genetic algorithm (NSGA-II. By using ABAQUS™ Python script interface, a parametric FEA model is established. And the fixture locating layout is treated as design variables, while the overall deformation and maximum deformation of SMP under external forces are as the multiple objective functions. First, a limited number of training and testing samples are generated by combining Latin hypercube design (LHD with FEA. Second, two SVR prediction models corresponding to the multiple objectives are established by learning from the limited training samples and are integrated as the multiobjective optimization surrogate model. Third, NSGA-II is applied to determine the Pareto optimal solutions of SMP fixture locating layout. Finally, a multiobjective optimization for fixture locating layout of an aircraft fuselage skin case is conducted to illustrate and verify the proposed method.

  16. Gaussian Process Regression for Predictive But Interpretable Machine Learning Models: An Example of Predicting Mental Workload across Tasks.

    Science.gov (United States)

    Caywood, Matthew S; Roberts, Daniel M; Colombe, Jeffrey B; Greenwald, Hal S; Weiland, Monica Z

    2016-01-01

    There is increasing interest in real-time brain-computer interfaces (BCIs) for the passive monitoring of human cognitive state, including cognitive workload. Too often, however, effective BCIs based on machine learning techniques may function as "black boxes" that are difficult to analyze or interpret. In an effort toward more interpretable BCIs, we studied a family of N-back working memory tasks using a machine learning model, Gaussian Process Regression (GPR), which was both powerful and amenable to analysis. Participants performed the N-back task with three stimulus variants, auditory-verbal, visual-spatial, and visual-numeric, each at three working memory loads. GPR models were trained and tested on EEG data from all three task variants combined, in an effort to identify a model that could be predictive of mental workload demand regardless of stimulus modality. To provide a comparison for GPR performance, a model was additionally trained using multiple linear regression (MLR). The GPR model was effective when trained on individual participant EEG data, resulting in an average standardized mean squared error (sMSE) between true and predicted N-back levels of 0.44. In comparison, the MLR model using the same data resulted in an average sMSE of 0.55. We additionally demonstrate how GPR can be used to identify which EEG features are relevant for prediction of cognitive workload in an individual participant. A fraction of EEG features accounted for the majority of the model's predictive power; using only the top 25% of features performed nearly as well as using 100% of features. Subsets of features identified by linear models (ANOVA) were not as efficient as subsets identified by GPR. This raises the possibility of BCIs that require fewer model features while capturing all of the information needed to achieve high predictive accuracy.

  17. Gaussian Process Regression for Predictive But Interpretable Machine Learning Models: An Example of Predicting Mental Workload across Tasks

    Science.gov (United States)

    Caywood, Matthew S.; Roberts, Daniel M.; Colombe, Jeffrey B.; Greenwald, Hal S.; Weiland, Monica Z.

    2017-01-01

    There is increasing interest in real-time brain-computer interfaces (BCIs) for the passive monitoring of human cognitive state, including cognitive workload. Too often, however, effective BCIs based on machine learning techniques may function as “black boxes” that are difficult to analyze or interpret. In an effort toward more interpretable BCIs, we studied a family of N-back working memory tasks using a machine learning model, Gaussian Process Regression (GPR), which was both powerful and amenable to analysis. Participants performed the N-back task with three stimulus variants, auditory-verbal, visual-spatial, and visual-numeric, each at three working memory loads. GPR models were trained and tested on EEG data from all three task variants combined, in an effort to identify a model that could be predictive of mental workload demand regardless of stimulus modality. To provide a comparison for GPR performance, a model was additionally trained using multiple linear regression (MLR). The GPR model was effective when trained on individual participant EEG data, resulting in an average standardized mean squared error (sMSE) between true and predicted N-back levels of 0.44. In comparison, the MLR model using the same data resulted in an average sMSE of 0.55. We additionally demonstrate how GPR can be used to identify which EEG features are relevant for prediction of cognitive workload in an individual participant. A fraction of EEG features accounted for the majority of the model’s predictive power; using only the top 25% of features performed nearly as well as using 100% of features. Subsets of features identified by linear models (ANOVA) were not as efficient as subsets identified by GPR. This raises the possibility of BCIs that require fewer model features while capturing all of the information needed to achieve high predictive accuracy. PMID:28123359

  18. Spectroscopic determination of aboveground biomass in grasslands using spectral transformations, Support Vector Machine and Partial Least Squares Regression.

    Science.gov (United States)

    Marabel, Miguel; Alvarez-Taboada, Flor

    2013-08-06

    Aboveground biomass (AGB) is one of the strategic biophysical variables of interest in vegetation studies. The main objective of this study was to evaluate the Support Vector Machine (SVM) and Partial Least Squares Regression (PLSR) for estimating the AGB of grasslands from field spectrometer data and to find out which data pre-processing approach was the most suitable. The most accurate model to predict the total AGB involved PLSR and the Maximum Band Depth index derived from the continuum removed reflectance in the absorption features between 916-1,120 nm and 1,079-1,297 nm (R2 = 0.939, RMSE = 7.120 g/m2). Regarding the green fraction of the AGB, the Area Over the Minimum index derived from the continuum removed spectra provided the most accurate model overall (R2 = 0.939, RMSE = 3.172 g/m2). Identifying the appropriate absorption features was proved to be crucial to improve the performance of PLSR to estimate the total and green aboveground biomass, by using the indices derived from those spectral regions. Ordinary Least Square Regression could be used as a surrogate for the PLSR approach with the Area Over the Minimum index as the independent variable, although the resulting model would not be as accurate.

  19. Spectroscopic Determination of Aboveground Biomass in Grasslands Using Spectral Transformations, Support Vector Machine and Partial Least Squares Regression

    Directory of Open Access Journals (Sweden)

    Miguel Marabel

    2013-08-01

    Full Text Available Aboveground biomass (AGB is one of the strategic biophysical variables of interest in vegetation studies. The main objective of this study was to evaluate the Support Vector Machine (SVM and Partial Least Squares Regression (PLSR for estimating the AGB of grasslands from field spectrometer data and to find out which data pre-processing approach was the most suitable. The most accurate model to predict the total AGB involved PLSR and the Maximum Band Depth index derived from the continuum removed reflectance in the absorption features between 916–1,120 nm and 1,079–1,297 nm (R2 = 0.939, RMSE = 7.120 g/m2. Regarding the green fraction of the AGB, the Area Over the Minimum index derived from the continuum removed spectra provided the most accurate model overall (R2 = 0.939, RMSE = 3.172 g/m2. Identifying the appropriate absorption features was proved to be crucial to improve the performance of PLSR to estimate the total and green aboveground biomass, by using the indices derived from those spectral regions. Ordinary Least Square Regression could be used as a surrogate for the PLSR approach with the Area Over the Minimum index as the independent variable, although the resulting model would not be as accurate.

  20. Least Square Support Vector Machine Classifier vs a Logistic Regression Classifier on the Recognition of Numeric Digits

    Directory of Open Access Journals (Sweden)

    Danilo A. López-Sarmiento

    2013-11-01

    Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.

  1. Evaluation of SVR: A Wireless Sensor Network Routing Protocol

    Directory of Open Access Journals (Sweden)

    Javed Ali Baloch

    2014-07-01

    Full Text Available The advancement in technology has made it possible to create small in size, low cost sensor nodes. However, the small size and low cost of such nodes comesat at price that is, reduced processing power, low memory and significantly small battery energy storage. WSNs (Wireless Sensor Networks are inherently ad hoc in nature and are assumed to work in the toughest terrain. The network lifetime plays a pivotal role in a wireless sensor network. A long network lifetime, could be achieved by either making significant changes in these low cost devices, which is not a feasible solution or by improving the means of communication throughout the network. The communication in such networks could be improved by employing energy efficient routing protocols, to route the data throughout the network. In this paper the SVR (Spatial Vector Routing protocol is compared against the most common WSN routing protocols, and from the results it could be inferred that the SVR protocol out performs its counterparts. The protocol provides an energy efficient means of communication in the network

  2. LS-SVR and AGO Based Time Series Prediction Method

    Institute of Scientific and Technical Information of China (English)

    ZHANG Shou-peng; LIU Shan; CHAI Wang-xu; ZHANG Jia-qi; GUO Yang-ming

    2016-01-01

    Recently , fault or health condition prediction of complex systems becomes an interesting research topic.However, it is difficult to establish precise physical model for complex systems , and the time series properties are often necessary to be incorporated for the prediction in practice .Currently ,the LS -SVR is widely adopted for prediction of systems with time series data .In this paper , in order to improve the prediction accuracy, accumulated generating operation (AGO) is carried out to improve the data quality and regularity of raw time series data based on grey system theory;then, the inverse accumulated generating operation ( IAGO) is performed to obtain the prediction results .In addition , due to the reason that appropriate kernel function plays an important role in improving the accuracy of prediction through LS-SVR, a modified Gaussian radial basis function (RBF) is proposed.The requirements of distance functions-based kernel functions are satisfied , which ensure fast damping at the place adjacent to the test point and a moderate damping at infinity .The presented model is applied to the analysis of benchmarks .As indicated by the results , the proposed method is an effective prediction one with good precision .

  3. River flow prediction using hybrid models of support vector regression with the wavelet transform, singular spectrum analysis and chaotic approach

    Science.gov (United States)

    Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal

    2017-03-01

    Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.

  4. Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features

    Directory of Open Access Journals (Sweden)

    Peek Andrew S

    2007-06-01

    Full Text Available Abstract Background RNA interference (RNAi is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid

  5. Estimate of error bounds in the improved support vector regression

    Institute of Scientific and Technical Information of China (English)

    SUN Yanfeng; LIANG Yanchun; WU Chunguo; YANG Xiaowei; LEE Heow Pueh; LIN Wu Zhong

    2004-01-01

    An estimate of a generalization error bound of the improved support vector regression(SVR)is provided based on our previous work.The boundedness of the error of the improved SVR is proved when the algorithm is applied to the function approximation.

  6. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling.

    Science.gov (United States)

    Kawashima, Issaku; Kumano, Hiroaki

    2017-01-01

    Mind-wandering (MW), task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG) variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR) to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  7. Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection.

    Science.gov (United States)

    Pan, Xiao-Yong; Shen, Hong-Bin

    2009-01-01

    B-factor is highly correlated with protein internal motion, which is used to measure the uncertainty in the position of an atom within a crystal structure. Although the rapid progress of structural biology in recent years makes more accurate protein structures available than ever, with the avalanche of new protein sequences emerging during the post-genomic Era, the gap between the known protein sequences and the known protein structures becomes wider and wider. It is urgent to develop automated methods to predict B-factor profile from the amino acid sequences directly, so as to be able to timely utilize them for basic research. In this article, we propose a novel approach, called PredBF, to predict the real value of B-factor. We firstly extract both global and local features from the protein sequences as well as their evolution information, then the random forests feature selection is applied to rank their importance and the most important features are inputted to a two-stage support vector regression (SVR) for prediction, where the initial predicted outputs from the 1(st) SVR are further inputted to the 2nd layer SVR for final refinement. Our results have revealed that a systematic analysis of the importance of different features makes us have deep insights into the different contributions of features and is very necessary for developing effective B-factor prediction tools. The two-layer SVR prediction model designed in this study further enhanced the robustness of predicting the B-factor profile. As a web server, PredBF is freely available at: http://www.csbio.sjtu.edu.cn/bioinf/PredBF for academic use.

  8. Machine Learning Based Statistical Prediction Model for Improving Performance of Live Virtual Machine Migration

    Directory of Open Access Journals (Sweden)

    Minal Patel

    2016-01-01

    Full Text Available Service can be delivered anywhere and anytime in cloud computing using virtualization. The main issue to handle virtualized resources is to balance ongoing workloads. The migration of virtual machines has two major techniques: (i reducing dirty pages using CPU scheduling and (ii compressing memory pages. The available techniques for live migration are not able to predict dirty pages in advance. In the proposed framework, time series based prediction techniques are developed using historical analysis of past data. The time series is generated with transferring of memory pages iteratively. Here, two different regression based models of time series are proposed. The first model is developed using statistical probability based regression model and it is based on ARIMA (autoregressive integrated moving average model. The second one is developed using statistical learning based regression model and it uses SVR (support vector regression model. These models are tested on real data set of Xen to compute downtime, total number of pages transferred, and total migration time. The ARIMA model is able to predict dirty pages with 91.74% accuracy and the SVR model is able to predict dirty pages with 94.61% accuracy that is higher than ARIMA.

  9. Combining Self-organizing Feature Map with Support Vector Regression Based on Expert System

    Institute of Scientific and Technical Information of China (English)

    WANGLing; MUZhi-Chun; GUOHui

    2005-01-01

    A new approach is proposed to model nonlinear dynamic systems by combining SOM(self-organizing feature map) with support vector regression (SVR) based on expert system. The whole system has a two-stage neural network architecture. In the first stage SOM is used as a clustering algorithm to partition the whole input space into several disjointed regions. A hierarchical architecture is adopted in the partition to avoid the problem of predetermining the number of partitioned regions. Then, in the second stage, multiple SVR, also called SVR experts, that best fit each partitioned region by the combination of different kernel function of SVR and promote the configuration and tuning of SVR. Finally, to apply this new approach to time-series prediction problems based on the Mackey-Glass differential equation and Santa Fe data, the results show that SVR experts has effective improvement in the generalization performance in comparison with the single SVR model.

  10. 2D Quantitative Structure-Property Relationship Study of Mycotoxins by Multiple Linear Regression and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Fereshteh Shiri

    2010-08-01

    Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

  11. Effects of solar wind ultralow-frequency fluctuations on plasma sheet electron temperature: Regression analysis with support vector machine

    Science.gov (United States)

    Wang, Chih-Ping; Kim, Hee-Jeong; Yue, Chao; Weygand, James M.; Hsu, Tung-Shin; Chu, Xiangning

    2017-04-01

    To investigate whether ultralow-frequency (ULF) fluctuations from 0.5 to 8.3 mHz in the solar wind and interplanetary magnetic field (IMF) can affect the plasma sheet electron temperature (Te) near geosynchronous distances, we use a support vector regression machine technique to decouple the effects from different solar wind parameters and their ULF fluctuation power. Te in this region varies from 0.1 to 10 keV with a median of 1.3 keV. We find that when the solar wind ULF power is weak, Te increases with increasing southward IMF Bz and solar wind speed, while it varies weakly with solar wind density. As the ULF power becomes stronger during weak IMF Bz ( 0) or northward IMF, Te becomes significantly enhanced, by a factor of up to 10. We also find that mesoscale disturbances in a time scale of a few to tens of minutes as indicated by AE during substorm expansion and recovery phases are more enhanced when the ULF power is stronger. The effect of ULF powers may be explained by stronger inward radial diffusion resulting from stronger mesoscale disturbances under higher ULF powers, which can bring high-energy plasma sheet electrons further toward geosynchronous distance. This effect of ULF powers is particularly important during weak southward IMF or northward IMF when convection electric drift is weak.

  12. Improving the vector auto regression technique for time-series link prediction by using support vector machine

    Directory of Open Access Journals (Sweden)

    Co Jan Miles

    2016-01-01

    Full Text Available Predicting links between the nodes of a graph has become an important Data Mining task because of its direct applications to biology, social networking, communication surveillance, and other domains. Recent literature in time-series link prediction has shown that the Vector Auto Regression (VAR technique is one of the most accurate for this problem. In this study, we apply Support Vector Machine (SVM to improve the VAR technique that uses an unweighted adjacency matrix along with 5 matrices: Common Neighbor (CN, Adamic-Adar (AA, Jaccard’s Coefficient (JC, Preferential Attachment (PA, and Research Allocation Index (RA. A DBLP dataset covering the years from 2003 until 2013 was collected and transformed into time-sliced graph representations. The appropriate matrices were computed from these graphs, mapped to the feature space, and then used to build baseline VAR models with lag of 2 and some corresponding SVM classifiers. Using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC as the main fitness metric, the average result of 82.04% for the VAR was improved to 84.78% with SVM. Additional experiments to handle the highly imbalanced dataset by oversampling with SMOTE and undersampling with K-means clusters, however, did not improve the average AUC-ROC of the baseline SVM.

  13. Identifying Environmental and Social Factors Predisposing to Pathological Gambling Combining Standard Logistic Regression and Logic Learning Machine.

    Science.gov (United States)

    Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco

    2017-03-02

    Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.

  14. 模糊聚类支持向量机的区域空气PM2.5浓度预报%Fuzzy Clustering Support Vector Machine for Predicting Regional PM2.5 Concentration

    Institute of Scientific and Technical Information of China (English)

    李海琴; 杨忠; 俞杰; 史旭华

    2016-01-01

    在分析模糊C均值聚类算法与支持向量机回归的特点后,将二者结合,提出了模糊聚类支持向量机回归(FCM-SVR)算法,对空气中颗粒物浓度PM2.5进行预测.该方法首先利用模糊C均值聚类算法把一个复杂的数据集分成多个群体,再在每个群体上建立支持向量机回归(SVR)模型,然后进行集成,对区域空气的 PM2.5浓度进行预测.预测结果分别与自组织竞争神经网络支持向量机回归(SOM-SVR)模型和单一的支持向量机回归(SVR)的结果进行比较.结果表明, FCM-SVR模型的预报准确率高于SOM-SVR模型和SVR模型.%A fuzzy clustering support vector machine regression algorithm is proposed by analyzing and combining the characteristics of the fuzzy C-mean clustering algorithm and the support vector machine regression. The SVM is designed to forecast the particles density PM2.5 in the air. Firstly, a complex data set is separated and inserted into multiple groups using fuzzy C-mean clustering algorithm. Then the SVM regression model in each group is established. The integrated fuzzy clustering SVM regression is applied to forecast the PM2.5 in the local air. By comparing the predicted result with that of the self-organizing competitive neural network SVM regression model, as well as that of the single SVM regression model respectively, it is found that the predicted accuracy rate of the FCM-SVR is higher than that of the SOM-SVR model and SVR model.

  15. Improved Correction of Atmospheric Pressure Data Obtained by Smartphones through Machine Learning

    Directory of Open Access Journals (Sweden)

    Yong-Hyuk Kim

    2016-01-01

    Full Text Available A correction method using machine learning aims to improve the conventional linear regression (LR based method for correction of atmospheric pressure data obtained by smartphones. The method proposed in this study conducts clustering and regression analysis with time domain classification. Data obtained in Gyeonggi-do, one of the most populous provinces in South Korea surrounding Seoul with the size of 10,000 km2, from July 2014 through December 2014, using smartphones were classified with respect to time of day (daytime or nighttime as well as day of the week (weekday or weekend and the user’s mobility, prior to the expectation-maximization (EM clustering. Subsequently, the results were analyzed for comparison by applying machine learning methods such as multilayer perceptron (MLP and support vector regression (SVR. The results showed a mean absolute error (MAE 26% lower on average when regression analysis was performed through EM clustering compared to that obtained without EM clustering. For machine learning methods, the MAE for SVR was around 31% lower for LR and about 19% lower for MLP. It is concluded that pressure data from smartphones are as good as the ones from national automatic weather station (AWS network.

  16. An automated ranking platform for machine learning regression models for meat spoilage prediction using multi-spectral imaging and metabolic profiling.

    Science.gov (United States)

    Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady

    2017-09-01

    Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights

  17. Soft sensor design for hydrodesulfurization process using support vector regression based on WT and PCA

    Institute of Scientific and Technical Information of China (English)

    Saeid Shokri; Mohammad Taghi Sadeghi; Mahdi Ahmadi Marvast; Shankar Narasimhan

    2015-01-01

    A novel method for developing a reliable data driven soft sensor to improve the prediction accuracy of sulfur content in hydrodesulfurization (HDS) process was proposed. Therefore, an integrated approach using support vector regression (SVR) based on wavelet transform (WT) and principal component analysis (PCA) was used. Experimental data from the HDS setup were employed to validate the proposed model. The results reveal that the integrated WT-PCA with SVR model was able to increase the prediction accuracy of SVR model. Implementation of the proposed model delivers the best satisfactory predicting performance (EAARE=0.058 andR2=0.97) in comparison with SVR. The obtained results indicate that the proposed model is more reliable and more precise than the multiple linear regression (MLR), SVR and PCA-SVR.

  18. Application of support vector regression (SVR) for stream flow prediction on the Amazon basin

    CSIR Research Space (South Africa)

    Du Toit, Melise

    2016-10-01

    Full Text Available Mason (International Research Institute for Climate and Society, USA) Dr Thando Ndrana (Council for Scientific and Industrial Research, SA) Prof Stuart Piketh (School of Geo and Spatial Science, University of North-West, SA) Associate Prof, Marcello... scenario development, and climate change adaptation. Based in Austria (International Institute for Applied Systems Analysis), the UK (Birmingham) and since 1990 in Finland, Carter has been a Lead Author for each of the five Intergovernmental Panel...

  19. A hybrid AR-EMD-SVR model for the short-term prediction of nonlinear and non-stationary ship motion

    Institute of Scientific and Technical Information of China (English)

    Wen-yang DUAN; Li-min HUANG; Yang HAN; Ya-hui ZHANG; Shuo HUANG

    2015-01-01

    题目:用于非线性非平稳船舶运动极短期预报的一种复合自回归经验模态分解支持向量机回归模型  目的:基于支持向量机回归(SVR)模型在非线时间序列的预测能力及经验模态分解(EMD)方法在处理非线性非平稳性的优势,提出一种复合自回归经验模态分解支持向量机回归(AR-EMD-SVR)模型,提高非线性非平稳船舶运动极短期预报精度。  创新点:1.研究非线性非平稳船舶运动的极短期预报问题,提出一种复合的预报方法;2.基于不同层次的预报模型和模型试验数据,分析非线性非平稳性对极短期预报精度的影响。  方法:1.在SVR模型中引入基于自回归(AR)预报端点延拓的 EMD 方法,形成复合的 AR-EMD-SVR 预报模型;2.基于集装箱船模水池试验运动数据将 AR-EMD-SVR 模型与 AR、SVR 和EMD-AR 三种模型进行比较,分析非线性非平稳性对极短期预报的影响以及不同模型的预报性能。  结论:1. AR-EMD 方法能够有效的克服非平稳对极短期预报模型(AR和 SVR)在精度上所带来的不良影响;2.基于船模试验数据的预报结果表明:相较于 AR、SVR 和 EMD-AR 三种预报模型,基于 AR-EMD-SVR模型的非线性非平稳船舶运动极短期预报结果具有更高的精度。%Accurate and reliable short-term prediction of ship motions offers improvements in both safety and control quality in ship motion sensitive maritime operations. Inspired by the satisfactory nonlinear learning capability of a support vector re-gression (SVR) model and the strong non-stationary processing ability of empirical mode decomposition (EMD), this paper develops a hybrid autoregressive (AR)-EMD-SVR model for the short-term forecast of nonlinear and non-stationary ship motion. The proposed hybrid model is designed by coupling the SVR model with an AR-EMD technique, which employs an AR model in ends

  20. Monthly evaporation forecasting using artificial neural networks and support vector machines

    Science.gov (United States)

    Tezel, Gulay; Buyukyildiz, Meral

    2016-04-01

    Evaporation is one of the most important components of the hydrological cycle, but is relatively difficult to estimate, due to its complexity, as it can be influenced by numerous factors. Estimation of evaporation is important for the design of reservoirs, especially in arid and semi-arid areas. Artificial neural network methods and support vector machines (SVM) are frequently utilized to estimate evaporation and other hydrological variables. In this study, usability of artificial neural networks (ANNs) (multilayer perceptron (MLP) and radial basis function network (RBFN)) and ɛ-support vector regression (SVR) artificial intelligence methods was investigated to estimate monthly pan evaporation. For this aim, temperature, relative humidity, wind speed, and precipitation data for the period 1972 to 2005 from Beysehir meteorology station were used as input variables while pan evaporation values were used as output. The Romanenko and Meyer method was also considered for the comparison. The results were compared with observed class A pan evaporation data. In MLP method, four different training algorithms, gradient descent with momentum and adaptive learning rule backpropagation (GDX), Levenberg-Marquardt (LVM), scaled conjugate gradient (SCG), and resilient backpropagation (RBP), were used. Also, ɛ-SVR model was used as SVR model. The models were designed via 10-fold cross-validation (CV); algorithm performance was assessed via mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2). According to the performance criteria, the ANN algorithms and ɛ-SVR had similar results. The ANNs and ɛ-SVR methods were found to perform better than the Romanenko and Meyer methods. Consequently, the best performance using the test data was obtained using SCG(4,2,2,1) with R 2 = 0.905.

  1. Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model

    Science.gov (United States)

    Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.

    2017-02-01

    Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the

  2. Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform

    Science.gov (United States)

    Kalteh, Aman Mohammad

    2013-04-01

    Reliable and accurate forecasts of river flow is needed in many water resources planning, design development, operation and maintenance activities. In this study, the relative accuracy of artificial neural network (ANN) and support vector regression (SVR) models coupled with wavelet transform in monthly river flow forecasting is investigated, and compared to regular ANN and SVR models, respectively. The relative performance of regular ANN and SVR models is also compared to each other. For this, monthly river flow data of Kharjegil and Ponel stations in Northern Iran are used. The comparison of the results reveals that both ANN and SVR models coupled with wavelet transform, are able to provide more accurate forecasting results than the regular ANN and SVR models. However, it is found that SVR models coupled with wavelet transform provide better forecasting results than ANN models coupled with wavelet transform. The results also indicate that regular SVR models perform slightly better than regular ANN models.

  3. Retrieval of aerosol optical depth from surface solar radiation measurements using machine learning algorithms, non-linear regression and a radiative transfer-based look-up table

    Science.gov (United States)

    Huttunen, Jani; Kokkola, Harri; Mielonen, Tero; Esa Juhani Mononen, Mika; Lipponen, Antti; Reunanen, Juha; Vilhelm Lindfors, Anders; Mikkonen, Santtu; Erkki Juhani Lehtinen, Kari; Kouremeti, Natalia; Bais, Alkiviadis; Niska, Harri; Arola, Antti

    2016-07-01

    In order to have a good estimate of the current forcing by anthropogenic aerosols, knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from the 1990s onward. One option to lengthen the AOD time series beyond the 1990s is to retrieve AOD from surface solar radiation (SSR) measurements taken with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a non-linear regression method and four machine learning methods (Gaussian process, neural network, random forest and support vector machine) with AOD observations carried out with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and non-linear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the random forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval, whereas the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during

  4. SVR-D1.2: A Prediction Model for Population Occurrence of Paddy Stem Borer

    Directory of Open Access Journals (Sweden)

    Lichuan Gu

    2012-08-01

    Full Text Available In this study, we analyse the SVR-based prediction method for selecting the optimal model framework based on kernel matrix. Moreover, SVR-D1.2 is proposed with the help of the kernel matrix’s symmetry and positive definition and kernel alignment. Test results show that there exactly exists the non-line relation between the insect population occurrence and the meteorological factors and the new prediction model, SVR-D1.2, improved prediction accuracy compared with other methods.

  5. Short-Term Wind Speed Forecasting Using Support Vector Regression Optimized by Cuckoo Optimization Algorithm

    Directory of Open Access Journals (Sweden)

    Jianzhou Wang

    2015-01-01

    Full Text Available This paper develops an effectively intelligent model to forecast short-term wind speed series. A hybrid forecasting technique is proposed based on recurrence plot (RP and optimized support vector regression (SVR. Wind caused by the interaction of meteorological systems makes itself extremely unsteady and difficult to forecast. To understand the wind system, the wind speed series is analyzed using RP. Then, the SVR model is employed to forecast wind speed, in which the input variables are selected by RP, and two crucial parameters, including the penalties factor and gamma of the kernel function RBF, are optimized by various optimization algorithms. Those optimized algorithms are genetic algorithm (GA, particle swarm optimization algorithm (PSO, and cuckoo optimization algorithm (COA. Finally, the optimized SVR models, including COA-SVR, PSO-SVR, and GA-SVR, are evaluated based on some criteria and a hypothesis test. The experimental results show that (1 analysis of RP reveals that wind speed has short-term predictability on a short-term time scale, (2 the performance of the COA-SVR model is superior to that of the PSO-SVR and GA-SVR methods, especially for the jumping samplings, and (3 the COA-SVR method is statistically robust in multi-step-ahead prediction and can be applied to practical wind farm applications.

  6. Application of phase space reconstruction and v-SVR algorithm in predicting displacement of underground engineering surrounding rock

    Institute of Scientific and Technical Information of China (English)

    SHI Chao; CHEN Yi-feng; YU Zhi-xiong; YANG Kun

    2006-01-01

    A new method for predicting the trend of displacement evolution of surrounding rock was presented in this paper. According to the nonlinear characteristics of displacement time series of underground engineering surrounding rock, based on phase space reconstruction theory and the powerful nonlinear mapping ability of support vector machines, the information offered by the time series datum sets was fully exploited and the non-linearity of the displacement evolution system of surrounding rock was well described.The example suggests that the methods based on phase space reconstruction and modified v-SVR algorithm are very accurate, and the study can help to build the displacement forecast system to analyze the stability of underground engineering surrounding rock.

  7. Simulator verification of thyristor controlled series capacitor SVR (Synchronous Voltage Reversal) scheme

    Energy Technology Data Exchange (ETDEWEB)

    Dickmander, D.L. [ABB Power T and D Company Inc., Raleigh, NC (United States). Transmission Technology Inst.; Rudin, S. [ABB Power Systems AB, Vaesteraas (Sweden). Reactive Power Compensation Div.

    1995-12-31

    This paper presents a simulator small-signal verification study conducted for a new Thyristor Controlled Series Capacitor (TCSC) control scheme referred to as the Synchronous Voltage Reversal (SVR) scheme. The goal of the SVR scheme is to achieve an inherently inductive sub-synchronous impedance characteristic for the TCSC, while preserving the capability to add higher level control loops. A detailed TCSC control model using field-proven digital control hardware, and programmed with the SVR scheme, was incorporated into a simulator representation of the IEEE First Benchmark system. Detailed measurements are presented in the paper to demonstrate that the SVR scheme successfully mitigates SSR (sub-synchronous resonance) conditions for the studied system. 8 refs, 13 figs, 2 tabs

  8. A soft self-repairing for FBG sensor network in SHM system based on PSO-SVR model reconstruction

    Science.gov (United States)

    Zhang, Xiaoli; Wang, Peng; Liang, Dakai; Fan, Chunfeng; Li, Cailing

    2015-05-01

    Structural health monitoring (SHM) system takes advantage of an array of sensors to continuously monitor a structure and provide an early prediction such as the damage position and damage degree etc. Such a system requires monitoring the structure in any conditions including bad condition. Therefore, it must be robust and survivable, even has the self-repairing ability. In this study, a model reconstruction predicting algorithm based on particle swarm optimization-support vector regression (PSO-SVR) is proposed to achieve the self-repairing of the Fiber Bragg Grating (FBG) sensor network in SHM system. Furthermore, an eight-point FBG sensor SHM system is experimented in an aircraft wing box. For the damage loading position prediction on the aircraft wing box, six kinds of disabled modes are experimentally studied to verify the self-repairing ability of the FBG sensor network in the SHM system, and the predicting performance are compared with non-reconstruction based on PSO-SVR model. The research results indicate that the model reconstruction algorithm has more excellence than that of non-reconstruction model, if partial sensors are invalid in the FBG-based SHM system, the predicting performance of the model reconstruction algorithm is almost consistent with that no sensor is invalid in the SHM system. In this way, the self-repairing ability of the FBG sensor is achieved for the SHM system, such the reliability and survivability of the FBG-based SHM system is enhanced if partial FBG sensors are invalid.

  9. Multiple local feature representations and their fusion based on an SVR model for iris recognition using optimized Gabor filters

    Science.gov (United States)

    He, Fei; Liu, Yuanning; Zhu, Xiaodong; Huang, Chun; Han, Ye; Dong, Hongxing

    2014-12-01

    Gabor descriptors have been widely used in iris texture representations. However, fixed basic Gabor functions cannot match the changing nature of diverse iris datasets. Furthermore, a single form of iris feature cannot overcome difficulties in iris recognition, such as illumination variations, environmental conditions, and device variations. This paper provides multiple local feature representations and their fusion scheme based on a support vector regression (SVR) model for iris recognition using optimized Gabor filters. In our iris system, a particle swarm optimization (PSO)- and a Boolean particle swarm optimization (BPSO)-based algorithm is proposed to provide suitable Gabor filters for each involved test dataset without predefinition or manual modulation. Several comparative experiments on JLUBR-IRIS, CASIA-I, and CASIA-V4-Interval iris datasets are conducted, and the results show that our work can generate improved local Gabor features by using optimized Gabor filters for each dataset. In addition, our SVR fusion strategy may make full use of their discriminative ability to improve accuracy and reliability. Other comparative experiments show that our approach may outperform other popular iris systems.

  10. Fast Prediction with Sparse Multikernel LS-SVR Using Multiple Relevant Time Series and Its Application in Avionics System

    Directory of Open Access Journals (Sweden)

    Yang M. Guo

    2015-01-01

    Full Text Available Health trend prediction is critical to ensure the safe operation of highly reliable systems. However, complex systems often present complex dynamic behaviors and uncertainty, which makes it difficult to develop a precise physical prediction model. Therefore, time series is often used for prediction in this case. In this paper, in order to obtain better prediction accuracy in shorter computation time, we propose a new scheme which utilizes multiple relevant time series to enhance the completeness of the information and adopts a prediction model based on least squares support vector regression (LS-SVR to perform prediction. In the scheme, we apply two innovative ways to overcome the drawbacks of the reported approaches. One is to remove certain support vectors by measuring the linear correlation to increase sparseness of LS-SVR; the other one is to determine the linear combination weights of multiple kernels by calculating the root mean squared error of each basis kernel. The results of prediction experiments indicate preliminarily that the proposed method is an effective approach for its good prediction accuracy and low computation time, and it is a valuable method in applications.

  11. Support vector regression-based internal model control

    Institute of Scientific and Technical Information of China (English)

    HUANG Yan-wei; PENG Tie-gen

    2007-01-01

    This paper proposes a design of internal model control systems for process with delay by using support vector regression (SVR). The proposed system fully uses the excellent nonlinear estimation performance of SVR with the structural risk minimization principle. Closed-system stability and steady error are analyzed for the existence of modeling errors. The simulations show that the proposed control systems have the better control performance than that by neural networks in the cases of the training samples with small size and noises.

  12. Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shahrbanoo Goli

    2016-01-01

    Full Text Available The Support Vector Regression (SVR model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.

  13. Application of Multi-task Sparse Group Lasso Feature Extraction and Support Vector Machine Regression in the Stellar Atmospheric Parametrization

    Science.gov (United States)

    Gao, W.; Li, X. R.

    2016-07-01

    The multi-task learning puts the multiple tasks together to analyse and calculate for discovering the correlation between them, which can improve the accuracy of analysis results. This kind of methods have been widely studied in machine learning, pattern recognition, computer vision, and other related fields. This paper investigates the application of multi-task learning in estimating the effective temperature (T_{eff}), surface gravity (lg g), and chemical abundance ([Fe/H]). Firstly, the spectral characteristics of the three atmospheric physical parameters are extracted by using the multi-task Sparse Group Lasso algorithm, and then the support vector machine is used to estimate the atmospheric physical parameters. The proposed scheme is evaluated on both Sloan stellar spectra and theoretical spectra computed from Kurucz's New Opacity Distribution Function (NEWODF) model. The mean absolute errors (MAEs) on the Sloan spectra are: 0.0064 for lg (T_{eff}/K), 0.1622 for lg (g/(cm\\cdot s^{-2})), and 0.1221 dex for [Fe/H]; The MAEs on synthetic spectra are 0.0006 for lg (T_{eff}/K), 0.0098 for lg (g/(cm\\cdot s^{-2})), and 0.0082 dex for [Fe/H]. Experimental results show that the proposed scheme is excellent for atmospheric parameter estimation.

  14. Application of Multi-task Sparse Lasso Feature Extraction and Support Vector Machine Regression in the Stellar Atmospheric Parameterization

    Science.gov (United States)

    Gao, Wei; Li, Xiang-ru

    2017-07-01

    The multi-task learning takes the multiple tasks together to make analysis and calculation, so as to dig out the correlations among them, and therefore to improve the accuracy of the analyzed results. This kind of methods have been widely applied to the machine learning, pattern recognition, computer vision, and other related fields. This paper investigates the application of multi-task learning in estimating the stellar atmospheric parameters, including the surface temperature (Teff), surface gravitational acceleration (lg g), and chemical abundance ([Fe/H]). Firstly, the spectral features of the three stellar atmospheric parameters are extracted by using the multi-task sparse group Lasso algorithm, then the support vector machine is used to estimate the atmospheric physical parameters. The proposed scheme is evaluated on both the Sloan stellar spectra and the theoretical spectra computed from the Kurucz's New Opacity Distribution Function (NEWODF) model. The mean absolute errors (MAEs) on the Sloan spectra are: 0.0064 for lg (Teff /K), 0.1622 for lg (g/(cm · s-2)), and 0.1221 dex for [Fe/H]; the MAEs on the synthetic spectra are 0.0006 for lg (Teff /K), 0.0098 for lg (g/(cm · s-2)), and 0.0082 dex for [Fe/H]. Experimental results show that the proposed scheme has a rather high accuracy for the estimation of stellar atmospheric parameters.

  15. Linear and support vector regressions based on geometrical correlation of data

    Directory of Open Access Journals (Sweden)

    Kaijun Wang

    2007-10-01

    Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.

  16. Application of Machine-Learning Models to Predict Tacrolimus Stable Dose in Renal Transplant Recipients

    Science.gov (United States)

    Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei

    2017-01-01

    Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67–0.76)] and validation cohorts [0.73 (0.63–0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future. PMID:28176850

  17. Application of Machine-Learning Models to Predict Tacrolimus Stable Dose in Renal Transplant Recipients

    Science.gov (United States)

    Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei

    2017-02-01

    Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67–0.76)] and validation cohorts [0.73 (0.63–0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future.

  18. Integrating principal component analysis and vector quantization with support vector regression for sulfur content prediction in HDS process

    Directory of Open Access Journals (Sweden)

    Shokri Saeid

    2015-01-01

    Full Text Available An accurate prediction of sulfur content is very important for the proper operation and product quality control in hydrodesulfurization (HDS process. For this purpose, a reliable data- driven soft sensors utilizing Support Vector Regression (SVR was developed and the effects of integrating Vector Quantization (VQ with Principle Component Analysis (PCA were studied on the assessment of this soft sensor. First, in pre-processing step the PCA and VQ techniques were used to reduce dimensions of the original input datasets. Then, the compressed datasets were used as input variables for the SVR model. Experimental data from the HDS setup were employed to validate the proposed integrated model. The integration of VQ/PCA techniques with SVR model was able to increase the prediction accuracy of SVR. The obtained results show that integrated technique (VQ-SVR was better than (PCA-SVR in prediction accuracy. Also, VQ decreased the sum of the training and test time of SVR model in comparison with PCA. For further evaluation, the performance of VQ-SVR model was also compared to that of SVR. The obtained results indicated that VQ-SVR model delivered the best satisfactory predicting performance (AARE= 0.0668 and R2= 0.995 in comparison with investigated models.

  19. Machine Learning Multi-Stage Classification and Regression in the Search for Vector-like Quarks and the Neyman Construction in Signal Searches

    CERN Document Server

    Leone, Robert Matthew

    A search for vector-like quarks (VLQs) decaying to a Z boson using multi-stage machine learning was compared to a search using a standard square cuts search strategy. VLQs are predicted by several new theories beyond the Standard Model. The searches used 20.3 inverse femtobarns of proton-proton collisions at a center-of-mass energy of 8 TeV collected with the ATLAS detector in 2012 at the CERN Large Hadron Collider. CLs upper limits on production cross sections of vector-like top and bottom quarks were computed for VLQs produced singly or in pairs, Tsingle, Bsingle, Tpair, and Bpair. The two stage machine learning classification search strategy did not provide any improvement over the standard square cuts strategy, but for Tpair, Bpair, and Tsingle, a third stage of machine learning regression was able to lower the upper limits of high signal masses by as much as 50%. Additionally, new test statistics were developed for use in the Neyman construction of confidence regions in order to address deficiencies in c...

  20. Support Vector Regression Model Based on Empirical Mode Decomposition and Auto Regression for Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Hong-Juan Li

    2013-04-01

    Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.

  1. Determination of the cutting forces regression functions for milling machining of the X105CrMo17 material

    Science.gov (United States)

    Popovici, T. D.; Dijmărescu, M. R.

    2017-08-01

    The aim of the research presented in this paper is to determine a cutting force prediction model for milling machining of the X105CrMo17 stainless steel. The analysed material is a martensitic stainless steel which, due to the high Carbon content (∼1%) and Chromium (∼17%), has high hardness and good corrosion resistance characteristics. This material is used for the steel structures parts which are subject of wear in corrosive environments, for making valve seats, bearings, various types of cutters, high hardness bushings, casting shells and nozzles, measuring instruments, etc. The paper is structured into three main parts in accordance to the considered research program; they are preceded by an introduction and followed by relevant conclusions. In the first part, for a more detailed knowledge of the material characteristics, a quality and quantity micro-analysis X-ray and a spectral analysis were performed. The second part presents the physical experiment in terms of input, necessary means, process and registration of the experimental data. In the third part, the experimental data is analysed and the cutting force model is developed in terms of the cutting regime parameters such as cutting speed, feed rate, axial depth and radial depth.

  2. Incremental learning for ν-Support Vector Regression.

    Science.gov (United States)

    Gu, Bin; Sheng, Victor S; Wang, Zhijie; Ho, Derek; Osman, Said; Li, Shuo

    2015-07-01

    The ν-Support Vector Regression (ν-SVR) is an effective regression learning algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν-Support Vector Classification (ν-SVC) (Schölkopf et al., 2000), ν-SVR introduces an additional linear term into its objective function. Thus, directly applying the accurate on-line ν-SVC algorithm (AONSVM) to ν-SVR will not generate an effective initial solution. It is the main challenge to design an incremental ν-SVR learning algorithm. To overcome this challenge, we propose a special procedure called initial adjustments in this paper. This procedure adjusts the weights of ν-SVC based on the Karush-Kuhn-Tucker (KKT) conditions to prepare an initial solution for the incremental learning. Combining the initial adjustments with the two steps of AONSVM produces an exact and effective incremental ν-SVR learning algorithm (INSVR). Theoretical analysis has proven the existence of the three key inverse matrices, which are the cornerstones of the three steps of INSVR (including the initial adjustments), respectively. The experiments on benchmark datasets demonstrate that INSVR can avoid the infeasible updating paths as far as possible, and successfully converges to the optimal solution. The results also show that INSVR is faster than batch ν-SVR algorithms with both cold and warm starts. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    Science.gov (United States)

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  4. Estimation of algal colonization growth on mortar surface using a hybridization of machine learning and metaheuristic optimization

    Indian Academy of Sciences (India)

    THU-HIEN TRAN; NHAT-DUC HOANG

    2017-06-01

    Estimation of the algal colonization growth on fac¸ade structure can provide useful information for the task of building maintenance. This research proposes a machine learning method based on the least squares support vector regression (LS-SVR) for modelling the growth time of the green alga Klebsormidium flaccidum on mortar surfaces. Furthermore, to identify an appropriate set of the LS-SVR hyper-parameters, the flower pollination algorithm (FPA) is employed as an optimization technique. The characteristics of the mortar samples, including surface roughness, porosity, surface pH, carbonated condition and type of cement, are employed as input factors for the analysing process. This study relies on a dataset that records 539 laboratory experiments to establish a hybrid model of the LS-SVR and the FPA. The cross-validation process reveals that the proposed method can successfully capture the functional relationship between the algal colonization growth and its influencing factors with a satisfactory outcome (the coefficient of determination R 2 = 0.94 and the root meansquare error RMSE = 4.55). These facts demonstrate that the hybrid model is a promising tool for assisting the decision-making process in building maintenance planning

  5. Combining Self-organizing Feature Map with Support Vector Regression Based on Expert System%自组织映射算法与基于专家系统的支持向量回归的结合

    Institute of Scientific and Technical Information of China (English)

    王玲; 穆志纯; 郭辉

    2005-01-01

    A new approach is proposed to model nonlinear dynamic systems by combining SOM (self-organizing feature map) with support vector regression (SVR) based on expert system. The whole system has a two-stage neural network architecture. In the first stage SOM is used as a clustering algorithm to partition the whole input space into several disjointed regions. A hierarchical architecture is adopted in the partition to avoid the problem of predetermining the number of partitioned regions. Then, in the second stage, multiple SVR, also called SVR experts, that best fit each partitioned region by the combination of different kernel function of SVR and promote the configuration and tuning of SVR. Finally, to apply this new approach to time-series prediction problems based on the Mackey-Glass differential equation and Santa Fe data, the results show that SVR experts has effective improvement in the generalist performance in comparison with the single SVR model.

  6. A case study using support vector machines, neural networks and logistic regression in a GIS to identify wells contaminated with nitrate-N

    Science.gov (United States)

    Dixon, Barnali

    2009-09-01

    Accurate and inexpensive identification of potentially contaminated wells is critical for water resources protection and management. The objectives of this study are to 1) assess the suitability of approximation tools such as neural networks (NN) and support vector machines (SVM) integrated in a geographic information system (GIS) for identifying contaminated wells and 2) use logistic regression and feature selection methods to identify significant variables for transporting contaminants in and through the soil profile to the groundwater. Fourteen GIS derived soil hydrogeologic and landuse parameters were used as initial inputs in this study. Well water quality data (nitrate-N) from 6,917 wells provided by Florida Department of Environmental Protection (USA) were used as an output target class. The use of the logistic regression and feature selection methods reduced the number of input variables to nine. Receiver operating characteristics (ROC) curves were used for evaluation of these approximation tools. Results showed superior performance with the NN as compared to SVM especially on training data while testing results were comparable. Feature selection did not improve accuracy; however, it helped increase the sensitivity or true positive rate (TPR). Thus, a higher TPR was obtainable with fewer variables.

  7. Forecasting monthly groundwater level fluctuations in coastal aquifers using hybrid Wavelet packet–Support vector regression

    Directory of Open Access Journals (Sweden)

    N. Sujay Raghavendra

    2015-12-01

    Full Text Available This research demonstrates the state-of-the-art capability of Wavelet packet analysis in improving the forecasting efficiency of Support vector regression (SVR through the development of a novel hybrid Wavelet packet–Support vector regression (WP–SVR model for forecasting monthly groundwater level fluctuations observed in three shallow unconfined coastal aquifers. The Sequential Minimal Optimization Algorithm-based SVR model is also employed for comparative study with WP–SVR model. The input variables used for modeling were monthly time series of total rainfall, average temperature, mean tide level, and past groundwater level observations recorded during the period 1996–2006 at three observation wells located near Mangalore, India. The Radial Basis function is employed as a kernel function during SVR modeling. Model parameters are calibrated using the first seven years of data, and the remaining three years data are used for model validation using various input combinations. The performance of both the SVR and WP–SVR models is assessed using different statistical indices. From the comparative result analysis of the developed models, it can be seen that WP–SVR model outperforms the classic SVR model in predicting groundwater levels at all the three well locations (e.g. NRMSE(WP–SVR = 7.14, NRMSE(SVR = 12.27; NSE(WP–SVR = 0.91, NSE(SVR = 0.8 during the test phase with respect to well location at Surathkal. Therefore, using the WP–SVR model is highly acceptable for modeling and forecasting of groundwater level fluctuations.

  8. Target Localization in Wireless Sensor Networks Using Online Semi-Supervised Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Jaehyun Yoo

    2015-05-01

    Full Text Available Machine learning has been successfully used for target localization in wireless sensor networks (WSNs due to its accurate and robust estimation against highly nonlinear and noisy sensor measurement. For efficient and adaptive learning, this paper introduces online semi-supervised support vector regression (OSS-SVR. The first advantage of the proposed algorithm is that, based on semi-supervised learning framework, it can reduce the requirement on the amount of the labeled training data, maintaining accurate estimation. Second, with an extension to online learning, the proposed OSS-SVR automatically tracks changes of the system to be learned, such as varied noise characteristics. We compare the proposed algorithm with semi-supervised manifold learning, an online Gaussian process and online semi-supervised colocalization. The algorithms are evaluated for estimating the unknown location of a mobile robot in a WSN. The experimental results show that the proposed algorithm is more accurate under the smaller amount of labeled training data and is robust to varying noise. Moreover, the suggested algorithm performs fast computation, maintaining the best localization performance in comparison with the other methods.

  9. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models

    Science.gov (United States)

    Welp, Gerhard; Thiel, Michael

    2017-01-01

    Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties–sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen–in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models–multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)–were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of

  10. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models.

    Science.gov (United States)

    Forkuor, Gerald; Hounkpatin, Ozias K L; Welp, Gerhard; Thiel, Michael

    2017-01-01

    Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties-sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen-in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models-multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)-were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of redness

  11. A Combination of Geographically Weighted Regression, Particle Swarm Optimization and Support Vector Machine for Landslide Susceptibility Mapping: A Case Study at Wanzhou in the Three Gorges Area, China.

    Science.gov (United States)

    Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian

    2016-05-11

    In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.

  12. Ranking chemical structures for drug discovery: a new machine learning approach.

    Science.gov (United States)

    Agarwal, Shivani; Dugar, Deepak; Sengupta, Shiladitya

    2010-05-24

    With chemical libraries increasingly containing millions of compounds or more, there is a fast-growing need for computational methods that can rank or prioritize compounds for screening. Machine learning methods have shown considerable promise for this task; indeed, classification methods such as support vector machines (SVMs), together with their variants, have been used in virtual screening to distinguish active compounds from inactive ones, while regression methods such as partial least-squares (PLS) and support vector regression (SVR) have been used in quantitative structure-activity relationship (QSAR) analysis for predicting biological activities of compounds. Recently, a new class of machine learning methods - namely, ranking methods, which are designed to directly optimize ranking performance - have been developed for ranking tasks such as web search that arise in information retrieval (IR) and other applications. Here we report the application of these new ranking methods in machine learning to the task of ranking chemical structures. Our experiments show that the new ranking methods give better ranking performance than both classification based methods in virtual screening and regression methods in QSAR analysis. We also make some interesting connections between ranking performance measures used in cheminformatics and those used in IR studies.

  13. 基于SVR的人脸图像超分辨率复原算法%Face hallucination based on SVR

    Institute of Scientific and Technical Information of China (English)

    王宇; 吴炜; 严斌宇; 张莹莹

    2013-01-01

    Most learning-based super-resolution algorithms have the shortcoming of the“quantitative”er-ror w hen using classification algorithm .In this paper ,a learning based super-resolution algorithm based on SVR is proposed .The algorithm first extracts high-frequency information of High-resolution images and middle-frequency of low-resolution images .Then ,according to their relationship ,a regression mod-el is built by using SVR .During the recovery ,middle-frequency of low-resolution images is extracted to feed into the built model to get high-frequency information .The experimental results showed that our method achieves very good results to IMDB (Asians) face database and Yale face(mainly Europeans and Americans) database .Overall ,the results of our method have better visual effects and higher peak sig-nal to noise ratio .%本文针对目前大多数基于学习的超分辨率算法由于“分类算法”造成的“量化”误差的问题,提出了基于SVR的人脸图像超分辨率算法。算法首先分别提取训练库中高低分辨率图像块的高频信息和中频信息(差分高斯特征,DoG )作为建立回归关系的特征,依据它们的关系(并考虑人脸的特殊性)使用SVR建立起回归模型。在复原时,将待复原的低分辨率图像的中频特征输入已经建立的SVR回归模型得到需要的高频信息。通过对亚洲人脸库(亚洲人为主)IM DB和Yale人脸库(欧美人为主)的实验结果表明,本文提出的方法对亚洲人脸和欧美人脸都能取得了较好的复原效果,复原的图像在主观的视觉效果和客观的峰值信噪比上都取得较好的结果。

  14. Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

    Science.gov (United States)

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-06-28

    A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is

  15. Measurement of food colour in L*a*b* units from RGB digital image using least squares support vector machine regression

    Directory of Open Access Journals (Sweden)

    Roberto Romaniello

    2015-12-01

    Full Text Available The aim of this work is to evaluate the potential of least squares support vector machine (LS-SVM regression to develop an efficient method to measure the colour of food materials in L*a*b* units by means of a computer vision systems (CVS. A laboratory CVS, based on colour digital camera (CDC, was implemented and three LS-SVM models were trained and validated, one for each output variables (L*, a*, and b* required by this problem, using the RGB signals generated by the CDC as input variables to these models. The colour target-based approach was used to camera characterization and a standard reference target of 242 colour samples was acquired using the CVS and a colorimeter. This data set was split in two sets of equal sizes, for training and validating the LS-SVM models. An effective two-stage grid search process on the parameters space was performed in MATLAB to tune the regularization parameters γ and the kernel parameters σ2 of the three LS-SVM models. A 3-8-3 multilayer feed-forward neural network (MFNN, according to the research conducted by León et al. (2006, was also trained in order to compare its performance with those of LS-SVM models. The LS-SVM models developed in this research have been shown better generalization capability then the MFNN, allowed to obtain high correlations between L*a*b* data acquired using the colorimeter and the corresponding data obtained by transformation of the RGB data acquired by the CVS. In particular, for the validation set, R2 values equal to 0.9989, 0.9987, and 0.9994 for L*, a* and b* parameters were obtained. The root mean square error values were 0.6443, 0.3226, and 0.2702 for L*, a*, and b* respectively, and the average of colour differences ΔEab was 0.8232±0.5033 units. Thus, LS-SVM regression seems to be a useful tool to measurement of food colour using a low cost CVS.

  16. Application of Multi-task Sparse Group Lasso Feature Extraction and Support Vector Machine Regression in the Stellar Atmospheric Parametrization%多任务Sparse Group Lasso特征提取与支持向量机回归在恒星大气物理参量估计中的应用∗

    Institute of Scientific and Technical Information of China (English)

    高伟; 李乡儒

    2016-01-01

    ([Fe/H]).在由Kurucz的New Opacity Distribution Function(NEWODF)模型得到的理论光谱上也做了同样的特征提取和恒星大气物理参数估计测试,相应的平均绝对误差分别为:0.0006(lg (Teff/K))),0.0098(lg (g/(cm · s−2))),0.0082 dex ([Fe/H]).通过与文献中的同类研究比较表明,多任务Sparse Group Lasso特征提取与支持向量机回归(support vector machine regression, SVR)两者结合的方案有较高的恒星大气物理参量估计精度.

  17. Quantitative analysis of multi-component complex oil spills based on the least-squares support vector regression

    Science.gov (United States)

    Tan, Ailing; Zhao, Yong; Wang, Siyuan

    2016-10-01

    Quantitative analysis of the simulated complex oil spills was researched based on PSO-LS-SVR method. Forty simulated mixture oil spills samples were made with different concentration proportions of gasoline, diesel and kerosene oil, and their near infrared spectra were collected. The parameters of least squares support vector machine were optimized by particle swarm optimization algorithm. The optimal concentration quantitative models of three-component oil spills were established. The best regularization parameter C and kernel parameter σ of gasoline, diesel and kerosene model were 48.1418 and 0.1067, 53.2820 and 0.1095, 59.1689 and 0.1000 respectively. The decision coefficient R2 of the prediction model were 0.9983, 0.9907 and 0.9942 respectively. RMSEP values were 0.0753, 0.1539 and 0.0789 respectively. For gasoline, diesel fuel and kerosene oil models, the mean value and variance value of predict absolute error were -0.0176±0.0636 μL/mL, -0.0084+/-0.1941 μL/mL, and 0.00338+/-0.0726 μL/mL respectively. The results showed that each component's concentration of the oil spills samples could be detected by the NIR technology combined with PSO-LS-SVR regression method, the predict results were accurate and reliable, thus this method can provide effective means for the quantitative detection and analysis of complex marine oil spills.

  18. Predicting Chinese Abbreviations from Definitions: An Empirical Learning Approach Using Support Vector Regression

    Institute of Scientific and Technical Information of China (English)

    Xu Sun; Hou-Feng Wang; Bo Wang

    2008-01-01

    In Chinese, phrases and named entities play a central role in information retrieval. Abbreviations, however,make keyword-based approaches less effective. This paper presents an empirical learning approach to Chinese abbreviation prediction. In this study, each abbreviation is taken as a reduced form of the corresponding definition (expanded form),and the abbreviation prediction is formalized as a scoring and ranking problem among abbreviation candidates, which are automatically generated from the corresponding definition. By employing Support Vector Regression (SVR) for scoring,we can obtain multiple abbreviation candidates together with their SVR values, which are used for candidate ranking.Experimental results show that the SVR method performs better than the popular heuristic rule of abbreviation prediction.In addition, in abbreviation prediction, the SVR method outperforms the hidden Markov model (HMM).

  19. Detection of Buried Objects by Means of a SAP Technique: Comparing MUSIC- and SVR-Based Approaches

    Science.gov (United States)

    Meschino, S.; Pajewski, L.; Pastorino, M.; Randazzo, A.; Schettini, G.

    2012-04-01

    This work is focused on the application of a Sub-Array Processing (SAP) technique to the detection of metallic cylindrical objects embedded in a dielectric half-space. The identification of buried cables, pipes, conduits, and other cylindrical utilities, is an important problem that has been extensively studied in the last years. Most commonly used approaches are based on the use of electromagnetic sensing: a set of antennas illuminates the ground and the collected echo is analyzed in order to extract information about the scenario and to localize the sought objects [1]. In a SAP approach, algorithms for the estimation of Directions of Arrival (DOAs) are employed [2]: they assume that the sources (in this paper, currents induced on buried targets) are in the far-field region of the receiving array, so that the received wavefront can be considered as planar, and the main angular direction of the field can be estimated. However, in electromagnetic sensing of buried objects, the scatterers are usually quite near to the antennas. Nevertheless, by dividing the whole receiving array in a suitable number of sub-arrays, and by finding a dominant DOA for each one, it is possible to localize objects that are in the far-field of the sub-array, although being in the near-field of the array. The DOAs found by the sub-arrays can be triangulated, obtaining a set of crossings with intersections condensed around object locations. In this work, the performances of two different DOA algorithms are compared. In particular, a MUltiple SIgnal Classification (MUSIC)-type method [3] and Support Vector Regression (SVR) based approach [4] are employed. The results of a Cylindrical-Wave Approach forward solver are used as input data of the detection procedure [5]. To process the crossing pattern, the region of interest is divided in small windows, and a Poisson model is adopted for the statistical distribution of intersections in the windows. Hypothesis testing procedures are used (imposing

  20. Laplacian embedded regression for scalable manifold regularization.

    Science.gov (United States)

    Chen, Lin; Tsang, Ivor W; Xu, Dong

    2012-06-01

    Semi-supervised learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms, such as Laplacian support vector machine (LapSVM) and Laplacian regularized least squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian embedded regression by introducing an intermediate decision variable into the manifold regularization framework. By using ∈-insensitive loss, we obtain the Laplacian embedded support vector regression (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are two-fold: (1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately and (2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel principal component analysis, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale SSL problems. Extensive experiments on both toy and real

  1. Monthly prediction of air temperature in Australia and New Zealand with machine learning algorithms

    Science.gov (United States)

    Salcedo-Sanz, S.; Deo, R. C.; Carro-Calvo, L.; Saavedra-Moreno, B.

    2016-07-01

    Long-term air temperature prediction is of major importance in a large number of applications, including climate-related studies, energy, agricultural, or medical. This paper examines the performance of two Machine Learning algorithms (Support Vector Regression (SVR) and Multi-layer Perceptron (MLP)) in a problem of monthly mean air temperature prediction, from the previous measured values in observational stations of Australia and New Zealand, and climate indices of importance in the region. The performance of the two considered algorithms is discussed in the paper and compared to alternative approaches. The results indicate that the SVR algorithm is able to obtain the best prediction performance among all the algorithms compared in the paper. Moreover, the results obtained have shown that the mean absolute error made by the two algorithms considered is significantly larger for the last 20 years than in the previous decades, in what can be interpreted as a change in the relationship among the prediction variables involved in the training of the algorithms.

  2. Soft-sensor modeling of silicon content in hot metal based on sparse robust LS--SVR and multi-objective optimization%基于稀疏化鲁棒LS--SVR与多目标优化的铁水硅含量软测量建模

    Institute of Scientific and Technical Information of China (English)

    郭东伟; 周平

    2016-01-01

    To solve the problem that the parameter of silicon content ( [ Si] ) in hot mental is difficult to be directly detected and obtained by manual analysis with large time delay, a method of sparse and robust least squares support vector regression ( R-S-LS-SVR) was proposed to establish a dynamic model of [ Si] with the help of the multi-objective genetic optimization of model parame-ters. First, owing to the issue that the Lagrange multiplier of the standard least squares support vector machine ( LS-SVR) is directly proportional to the error term and solves the lack of sparsity, the maximal independent set of sample data in the feature space mapping set was extracted to realize the sparse of the training sample set and reduce the computational complexity of modeling. Next, in view of the problem that the standard least squares support vector machine has no regularization term, a method to improve the modeling ro-bustness was proposed by introducing the IGGIII weighting function into the obtained sparse least squares support vector regression ( S-LS-SVR) model. Last, the multi-objective evaluation index that synthesizes the modeling residue and the estimated trend was presented to compensate for the deficiency of the single root mean square error ( RMSE) index. Based on those, an on-line soft sensor model of hot metal [ Si] with the optimal parameters was obtained by using the multi-objective genetic algorithm ( NSGA-II) with the non-dominated sort and elitist strategy. Industrial verification and analysis show the effectiveness and superiority of the proposed method.%针对高炉炼铁过程的关键工艺指标———铁水硅含量[ Si]难以直接在线检测且化验过程滞后的问题,提出一种基于稀疏化鲁棒最小二乘支持向量机( R-S-LS-SVR)与多目标遗传参数优化的铁水[ Si]动态软测量建模方法。首先,针对标准最小二乘支持向量机( LS-SVR)的拉格朗日乘子与误差项成正比导致最终解缺少稀疏

  3. Regression Methods for Virtual Metrology of Layer Thickness in Chemical Vapor Deposition

    DEFF Research Database (Denmark)

    Purwins, Hendrik; Barak, Bernd; Nagi, Ahmed

    2014-01-01

    predictive variable alone, the 3 most predictive variables, an expert selection, and full set. The following regression methods are compared: Simple Linear Regression, Multiple Linear Regression, Partial Least Square Regression, and Ridge Linear Regression utilizing the Partial Least Square Estimate......The quality of wafer production in semiconductor manufacturing cannot always be monitored by a costly physical measurement. Instead of measuring a quantity directly, it can be predicted by a regression method (Virtual Metrology). In this paper, a survey on regression methods is given to predict...... algorithm, and Support Vector Regression (SVR). On a test set, SVR outperforms the other methods by a large margin, being more robust towards changes in the production conditions. The method performs better on high-dimensional multivariate input data than on the most predictive variables alone. Process...

  4. PREDICTORS OF SUSTAINED VIROLOGICAL RESPONSE (SVR TO PEGYLATED INTERFERON ALPHA (PEG-IFN α AND RIBAVIRIN (RBV IN PATIENTS WITH CHRONIC HEPATITIS C INFECTED WITH GENOTYPE 1.

    Directory of Open Access Journals (Sweden)

    Krasimir Antonov

    2011-12-01

    Full Text Available Objective: The combined PEG-IFN alpha and RBV therapy achieved SVR in 40 - 50% of patients infected with HCV genotype 1. Identification of virological and host paramemeters predicting SVR will be useful to tailor therapy. Methods: 71 patients with chronic HCV genotype 1 infection were treated with PEG-IFN alpha2a and RBV for 12 months. Predictors of SVR were analyzed by using nonparametric correlation test. Results: SVR was found in 57 / 71 of subjects (80,3%. The significant differences in baseline level of HCV RNA, sex, age, baseline ALT and present of liver cirrhosis between the patients with or without SVR were not found. Correlation was not proved between SVR and all these factors when they were analyzed separately. High correlation was found between serum levels of HCV RNA at the end of 3-th month therapy (Early Virological Response and SVR (r=0,759; p=0,011.Conclusion: The viral response during the first 3 months of PEG-IFN alpha and RBV therapy is the strongest independent predictor among the all baseline viral and host predictive factors for achieving of SVR.

  5. Stream-flow forecasting using extreme learning machines: A case study in a semi-arid region in Iraq

    Science.gov (United States)

    Yaseen, Zaher Mundher; Jaafar, Othman; Deo, Ravinesh C.; Kisi, Ozgur; Adamowski, Jan; Quilty, John; El-Shafie, Ahmed

    2016-11-01

    Monthly stream-flow forecasting can yield important information for hydrological applications including sustainable design of rural and urban water management systems, optimization of water resource allocations, water use, pricing and water quality assessment, and agriculture and irrigation operations. The motivation for exploring and developing expert predictive models is an ongoing endeavor for hydrological applications. In this study, the potential of a relatively new data-driven method, namely the extreme learning machine (ELM) method, was explored for forecasting monthly stream-flow discharge rates in the Tigris River, Iraq. The ELM algorithm is a single-layer feedforward neural network (SLFNs) which randomly selects the input weights, hidden layer biases and analytically determines the output weights of the SLFNs. Based on the partial autocorrelation functions of historical stream-flow data, a set of five input combinations with lagged stream-flow values are employed to establish the best forecasting model. A comparative investigation is conducted to evaluate the performance of the ELM compared to other data-driven models: support vector regression (SVR) and generalized regression neural network (GRNN). The forecasting metrics defined as the correlation coefficient (r), Nash-Sutcliffe efficiency (ENS), Willmott's Index (WI), root-mean-square error (RMSE) and mean absolute error (MAE) computed between the observed and forecasted stream-flow data are employed to assess the ELM model's effectiveness. The results revealed that the ELM model outperformed the SVR and the GRNN models across a number of statistical measures. In quantitative terms, superiority of ELM over SVR and GRNN models was exhibited by ENS = 0.578, 0.378 and 0.144, r = 0.799, 0.761 and 0.468 and WI = 0.853, 0.802 and 0.689, respectively and the ELM model attained lower RMSE value by approximately 21.3% (relative to SVR) and by approximately 44.7% (relative to GRNN). Based on the findings of this

  6. Short-Term Distribution System State Forecast Based on Optimal Synchrophasor Sensor Placement and Extreme Learning Machine

    Energy Technology Data Exchange (ETDEWEB)

    Jiang, Huaiguang; Zhang, Yingchen

    2016-11-14

    This paper proposes an approach for distribution system state forecasting, which aims to provide an accurate and high speed state forecasting with an optimal synchrophasor sensor placement (OSSP) based state estimator and an extreme learning machine (ELM) based forecaster. Specifically, considering the sensor installation cost and measurement error, an OSSP algorithm is proposed to reduce the number of synchrophasor sensor and keep the whole distribution system numerically and topologically observable. Then, the weighted least square (WLS) based system state estimator is used to produce the training data for the proposed forecaster. Traditionally, the artificial neural network (ANN) and support vector regression (SVR) are widely used in forecasting due to their nonlinear modeling capabilities. However, the ANN contains heavy computation load and the best parameters for SVR are difficult to obtain. In this paper, the ELM, which overcomes these drawbacks, is used to forecast the future system states with the historical system states. The proposed approach is effective and accurate based on the testing results.

  7. Noise model based ν-support vector regression with its application to short-term wind speed forecasting.

    Science.gov (United States)

    Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie

    2014-09-01

    Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique.

  8. Support vector regression model for predicting the sorption capacity of lead (II

    Directory of Open Access Journals (Sweden)

    Nusrat Parveen

    2016-09-01

    Full Text Available Biosorption is supposed to be an economical process for the treatment of wastewater containing heavy metals like lead (II. In this research paper, the support vector regression (SVR has been used to predict the sorption capacity of lead (II ions with the independent input parameters being: initial lead ion concentration, pH, temperature and contact time. Tree fern, an agricultural by-product, has been employed as a low cost biosorbent. Comparison between multiple linear regression (MLR and SVR-based models has been made using statistical parameters. It has been found that the SVR model is more accurate and generalized for prediction of the sorption capacity of lead (II ions.

  9. Quantitative Recognizing Dissolved Hydrocarbons with Genetic Algorithm-Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Qu Zhou

    2013-09-01

    Full Text Available Online monitoring of dissolved fault characteristic hydrocarbon gases, such as methane, ethane, ethylene and acetylene in power transformer oil has significant meaning for condition assessment of transformer. Recently, semiconductor tin oxide based gas sensor array has been widely applied in online monitoring apparatus, while cross sensitivity of the gas sensor array is inevitable due to same compositions and similar structures among the four hydrocarbon gases. Based on support vector regression (SVR with genetic algorithm (GA, a new pattern recognition method was proposed to reduce the cross sensitivity of the gas sensor array and further quantitatively recognize the concentration of dissolved hydrocarbon gases. The experimental data from a certain online monitoring device in China is used to illustrate the performance of the proposed GA-SVR model. Experimental results indicate that the GA-SVR method can effectively decrease the cross sensitivity and the regressed data is much more closed to the real values.

  10. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    Directory of Open Access Journals (Sweden)

    Santana Isabel

    2011-08-01

    Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

  11. Use of multivariate linear regression and support vector regression to predict functional outcome after surgery for cervical spondylotic myelopathy.

    Science.gov (United States)

    Hoffman, Haydn; Lee, Sunghoon I; Garst, Jordan H; Lu, Derek S; Li, Charles H; Nagasawa, Daniel T; Ghalehsari, Nima; Jahanforouz, Nima; Razaghy, Mehrdad; Espinal, Marie; Ghavamrezaii, Amir; Paak, Brian H; Wu, Irene; Sarrafzadeh, Majid; Lu, Daniel C

    2015-09-01

    This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for CSM remains a challenge. We recruited patients who had a diagnosis of CSM and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in MLR and SVR models to predict postoperative ODI. Predictions were compared to the actual ODI scores with the coefficient of determination (R(2)) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the MLR model, a combination of the preoperative ODI score, preoperative MAA (step function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.452; MAD=0.0887; p=1.17 × 10(-3)). With the SVR model, a combination of preoperative ODI score, preoperative MAA (sinusoidal function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.932; MAD=0.0283; p=5.73 × 10(-12)). The SVR model was more accurate than the MLR model. The SVR can be used preoperatively in risk/benefit analysis and the decision to operate.

  12. Priori Information Based Support Vector Regression and Its Applications

    Directory of Open Access Journals (Sweden)

    Litao Ma

    2015-01-01

    Full Text Available In order to extract the priori information (PI provided by real monitored values of peak particle velocity (PPV and increase the prediction accuracy of PPV, PI based support vector regression (SVR is established. Firstly, to extract the PI provided by monitored data from the aspect of mathematics, the probability density of PPV is estimated with ε-SVR. Secondly, in order to make full use of the PI about fluctuation of PPV between the maximal value and the minimal value in a certain period of time, probability density estimated with ε-SVR is incorporated into training data, and then the dimensionality of training data is increased. Thirdly, using the training data with a higher dimension, a method of predicting PPV called PI-ε-SVR is proposed. Finally, with the collected values of PPV induced by underwater blasting at Dajin Island in Taishan nuclear power station in China, contrastive experiments are made to show the effectiveness of the proposed method.

  13. Fault Isolation for Nonlinear Systems Using Flexible Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Yufang Liu

    2014-01-01

    Full Text Available While support vector regression is widely used as both a function approximating tool and a residual generator for nonlinear system fault isolation, a drawback for this method is the freedom in selecting model parameters. Moreover, for samples with discordant distributing complexities, the selection of reasonable parameters is even impossible. To alleviate this problem we introduce the method of flexible support vector regression (F-SVR, which is especially suited for modelling complicated sample distributions, as it is free from parameters selection. Reasonable parameters for F-SVR are automatically generated given a sample distribution. Lastly, we apply this method in the analysis of the fault isolation of high frequency power supplies, where satisfactory results have been obtained.

  14. Wall parameters estimation based onsupport vector regression for through wall radar sensing

    Science.gov (United States)

    Chen, Xi; Chen, Weidong

    2015-12-01

    In through wall radar sensing, the wall parameters estimation (WPE) problem has been a topic that attracts a lot of attention since the wall parameters, i.e., the permittivity and the thickness, are of crucial importance to locate the targets and to produce a well-focused image, but they are usually unknown in practice. To solve this problem, in this paper, the support vector regression (SVR), a powerful tool for regression analysis, is introduced, and its performance on WPE, provided it is used it in the regular way, is investigated. Unfortunately, it is shown that the regular use of SVR cannot afford satisfactory estimation results since the sample data used in SVR, namely the received echoes from the walls, are seriously interfered with the echoes from the targets which are located near the walls. In view of this limitation, a novel SVR-based WPE approach that consists of three stages is proposed by this paper. In the first stage, three regression functions are trained by SVR, one of which will output the estimate of the permittivity in the second stage, and the others are designed to output two instrumental variables for estimating the thickness. In the third stage, the estimate of thickness will be achieved by minimizing a predefined cost function wherein the estimated permittivity and the outputted instrumental variables are involved. The better robustness and higher estimation accuracy of the proposed approach compared to the regular use of SVR are validated by the numerical experimental results using finite-difference time-domain simulations.

  15. Optimized support vector regression for drilling rate of penetration estimation

    Science.gov (United States)

    Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa

    2015-12-01

    In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.

  16. Cutting Parameters Multi-object Optimization of Titanium Alloy Milling Process Based on Support Vector Regression and NSGA-II%基于SVR和NSGA-II的钛合金铣削参数多目标优化

    Institute of Scientific and Technical Information of China (English)

    向国齐

    2016-01-01

    对钛合金材料Ti6Al4V铣削加工进行有限元数值计算,结合试验设计方法构建了基于支持向量回归机(SVR)的铣削力预测模型,以材料去除率和刀具寿命为优化目标,提出一种基于支持向量回归机和带精英策略的非支配排序遗传算法(NSGA-II)的优化方法。结果表明,该方法能够获得满意的Pareto解集,为钛合金铣削参数优化提供一种新的方法,具有良好的推广价值。%In this paper, the Titanium Alloy Ti6Al4V milling process is analysized by ifnite element method, a milling force prediction model was established based on Support Vector Regression (SVR), The optimization design methodology based on SVR and NSGA-II is proposed for Titanium Alloy milling process cutting parameters. The results show that this methodology has a good performance in ifnding satisfying Pareto solutions, and thus can be used in the machining process parameters optimum and other material processing ifelds.

  17. Deriving statistical significance maps for support vector regression using medical imaging data.

    Science.gov (United States)

    Gaonkar, Bilwaj; Sotiras, Aristeidis; Davatzikos, Christos

    2013-01-01

    Regression analysis involves predicting a continuous variable using imaging data. The Support Vector Regression (SVR) algorithm has previously been used in addressing regression analysis in neuroimaging. However, identifying the regions of the image that the SVR uses to model the dependence of a target variable remains an open problem. It is an important issue when one wants to biologically interpret the meaning of a pattern that predicts the variable(s) of interest, and therefore to understand normal or pathological process. One possible approach to the identification of these regions is the use of permutation testing. Permutation testing involves 1) generation of a large set of 'null SVR models' using randomly permuted sets of target variables, and 2) comparison of the SVR model trained using the original labels to the set of null models. These permutation tests often require prohibitively long computational time. Recent work in support vector classification shows that it is possible to analytically approximate the results of permutation testing in medical image analysis. We propose an analogous approach to approximate permutation testing based analysis for support vector regression with medical imaging data. In this paper we present 1) the theory behind our approximation, and 2) experimental results using two real datasets.

  18. Material grain size characterization method based on energy attenuation coefficient spectrum and support vector regression.

    Science.gov (United States)

    Li, Min; Zhou, Tong; Song, Yanan

    2016-07-01

    A grain size characterization method based on energy attenuation coefficient spectrum and support vector regression (SVR) is proposed. First, the spectra of the first and second back-wall echoes are cut into several frequency bands to calculate the energy attenuation coefficient spectrum. Second, the frequency band that is sensitive to grain size variation is determined. Finally, a statistical model between the energy attenuation coefficient in the sensitive frequency band and average grain size is established through SVR. Experimental verification is conducted on austenitic stainless steel. The average relative error of the predicted grain size is 5.65%, which is better than that of conventional methods.

  19. DOA Finding with Support Vector Regression Based Forward–Backward Linear Prediction

    Directory of Open Access Journals (Sweden)

    Jingjing Pan

    2017-05-01

    Full Text Available Direction-of-arrival (DOA estimation has drawn considerable attention in array signal processing, particularly with coherent signals and a limited number of snapshots. Forward–backward linear prediction (FBLP is able to directly deal with coherent signals. Support vector regression (SVR is robust with small samples. This paper proposes the combination of the advantages of FBLP and SVR in the estimation of DOAs of coherent incoming signals with low snapshots. The performance of the proposed method is validated with numerical simulations in coherent scenarios, in terms of different angle separations, numbers of snapshots, and signal-to-noise ratios (SNRs. Simulation results show the effectiveness of the proposed method.

  20. Research of Chinese Stock Index Futures Regression Prediction Based on Support Vector Machines%基于支持向量机的中国股指期货回归预测研究

    Institute of Scientific and Technical Information of China (English)

    赛英; 张凤廷; 张涛

    2013-01-01

    本文针对股指期货预测的特点,选择对股指期货指数有重要影响的相关指标,首次提出用支持向量机(SVM)方法对其进行回归预测,并用遗传算法(GA)和粒子群算法(PSO)分别优化四种不同核函数的支持向量机,构建了八种不同的中国股指期货回归预测方案,用实证研究的方法对这八种方案的准确性和时效性进行了比较.实验结果表明粒子群算法优化的线性核函数支持向量机作为中国股指期货回归预测的模型,具有更好的预测效果.%According to the characteristics of the stock index futures prediction,the indicators that have great influence on the development trend of stock index futures are selected and the support vector machines are firstly used to the regression prediction of stock index futures.Besides,genetic algorithm (GA)and particle swarm optimization algorithm (PSO) are employed to optimize the support vector machine (SVM) with four different kernel functions and eight different programs are attained.By comparing the accuracy and the time complexity of all the programs,the empirical study shows that the linear kernel function SVM optimized by PSO is the best model for regression prediction of Chinese stock index futures.

  1. Quantile regression

    CERN Document Server

    Hao, Lingxin

    2007-01-01

    Quantile Regression, the first book of Hao and Naiman's two-book series, establishes the seldom recognized link between inequality studies and quantile regression models. Though separate methodological literature exists for each subject, the authors seek to explore the natural connections between this increasingly sought-after tool and research topics in the social sciences. Quantile regression as a method does not rely on assumptions as restrictive as those for the classical linear regression; though more traditional models such as least squares linear regression are more widely utilized, Hao

  2. Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression.

    Science.gov (United States)

    Ibitoye, Morufu Olusola; Hamzaid, Nur Azah; Abdul Wahab, Ahmad Khairi; Hasnan, Nazirah; Olatunji, Sunday Olusanya; Davis, Glen M

    2016-07-19

    The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES) in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG) of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR) due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70%) and testing (30%) subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R²) between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE) of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.

  3. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests.

    Science.gov (United States)

    Maroco, João; Silva, Dina; Rodrigues, Ana; Guerreiro, Manuela; Santana, Isabel; de Mendonça, Alexandre

    2011-08-17

    Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Press' Q test showed that all classifiers performed better than chance alone (p Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a

  4. Pre-processing data using wavelet transform and PCA based on support vector regression and gene expression programming for river flow simulation

    Indian Academy of Sciences (India)

    Abazar Solgi; Amir Pourhaghi; Ramin Bahmani; Heidar Zarei

    2017-07-01

    An accurate estimation of flow using different models is an issue for water resource researchers. In this study, support vector regression (SVR) and gene expression programming (GEP) models in daily and monthly scale were used in order to simulate Gamasiyab River flow in Nahavand, Iran. The results showed that although the performance of models in daily scale was acceptable and the result of SVR model was a little better, their performance in the daily scale was really better than the monthly scale. Therefore, wavelet transform was used and the main signal of every input was decomposed. Then, by using principal component analysis method, important sub-signals were recognized and used as inputs for the SVR and GEP models to produce wavelet-support vector regression (WSVR) and wavelet-gene expression programming. The results showed that the performance of WSVR was better than the SVR in such a way that the combination of SVR with wavelet could improve the determination coefficient of the model up to 3% and 18% for daily and monthly scales, respectively. Totally, it can be said that the combination of wavelet with SVR is a suitable tool for the prediction of Gamasiyab River flow in both daily and monthly scales.

  5. Fruit fly optimization based least square support vector regression for blind image restoration

    Science.gov (United States)

    Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei

    2014-11-01

    The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and

  6. Phone Duration Modeling of Affective Speech Using Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Alexandros Lazaridis

    2012-07-01

    Full Text Available In speech synthesis accurate modeling of prosody is important for producing high quality synthetic speech. One of the main aspects of prosody is phone duration. Robust phone duration modeling is a prerequisite for synthesizing emotional speech with natural sounding. In this work ten phone duration models are evaluated. These models belong to well known and widely used categories of algorithms, such as the decision trees, linear regression, lazy-learning algorithms and meta-learning algorithms. Furthermore, we investigate the effectiveness of Support Vector Regression (SVR in phone duration modeling in the context of emotional speech. The evaluation of the eleven models is performed on a Modern Greek emotional speech database which consists of four categories of emotional speech (anger, fear, joy, sadness plus neutral speech. The experimental results demonstrated that the SVR-based modeling outperforms the other ten models across all the four emotion categories. Specifically, the SVR model achieved an average relative reduction of 8% in terms of root mean square error (RMSE throughout all emotional categories.

  7. A machine learning pipeline for quantitative phenotype prediction from genotype data

    Directory of Open Access Journals (Sweden)

    Jurman Giuseppe

    2010-10-01

    Full Text Available Abstract Background Quantitative phenotypes emerge everywhere in systems biology and biomedicine due to a direct interest for quantitative traits, or to high individual variability that makes hard or impossible to classify samples into distinct categories, often the case with complex common diseases. Machine learning approaches to genotype-phenotype mapping may significantly improve Genome-Wide Association Studies (GWAS results by explicitly focusing on predictivity and optimal feature selection in a multivariate setting. It is however essential that stringent and well documented Data Analysis Protocols (DAP are used to control sources of variability and ensure reproducibility of results. We present a genome-to-phenotype pipeline of machine learning modules for quantitative phenotype prediction. The pipeline can be applied for the direct use of whole-genome information in functional studies. As a realistic example, the problem of fitting complex phenotypic traits in heterogeneous stock mice from single nucleotide polymorphims (SNPs is here considered. Methods The core element in the pipeline is the L1L2 regularization method based on the naïve elastic net. The method gives at the same time a regression model and a dimensionality reduction procedure suitable for correlated features. Model and SNP markers are selected through a DAP originally developed in the MAQC-II collaborative initiative of the U.S. FDA for the identification of clinical biomarkers from microarray data. The L1L2 approach is compared with standard Support Vector Regression (SVR and with Recursive Jump Monte Carlo Markov Chain (MCMC. Algebraic indicators of stability of partial lists are used for model selection; the final panel of markers is obtained by a procedure at the chromosome scale, termed ’saturation’, to recover SNPs in Linkage Disequilibrium with those selected. Results With respect to both MCMC and SVR, comparable accuracies are obtained by the L1L2 pipeline

  8. Modeling personalized head-related impulse response using support vector regression

    Institute of Scientific and Technical Information of China (English)

    HUANG Qing-hua; FANG Yong

    2009-01-01

    A new customization approach based on support vector regression (SVR) is proposed to obtain individual headrelated impulse response (HRIR) without complex measurement and special equipment. Principal component analysis (PCA) is first applied to obtain a few principal components and corresponding weight vectors correlated with individual anthropometric parameters. Then the weight vectors act as output of the nonlinear regression model. Some measured anthropometric parameters are selected as input of the model according to the correlation coefficients between the parameters and the weight vectors. After the regression model is learned from the training data, the individual HRIR can be predicted based on the measured anthropometric parameters. Compared with a back-propagation neural network (BPNN) for nonlinear regression,better generalization and prediction performance for small training samples can be obtained using the proposed PCA-SVR algorithm.

  9. Regression Basics

    CERN Document Server

    Kahane, Leo H

    2007-01-01

    Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:

  10. Semi-supervised Machine Learning for Analysis of Hydrogeochemical Data and Models

    Science.gov (United States)

    Vesselinov, Velimir; O'Malley, Daniel; Alexandrov, Boian; Moore, Bryan

    2017-04-01

    Data- and model-based analyses such as uncertainty quantification, sensitivity analysis, and decision support using complex physics models with numerous model parameters and typically require a huge number of model evaluations (on order of 10^6). Furthermore, model simulations of complex physics may require substantial computational time. For example, accounting for simultaneously occurring physical processes such as fluid flow and biogeochemical reactions in heterogeneous porous medium may require several hours of wall-clock computational time. To address these issues, we have developed a novel methodology for semi-supervised machine learning based on Non-negative Matrix Factorization (NMF) coupled with customized k-means clustering. The algorithm allows for automated, robust Blind Source Separation (BSS) of groundwater types (contamination sources) based on model-free analyses of observed hydrogeochemical data. We have also developed reduced order modeling tools, which coupling support vector regression (SVR), genetic algorithms (GA) and artificial and convolutional neural network (ANN/CNN). SVR is applied to predict the model behavior within prior uncertainty ranges associated with the model parameters. ANN and CNN procedures are applied to upscale heterogeneity of the porous medium. In the upscaling process, fine-scale high-resolution models of heterogeneity are applied to inform coarse-resolution models which have improved computational efficiency while capturing the impact of fine-scale effects at the course scale of interest. These techniques are tested independently on a series of synthetic problems. We also present a decision analysis related to contaminant remediation where the developed reduced order models are applied to reproduce groundwater flow and contaminant transport in a synthetic heterogeneous aquifer. The tools are coded in Julia and are a part of the MADS high-performance computational framework (https://github.com/madsjulia/Mads.jl).

  11. Intelligent Evaluation for Aerial Warfare Efficiency of Fighter-plane Based on Rough Set and Support Vector Machine%粗集支持向量机的战斗机空战效能智能评估

    Institute of Scientific and Technical Information of China (English)

    龚胜科; 徐浩军; 林敏

    2012-01-01

    根据现代空战特点,选取了战斗机的空战效能评估指标集,并采用粗糙集理论对指标体系进行约简,提取对空战效能影响起关键作用的特征参数,消除冗余信息,减少了支持向量的维效.支持向量机( SVM)具有结构简单、全局最优、泛化能力强的优点.根据所提取的特征参数,文中提出采用回归型支持向量机(SVR)建立空战效能智能评估模型,并通过实例与指数法和BP神经网络法计算结果进行了比较,验证了该模型的可行性和有效性.%According to the characteristic of modern aerial warfare. the index set of aerial warfare efficiency evaluation is selected in this paper. Reduction is performed on index systems based on rough set theory to extract characteristic parameters which affect aerial warfare efficiency crucially, which is to remove redundant information and reduce the dimension of support vector. Support vector machine (SVM) has the advantages of simple structure, global optimum and high generalization ability. With the characteristic parameters, intelligent evaluation model for aerial warfare efficiency of fighter-plane is establish by using Support Vector Regression (SVR), and we compare the SVR with index method and BP network method by a case study, which verified the feasibility and validity of the model.

  12. Constrained Sparse Galerkin Regression

    CERN Document Server

    Loiseau, Jean-Christophe

    2016-01-01

    In this work, we demonstrate the use of sparse regression techniques from machine learning to identify nonlinear low-order models of a fluid system purely from measurement data. In particular, we extend the sparse identification of nonlinear dynamics (SINDy) algorithm to enforce physical constraints in the regression, leading to energy conservation. The resulting models are closely related to Galerkin projection models, but the present method does not require the use of a full-order or high-fidelity Navier-Stokes solver to project onto basis modes. Instead, the most parsimonious nonlinear model is determined that is consistent with observed measurement data and satisfies necessary constraints. The constrained Galerkin regression algorithm is implemented on the fluid flow past a circular cylinder, demonstrating the ability to accurately construct models from data.

  13. Support vector echo-state machine for chaotic time-series prediction.

    Science.gov (United States)

    Shi, Zhiwei; Han, Min

    2007-03-01

    A novel chaotic time-series prediction method based on support vector machines (SVMs) and echo-state mechanisms is proposed. The basic idea is replacing "kernel trick" with "reservoir trick" in dealing with nonlinearity, that is, performing linear support vector regression (SVR) in the high-dimension "reservoir" state space, and the solution benefits from the advantages from structural risk minimization principle, and we call it support vector echo-state machines (SVESMs). SVESMs belong to a special kind of recurrent neural networks (RNNs) with convex objective function, and their solution is global, optimal, and unique. SVESMs are especially efficient in dealing with real life nonlinear time series, and its generalization ability and robustness are obtained by regularization operator and robust loss function. The method is tested on the benchmark prediction problem of Mackey-Glass time series and applied to some real life time series such as monthly sunspots time series and runoff time series of the Yellow River, and the prediction results are promising.

  14. A planning quality evaluation tool for prostate adaptive IMRT based on machine learning

    Energy Technology Data Exchange (ETDEWEB)

    Zhu Xiaofeng; Ge Yaorong; Li Taoran; Thongphiew, Danthai; Yin Fangfang; Wu, Q Jackie [Department of Radiation Oncology, Duke University Medical Center, Durham, North Carolina 27708 (United States); Department of Biomedical Engineering, Wake Forest University Health Sciences, Medical Center Boulevard, Winston-Salem, North Carolina 27106 (United States); Department of Radiation Oncology, Duke University Medical Center, Durham, North Carolina 27708 (United States); Department of Radiation Oncology, Brody School of Medicine, East Carolina University, Greenville, North Carolina 27834 (United States); Department of Radiation Oncology, Duke University Medical Center, Durham, North Carolina 27708 (United States)

    2011-02-15

    Purpose: To ensure plan quality for adaptive IMRT of the prostate, we developed a quantitative evaluation tool using a machine learning approach. This tool generates dose volume histograms (DVHs) of organs-at-risk (OARs) based on prior plans as a reference, to be compared with the adaptive plan derived from fluence map deformation. Methods: Under the same configuration using seven-field 15 MV photon beams, DVHs of OARs (bladder and rectum) were estimated based on anatomical information of the patient and a model learned from a database of high quality prior plans. In this study, the anatomical information was characterized by the organ volumes and distance-to-target histogram (DTH). The database consists of 198 high quality prostate plans and was validated with 14 cases outside the training pool. Principal component analysis (PCA) was applied to DVHs and DTHs to quantify their salient features. Then, support vector regression (SVR) was implemented to establish the correlation between the features of the DVH and the anatomical information. Results: DVH/DTH curves could be characterized sufficiently just using only two or three truncated principal components, thus, patient anatomical information was quantified with reduced numbers of variables. The evaluation of the model using the test data set demonstrated its accuracy {approx}80% in prediction and effectiveness in improving ART planning quality. Conclusions: An adaptive IMRT plan quality evaluation tool based on machine learning has been developed, which estimates OAR sparing and provides reference in evaluating ART.

  15. Prediction of pore-water pressure response to rainfall using support vector regression

    Science.gov (United States)

    Babangida, Nuraddeen Muhammad; Mustafa, Muhammad Raza Ul; Yusuf, Khamaruzaman Wan; Isa, Mohamed Hasnain

    2016-11-01

    Nonlinear complex behavior of pore-water pressure responses to rainfall was modelled using support vector regression (SVR). Pore-water pressure can rise to disturbing levels that may result in slope failure during or after rainfall. Traditionally, monitoring slope pore-water pressure responses to rainfall is tedious and expensive, in that the slope must be instrumented with necessary monitors. Data on rainfall and corresponding responses of pore-water pressure were collected from such a monitoring program at a slope site in Malaysia and used to develop SVR models to predict pore-water pressure fluctuations. Three models, based on their different input configurations, were developed. SVR optimum meta-parameters were obtained using k-fold cross validation and a grid search. Model type 3 was adjudged the best among the models and was used to predict three other points on the slope. For each point, lag intervals of 30 min, 1 h and 2 h were used to make the predictions. The SVR model predictions were compared with predictions made by an artificial neural network model; overall, the SVR model showed slightly better results. Uncertainty quantification analysis was also performed for further model assessment. The uncertainty components were found to be low and tolerable, with d-factor of 0.14 and 74 % of observed data falling within the 95 % confidence bound. The study demonstrated that the SVR model is effective in providing an accurate and quick means of obtaining pore-water pressure response, which may be vital in systems where response information is urgently needed.

  16. Prediction of pore-water pressure response to rainfall using support vector regression

    Science.gov (United States)

    Babangida, Nuraddeen Muhammad; Mustafa, Muhammad Raza Ul; Yusuf, Khamaruzaman Wan; Isa, Mohamed Hasnain

    2016-05-01

    Nonlinear complex behavior of pore-water pressure responses to rainfall was modelled using support vector regression (SVR). Pore-water pressure can rise to disturbing levels that may result in slope failure during or after rainfall. Traditionally, monitoring slope pore-water pressure responses to rainfall is tedious and expensive, in that the slope must be instrumented with necessary monitors. Data on rainfall and corresponding responses of pore-water pressure were collected from such a monitoring program at a slope site in Malaysia and used to develop SVR models to predict pore-water pressure fluctuations. Three models, based on their different input configurations, were developed. SVR optimum meta-parameters were obtained using k-fold cross validation and a grid search. Model type 3 was adjudged the best among the models and was used to predict three other points on the slope. For each point, lag intervals of 30 min, 1 h and 2 h were used to make the predictions. The SVR model predictions were compared with predictions made by an artificial neural network model; overall, the SVR model showed slightly better results. Uncertainty quantification analysis was also performed for further model assessment. The uncertainty components were found to be low and tolerable, with d-factor of 0.14 and 74 % of observed data falling within the 95 % confidence bound. The study demonstrated that the SVR model is effective in providing an accurate and quick means of obtaining pore-water pressure response, which may be vital in systems where response information is urgently needed.

  17. Study on Flow Characteristic of Gear Pump Based on Support Vector Machines Regression%基于向量机回归理论研究齿轮泵流量特性

    Institute of Scientific and Technical Information of China (English)

    曾德堂; 赵威力; 王曦

    2012-01-01

    In order to solve the problem that the real flow rate cannot be gotten by gear pump flow formula, a SVMR model using the training results to calculate the flow rate was built. The model was set with gear pump experimental data as learning samples and comparisons of the results were given. It shows that support vector machines regression can be effectively applied to study the flow characteristics of external gear pump.%针对齿轮泵流量计算问题,采用支持向量机回归理论,以齿轮泵实验数据作为学习样本,建立了齿轮泵流量特性模型,研究齿轮泵流量特性计算问题.研究结果表明:基于向量机回归理论流量计算模型所得计算值与试验结果具有良好的一致性.

  18. A Bayesian least squares support vector machines based framework for fault diagnosis and failure prognosis

    Science.gov (United States)

    Khawaja, Taimoor Saleem

    and any abnormal or novel data during real-time operation. The results of the scheme are interpreted as a posterior probability of health (1 - probability of fault). As shown through two case studies in Chapter 3, the scheme is well suited for diagnosing imminent faults in dynamical non-linear systems. Finally, the failure prognosis scheme is based on an incremental weighted Bayesian LS-SVR machine. It is particularly suited for online deployment given the incremental nature of the algorithm and the quick optimization problem solved in the LS-SVR algorithm. By way of kernelization and a Gaussian Mixture Modeling (GMM) scheme, the algorithm can estimate "possibly" non-Gaussian posterior distributions for complex non-linear systems. An efficient regression scheme associated with the more rigorous core algorithm allows for long-term predictions, fault growth estimation with confidence bounds and remaining useful life (RUL) estimation after a fault is detected. The leading contributions of this thesis are (a) the development of a novel Bayesian Anomaly Detector for efficient and reliable Fault Detection and Identification (FDI) based on Least Squares Support Vector Machines, (b) the development of a data-driven real-time architecture for long-term Failure Prognosis using Least Squares Support Vector Machines, (c) Uncertainty representation and management using Bayesian Inference for posterior distribution estimation and hyper-parameter tuning, and finally (d) the statistical characterization of the performance of diagnosis and prognosis algorithms in order to relate the efficiency and reliability of the proposed schemes.

  19. A Support Vector Regression Approach for Investigating Multianticipative Driving Behavior

    Directory of Open Access Journals (Sweden)

    Bin Lu

    2015-01-01

    Full Text Available This paper presents a Support Vector Regression (SVR approach that can be applied to predict the multianticipative driving behavior using vehicle trajectory data. Building upon the SVR approach, a multianticipative car-following model is developed and enhanced in learning speed and predication accuracy. The model training and validation are conducted by using the field trajectory data extracted from the Next Generation Simulation (NGSIM project. During the model training and validation tests, the estimation results show that the SVR model performs as well as IDM model with respect to the model prediction accuracy. In addition, this paper performs a relative importance analysis to quantify the multianticipation in terms of the different stimuli to which drivers react in platoon car following. The analysis results confirm that drivers respond to the behavior of not only the immediate leading vehicle in front but also the second, third, and even fourth leading vehicles. Specifically, in congested traffic conditions, drivers are observed to be more sensitive to the relative speed than to the gap. These findings provide insight into multianticipative driving behavior and illustrate the necessity of taking into account multianticipative car-following model in microscopic traffic simulation.

  20. Predicting Future Hourly Residential Electrical Consumption: A Machine Learning Case Study

    Energy Technology Data Exchange (ETDEWEB)

    Edwards, Richard E [ORNL; New, Joshua Ryan [ORNL; Parker, Lynne Edwards [ORNL

    2012-01-01

    Whole building input models for energy simulation programs are frequently created in order to evaluate specific energy savings potentials. They are also often utilized to maximize cost-effective retrofits for existing buildings as well as to estimate the impact of policy changes toward meeting energy savings goals. Traditional energy modeling suffers from several factors, including the large number of inputs required to characterize the building, the specificity required to accurately model building materials and components, simplifying assumptions made by underlying simulation algorithms, and the gap between the as-designed and as-built building. Prior works have attempted to mitigate these concerns by using sensor-based machine learning approaches to model energy consumption. However, a majority of these prior works focus only on commercial buildings. The works that focus on modeling residential buildings primarily predict monthly electrical consumption, while commercial models predict hourly consumption. This means there is not a clear indicator of which techniques best model residential consumption, since these methods are only evaluated using low-resolution data. We address this issue by testing seven different machine learning algorithms on a unique residential data set, which contains 140 different sensors measurements, collected every 15 minutes. In addition, we validate each learner's correctness on the ASHRAE Great Energy Prediction Shootout, using the original competition metrics. Our validation results confirm existing conclusions that Neural Network-based methods perform best on commercial buildings. However, the results from testing our residential data set show that Feed Forward Neural Networks, Support Vector Regression (SVR), and Linear Regression methods perform poorly, and that Hierarchical Mixture of Experts (HME) with Least Squares Support Vector Machines (LS-SVM) performs best - a technique not previously applied to this domain.

  1. Tuning a PD Controller Based on an SVR for the Control of a Biped Robot Subject to External Forces and Slope Variation

    Directory of Open Access Journals (Sweden)

    João P. Ferreira

    2014-03-01

    The ZMP is calculated by reading four force sensors placed under each of the robot’s feet. The gait implemented in this biped is similar to a human gait, which is acquired and adapted to the robot’s size. The main contribution of this paper is the fine-tuning of the ZMP controller based on the SVR. To implement and test this, the biped robot was subjected to external forces and slope variation. Some experiments are presented and the results show that the implemented gait combined with the correct tuning of the SVR controller is appropriate for use with this biped robot. The SVR controller runs at 0.2 ms, which is about 50 times faster than a corresponding first- order TSK neural-fuzzy network.

  2. Aplicación de los instrumentos de reincidencia en violencia HCR-20 y SVR-20 en dos grupos de delincuentes colombianos

    Directory of Open Access Journals (Sweden)

    Ángela Tapias Saldaña

    2011-06-01

    Full Text Available Esta investigación, de tipo exploratorio, cuenta con un diseño no experimental y transversal o transeccional; tuvo por objeto determinar si los instrumentos de evaluación psicológica forense HCR-20 y SVR-20 discriminan entre un grupo de reincidentes en delitos de acceso carnal violento y un grupo de sujetos judicializados por delitos menores. Hubo presencia de los indicadores, tanto del HCR-20 como del SVR-20, en los grupos. Se encontraron diferencias significativas en los puntajes de los grupos para el SVR- 20, pero no para el HCR-20. Finalmente, se observaron nuevos factores de riesgo, que podrían incluirse en instrumentos forenses.

  3. Autistic Regression

    Science.gov (United States)

    Matson, Johnny L.; Kozlowski, Alison M.

    2010-01-01

    Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…

  4. Logistic regression.

    Science.gov (United States)

    Nick, Todd G; Campbell, Kathleen M

    2007-01-01

    The Medical Subject Headings (MeSH) thesaurus used by the National Library of Medicine defines logistic regression models as "statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable." Logistic regression models are used to study effects of predictor variables on categorical outcomes and normally the outcome is binary, such as presence or absence of disease (e.g., non-Hodgkin's lymphoma), in which case the model is called a binary logistic model. When there are multiple predictors (e.g., risk factors and treatments) the model is referred to as a multiple or multivariable logistic regression model and is one of the most frequently used statistical model in medical journals. In this chapter, we examine both simple and multiple binary logistic regression models and present related issues, including interaction, categorical predictor variables, continuous predictor variables, and goodness of fit.

  5. The Neural Support Vector Machine

    NARCIS (Netherlands)

    Wiering, Marco; van der Ree, Michiel; Embrechts, Mark; Stollenga, Marijn; Meijster, Arnold; Nolte, A; Schomaker, Lambertus

    2013-01-01

    This paper describes a new machine learning algorithm for regression and dimensionality reduction tasks. The Neural Support Vector Machine (NSVM) is a hybrid learning algorithm consisting of neural networks and support vector machines (SVMs). The output of the NSVM is given by SVMs that take a

  6. The Neural Support Vector Machine

    NARCIS (Netherlands)

    Wiering, Marco; van der Ree, Michiel; Embrechts, Mark; Stollenga, Marijn; Meijster, Arnold; Nolte, A; Schomaker, Lambertus

    2013-01-01

    This paper describes a new machine learning algorithm for regression and dimensionality reduction tasks. The Neural Support Vector Machine (NSVM) is a hybrid learning algorithm consisting of neural networks and support vector machines (SVMs). The output of the NSVM is given by SVMs that take a centr

  7. Multivariate Time Series Forecasting of Crude Palm Oil Price Using Machine Learning Techniques

    Science.gov (United States)

    Kanchymalay, Kasturi; Salim, N.; Sukprasert, Anupong; Krishnan, Ramesh; Raba'ah Hashim, Ummi

    2017-08-01

    The aim of this paper was to study the correlation between crude palm oil (CPO) price, selected vegetable oil prices (such as soybean oil, coconut oil, and olive oil, rapeseed oil and sunflower oil), crude oil and the monthly exchange rate. Comparative analysis was then performed on CPO price forecasting results using the machine learning techniques. Monthly CPO prices, selected vegetable oil prices, crude oil prices and monthly exchange rate data from January 1987 to February 2017 were utilized. Preliminary analysis showed a positive and high correlation between the CPO price and soy bean oil price and also between CPO price and crude oil price. Experiments were conducted using multi-layer perception, support vector regression and Holt Winter exponential smoothing techniques. The results were assessed by using criteria of root mean square error (RMSE), means absolute error (MAE), means absolute percentage error (MAPE) and Direction of accuracy (DA). Among these three techniques, support vector regression(SVR) with Sequential minimal optimization (SMO) algorithm showed relatively better results compared to multi-layer perceptron and Holt Winters exponential smoothing method.

  8. A method for separating seismo-ionospheric TEC outliers from heliogeomagnetic disturbances by using nu-SVR

    Energy Technology Data Exchange (ETDEWEB)

    Pattisahusiwa, Asis [Bandung Institute of Technology (Indonesia); Liong, The Houw; Purqon, Acep [Earth physics and complex systems research group, Bandung Institute of Technology (Indonesia)

    2015-09-30

    Seismo-Ionospheric is a study of ionosphere disturbances associated with seismic activities. In many previous researches, heliogeomagnetic or strong earthquake activities can caused the disturbances in the ionosphere. However, it is difficult to separate these disturbances based on related sources. In this research, we proposed a method to separate these disturbances/outliers by using nu-SVR with the world-wide GPS data. TEC data related to the 26th December 2004 Sumatra and the 11th March 2011 Honshu earthquakes had been analyzed. After analyzed TEC data in several location around the earthquake epicenter and compared with geomagnetic data, the method shows a good result in the average to detect the source of these outliers. This method is promising to use in the future research.

  9. Linear regression

    CERN Document Server

    Olive, David J

    2017-01-01

    This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...

  10. Predicting the metabolizable energy content of corn for ducks: a comparison of support vector regression with other methods

    Directory of Open Access Journals (Sweden)

    A. Faridi

    2013-11-01

    Full Text Available Support vector regression (SVR is used in this study to develop models to estimate apparent metabolizable energy (AME, AME corrected for nitrogen (AMEn, true metabolizable energy (TME, and TME corrected for nitrogen (TMEn contents of corn fed to ducks based on its chemical composition. Performance of the SVR models was assessed by comparing their results with those of artificial neural network (ANN and multiple linear regression (MLR models. The input variables to estimate metabolizable energy content (MJ kg-1 of corn were crude protein, ether extract, crude fibre, and ash (g kg-1. Goodness of fit of the models was examined using R2, mean square error, and bias. Based on these indices, the predictive performance of the SVR, ANN, and MLR models was acceptable. Comparison of models indicated that performance of SVR (in terms of R2 on the full data set (0.937 for AME, 0.954 for AMEn, 0.860 for TME, and 0.937 for TMEn was better than that of ANN (0.907 for AME, 0.922 for AMEn, 0.744 for TME, and 0.920 for TMEn and MLR (0.887 for AME, 0.903 for AMEn, 0.704 for TME, and 0.902 for TMEn. Similar findings were observed with the calibration and testing data sets. These results suggest SVR models are a promising tool for modelling the relationship between chemical composition and metabolizable energy of feedstuffs for poultry. Although from the present results the application of SVR models seems encouraging, the use of such models in other areas of animal nutrition needs to be evaluated.

  11. Reference Function Based Spatiotemporal Fuzzy Logic Control Design Using Support Vector Regression Learning

    Directory of Open Access Journals (Sweden)

    Xian-Xia Zhang

    2013-01-01

    Full Text Available This paper presents a reference function based 3D FLC design methodology using support vector regression (SVR learning. The concept of reference function is introduced to 3D FLC for the generation of 3D membership functions (MF, which enhance the capability of the 3D FLC to cope with more kinds of MFs. The nonlinear mathematical expression of the reference function based 3D FLC is derived, and spatial fuzzy basis functions are defined. Via relating spatial fuzzy basis functions of a 3D FLC to kernel functions of an SVR, an equivalence relationship between a 3D FLC and an SVR is established. Therefore, a 3D FLC can be constructed using the learned results of an SVR. Furthermore, the universal approximation capability of the proposed 3D fuzzy system is proven in terms of the finite covering theorem. Finally, the proposed method is applied to a catalytic packed-bed reactor and simulation results have verified its effectiveness.

  12. Predictive based monitoring of nuclear plant component degradation using support vector regression

    Energy Technology Data Exchange (ETDEWEB)

    Agarwal, Vivek [Idaho National Lab. (INL), Idaho Falls, ID (United States). Dept. of Human Factors, Controls, Statistics; Alamaniotis, Miltiadis [Purdue Univ., West Lafayette, IN (United States). School of Nuclear Engineering; Tsoukalas, Lefteri H. [Purdue Univ., West Lafayette, IN (United States). School of Nuclear Engineering

    2015-02-01

    Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component’s respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.

  13. Support vector machine prediction model based on improved genetic algorithm%基于改进遗传算法的支持向量机预测模型研究

    Institute of Scientific and Technical Information of China (English)

    陈锦青; 韩延杰

    2013-01-01

    作为一种新的机器学习方法,支持向量机的参数选择没有一个统一的模式和标准。为了克服这一缺点,对遗传算法进行改进,构造一种混沌云自适应模拟退火遗传算法( CCASAGA )对支持向量机回归参数进行优化。该算法将混沌优化、基于云模型的自适应控制机制和模拟退火的Metropolis 准则结合起来,并采取精英保持策略加快算法的收敛速度。利用改进后的 CCASAGA-SVR预测模型对某股份制银行 ATM 机现金需求进行预测,并引入 GA-SVR 模型和BP 神经网络模型进行对比,从而证实该预测模型具有更高的预测精度。%As a new method of machine learning-support vector machine ( SVM ) , there is not an unified mode and standards to select parameters . In order to overcome this shortcoming , the author improves the genetic algorithm , and proposes a chaos Cloud-based adaptive simulated annealing genetic algorithm ( CCASAGA ) to optimize the parameters of support vector regression machine . This method combines chaos optimization , adaptive control mechanism based on Cloud model and simulated annealing Metropolis cri-terion , and also takes the elite hold strategy to accelerate the speed of convergence of the algorithm . Finally , take the constructed CCASAGA-SVR model to predict the ATM cash demand of a joint-stock bank . To illustrate the proposed method has a higher prediction accuracy , this paper introduces standard GA-SVR model and BP neural network model as a comparison .

  14. When Machines Design Machines!

    DEFF Research Database (Denmark)

    2011-01-01

    Until recently we were the sole designers, alone in the driving seat making all the decisions. But, we have created a world of complexity way beyond human ability to understand, control, and govern. Machines now do more trades than humans on stock markets, they control our power, water, gas...... and food supplies, manage our elevators, microclimates, automobiles and transport systems, and manufacture almost everything. It should come as no surprise that machines are now designing machines. The chips that power our computers and mobile phones, the robots and commercial processing plants on which we...... depend, all are now largely designed by machines. So what of us - will be totally usurped, or are we looking at a new symbiosis with human and artificial intelligences combined to realise the best outcomes possible. In most respects we have no choice! Human abilities alone cannot solve any of the major...

  15. Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Morufu Olusola Ibitoye

    2016-07-01

    Full Text Available The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70% and testing (30% subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2 between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.

  16. Maximum likelihood optimal and robust Support Vector Regression with lncosh loss function.

    Science.gov (United States)

    Karal, Omer

    2017-10-01

    In this paper, a novel and continuously differentiable convex loss function based on natural logarithm of hyperbolic cosine function, namely lncosh loss, is introduced to obtain Support Vector Regression (SVR) models which are optimal in the maximum likelihood sense for the hyper-secant error distributions. Most of the current regression models assume that the distribution of error is Gaussian, which corresponds to the squared loss function and has helpful analytical properties such as easy computation and analysis. However, in many real world applications, most observations are subject to unknown noise distributions, so the Gaussian distribution may not be a useful choice. The developed SVR model with the parameterized lncosh loss provides a possibility of learning a loss function leading to a regression model which is maximum likelihood optimal for a specific input-output data. The SVR models obtained with different parameter choices of lncosh loss with ε-insensitiveness feature, possess most of the desirable characteristics of well-known loss functions such as Vapnik's loss, the Squared loss, and Huber's loss function as special cases. In other words, it is observed in the extensive simulations that the mentioned lncosh loss function is entirely controlled by a single adjustable λ parameter and as a result, it allows switching between different losses depending on the choice of λ. The effectiveness and feasibility of lncosh loss function are validated through a number of synthetic and real world benchmark data sets for various types of additive noise distributions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Multiple time series autoregressive method based on support vector regression%基于支持向量回归的多时间序列自回归方法

    Institute of Scientific and Technical Information of China (English)

    张伟; 柳先辉; 丁毅; 史德明

    2012-01-01

    能耗时间序列涉及多种能源,且各种能源间关系复杂,主要通过多个独立的单时间序列进行预报,这种方式忽略了多时间序列之间的依赖性.为了充分利用多时间序列之间的关联信息以提高预报的准确性,根据机器学习中的向量值函数学习和多任务学习理论,采用支持向量回归(SVR)算法建立了多时间序列的向量值自回归方法和多任务自回归方法.实验结果证明,与多个独立的单时间序列模型相比,通过这种方法建立的多时间序列自回归模型在焦化工序能耗预报中表现出了更好的性能.%Energy consumption time series involves a variety of energy and the relationship between different energy is complicated. Most existing consumption methods make prediction through multiple independent single time series respectively, which ignores dependencies between multiple time series. In order to take full advantage of the association between multiple time series and improve prediction accuracy, the vector-valued autoregressive method and multi-task autoregressive method based on Support Vector Regression (SVR) machines were proposed for multiple time series forecast according to vector-valued function learning and multi-task learning theory. The experimental results with energy consumption of coking process verify that multiple time series autoregressive models based on the proposed methods show better prediction performance.

  18. ORDINAL REGRESSION FOR INFORMATION RETRIEVAL

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effectiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM significantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.

  19. [Prediction model of net photosynthetic rate of ginseng under forest based on optimized parameters support vector machine].

    Science.gov (United States)

    Wu, Hai-wei; Yu, Hai-ye; Zhang, Lei

    2011-05-01

    Using K-fold cross validation method and two support vector machine functions, four kernel functions, grid-search, genetic algorithm and particle swarm optimization, the authors constructed the support vector machine model of the best penalty parameter c and the best correlation coefficient. Using information granulation technology, the authors constructed P particle and epsilon particle about those factors affecting net photosynthetic rate, and reduced these dimensions of the determinant. P particle includes the percent of visible spectrum ingredients. Epsilon particle includes leaf temperature, scattering radiation, air temperature, and so on. It is possible to obtain the best correlation coefficient among photosynthetic effective radiation, visible spectrum and individual net photosynthetic rate by this technology. The authors constructed the training set and the forecasting set including photosynthetic effective radiation, P particle and epsilon particle. The result shows that epsilon-SVR-RBF-genetic algorithm model, nu-SVR-linear-grid-search model and nu-SVR-RBF-genetic algorithm model obtain the correlation coefficient of up to 97% about the forecasting set including photosynthetic effective radiation and P particle. The penalty parameter c of nu-SVR-linear-grid-search model is the minimum, so the model's generalization ability is the best. The authors forecasted the forecasting set including photosynthetic effective radiation, P particle and epsilon particle by the model, and the correlation coefficient is up to 96%.

  20. Interpolation and extrapolation problems of multivariate regression in analytical chemistry: benchmarking the robustness on near-infrared (NIR) spectroscopy data.

    Science.gov (United States)

    Balabin, Roman M; Smirnov, Sergey V

    2012-04-07

    Modern analytical chemistry of industrial products is in need of rapid, robust, and cheap analytical methods to continuously monitor product quality parameters. For this reason, spectroscopic methods are often used to control the quality of industrial products in an on-line/in-line regime. Vibrational spectroscopy, including mid-infrared (MIR), Raman, and near-infrared (NIR), is one of the best ways to obtain information about the chemical structures and the quality coefficients of multicomponent mixtures. Together with chemometric algorithms and multivariate data analysis (MDA) methods, which were especially created for the analysis of complicated, noisy, and overlapping signals, NIR spectroscopy shows great results in terms of its accuracy, including classical prediction error, RMSEP. However, it is unclear whether the combined NIR + MDA methods are capable of dealing with much more complex interpolation or extrapolation problems that are inevitably present in real-world applications. In the current study, we try to make a rather general comparison of linear, such as partial least squares or projection to latent structures (PLS); "quasi-nonlinear", such as the polynomial version of PLS (Poly-PLS); and intrinsically non-linear, such as artificial neural networks (ANNs), support vector regression (SVR), and least-squares support vector machines (LS-SVM/LSSVM), regression methods in terms of their robustness. As a measure of robustness, we will try to estimate their accuracy when solving interpolation and extrapolation problems. Petroleum and biofuel (biodiesel) systems were chosen as representative examples of real-world samples. Six very different chemical systems that differed in complexity, composition, structure, and properties were studied; these systems were gasoline, ethanol-gasoline biofuel, diesel fuel, aromatic solutions of petroleum macromolecules, petroleum resins in benzene, and biodiesel. Eighteen different sample sets were used in total. General

  1. A Short-Term and High-Resolution System Load Forecasting Approach Using Support Vector Regression with Hybrid Parameters Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Jiang, Huaiguang [National Renewable Energy Laboratory (NREL), Golden, CO (United States)

    2017-08-25

    This work proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of the hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system.

  2. Machine Translation

    Institute of Scientific and Technical Information of China (English)

    张严心

    2015-01-01

    As a kind of ancillary translation tool, Machine Translation has been paid increasing attention to and received different kinds of study by a great deal of researchers and scholars for a long time. To know the definition of Machine Translation and to analyse its benefits and problems are significant for translators in order to make good use of Machine Translation, and helpful to develop and consummate Machine Translation Systems in the future.

  3. Sustainable machining

    CERN Document Server

    2017-01-01

    This book provides an overview on current sustainable machining. Its chapters cover the concept in economic, social and environmental dimensions. It provides the reader with proper ways to handle several pollutants produced during the machining process. The book is useful on both undergraduate and postgraduate levels and it is of interest to all those working with manufacturing and machining technology.

  4. Seasonal river discharge forecast in alpine catchments using snow map time series and support vector regression approach

    OpenAIRE

    Callegari, Mattia; Mazzoli, Paolo; Gregorio, Ludovica de; Notarnicola, Claudia; PETITTA Marcello; Pasolli, Luca; Seppi, Roberto; Pistocchi, Alberto

    2014-01-01

    The prediction of monthly mean discharge is critical for water resources management. Statistical methods applied on discharge time series are traditionally used for predicting this kind of slow response hydrological events. With this paper we present a Support Vector Regression (SVR) system able to predict monthly mean discharge considering discharge and snow cover extent (250 meters resolution obtained by MODIS images) time series as input. Additional meteorological and climatic variables ar...

  5. Hybrid support vector regression and autoregressive integrated moving average models improved by particle swarm optimization for property crime rates forecasting with economic indicators.

    Science.gov (United States)

    Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Sallehuddin, Roselina

    2013-01-01

    Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.

  6. Hybrid Support Vector Regression and Autoregressive Integrated Moving Average Models Improved by Particle Swarm Optimization for Property Crime Rates Forecasting with Economic Indicators

    Directory of Open Access Journals (Sweden)

    Razana Alwee

    2013-01-01

    Full Text Available Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR and autoregressive integrated moving average (ARIMA to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.

  7. Support vector regression and artificial neural network models for stability indicating analysis of mebeverine hydrochloride and sulpiride mixtures in pharmaceutical preparation: a comparative study.

    Science.gov (United States)

    Naguib, Ibrahim A; Darwish, Hany W

    2012-02-01

    A comparison between support vector regression (SVR) and Artificial Neural Networks (ANNs) multivariate regression methods is established showing the underlying algorithm for each and making a comparison between them to indicate the inherent advantages and limitations. In this paper we compare SVR to ANN with and without variable selection procedure (genetic algorithm (GA)). To project the comparison in a sensible way, the methods are used for the stability indicating quantitative analysis of mixtures of mebeverine hydrochloride and sulpiride in binary mixtures as a case study in presence of their reported impurities and degradation products (summing up to 6 components) in raw materials and pharmaceutical dosage form via handling the UV spectral data. For proper analysis, a 6 factor 5 level experimental design was established resulting in a training set of 25 mixtures containing different ratios of the interfering species. An independent test set consisting of 5 mixtures was used to validate the prediction ability of the suggested models. The proposed methods (linear SVR (without GA) and linear GA-ANN) were successfully applied to the analysis of pharmaceutical tablets containing mebeverine hydrochloride and sulpiride mixtures. The results manifest the problem of nonlinearity and how models like the SVR and ANN can handle it. The methods indicate the ability of the mentioned multivariate calibration models to deconvolute the highly overlapped UV spectra of the 6 components' mixtures, yet using cheap and easy to handle instruments like the UV spectrophotometer.

  8. A regression approach to the mapping of bio-physical characteristics of surface sediment using in situ and airborne hyperspectral acquisitions

    Science.gov (United States)

    Ibrahim, Elsy; Kim, Wonkook; Crawford, Melba; Monbaliu, Jaak

    2017-01-01

    Remote sensing has been successfully utilized to distinguish and quantify sediment properties in the intertidal environment. Classification approaches of imagery are popular and powerful yet can lead to site- and case-specific results. Such specificity creates challenges for temporal studies. Thus, this paper investigates the use of regression models to quantify sediment properties instead of classifying them. Two regression approaches, namely multiple regression (MR) and support vector regression (SVR), are used in this study for the retrieval of bio-physical variables of intertidal surface sediment of the IJzermonding, a Belgian nature reserve. In the regression analysis, mud content, chlorophyll a concentration, organic matter content, and soil moisture are estimated using radiometric variables of two airborne sensors, namely airborne hyperspectral sensor (AHS) and airborne prism experiment (APEX) and and using field hyperspectral acquisitions by analytical spectral device (ASD). The performance of the two regression approaches is best for the estimation of moisture content. SVR attains the highest accuracy without feature reduction while MR achieves good results when feature reduction is carried out. Sediment property maps are successfully obtained using the models and hyperspectral imagery where SVR used with all bands achieves the best performance. The study also involves the extraction of weights identifying the contribution of each band of the images in the quantification of each sediment property when MR and principal component analysis are used.

  9. A regression approach to the mapping of bio-physical characteristics of surface sediment using in situ and airborne hyperspectral acquisitions

    Science.gov (United States)

    Ibrahim, Elsy; Kim, Wonkook; Crawford, Melba; Monbaliu, Jaak

    2017-02-01

    Remote sensing has been successfully utilized to distinguish and quantify sediment properties in the intertidal environment. Classification approaches of imagery are popular and powerful yet can lead to site- and case-specific results. Such specificity creates challenges for temporal studies. Thus, this paper investigates the use of regression models to quantify sediment properties instead of classifying them. Two regression approaches, namely multiple regression (MR) and support vector regression (SVR), are used in this study for the retrieval of bio-physical variables of intertidal surface sediment of the IJzermonding, a Belgian nature reserve. In the regression analysis, mud content, chlorophyll a concentration, organic matter content, and soil moisture are estimated using radiometric variables of two airborne sensors, namely airborne hyperspectral sensor (AHS) and airborne prism experiment (APEX) and and using field hyperspectral acquisitions by analytical spectral device (ASD). The performance of the two regression approaches is best for the estimation of moisture content. SVR attains the highest accuracy without feature reduction while MR achieves good results when feature reduction is carried out. Sediment property maps are successfully obtained using the models and hyperspectral imagery where SVR used with all bands achieves the best performance. The study also involves the extraction of weights identifying the contribution of each band of the images in the quantification of each sediment property when MR and principal component analysis are used.

  10. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  11. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  12. Determination of Quality Properties of Soy Sauce by Support Vector Regression Coupled with SW-NIR Spectroscopy

    Institute of Scientific and Technical Information of China (English)

    LIU Tong; BAO Chun-fang; REN Yu-lin

    2011-01-01

    The modem near-infrared(NIR) spectroscopy analysis is a simple, efficient and nondestructive technique,which has been used in chemical analysis in diverse fields. Shortwave NIR spectroscopy is also a rapid, flexible, and cost-effective method to control product quality in food industry. The method of support vector regression coupled with shortwave NIR spectroscopy was explored for the nondestructive quantitative analysis of the important quality parameters of soy sauce, including amino nitrogen content, total acid content, salt content and color ratio. In this study, the support vector regression(SVR) models based on subtractive spectra and positive spectra were found and compared, the results show that the subtractive spectrum was more excellent than the positive spectrum. Meanwhile,R and RSE were determined, respectively, by means of original spectra and pretreated spectra[standard normal variate (SNV), first-derivative and second-derivative], and the corresponding models were successfully established. The best prediction was achieved by a support vector regression model of the first derivative transformed dataset. In addition,the result obtained by the proposed method was compared with that of Partial Least Squares(PLS), which showed that the generalization performance of the classifier based on SVR was much better than that of PLS. The results demonstrate that shortwave NIR spectroscopy combined with SVR is promising for thc quality control of soy sauce.

  13. Performance Comparison Between Support Vector Regression and Artificial Neural Network for Prediction of Oil Palm Production

    Directory of Open Access Journals (Sweden)

    Mustakim Mustakim

    2016-02-01

    Full Text Available The largest region that produces oil palm in Indonesia has an important role in improving the welfare of society and economy. Oil palm has increased significantly in Riau Province in every period, to determine the production development for the next few years with the functions and benefits of oil palm carried prediction production results that were seen from time series data last 8 years (2005-2013. In its prediction implementation, it was done by comparing the performance of Support Vector Regression (SVR method and Artificial Neural Network (ANN. From the experiment, SVR produced the best model compared with ANN. It is indicated by the correlation coefficient of 95% and 6% for MSE in the kernel Radial Basis Function (RBF, whereas ANN produced only 74% for R2 and 9% for MSE on the 8th experiment with hiden neuron 20 and learning rate 0,1. SVR model generates predictions for next 3 years which increased between 3% - 6% from actual data and RBF model predictions.

  14. Forecast daily indices of solar activity, F10.7, using support vector regression method

    Institute of Scientific and Technical Information of China (English)

    Cong Huang; Dan-Dan Liu; Jing-Song Wang

    2009-01-01

    The 10.7cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network.

  15. Applying different independent component analysis algorithms and support vector regression for IT chain store sales forecasting.

    Science.gov (United States)

    Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie

    2014-01-01

    Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  16. Estimation of the laser cutting operating cost by support vector regression methodology

    Science.gov (United States)

    Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam

    2016-09-01

    Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.

  17. Applying Different Independent Component Analysis Algorithms and Support Vector Regression for IT Chain Store Sales Forecasting

    Directory of Open Access Journals (Sweden)

    Wensheng Dai

    2014-01-01

    Full Text Available Sales forecasting is one of the most important issues in managing information technology (IT chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR, is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA, temporal ICA (tICA, and spatiotemporal ICA (stICA to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  18. Determination of glucose in plasma by dry film-based Fourier transformed-infrared spectroscopy coupled with boosting support vector regression.

    Science.gov (United States)

    Zhou, Yan-Ping; Xu, Lu; Tang, Li-Juan; Jiang, Jian-Hui; Shen, Guo-Li; Yu, Ru-Qin; Ozaki, Yukihiro

    2007-07-01

    In the present study, a dry film-based Fourier transformed-infrared (FT-IR) spectroscopic technique, coupled with boosting support vector regression (BSVR), was employed for a blood glucose assay. Potassium thiocyanate (KSCN) was taken in the dry-film method as an internal standard to compensate for any film thickness variation. This technique circumvents interference from water absorption, and requires only 5 microl of a sample. Moving window partial least-squares regression (MWPLSR) was used for wavenumber interval selection before multivariate modeling. By using the BSVR modeling technique, glucose in plasma could be determined over a 0.4 - 20 mmol/l concentration range with satisfactory accuracy. The performance of the BSVR methodology was compared with that of conventional support vector regression (SVR) as well as partial-least squares (PLS). The results demonstrated that BSVR is an effective multivariate calibration tool, providing better performance than conventional PLS and SVR.

  19. Simple machines

    CERN Document Server

    Graybill, George

    2007-01-01

    Just how simple are simple machines? With our ready-to-use resource, they are simple to teach and easy to learn! Chocked full of information and activities, we begin with a look at force, motion and work, and examples of simple machines in daily life are given. With this background, we move on to different kinds of simple machines including: Levers, Inclined Planes, Wedges, Screws, Pulleys, and Wheels and Axles. An exploration of some compound machines follows, such as the can opener. Our resource is a real time-saver as all the reading passages, student activities are provided. Presented in s

  20. Prediction on the softening point of bitumen in producing by using SVR%沥青生产过程中软化点的SVR预测

    Institute of Scientific and Technical Information of China (English)

    蔡从中; 王桂莲; 裴军芳; 朱星键

    2011-01-01

    According to an experimental dataset on the softening points of 30 bitumen samples under different resistances and temperatures,the support vector regression(SVR) approach combined with particle swarm optimization(PSO) for its parameter optimization is proposed to conduct leave-one-out cross validation(LOOCV) for modeling and predicting the softening point of bitumen,and its prediction result is compared with that of multivariate linear regression(MLR).The maximum error 2.1 ℃ predicted by SVR is much less than 7.9 ℃ which is calculated by MLR modeling.The statistical results reveal that the root mean square error(RMSE=0.75 ℃),mean absolute error(MAE=0.32 ℃) and mean absolute percentage error(MAPE=0.28%) achieved by SVR-LOOCV are all less than those(RMSE=3.3 ℃,MAE=2.6 ℃ and MAPE=2.34%) calculated via MLR model.This study suggests that the softening point of bitumen can be forecasted timely by SVR to provide an accurate guidance for producing of high-quality bitumen.%根据30组不同电阻和温度下的沥青软化点的实测数据集,应用基于粒子群算法(PSO)寻优的支持向量回归(SVR)方法,并结合留一交叉验证(LOOCV)法对沥青软化点进行了建模和预测研究,将其预测结果与多元线性回归(MLR)模型的计算结果进行了比较。SVR-LOOCV预测的最大误差为2.1℃,远比MLR模型计算的最大误差7.9℃要小得多。统计结果表明:基于SVR-LOOCV预测结果的均方根误差(RMSE=0.75℃)、平均绝对误差(MAE=0.32℃)和平均绝对百分误差(MAPE=0.28%)相应也比MLR回归模型的预测结果(RMSE=3.3℃,MAE=2.6℃和MAPE=2.34%)要小。因此,应用SVR实时预测沥青产品的软化点,可为生产优质沥青提供准确的科学指导。

  1. 基于小波包特征提取及支持向量回归机的光纤布拉格光栅冲击定位系统%Identification of impact location by using FBG based on wavelet packet feature extraction and SVR

    Institute of Scientific and Technical Information of China (English)

    芦吉云; 王帮峰; 梁大开

    2012-01-01

    A real-time monitoring system of composite impact loads was constructed by a Fiber Bragg Grating(FBG) sensor network, and the wavelet packet feature extraction and a Support Vector Regression (SVR) were used to identify the impact location. For the impact response signals at the same position measured by different FBG sensors, the wavelet packet energy spectrum analysis shows that some specifically frequency bands of sensor signals are sensitive to the impact. The relation between impact location and wavelet energy was studied and the sixth decomposition level wavelet packet energy was chosen as the characteristic vector of the impact location. The SVR whose tuning parameters have been optimized was used to established the sample regression model and predict the impact location. The result shows that network testing error of the SVR is 4. 81%. The research provides a practical reference for the impact performance evaluation of the structures from carbon fiber reinforced plastics.%以光纤布拉格光栅(FBG)为传感网络,构建了复合材料冲击载荷实时在线监测系统,研究了基于小波包特征提取及支持向量回归机的光纤-碳纤维复合材料结构冲击定位方法.针对同一冲击点,分析不同传感信号,获得了冲击响应信号小波包能量谱,分析结果表明小波包能量谱中特定阶数对冲击敏感.改变冲击点位置研究小波包能量谱与冲击位置之间的关系,提出将第6阶小波包能量值作为冲击定位的特征向量.采用支持向量回归机建立样本数据的回归模型,预测冲击载荷位置,并对支持向量机的相关调整参数进行了优化.实验表明,支持向量机的网络测试误差为4.81%.研究结果可为碳纤维复合材料(CFRP)层状结构的冲击性能评估提供可行的实验方法.

  2. Regression analysis by example

    National Research Council Canada - National Science Library

    Chatterjee, Samprit; Hadi, Ali S

    2012-01-01

    .... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...

  3. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation

    Science.gov (United States)

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-09-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1-10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633-0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926-0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation.

  4. Electric machine

    Science.gov (United States)

    El-Refaie, Ayman Mohamed Fawzi [Niskayuna, NY; Reddy, Patel Bhageerath [Madison, WI

    2012-07-17

    An interior permanent magnet electric machine is disclosed. The interior permanent magnet electric machine comprises a rotor comprising a plurality of radially placed magnets each having a proximal end and a distal end, wherein each magnet comprises a plurality of magnetic segments and at least one magnetic segment towards the distal end comprises a high resistivity magnetic material.

  5. A Wireless Electronic Nose System Using a Fe2O3 Gas Sensing Array and Least Squares Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Yingguo Cheng

    2011-01-01

    Full Text Available This paper describes the design and implementation of a wireless electronic nose (WEN system which can online detect the combustible gases methane and hydrogen (CH4/H2 and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs and a faster convergence rate than the standard support vector regression (SVR. The designed WEN system effectively achieves gas mixture analysis in a real-time process.

  6. A Hybrid Sales Forecasting Scheme by Combining Independent Component Analysis with K-Means Clustering and Support Vector Regression

    Science.gov (United States)

    2014-01-01

    Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738

  7. A Hybrid Sales Forecasting Scheme by Combining Independent Component Analysis with K-Means Clustering and Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Chi-Jie Lu

    2014-01-01

    Full Text Available Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA with K-means clustering and support vector regression (SVR. The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.

  8. Estimation of residual stress in welding of dissimilar metals at nuclear power plants using cascaded support vetor regression

    Energy Technology Data Exchange (ETDEWEB)

    Koo, Young Do; Yoo, Kwae Hwan; Na, Man Gyun [Dept. of Nuclear Engineering, Chosun University, Gwangju (Korea, Republic of)

    2017-06-15

    Residual stress is a critical element in determining the integrity of parts and the lifetime of welded structures. It is necessary to estimate the residual stress of a welding zone because residual stress is a major reason for the generation of primary water stress corrosion cracking in nuclear power plants. That is, it is necessary to estimate the distribution of the residual stress in welding of dissimilar metals under manifold welding conditions. In this study, a cascaded support vector regression (CSVR) model was presented to estimate the residual stress of a welding zone. The CSVR model was serially and consecutively structured in terms of SVR modules. Using numerical data obtained from finite element analysis by a subtractive clustering method, learning data that explained the characteristic behavior of the residual stress of a welding zone were selected to optimize the proposed model. The results suggest that the CSVR model yielded a better estimation performance when compared with a classic SVR model.

  9. Combining support vector regression and cellular genetic algorithm for multi-objective optimization of coal-fired utility boilers

    Energy Technology Data Exchange (ETDEWEB)

    Feng Wu; Hao Zhou; Tao Ren; Ligang Zheng; Kefa Cen [Zhejiang University, Hangzhou (China). State Key Laboratory of Clean Energy Utilization

    2009-10-15

    Support vector regression (SVR) was employed to establish mathematical models for the NOx emissions and carbon burnout of a 300 MW coal-fired utility boiler. Combined with the SVR models, the cellular genetic algorithm for multi-objective optimization (MOCell) was used for multi-objective optimization of the boiler combustion. Meanwhile, the comparison between MOCell and the improved non-dominated sorting genetic algorithm (NSGA-II) shows that MOCell has superior performance to NSGA-II regarding the problem. The field experiments were carried out to verify the accuracy of the results obtained by MOCell, the results were in good agreement with the measurement data. The proposed approach provides an effective tool for multi-objective optimization of coal combustion performance, whose feasibility and validity are experimental validated. A time period of less than 4 s was required for a run of optimization under a PC system, which is suitable for the online application. 19 refs., 8 figs., 2 tabs.

  10. Quantification of animal fat biodiesel in soybean biodiesel and B20 diesel blends using near infrared spectroscopy and synergy interval support vector regression.

    Science.gov (United States)

    Filgueiras, Paulo Roberto; Alves, Júlio Cesar L; Poppi, Ronei Jesus

    2014-02-01

    In this work, multivariate calibration based on partial least squares (PLS) and support vector regression (SVR) using the whole spectrum and variable selection by synergy interval (siPLS and siSVR) were applied to NIR spectra for the determination of animal fat biodiesel content in soybean biodiesel and B20 diesel blends. For all models, prediction errors, bias test for systematic errors and permutation test for trends in the residuals were calculated. The siSVR produced significantly lower prediction errors compared to the full spectrum methods and siPLS, with a root mean squares error (RMSEP) of 0.18%(w/w) (concentration range: 0.00%-69.00%(w/w)) in the soybean biodiesel blend and 0.10%(w/w) in the B20 diesel (concentration range: 0.00%-13.80%(w/w)). Additionally, in the models for the determination of animal fat biodiesel in blends with soybean diesel, PLS and SVR showed evidence of systematic errors, and PLS/siPLS presented trends in residuals based on the permutation test. For the B20 diesel, PLS presented evidence of systematic errors, and siPLS presented trends in the residuals.

  11. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating e...

  12. The Machine within the Machine

    CERN Multimedia

    Katarina Anthony

    2014-01-01

    Although Virtual Machines are widespread across CERN, you probably won't have heard of them unless you work for an experiment. Virtual machines - known as VMs - allow you to create a separate machine within your own, allowing you to run Linux on your Mac, or Windows on your Linux - whatever combination you need.   Using a CERN Virtual Machine, a Linux analysis software runs on a Macbook. When it comes to LHC data, one of the primary issues collaborations face is the diversity of computing environments among collaborators spread across the world. What if an institute cannot run the analysis software because they use different operating systems? "That's where the CernVM project comes in," says Gerardo Ganis, PH-SFT staff member and leader of the CernVM project. "We were able to respond to experimentalists' concerns by providing a virtual machine package that could be used to run experiment software. This way, no matter what hardware they have ...

  13. Machine learning in virtual screening.

    Science.gov (United States)

    Melville, James L; Burke, Edmund K; Hirst, Jonathan D

    2009-05-01

    In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression techniques. Effective application of these methodologies requires an appreciation of data preparation, validation, optimization, and search methodologies, and we also survey developments in these areas.

  14. SVR-Miner:一种基于大型软件的安全验证规则挖掘和缺陷检测工具(英文)

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    For various reasons,many of the security programming rules applicable to specific software have not been recorded in official documents,and hence can hardly be employed by static analysis tools for detection.In this paper,we propose a new approach,named SVR-Miner(Security Validation Rules Miner),which uses frequent sequence mining technique [1-4] to automatically infer implicit security validation rules from large software code written in C programming language.Different from the past works in this area,SVR...

  15. Machine Learning

    CERN Document Server

    CERN. Geneva

    2017-01-01

    Machine learning, which builds on ideas in computer science, statistics, and optimization, focuses on developing algorithms to identify patterns and regularities in data, and using these learned patterns to make predictions on new observations. Boosted by its industrial and commercial applications, the field of machine learning is quickly evolving and expanding. Recent advances have seen great success in the realms of computer vision, natural language processing, and broadly in data science. Many of these techniques have already been applied in particle physics, for instance for particle identification, detector monitoring, and the optimization of computer resources. Modern machine learning approaches, such as deep learning, are only just beginning to be applied to the analysis of High Energy Physics data to approach more and more complex problems. These classes will review the framework behind machine learning and discuss recent developments in the field.

  16. Regression models for near-infrared measurement of subcutaneous adipose tissue thickness.

    Science.gov (United States)

    Wang, Yu; Hao, Dongmei; Shi, Jingbin; Yang, Zeqiang; Jin, Liu; Zhang, Song; Yang, Yimin; Bin, Guangyu; Zeng, Yanjun; Zheng, Dingchang

    2016-07-01

    Obesity is often associated with the risks of diabetes and cardiovascular disease, and there is a need to measure subcutaneous adipose tissue (SAT) thickness for acquiring the distribution of body fat. The present study aimed to develop and evaluate different model-based methods for SAT thickness measurement using an SATmeter developed in our laboratory. Near-infrared signals backscattered from the body surfaces from 40 subjects at 20 body sites each were recorded. Linear regression (LR) and support vector regression (SVR) models were established to predict SAT thickness on different body sites. The measurement accuracy was evaluated by ultrasound, and compared with results from a mechanical skinfold caliper (MSC) and a body composition balance monitor (BCBM). The results showed that both LR- and SVR-based measurement produced better accuracy than MSC and BCBM. It was also concluded that by using regression models specifically designed for certain parts of human body, higher measurement accuracy could be achieved than using a general model for the whole body. Our results demonstrated that the SATmeter is a feasible method, which can be applied at home and in the community due to its portability and convenience.

  17. Downscaling of MODIS One Kilometer Evapotranspiration Using Landsat-8 Data and Machine Learning Approaches

    Directory of Open Access Journals (Sweden)

    Yinghai Ke

    2016-03-01

    Full Text Available This study presented a MODIS 8-day 1 km evapotranspiration (ET downscaling method based on Landsat 8 data (30 m and machine learning approaches. Eleven indicators including albedo, land surface temperature (LST, and vegetation indices (VIs derived from Landsat 8 data were first upscaled to 1 km resolution. Machine learning algorithms including Support Vector Regression (SVR, Cubist, and Random Forest (RF were used to model the relationship between the Landsat indicators and MODIS 8-day 1 km ET. The models were then used to predict 30 m ET based on Landsat 8 indicators. A total of thirty-two pairs of Landsat 8 images/MODIS ET data were evaluated at four study sites including two in United States and two in South Korea. Among the three models, RF produced the lowest error, with relative Root Mean Square Error (rRMSE less than 20%. Vegetation greenness related indicators such as Normalized Difference Vegetation Index (NDVI, Enhanced Vegetation Index (EVI, Soil Adjusted Vegetation Index (SAVI, and vegetation moisture related indicators such as Normalized Difference Infrared Index—Landsat 8 OLI band 7 (NDIIb7 and Normalized Difference Water Index (NDWI were the five most important features used in RF model. Temperature-based indicators were less important than vegetation greenness and moisture-related indicators because LST could have considerable variation during each 8-day period. The predicted Landsat downscaled ET had good overall agreement with MODIS ET (average rRMSE = 22% and showed a similar temporal trend as MODIS ET. Compared to the MODIS ET product, the downscaled product demonstrated more spatial details, and had better agreement with in situ ET observations (R2 = 0.56. However, we found that the accuracy of MODIS ET was the main control factor of the accuracy of the downscaled product. Improved coarse-resolution ET estimation would result in better finer-resolution estimation. This study proved the potential of using machine learning

  18. Online monitoring and control of particle size in the grinding process using least square support vector regression and resilient back propagation neural network.

    Science.gov (United States)

    Pani, Ajaya Kumar; Mohanta, Hare Krishna

    2015-05-01

    Particle size soft sensing in cement mills will be largely helpful in maintaining desired cement fineness or Blaine. Despite the growing use of vertical roller mills (VRM) for clinker grinding, very few research work is available on VRM modeling. This article reports the design of three types of feed forward neural network models and least square support vector regression (LS-SVR) model of a VRM for online monitoring of cement fineness based on mill data collected from a cement plant. In the data pre-processing step, a comparative study of the various outlier detection algorithms has been performed. Subsequently, for model development, the advantage of algorithm based data splitting over random selection is presented. The training data set obtained by use of Kennard-Stone maximal intra distance criterion (CADEX algorithm) was used for development of LS-SVR, back propagation neural network, radial basis function neural network and generalized regression neural network models. Simulation results show that resilient back propagation model performs better than RBF network, regression network and LS-SVR model. Model implementation has been done in SIMULINK platform showing the online detection of abnormal data and real time estimation of cement Blaine from the knowledge of the input variables. Finally, closed loop study shows how the model can be effectively utilized for maintaining cement fineness at desired value.

  19. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  20. Unitary Response Regression Models

    Science.gov (United States)

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  1. Flexible survival regression modelling

    DEFF Research Database (Denmark)

    Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben

    2009-01-01

    Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...

  2. Quantile Regression Methods

    DEFF Research Database (Denmark)

    Fitzenberger, Bernd; Wilke, Ralf Andreas

    2015-01-01

    Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights by m...... treatment of the topic is based on the perspective of applied researchers using quantile regression in their empirical work....

  3. New approach to training support vector machine

    Institute of Scientific and Technical Information of China (English)

    Tang Faming; Chen Mianyun; Wang Zhongdong

    2006-01-01

    Support vector machine has become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. Training a support vector machine requires the solution of a very large quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, several approaches exist for circumventing the above shortcomings and work well. Another learning algorithm, particle swarm optimization, for training SVM is introduted. The method is tested on UCI datasets.

  4. Combining support vector regression and ant colony optimization to reduce NOx emissions in coal-fired utility boilers

    Energy Technology Data Exchange (ETDEWEB)

    Ligang Zheng; Hao Zhou; Chunlin Wang; Kefa Cen [Zhejiang University, Hangzhou (China). State Key Laboratory of Clean Energy Utilization

    2008-03-15

    Combustion optimization has recently demonstrated its potential to reduce NOx emissions in high capacity coal-fired utility boilers. In the present study, support vector regression (SVR), as well as artificial neural networks (ANN), was proposed to model the relationship between NOx emissions and operating parameters of a 300 MW coal-fired utility boiler. The predicted NOx emissions from the SVR model, by comparing with that of the ANN-based model, showed better agreement with the values obtained in the experimental tests on this boiler operated at different loads and various other operating parameters. The mean modeling error and the correlation factor were 1.58% and 0.94, respectively. Then, the combination of the SVR model with ant colony optimization (ACO) to reduce NOx emissions was presented in detail. The experimental results showed that the proposed approach can effectively reduce NOx emissions from the coal-fired utility boiler by about 18.69% (65 ppm). A time period of less than 6 min was required for NOx emissions modeling, and 2 min was required for a run of optimization under a PC system. The computing times are suitable for the online application of the proposed method to actual power plants. 37 refs., 8 figs., 3 tabs.

  5. A robust static decoupling algorithm for 3-axis force sensors based on coupling error model and ε-SVR.

    Science.gov (United States)

    Ma, Junqing; Song, Aiguo; Xiao, Jing

    2012-10-29

    Coupling errors are major threats to the accuracy of 3-axis force sensors. Design of decoupling algorithms is a challenging topic due to the uncertainty of coupling errors. The conventional nonlinear decoupling algorithms by a standard Neural Network (NN) are sometimes unstable due to overfitting. In order to avoid overfitting and minimize the negative effect of random noises and gross errors in calibration data, we propose a novel nonlinear static decoupling algorithm based on the establishment of a coupling error model. Instead of regarding the whole system as a black box in conventional algorithm, the coupling error model is designed by the principle of coupling errors, in which the nonlinear relationships between forces and coupling errors in each dimension are calculated separately. Six separate Support Vector Regressions (SVRs) are employed for their ability to perform adaptive, nonlinear data fitting. The decoupling performance of the proposed algorithm is compared with the conventional method by utilizing obtained data from the static calibration experiment of a 3-axis force sensor. Experimental results show that the proposed decoupling algorithm gives more robust performance with high efficiency and decoupling accuracy, and can thus be potentially applied to the decoupling application of 3-axis force sensors.

  6. Regression for economics

    CERN Document Server

    Naghshpour, Shahdad

    2012-01-01

    Regression analysis is the most commonly used statistical method in the world. Although few would characterize this technique as simple, regression is in fact both simple and elegant. The complexity that many attribute to regression analysis is often a reflection of their lack of familiarity with the language of mathematics. But regression analysis can be understood even without a mastery of sophisticated mathematical concepts. This book provides the foundation and will help demystify regression analysis using examples from economics and with real data to show the applications of the method. T

  7. Machine Learning

    Energy Technology Data Exchange (ETDEWEB)

    Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.; Carroll, Thomas E.; Muller, George

    2017-04-21

    The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networks and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.

  8. The Improved Relevance Voxel Machine

    DEFF Research Database (Denmark)

    Ganz, Melanie; Sabuncu, Mert; Van Leemput, Koen

    The concept of sparse Bayesian learning has received much attention in the machine learning literature as a means of achieving parsimonious representations of features used in regression and classification. It is an important family of algorithms for sparse signal recovery and compressed sensing...

  9. Machine testning

    DEFF Research Database (Denmark)

    De Chiffre, Leonardo

    This document is used in connection with a laboratory exercise of 3 hours duration as a part of the course GEOMETRICAL METROLOGY AND MACHINE TESTING. The exercise includes a series of tests carried out by the student on a conventional and a numerically controled lathe, respectively. This document...

  10. Representational Machines

    DEFF Research Database (Denmark)

    Petersson, Dag; Dahlgren, Anna; Vestberg, Nina Lager

    to the enterprises of the medium. This is the subject of Representational Machines: How photography enlists the workings of institutional technologies in search of establishing new iconic and social spaces. Together, the contributions to this edited volume span historical epochs, social environments, technological...

  11. Prognostics of Lithium-Ion Batteries Based on Battery Performance Analysis and Flexible Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shuai Wang

    2014-10-01

    Full Text Available Accurate prediction of the remaining useful life (RUL of lithium-ion batteries is important for battery management systems. Traditional empirical data-driven approaches for RUL prediction usually require multidimensional physical characteristics including the current, voltage, usage duration, battery temperature, and ambient temperature. From a capacity fading analysis of lithium-ion batteries, it is found that the energy efficiency and battery working temperature are closely related to the capacity degradation, which account for all performance metrics of lithium-ion batteries with regard to the RUL and the relationships between some performance metrics. Thus, we devise a non-iterative prediction model based on flexible support vector regression (F-SVR and an iterative multi-step prediction model based on support vector regression (SVR using the energy efficiency and battery working temperature as input physical characteristics. The experimental results show that the proposed prognostic models have high prediction accuracy by using fewer dimensions for the input data than the traditional empirical models.

  12. Autistic epileptiform regression.

    Science.gov (United States)

    Canitano, Roberto; Zappella, Michele

    2006-01-01

    Autistic regression is a well known condition that occurs in one third of children with pervasive developmental disorders, who, after normal development in the first year of life, undergo a global regression during the second year that encompasses language, social skills and play. In a portion of these subjects, epileptiform abnormalities are present with or without seizures, resembling, in some respects, other epileptiform regressions of language and behaviour such as Landau-Kleffner syndrome. In these cases, for a more accurate definition of the clinical entity, the term autistic epileptifom regression has been suggested. As in other epileptic syndromes with regression, the relationships between EEG abnormalities, language and behaviour, in autism, are still unclear. We describe two cases of autistic epileptiform regression selected from a larger group of children with autistic spectrum disorders, with the aim of discussing the clinical features of the condition, the therapeutic approach and the outcome.

  13. Scaled Sparse Linear Regression

    CERN Document Server

    Sun, Tingni

    2011-01-01

    Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual squares and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs nearly nothing beyond the computation of a path of the sparse regression estimator for penalty levels above a threshold. For the scaled Lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the method yields simultaneously an estimator for the noise level and an estimated coefficient vector in the Lasso path satisfying certain oracle inequalities for the estimation of the noise level, prediction, and the estimation of regression coefficients. These oracle inequalities provide sufficient conditions for the consistency and asymptotic...

  14. Prediction of Dynamical Systems by Symbolic Regression

    CERN Document Server

    Quade, Markus; Shafi, Kamran; Niven, Robert K; Noack, Bernd R

    2016-01-01

    We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting a...

  15. Rolling Regressions with Stata

    OpenAIRE

    Kit Baum

    2004-01-01

    This talk will describe some work underway to add a "rolling regression" capability to Stata's suite of time series features. Although commands such as "statsby" permit analysis of non-overlapping subsamples in the time domain, they are not suited to the analysis of overlapping (e.g. "moving window") samples. Both moving-window and widening-window techniques are often used to judge the stability of time series regression relationships. We will present an implementation of a rolling regression...

  16. Unbiased Quasi-regression

    Institute of Scientific and Technical Information of China (English)

    Guijun YANG; Lu LIN; Runchu ZHANG

    2007-01-01

    Quasi-regression, motivated by the problems arising in the computer experiments, focuses mainly on speeding up evaluation. However, its theoretical properties are unexplored systemically. This paper shows that quasi-regression is unbiased, strong convergent and asymptotic normal for parameter estimations but it is biased for the fitting of curve. Furthermore, a new method called unbiased quasi-regression is proposed. In addition to retaining the above asymptotic behaviors of parameter estimations, unbiased quasi-regression is unbiased for the fitting of curve.

  17. Introduction to regression graphics

    CERN Document Server

    Cook, R Dennis

    2009-01-01

    Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava

  18. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2005-01-01

    Master linear regression techniques with a new edition of a classic text Reviews of the Second Edition: ""I found it enjoyable reading and so full of interesting material that even the well-informed reader will probably find something new . . . a necessity for all of those who do linear regression."" -Technometrics, February 1987 ""Overall, I feel that the book is a valuable addition to the now considerable list of texts on applied linear regression. It should be a strong contender as the leading text for a first serious course in regression analysis."" -American Scientist, May-June 1987

  19. Application of Robust Support Vector Regression in Financial Time Sequence Prediction%鲁棒SVR在金融时间序列预测中的应用

    Institute of Scientific and Technical Information of China (English)

    王快妮; 钟萍; 赵耀红

    2011-01-01

    Aiming at the problem that standard Support Vector Machine(SVM) is sensitive to noise and outliers, by setting the upper bound of loss caused by noise and outliers, this paper presents a robust Support Vector Regression(SVR) based on asymmetric ramp loss function. The concave-convex procedure is employed to transform the associated non-convex optimization problem into a convex one. A Newton method is introduced to solve the robust model. Numerical experiments on the closing price of Hong Kong's Hang Seng index and Shanghai Stock index show that the model can reduce the noise and the influence of the abnormal values to a certain extent, increase the prediction accuracy and reduce risk of falling to avoid risk.%针对标准支持向量机对噪声和异常值比较敏感的问题,通过限定噪声和异常值的损失上界,提出一种基于不对称Ramp损失函数的鲁棒支持向量回归机模型,应用凹凸过程将其由非凸优化问题转化为凸优化问题并利用牛顿法进行求解.对上证指数和香港恒生指数收盘价的预测结果表明,该模型能在一定程度上抑制噪声和异常值的影响,从而提高预测精度及减少下跌风险,达到规避风险的目的.

  20. Adding machine and calculating machine

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    In 1642 the French mathematician Blaise Pascal(1623-1662) invented a machine;.that could add and subtract. It had.wheels that each had: 1 to 10 marked off along its circumference. When the wheel at the right, representing units, made one complete circle, it engaged the wheel to its left, represents tens, and moved it forward one notch.

  1. 基于支持向量回归机的离心式冷水机组运行能效模型研究%Research on COP Prediction Model of Centrifugal Chiller Based on SVR

    Institute of Scientific and Technical Information of China (English)

    蔡盼盼; 周璇; 李利文

    2015-01-01

    建立离心式冷水机组运行能效模型对其运行能效分析以及优化控制意义重大。离心式冷水机组结构复杂且其运行能效受多种因素的影响,机理建模困难,而支持向量回归机能够较好的解决非线性高维问题,因此提出了基于支持向量回归机的离心式冷水机组运行能效建模方法,提高了模型的精度。同时以某商场离心式冷水机组为例,对该方法进行验证,采用平均相对误差(MRE)和均方根误差(RMSE)对模型精度进行评价。结果表明,基于支持向量回归机的冷水机组模型MRE值较BP神经网络模型提高了37.93%,RMSE值较BP神经网络模型提高了28.81%,能准确的反应离心式冷水机组的运行能效。%Establishment of the COP model of centrifugal chiller is of great significance to it's energy efficiency analysis andoptimal control. Since centrifugal chillers operation energy efficiency structure is complex, which is greatly affected by operating parameters, it is difficult to build the mechanism model. In this paper, a prediction model of centrifugal chillers operation energy efficiency was proposed based on Support Vector Regression, whose model parameters were optimized by Particle Swarm Optimization algorithm. This model was verified by centrifugal chillers equipped in a shopping mall, mean relative error (MRE) and root mean squared erro (RMSE) was adopted as the evaluation of the prediction accuracy. The results showed that the prediction accuracy MRE of SVR model based on PSO optimization algorithm was 37.93 % higher than that of BP neural network, and RMSE was28.81% higher than of BP neural network. This model can provide theorical basisfor the centrifugal chiller energy efficiency analysis, fault detection and diagnosis, and optimizing control.

  2. Morse–Smale Regression

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Samuel [Univ. of Utah, Salt Lake City, UT (United States); Rubel, Oliver [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bremer, Peer -Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Whitaker, Ross T. [Univ. of Utah, Salt Lake City, UT (United States)

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  3. Forecasting hysteresis behaviours of magnetorheological elastomer base isolator utilizing a hybrid model based on support vector regression and improved particle swarm optimization

    Science.gov (United States)

    Yu, Yang; Li, Yancheng; Li, Jianchun

    2015-03-01

    Due to its inherent hysteretic characteristics, the main challenge for the application of a magnetorheological elastomer- (MRE) based isolator is the exploitation of the accurate model, which could fully describe its unique behaviour. This paper proposes a nonparametric model for a MRE-based isolator based on support vector regression (SVR). The trained identification model is to forecast the shear force of the MRE-based isolator online; thus, the dynamic response from the MRE-based isolator can be well captured. In order to improve the forecast capacity of the model, a type of improved particle swarm optimization (IPSO) is employed to optimize the parameters in SVR. Eventually, the trained model is applied to the MRE-based isolator modelling with testing data. The results indicate that the proposed hybrid model has a better generalization capacity and better recognition accuracy than other conventional models, and it is an effective and suitable approach for forecasting the behaviours of a MRE-based isolator.

  4. Seasonal River Discharge Forecasting Using Support Vector Regression: A Case Study in the Italian Alps

    Directory of Open Access Journals (Sweden)

    Mattia Callegari

    2015-05-01

    Full Text Available In this contribution we analyze the performance of a monthly river discharge forecasting model with a Support Vector Regression (SVR technique in a European alpine area. We considered as predictors the discharges of the antecedent months, snow-covered area (SCA, and meteorological and climatic variables for 14 catchments in South Tyrol (Northern Italy, as well as the long-term average discharge of the month of prediction, also regarded as a benchmark. Forecasts at a six-month lead time tend to perform no better than the benchmark, with an average 33% relative root mean square error (RMSE% on test samples. However, at one month lead time, RMSE% was 22%, a non-negligible improvement over the benchmark; moreover, the SVR model reduces the frequency of higher errors associated with anomalous months. Predictions with a lead time of three months show an intermediate performance between those at one and six months lead time. Among the considered predictors, SCA alone reduces RMSE% to 6% and 5% compared to using monthly discharges only, for a lead time equal to one and three months, respectively, whereas meteorological parameters bring only minor improvements. The model also outperformed a simpler linear autoregressive model, and yielded the lowest volume error in forecasting with one month lead time, while at longer lead times the differences compared to the benchmarks are negligible. Our results suggest that although an SVR model may deliver better forecasts than its simpler linear alternatives, long lead-time hydrological forecasting in Alpine catchments remains a challenge. Catchment state variables may play a bigger role than catchment input variables; hence a focus on characterizing seasonal catchment storage—Rather than seasonal weather forecasting—Could be key for improving our predictive capacity.

  5. Regression to Causality

    DEFF Research Database (Denmark)

    Bordacconi, Mats Joe; Larsen, Martin Vinæs

    2014-01-01

    Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...

  6. Genesis machines

    CERN Document Server

    Amos, Martyn

    2014-01-01

    Silicon chips are out. Today's scientists are using real, wet, squishy, living biology to build the next generation of computers. Cells, gels and DNA strands are the 'wetware' of the twenty-first century. Much smaller and more intelligent, these organic computers open up revolutionary possibilities. Tracing the history of computing and revealing a brave new world to come, Genesis Machines describes how this new technology will change the way we think not just about computers - but about life itself.

  7. Boosted beta regression.

    Directory of Open Access Journals (Sweden)

    Matthias Schmid

    Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.

  8. Applied logistic regression

    CERN Document Server

    Hosmer, David W; Sturdivant, Rodney X

    2013-01-01

     A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-

  9. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  10. Vehicle Travel Time Predication based on Multiple Kernel Regression

    Directory of Open Access Journals (Sweden)

    Wenjing Xu

    2014-07-01

    Full Text Available With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR method etc. However, these algorithms still have some shortcomings, such as highcomputationcomplexity, slow convergence rate etc. This paper exploits the learning ability of multiple kernel learning regression (MKLR in nonlinear prediction processing characteristics, logistics planning based on MKLR for vehicle travel time prediction. The method for Vehicle travel time prediction includes the following steps: (1 preprocessing historical data; (2 selecting appropriate kernel function, training the historical data and performing analysis ;(3 predicting the vehicle travel time based on the trained model. The experimental results show that, through the analysis of using different methods for prediction, the vehicle travel time prediction method proposed in this paper, archives higher accuracy than other methods. It also illustrates the feasibility and effectiveness of the proposed prediction method.

  11. Simulating Turing machines on Maurer machines

    NARCIS (Netherlands)

    Bergstra, J.A.; Middelburg, C.A.

    2008-01-01

    In a previous paper, we used Maurer machines to model and analyse micro-architectures. In the current paper, we investigate the connections between Turing machines and Maurer machines with the purpose to gain an insight into computability issues relating to Maurer machines. We introduce ways to

  12. Environmentally Friendly Machining

    CERN Document Server

    Dixit, U S; Davim, J Paulo

    2012-01-01

    Environment-Friendly Machining provides an in-depth overview of environmentally-friendly machining processes, covering numerous different types of machining in order to identify which practice is the most environmentally sustainable. The book discusses three systems at length: machining with minimal cutting fluid, air-cooled machining and dry machining. Also covered is a way to conserve energy during machining processes, along with useful data and detailed descriptions for developing and utilizing the most efficient modern machining tools. Researchers and engineers looking for sustainable machining solutions will find Environment-Friendly Machining to be a useful volume.

  13. Transductive Ordinal Regression

    CERN Document Server

    Seah, Chun-Wei; Ong, Yew-Soon

    2011-01-01

    Ordinal regression is commonly formulated as a multi-class problem with ordinal constraints. The challenge of designing accurate classifiers for ordinal regression generally increases with the number of classes involved, due to the large number of labeled patterns that are needed. The availability of ordinal class labels, however, are often costly to calibrate or difficult to obtain. Unlabeled patterns, on the other hand, often exist in much greater abundance and are freely available. To take benefits from the abundance of unlabeled patterns, we present a novel transductive learning paradigm for ordinal regression in this paper, namely Transductive Ordinal Regression (TOR). The key challenge of the present study lies in the precise estimation of both the ordinal class label of the unlabeled data and the decision functions of the ordinal classes, simultaneously. The core elements of the proposed TOR include an objective function that caters to several commonly used loss functions casted in transductive setting...

  14. Nonparametric Predictive Regression

    OpenAIRE

    Ioannis Kasparis; Elena Andreou; Phillips, Peter C.B.

    2012-01-01

    A unifying framework for inference is developed in predictive regressions where the predictor has unknown integration properties and may be stationary or nonstationary. Two easily implemented nonparametric F-tests are proposed. The test statistics are related to those of Kasparis and Phillips (2012) and are obtained by kernel regression. The limit distribution of these predictive tests holds for a wide range of predictors including stationary as well as non-stationary fractional and near unit...

  15. Machine Transliteration

    CERN Document Server

    Knight, K; Knight, Kevin; Graehl, Jonathan

    1997-01-01

    It is challenging to translate names and technical terms across languages with different alphabets and sound inventories. These items are commonly transliterated, i.e., replaced with approximate phonetic equivalents. For example, "computer" in English comes out as "konpyuutaa" in Japanese. Translating such items from Japanese back to English is even more challenging, and of practical interest, as transliterated items make up the bulk of text phrases not found in bilingual dictionaries. We describe and evaluate a method for performing backwards transliterations by machine. This method uses a generative model, incorporating several distinct stages in the transliteration process.

  16. Regression Verification Using Impact Summaries

    Science.gov (United States)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    evaluation of our regression verification technique shows that our approach is capable of leveraging similarities between program versions to reduce the size of the queries and the time required to check for logical equivalence. The main contributions of this work are: - A regression verification technique to generate impact summaries that can be checked for functional equivalence using an off-the-shelf decision procedure. - A proof that our approach is sound and complete with respect to the depth bound of symbolic execution. - An implementation of our technique using the LLVMcompiler infrastructure, the klee Symbolic Virtual Machine [4], and a variety of Satisfiability Modulo Theory (SMT) solvers, e.g., STP [7] and Z3 [6]. - An empirical evaluation on a set of C artifacts which shows that the use of impact summaries can reduce the cost of regression verification.

  17. Machine Protection

    CERN Document Server

    Schmidt, R

    2014-01-01

    The protection of accelerator equipment is as old as accelerator technology and was for many years related to high-power equipment. Examples are the protection of powering equipment from overheating (magnets, power converters, high-current cables), of superconducting magnets from damage after a quench and of klystrons. The protection of equipment from beam accidents is more recent. It is related to the increasing beam power of high-power proton accelerators such as ISIS, SNS, ESS and the PSI cyclotron, to the emission of synchrotron light by electron–positron accelerators and FELs, and to the increase of energy stored in the beam (in particular for hadron colliders such as LHC). Designing a machine protection system requires an excellent understanding of accelerator physics and operation to anticipate possible failures that could lead to damage. Machine protection includes beam and equipment monitoring, a system to safely stop beam operation (e.g. dumping the beam or stopping the beam at low energy) and an ...

  18. 支持向量回归方法的跳跃扩散汇率期权定价%Pricing Jump-Diffusion Currency Options with Support Vector Regression

    Institute of Scientific and Technical Information of China (English)

    王平; 王垣苏; 黄运成

    2011-01-01

    The currency option market has the highest degree of liquidity in comparison with other markets. It is important to use a reasonable option pricing approach in order to use the currency option properly. Existing literature has adopted parametric and nonparametric option pricing approaches to help improve the return of currency option investment. A new model based on the merits of both parametric and nonparametric methods in currency option pricing practices is constructed to provide insight on foreign currency option pricing.The first part reviews the Jump-Diffusion model, a parametric method. Sudden changes to the financial market can lead to the volatility of currency exchange rates and disrupt the continuity of the geometric diffusion process. A few jump-diffusion models are discussed. We decide to adopt Hanson and Westman's Log-Uniform Model because the model enables us to modify infinite domain and the tails of exponent. Obtaining a closed option pricing formula is unviable because of the complexity of the UMP diffusion models. A quasi maximum likelihood is used to estimate the parameters of the log-uniform model. The Monte Carlo algorithm is used to compute European option prices.The second part introduces the Support Vector Machine ( SVM ) Theory. SVM is developed on the basis of the statistical learning theory. SVM is used for classification (SVC) and regression (SVR). SVR overcomes the weakness of neural networks that get trapped when local minima are unavoidable. The third party proposes a new currency option pricing model. The model-reference architecture is a commonly-used framework in neutral network applications. We further extend the pricing framework in the model-reference architecture to the analysis of forecasting activities. SVR are employed to capture the nonlinear residuals between the actual option prices and the predicted prices given by parametric models. The forecast system is supported by the conventional methods coded in the MATLAB. The

  19. Support Vector Regression Based Indoor Location in IEEE 802.11 Environments

    Directory of Open Access Journals (Sweden)

    Ke Shi

    2015-01-01

    Full Text Available The wide spread of the 802.11-based wireless technology brings about a good opportunity for the indoor positioning system. In this paper, we present a new 802.11-based indoor positioning method using support vector regression (SVR, which consists of offline training stage and online location stage. The model that describes the relations between the position and the received signal strength (RSS of the mobile device is established at the offline training stage by SVR, and at the online location stage the exact position is determined by this model. Due to the complex indoor environment, RSS is vulnerable and changeable. To address this issue, data filtering rules obtained through statistical analysis are applied at offline training stage to improve the quality of training samples and thus improve the quality of prediction model. At the online location stage, k-times continuous measurement is utilized to obtain the high quality RSS input, which guarantees the consistency with the training samples and improves the position accuracy of mobile devices. Performance evaluation shows that the proposed method has a higher positioning accuracy compared with the probability and neutral network method, and the demand for the storage capacity and computing power is also low at the same time.

  20. Single face image reconstruction for super resolution using support vector regression

    Science.gov (United States)

    Lin, Haijie; Yuan, Qiping; Chen, Zhihong; Yang, Xiaoping

    2016-10-01

    In recent years, we have witnessed the prosperity of the face image super-resolution (SR) reconstruction, especially the learning-based technology. In this paper, a novel super-resolution face reconstruction framework based on support vector regression (SVR) about a single image is presented. Given some input data, SVR can precisely predict output class labels. We regard the SR problem as the estimation of pixel labels in its high resolution version. It's effective to put local binary pattern (LBP) codes and partial pixels into input vectors during training models in our work, and models are learnt from a set of high and low resolution face image. By optimizing vector pairs which are used for learning model, the final reconstructed results were advanced. Especially to deserve to be mentioned, we can get more high frequency information by exploiting the cyclical scan actions in the process of both training and prediction. A large number of experimental data and visual observation have shown that our method outperforms bicubic interpolation and some stateof- the-art super-resolution algorithms.

  1. Black box modeling of PIDs implemented in PLCs without structural information: a support vector regression approach.

    Science.gov (United States)

    Salat, Robert; Awtoniuk, Michal

    In this report, the parameters identification of a proportional-integral-derivative (PID) algorithm implemented in a programmable logic controller (PLC) using support vector regression (SVR) is presented. This report focuses on a black box model of the PID with additional functions and modifications provided by the manufacturers and without information on the exact structure. The process of feature selection and its impact on the training and testing abilities are emphasized. The method was tested on a real PLC (Siemens and General Electric) with the implemented PID. The results show that the SVR maps the function of the PID algorithms and the modifications introduced by the manufacturer of the PLC with high accuracy. With this approach, the simulation results can be directly used to tune the PID algorithms in the PLC. The method is sufficiently universal in that it can be applied to any PI or PID algorithm implemented in the PLC with additional functions and modifications that were previously considered to be trade secrets. This method can also be an alternative for engineers who need to tune the PID and do not have any such information on the structure and cannot use the default settings for the known structures.

  2. Analysis of machining and machine tools

    CERN Document Server

    Liang, Steven Y

    2016-01-01

    This book delivers the fundamental science and mechanics of machining and machine tools by presenting systematic and quantitative knowledge in the form of process mechanics and physics. It gives readers a solid command of machining science and engineering, and familiarizes them with the geometry and functionality requirements of creating parts and components in today’s markets. The authors address traditional machining topics, such as: single and multiple point cutting processes grinding components accuracy and metrology shear stress in cutting cutting temperature and analysis chatter They also address non-traditional machining, such as: electrical discharge machining electrochemical machining laser and electron beam machining A chapter on biomedical machining is also included. This book is appropriate for advanced undergraduate and graduate mechani cal engineering students, manufacturing engineers, and researchers. Each chapter contains examples, exercises and their solutions, and homework problems that re...

  3. SNOW DEPTH ESTIMATION USING TIME SERIES PASSIVE MICROWAVE IMAGERY VIA GENETICALLY SUPPORT VECTOR REGRESSION (CASE STUDY URMIA LAKE BASIN

    Directory of Open Access Journals (Sweden)

    N. Zahir

    2015-12-01

    Full Text Available Lake Urmia is one of the most important ecosystems of the country which is on the verge of elimination. Many factors contribute to this crisis among them is the precipitation, paly important roll. Precipitation has many forms one of them is in the form of snow. The snow on Sahand Mountain is one of the main and important sources of the Lake Urmia’s water. Snow Depth (SD is vital parameters for estimating water balance for future year. In this regards, this study is focused on SD parameter using Special Sensor Microwave/Imager (SSM/I instruments on board the Defence Meteorological Satellite Program (DMSP F16. The usual statistical methods for retrieving SD include linear and non-linear ones. These methods used least square procedure to estimate SD model. Recently, kernel base methods widely used for modelling statistical problem. From these methods, the support vector regression (SVR is achieved the high performance for modelling the statistical problem. Examination of the obtained data shows the existence of outlier in them. For omitting these outliers, wavelet denoising method is applied. After the omission of the outliers it is needed to select the optimum bands and parameters for SVR. To overcome these issues, feature selection methods have shown a direct effect on improving the regression performance. We used genetic algorithm (GA for selecting suitable features of the SSMI bands in order to estimate SD model. The results for the training and testing data in Sahand mountain is [R²_TEST=0.9049 and RMSE= 6.9654] that show the high SVR performance.

  4. [Understanding logistic regression].

    Science.gov (United States)

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  5. Study of Identification and Countermeasure of Financial Frauds Based on Triangle Theory of Frauds-Empirical Analysis of the Coupling of Machine Learning and Logistic Regression%基于舞弊三角理论的财务舞弊识别模型研究--支持向量机与Logistic回归的耦合实证分析

    Institute of Scientific and Technical Information of China (English)

    金花妍; 刘永泽

    2014-01-01

    文章基于舞弊三角理论,对舞弊风险因素进行T检验,选取显著特征运用支持向量机算法和Logistic回归分析方法构建了舞弊识别模型,并进行比较。研究结果表明,财务稳定性越差、监督部门的监督积极性越低、曾经获得非标准审计意见次数越多,公司发生财务舞弊可能性越高。应制定合理的报酬激励机制,降低报酬契约冲突带来的舞弊压力;建立健全内部控制和风险管理评价,减少舞弊的机会;加强内外监督机制,杜绝舞弊的借口。%This paper analyzed fraud risk factors with T-test based on triangle theory of frauds and selected re-markable factors to model the identification of frauds by support vector machine and logistic regression .The re-sults showed that frauds’ increased with the number of the non-standard audit opinions ,but decreased with the financial stability and the positivity of the supervision .In order to prevent and find frauds ,some measures should be taken as follow s .Firstly ,the pressure caused by the conflict of compensation contracts can be re-duced by making reasonable incentive mechanism .Secondly ,the opportunities of frauds can be reduced by im-proving internal controls and evaluation system of risk management .Finally ,the internal and external supervi-sion mechanism can be strengthened to eliminate the excuses of frauds .

  6. Practical Session: Logistic Regression

    Science.gov (United States)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  7. Minimax Regression Quantiles

    DEFF Research Database (Denmark)

    Bache, Stefan Holst

    A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....

  8. Treatment Extension of Pegylated Interferon Alpha and Ribavirin Does Not Improve SVR in Patients with Genotypes 2/3 without Rapid Virological Response (OPTEX Trial: A Prospective, Randomized, Two-Arm, Multicentre Phase IV Clinical Trial.

    Directory of Open Access Journals (Sweden)

    Benjamin Heidrich

    Full Text Available Although sofosbuvir has been approved for patients with genotypes 2/3 (G2/3, many parts of the world still consider pegylated Interferon alpha (P and ribavirin (R as standard of care for G2/3. Patients with rapid virological response (RVR show response rates >80%. However, SVR (sustained virological response in non-RVR patients is not satisfactory. Longer treatment duration may be required but evidence from prospective trials are lacking. A total of 1006 chronic HCV genotype 2/3 patients treated with P/R were recruited into a German HepNet multicenter screening registry. Of those, only 226 patients were still HCV RNA positive at week 4 (non-RVR. Non-RVR patients with ongoing response after 24 weeks P-2b/R qualified for OPTEX, a randomized trial investigating treatment extension of additional 24 weeks (total 48 weeks, Group A or additional 12 weeks (total 36 weeks, group B of 1.5 μg/kg P-2b and 800-1400 mg R. Due to the low number of patients without RVR, the number of 150 anticipated study patients was not met and only 99 non-RVR patients (n=50 Group A, n=49 Group B could be enrolled into the OPTEX trial. Baseline factors did not differ between groups. Sixteen patients had G2 and 83 patients G3. Based on the ITT (intention-to-treat analysis, 68% [55%; 81%] in Group A and 57% [43%; 71%] in Group B achieved SVR (p= 0.31. The primary endpoint of better SVR rates in Group A compared to a historical control group (SVR 70% was not met. In conclusion, approximately 23% of G2/3 patients did not achieve RVR in a real world setting. However, subsequent recruitment in a treatment-extension study was difficult. Prolonged therapy beyond 24 weeks did not result in higher SVR compared to a historical control group.ClinicalTrials.gov NCT00803309.

  9. [Screen potential CYP450 2E1 inhibitors from Chinese herbal medicine based on support vector regression and molecular docking method].

    Science.gov (United States)

    Chen, Xi; Lu, Fang; Jiang, Lu-di; Cai, Yi-Lian; Li, Gong-Yu; Zhang, Yan-Ling

    2016-07-01

    Inhibition of cytochrome P450 (CYP450) enzymes is the most common reasons for drug interactions, so the study on early prediction of CYPs inhibitors can help to decrease the incidence of adverse reactions caused by drug interactions.CYP450 2E1(CYP2E1), as a key role in drug metabolism process, has broad spectrum of drug metabolism substrate. In this study, 32 CYP2E1 inhibitors were collected for the construction of support vector regression (SVR) model. The test set data were used to verify CYP2E1 quantitative models and obtain the optimal prediction model of CYP2E1 inhibitor. Meanwhile, one molecular docking program, CDOCKER, was utilized to analyze the interaction pattern between positive compounds and active pocket to establish the optimal screening model of CYP2E1 inhibitors.SVR model and molecular docking prediction model were combined to screen traditional Chinese medicine database (TCMD), which could improve the calculation efficiency and prediction accuracy. 6 376 traditional Chinese medicine (TCM) compounds predicted by SVR model were obtained, and in further verification by using molecular docking model, 247 TCM compounds with potential inhibitory activities against CYP2E1 were finally retained. Some of them have been verified by experiments. The results demonstrated that this study could provide guidance for the virtual screening of CYP450 inhibitors and the prediction of CYPs-mediated DDIs, and also provide references for clinical rational drug use. Copyright© by the Chinese Pharmaceutical Association.

  10. Nonlinear Regression with R

    CERN Document Server

    Ritz, Christian; Parmigiani, Giovanni

    2009-01-01

    R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.

  11. Multiple linear regression analysis

    Science.gov (United States)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  12. Adaptive metric kernel regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    2000-01-01

    regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...

  13. Software Regression Verification

    Science.gov (United States)

    2013-12-11

    of recursive procedures. Acta Informatica , 45(6):403 – 439, 2008. [GS11] Benny Godlin and Ofer Strichman. Regression verifica- tion. Technical Report...functions. Therefore, we need to rede - fine m-term. – Mutual termination. If either function f or function f ′ (or both) is non- deterministic, then their

  14. Linear Regression Analysis

    CERN Document Server

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  15. Ultrasonic image restoration based on support vector machine for surfacing interface testing

    Institute of Scientific and Technical Information of China (English)

    Gao Shuangsheng; Gang Tie; Chi Dazhao

    2007-01-01

    In order to restore the degraded ultrasonic C-scan image for testing surfacing interface, a method based on support vector regression (SVR) network is proposed. By using the image of a simulating defect, the network is trained and a mapping relationship between the degraded and restored image is founded. The degraded C-scan image of Cu-Steel surfacing interface is processed by the trained network and improved image is obtained. The result shows that the method can effectively suppress the noise and deblur the defect edge in the image, and provide technique support for quality and reliability evaluation of the surfacing weld.

  16. Simultaneous determination of multiple food colorants content in beverages based on KICA-SVR model of simulated UV data%运用模拟UV数据的KICA-SVR模型同时测定饮料中多种色素的含量

    Institute of Scientific and Technical Information of China (English)

    王国庆; 弓丽华; 王素方; 孙晓丽; 董春红; 符德学

    2014-01-01

    By extraction of the independent components (IC)from the ultraviolet-visible (UV)spectrum of simulated mixtures of food colorants disolved in water and beverage using kernel independent component analysis (KICA),the UV-KUA-SVR model were established by support vector regressing (SVR)using the IC’s coefficients.This model can be used to simultineously determine multiple colorants,i.e.lemon yellow,sunset yellow,red temptation,amaranth,carmine,and brilliant blue,with the relative standard devi-ations (RSDs)are 1 .5%,2.2%,2.0%,2.5%,2.6%,and 1 .2%,respectively,the limits of detection (LOD)are 0.5 mg·L-1 ,0.5 mg·L-1 ,0.5 mg·L-1 ,1 .0 mg·L-1 ,1 .0 mg·L-1 and 0.5 mg·L-1 , respectively.%在模拟水性饮料和实际饮料样品中加入不同种类和含量的色素,运用核独立成分分析(KICA)法提取样品的紫外-可见(UV)光谱数据中独立组分(IC)信息,以IC 的系数矩阵进行支持向量回归(SVR)处理并建立UV-KICA-SVR模型,用于直接预测各种色素的含量.用此方法测定水性饮料中柠檬黄、日落黄、诱惑红、苋菜红、胭脂红和亮蓝6种色素的含量,相对标准偏差分别为1.5%,2.2%,2.0%,2.5%,2.6%和1.2%,检测限分别为0.5 mg·L-1,0.5 mg·L-1,0.5 mg·L-1,1.0 mg·L-1,1.0 mg·L-1和0.5 mg·L-1.

  17. Stacked Extreme Learning Machines.

    Science.gov (United States)

    Zhou, Hongming; Huang, Guang-Bin; Lin, Zhiping; Wang, Han; Soh, Yeng Chai

    2015-09-01

    Extreme learning machine (ELM) has recently attracted many researchers' interest due to its very fast learning speed, good generalization ability, and ease of implementation. It provides a unified solution that can be used directly to solve regression, binary, and multiclass classification problems. In this paper, we propose a stacked ELMs (S-ELMs) that is specially designed for solving large and complex data problems. The S-ELMs divides a single large ELM network into multiple stacked small ELMs which are serially connected. The S-ELMs can approximate a very large ELM network with small memory requirement. To further improve the testing accuracy on big data problems, the ELM autoencoder can be implemented during each iteration of the S-ELMs algorithm. The simulation results show that the S-ELMs even with random hidden nodes can achieve similar testing accuracy to support vector machine (SVM) while having low memory requirements. With the help of ELM autoencoder, the S-ELMs can achieve much better testing accuracy than SVM and slightly better accuracy than deep belief network (DBN) with much faster training speed.

  18. Automation of printing machine

    OpenAIRE

    Sušil, David

    2016-01-01

    Bachelor thesis is focused on the automation of the printing machine and comparing the two types of printing machines. The first chapter deals with the history of printing, typesettings, printing techniques and various kinds of bookbinding. The second chapter describes the difference between sheet-fed printing machines and offset printing machines, the difference between two representatives of rotary machines, technological process of the products on these machines, the description of the mac...

  19. Low rank Multivariate regression

    CERN Document Server

    Giraud, Christophe

    2010-01-01

    We consider in this paper the multivariate regression problem, when the target regression matrix $A$ is close to a low rank matrix. Our primary interest in on the practical case where the variance of the noise is unknown. Our main contribution is to propose in this setting a criterion to select among a family of low rank estimators and prove a non-asymptotic oracle inequality for the resulting estimator. We also investigate the easier case where the variance of the noise is known and outline that the penalties appearing in our criterions are minimal (in some sense). These penalties involve the expected value of the Ky-Fan quasi-norm of some random matrices. These quantities can be evaluated easily in practice and upper-bounds can be derived from recent results in random matrix theory.

  20. Subset selection in regression

    CERN Document Server

    Miller, Alan

    2002-01-01

    Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...

  1. Machine musicianship

    Science.gov (United States)

    Rowe, Robert

    2002-05-01

    The training of musicians begins by teaching basic musical concepts, a collection of knowledge commonly known as musicianship. Computer programs designed to implement musical skills (e.g., to make sense of what they hear, perform music expressively, or compose convincing pieces) can similarly benefit from access to a fundamental level of musicianship. Recent research in music cognition, artificial intelligence, and music theory has produced a repertoire of techniques that can make the behavior of computer programs more musical. Many of these were presented in a recently published book/CD-ROM entitled Machine Musicianship. For use in interactive music systems, we are interested in those which are fast enough to run in real time and that need only make reference to the material as it appears in sequence. This talk will review several applications that are able to identify the tonal center of musical material during performance. Beyond this specific task, the design of real-time algorithmic listening through the concurrent operation of several connected analyzers is examined. The presentation includes discussion of a library of C++ objects that can be combined to perform interactive listening and a demonstration of their capability.

  2. Classification and regression trees

    CERN Document Server

    Breiman, Leo; Olshen, Richard A; Stone, Charles J

    1984-01-01

    The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

  3. Aid and growth regressions

    DEFF Research Database (Denmark)

    Hansen, Henrik; Tarp, Finn

    2001-01-01

    . There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....

  4. Robust Nonstationary Regression

    OpenAIRE

    1993-01-01

    This paper provides a robust statistical approach to nonstationary time series regression and inference. Fully modified extensions of traditional robust statistical procedures are developed which allow for endogeneities in the nonstationary regressors and serial dependence in the shocks that drive the regressors and the errors that appear in the equation being estimated. The suggested estimators involve semiparametric corrections to accommodate these possibilities and they belong to the same ...

  5. Determination of Quality Properties of Soy Sauce by Support Vector Regression Coupled with SW-NIR Spectroscopy

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    The modern near-infrared(NIR) spectroscopy analysis is a simple, efficient and nondestructive technique, which has been used in chemical analysis in diverse fields. Shortwave NIR spectroscopy is also a rapid, flexible, and cost-effective method to control product quality in food industry. The method of support vector regression coupled with shortwave NIR spectroscopy was explored for the nondestructive quantitative analysis of the important quality parameters of soy sauce, including amino nitrogen content, total acid content, salt content and color ratio. In this study, the support vector regression(SVR) models based on subtractive spectra and positive spectra were found and compared, the results show that the subtractive spectrum was more excellent than the positive spectrum. Meanwhile, R and RSE were determined, respectively, by means of original spectra and pretreated spectra[standard normal variate (SNV), first-derivative and second-derivative], and the corresponding models were successfully established. The best prediction was achieved by a support vector regression model of the first derivative transformed dataset. In addition, the result obtained by the proposed method was compared with that of Partial Least Squares(PLS), which showed that the generalization performance of the classifier based on SVR was much better than that of PLS. The results demonstrate that shortwave NIR spectroscopy combined with SVR is promising for the quality control of soy sauce.

  6. TWO REGRESSION CREDIBILITY MODELS

    Directory of Open Access Journals (Sweden)

    Constanţa-Nicoleta BODEA

    2010-03-01

    Full Text Available In this communication we will discuss two regression credibility models from Non – Life Insurance Mathematics that can be solved by means of matrix theory. In the first regression credibility model, starting from a well-known representation formula of the inverse for a special class of matrices a risk premium will be calculated for a contract with risk parameter θ. In the next regression credibility model, we will obtain a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state and the collective estimate (based on aggregate USA data. To illustrate the solution with the properties mentioned above, we shall need the well-known representation theorem for a special class of matrices, the properties of the trace for a square matrix, the scalar product of two vectors, the norm with respect to a positive definite matrix given in advance and the complicated mathematical properties of conditional expectations and of conditional covariances.

  7. Electrical machines mathematical fundamentals of machine topologies

    CERN Document Server

    Gerling, Dieter

    2015-01-01

    Electrical Machines and Drives play a powerful role in industry with an ever increasing importance. This fact requires the understanding of machine and drive principles by engineers of many different disciplines. Therefore, this book is intended to give a comprehensive deduction of these principles. Special attention is given to the precise mathematical derivation of the necessary formulae to calculate machines and drives and to the discussion of simplifications (if applied) with the associated limits. The book shows how the different machine topologies can be deduced from general fundamentals, and how they are linked together. This book addresses graduate students, researchers, and developers of Electrical Machines and Drives, who are interested in getting knowledge about the principles of machine and drive operation and in detecting the mathematical and engineering specialties of the different machine and drive topologies together with their mutual links. The detailed - but nevertheless compact - mat...

  8. Laser machining of advanced materials

    CERN Document Server

    Dahotre, Narendra B

    2011-01-01

    Advanced materialsIntroductionApplicationsStructural ceramicsBiomaterials CompositesIntermetallicsMachining of advanced materials IntroductionFabrication techniquesMechanical machiningChemical Machining (CM)Electrical machiningRadiation machining Hybrid machiningLaser machiningIntroductionAbsorption of laser energy and multiple reflectionsThermal effectsLaser machining of structural ceramicsIntrodu

  9. 基于支持向量机回归的水体重金属激光诱导击穿光谱定量分析研究%Quantitative Analysis of Laser-Induced Breakdown Spectroscopy of Heavy Metals in Water Based on Support-Vector-Machine Regression

    Institute of Scientific and Technical Information of China (English)

    王春龙; 刘建国; 赵南京; 马明俊; 王寅; 胡丽; 张大海; 余洋; 孟德硕

    2013-01-01

    建立了基于自适应核的激光诱导击穿光谱支持向量机回归定量分析模型.通过石墨富集、洛伦兹拟合和碳内标归一化,增强等离子体信号强度,减小环境噪声和能量抖动对水体重金属浓度测量的影响.实现了基于支持向量机回归智能算法的激光诱导击穿光谱定量分析.铅铜的平均相对标准偏差分别为6.4361%和6.9291%,最大相对标准偏差分别为9.1009%和8.9280%,平均相对误差分别为1.6765%和1.2478%,最大相对误差分别为5.5759%和4.2604%,相关系数分别为0.9979和0.9997.该研究为进一步实现水中痕量金属元素的快速定量分析提供了方法和数据参考.%The quantitative analysis model of laser-induced breakdown spectroscopy with adaptive kernel is established. Effect of ambient noise and energy jitter in measured density of heavy metals is gradually removed by Lorentz fitting and carbon normalization, and the intensity of plasmas is enhanced by graphite enrichment. Quantitative analysis of laser-induced breakdown spectroscopy based on regression intelligent algorithm of support vector machine is achieved. The average relative standard deviations of lead and copper are 6.4361 % and 6.9291 % , and the maximum standard deviations are 9.1009% and 8. 9280% . The average relative errors of lead and copper are 1.6765% and 1.2478 %, and the maximum relative errors are 5. 5759% and4.2604%. The correlation coefficients of lead and copper are 0. 9979 and 0. 9997. Methods and reference data are provided for the further study of fast measurement of trace heavy metals in water by laser-induced breakdown spectroscopy.

  10. The deleuzian abstract machines

    DEFF Research Database (Denmark)

    Werner Petersen, Erik

    2005-01-01

    production. In Kafka: Toward a Minor Literature, Deleuze and Guatari gave the most comprehensive explanation to the abstract machine in the work of art. Like the war-machines of Virilio, the Kafka-machine operates in three gears or speeds. Furthermore, the machine is connected to spatial diagrams...

  11. Regression Segmentation for M³ Spinal Images.

    Science.gov (United States)

    Wang, Zhijie; Zhen, Xiantong; Tay, KengYeow; Osman, Said; Romano, Walter; Li, Shuo

    2015-08-01

    Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities (M(3)). Unfortunately, existing methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality (S(3)). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment M(3) spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially diverse M(3) images directly to desired object boundaries. Leveraging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where M(3) diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for M(3) spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases.

  12. Prediction of dynamical systems by symbolic regression

    Science.gov (United States)

    Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K.; Noack, Bernd R.

    2016-07-01

    We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.

  13. Employing machine learning for reliable miRNA target identification in plants

    Directory of Open Access Journals (Sweden)

    Jha Ashwani

    2011-12-01

    Full Text Available Abstract Background miRNAs are ~21 nucleotide long small noncoding RNA molecules, formed endogenously in most of the eukaryotes, which mainly control their target genes post transcriptionally by interacting and silencing them. While a lot of tools has been developed for animal miRNA target system, plant miRNA target identification system has witnessed limited development. Most of them have been centered around exact complementarity match. Very few of them considered other factors like multiple target sites and role of flanking regions. Result In the present work, a Support Vector Regression (SVR approach has been implemented for plant miRNA target identification, utilizing position specific dinucleotide density variation information around the target sites, to yield highly reliable result. It has been named as p-TAREF (plant-Target Refiner. Performance comparison for p-TAREF was done with other prediction tools for plants with utmost rigor and where p-TAREF was found better performing in several aspects. Further, p-TAREF was run over the experimentally validated miRNA targets from species like Arabidopsis, Medicago, Rice and Tomato, and detected them accurately, suggesting gross usability of p-TAREF for plant species. Using p-TAREF, target identification was done for the complete Rice transcriptome, supported by expression and degradome based data. miR156 was found as an important component of the Rice regulatory system, where control of genes associated with growth and transcription looked predominant. The entire methodology has been implemented in a multi-threaded parallel architecture in Java, to enable fast processing for web-server version as well as standalone version. This also makes it to run even on a simple desktop computer in concurrent mode. It also provides a facility to gather experimental support for predictions made, through on the spot expression data analysis, in its web-server version. Conclusion A machine learning

  14. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak

    2012-07-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  15. Massive-training support vector regression and Gaussian process for false-positive reduction in computer-aided detection of polyps in CT colonography.

    Science.gov (United States)

    Xu, Jian-Wu; Suzuki, Kenji

    2011-04-01

    A massive-training artificial neural network (MTANN) has been developed for the reduction of false positives (FPs) in computer-aided detection (CADe) of polyps in CT colonography (CTC). A major limitation of the MTANN is the long training time. To address this issue, the authors investigated the feasibility of two state-of-the-art regression models, namely, support vector regression (SVR) and Gaussian process regression (GPR) models, in the massive-training framework and developed massive-training SVR (MTSVR) and massive-training GPR (MTGPR) for the reduction of FPs in CADe of polyps. The authors applied SVR and GPR as volume-processing techniques in the distinction of polyps from FP detections in a CTC CADe scheme. Unlike artificial neural networks (ANNs), both SVR and GPR are memory-based methods that store a part of or the entire training data for testing. Therefore, their training is generally fast and they are able to improve the efficiency of the massive-training methodology. Rooted in a maximum margin property, SVR offers excellent generalization ability and robustness to outliers. On the other hand, GPR approaches nonlinear regression from a Bayesian perspective, which produces both the optimal estimated function and the covariance associated with the estimation. Therefore, both SVR and GPR, as the state-of-the-art nonlinear regression models, are able to offer a performance comparable or potentially superior to that of ANN, with highly efficient training. Both MTSVR and MTGPR were trained directly with voxel values from CTC images. A 3D scoring method based on a 3D Gaussian weighting function was applied to the outputs of MTSVR and MTGPR for distinction between polyps and nonpolyps. To test the performance of the proposed models, the authors compared them to the original MTANN in the distinction between actual polyps and various types of FPs in terms of training time reduction and FP reduction performance. The authors' CTC database consisted of 240 CTC

  16. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  17. Caudal Regression Syndrome

    Directory of Open Access Journals (Sweden)

    Karim Hardani*

    2012-05-01

    Full Text Available A 10-month-old baby presented with developmental delay. He had flaccid paralysis on physical examination.An MRI of the spine revealed malformation of the ninth and tenth thoracic vertebral bodies with complete agenesis of the rest of the spine down that level. The thoracic spinal cord ends at the level of the fifth thoracic vertebra with agenesis of the posterior arches of the eighth, ninth and tenth thoracic vertebral bodies. The roots of the cauda equina appear tightened down and backward and ended into a subdermal fibrous fatty tissue at the level of the ninth and tenth thoracic vertebral bodies (closed meningocele. These findings are consistent with caudal regression syndrome.

  18. Multiple sources and multiple measures based traffic flow prediction using the chaos theory and support vector regression method

    Science.gov (United States)

    Cheng, Anyu; Jiang, Xiao; Li, Yongfu; Zhang, Chao; Zhu, Hao

    2017-01-01

    This study proposes a multiple sources and multiple measures based traffic flow prediction algorithm using the chaos theory and support vector regression method. In particular, first, the chaotic characteristics of traffic flow associated with the speed, occupancy, and flow are identified using the maximum Lyapunov exponent. Then, the phase space of multiple measures chaotic time series are reconstructed based on the phase space reconstruction theory and fused into a same multi-dimensional phase space using the Bayesian estimation theory. In addition, the support vector regression (SVR) model is designed to predict the traffic flow. Numerical experiments are performed using the data from multiple sources. The results show that, compared with the single measure, the proposed method has better performance for the short-term traffic flow prediction in terms of the accuracy and timeliness.

  19. Addressing uncertainty in atomistic machine learning

    DEFF Research Database (Denmark)

    Peterson, Andrew A.; Christensen, Rune; Khorshidi, Alireza

    2017-01-01

    Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility of the predi......Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility...... of the predictions. In this perspective, we address the types of errors that might arise in atomistic machine learning, the unique aspects of atomistic simulations that make machine-learning challenging, and highlight how uncertainty analysis can be used to assess the validity of machine-learning predictions. We...... suggest this will allow researchers to more fully use machine learning for the routine acceleration of large, high-accuracy, or extended-time simulations. In our demonstrations, we use a bootstrap ensemble of neural network-based calculators, and show that the width of the ensemble can provide an estimate...

  20. Differentially Private Support Vector Machines

    CERN Document Server

    Sarwate, Anand; Monteleoni, Claire

    2009-01-01

    This paper addresses the problem of practical privacy-preserving machine learning: how to detect patterns in massive, real-world databases of sensitive personal information, while maintaining the privacy of individuals. Chaudhuri and Monteleoni (2008) recently provided privacy-preserving techniques for learning linear separators via regularized logistic regression. With the goal of handling large databases that may not be linearly separable, we provide privacy-preserving support vector machine algorithms. We address general challenges left open by past work, such as how to release a kernel classifier without releasing any of the training data, and how to tune algorithm parameters in a privacy-preserving manner. We provide general, efficient algorithms for linear and nonlinear kernel SVMs, which guarantee $\\epsilon$-differential privacy, a very strong privacy definition due to Dwork et al. (2006). We also provide learning generalization guarantees. Empirical evaluations reveal promising performance on real and...

  1. Predicting residue-wise contact orders in proteins by support vector regression

    Directory of Open Access Journals (Sweden)

    Burrage Kevin

    2006-10-01

    Full Text Available Abstract Background The residue-wise contact order (RWCO describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR, starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and

  2. An Integrated Approach to Battery Health Monitoring using Bayesian Regression, Classification and State Estimation

    Data.gov (United States)

    National Aeronautics and Space Administration — The application of the Bayesian theory of managing uncertainty and complexity to regression and classification in the form of Relevance Vector Machine (RVM), and to...

  3. Unravelling effects of flavanols and their derivatives on acrylamide formation via support vector machine modelling.

    Science.gov (United States)

    Huang, Mengmeng; Wang, Qiao; Chen, Xinyu; Zhang, Yu

    2017-04-15

    This study investigated the effect of flavanols and their derivatives on acrylamide formation under low-moisture conditions via prediction using the support vector regression (SVR) approach. Acrylamide was generated in a potato-based equimolar asparagine-reducing sugar model system through oven heating. Both positive and negative effects were observed when the flavonoid treatment ranged 1-10,000μmol/L. Flavanols and derivatives (100μmol/L) suppress the acrylamide formation within a range of 59.9-78.2%, while their maximal promotion effects ranged from 2.15-fold to 2.84-fold for the control at a concentration of 10,000μmol/L. The correlations between inhibition rates and changes in Trolox-equivalent antioxidant capacity (ΔTEAC) (RTEAC-DPPH=0.878, RTEAC-ABTS=0.882, RTEAC-FRAP=0.871) were better than promotion rates (RTEAC-DPPH=0.815, RTEAC-ABTS=0.749, RTEAC-FRAP=0.841). Using ΔTEAC as variables, an optimized SVR model could robustly serve as a new predictive tool for estimating the effect (R: 0.783-0.880), the fitting performance of which was slightly better than that of multiple linear regression model (R: 0.754-0.880). Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Machine tool structures

    CERN Document Server

    Koenigsberger, F

    1970-01-01

    Machine Tool Structures, Volume 1 deals with fundamental theories and calculation methods for machine tool structures. Experimental investigations into stiffness are discussed, along with the application of the results to the design of machine tool structures. Topics covered range from static and dynamic stiffness to chatter in metal cutting, stability in machine tools, and deformations of machine tool structures. This volume is divided into three sections and opens with a discussion on stiffness specifications and the effect of stiffness on the behavior of the machine under forced vibration c

  5. Broiler chickens can benefit from machine learning: support vector machine analysis of observational epidemiological data.

    Science.gov (United States)

    Hepworth, Philip J; Nefedov, Alexey V; Muchnik, Ilya B; Morgan, Kenton L

    2012-08-07

    Machine-learning algorithms pervade our daily lives. In epidemiology, supervised machine learning has the potential for classification, diagnosis and risk factor identification. Here, we report the use of support vector machine learning to identify the features associated with hock burn on commercial broiler farms, using routinely collected farm management data. These data lend themselves to analysis using machine-learning techniques. Hock burn, dermatitis of the skin over the hock, is an important indicator of broiler health and welfare. Remarkably, this classifier can predict the occurrence of high hock burn prevalence with accuracy of 0.78 on unseen data, as measured by the area under the receiver operating characteristic curve. We also compare the results with those obtained by standard multi-variable logistic regression and suggest that this technique provides new insights into the data. This novel application of a machine-learning algorithm, embedded in poultry management systems could offer significant improvements in broiler health and welfare worldwide.

  6. 选择性激光烧结成型件密度的支持向量回归预测%Density prediction of selective laser sintering parts based on support vector regression

    Institute of Scientific and Technical Information of China (English)

    蔡从中; 裴军芳; 温玉锋; 朱星键; 肖婷婷

    2009-01-01

    根据不同工艺参数(层厚、扫描间距、激光功率、扫描速度、加工环境温度、层与层之间的加工时间间隔和扫描方式)下的选择性激光烧结成型件密度的实测数据集,应用基于粒子群算法寻优的支持向量回归(SVR)方法,建立了加工工艺参数与成型件密度间的预测模型,并与BP神经网络模型进行了比较.结果表明:基于相同的训练样本和检验样本,成型件密度的SVR模型比其BP神经网络模型具有更强的内部拟合能力和更高的预测精度;增加训练样本数有助于提高SVR预测模型的泛化能力;基于留一交叉验证法的SVR模型的预测误差最小.因此,SVR是一种预测选择性激光烧结成型件密度的有效方法.%The support vector regression (SVR) approach combined with particle swarm optimization for parameter optimization, is proposed to establish a model for estimating the density of selective laser sintering parts under processing parameters, including layer thickness, hatch spacing, laser power, scanning speed, ambient temperature, interval time and scanning mode. A comparison between the prediction results and the results from the BP neural networks strongly supports that the internal fitting capacity and prediction accuracy of SVR model are superior to those of BP neural networks under the identical training and test samples; the generation ability of SVR model can be efficiently improved by increasing the number of training samples. The minimum error value is provided by leave-one-out cross validation test of SVR. These results suggest that SVR is an effective and powerful tool for estimating the density of selective laser sintering parts.

  7. Adaptive weighted least square support vector machine regression with gross error detection and its application to estimate kinetic parameters for industrial oxidation of p-xylene%基于粗差判别的参数优化自适应加权最小二乘支持向量机在PX氧化过程参数估计中的应用

    Institute of Scientific and Technical Information of China (English)

    陶莉莉; 钟伟民; 罗娜; 钱锋

    2012-01-01

    针对软测量建模过程中数据可能存在粗大误差以及粗差数据对模型的性能产生的影响,提出了一种基于粗差判别的自适应加权最小二乘支持向量机回归方法(WLS-SVM).该方法首先根据3δ法则检测出样本中的显著误差并加以剔除,然后根据样本误差的大小自适应地调整权值,使得非显著误差对模型性能的影响大大降低.另外,由于最小二乘支持向量机的正则化参数和核宽度参数对模型的拟合精度和泛化能力有较大的影响,一般依靠经验和试算的方法进行估计,耗时且不准确,本文将模型的参数作为进化算法的优化问题,应用自适应免疫算法(AIGA)对参数进行优化选择.仿真实验表明,该方法对非线性系统的建模具有很好的效果.同时,将该方法应用于工业PX氧化建模过程中动力学参数的估计中,结果表明,基于粗差判别的参数优化自适应最小二乘支持向量机预测精度高,取得了较好的效果.%The presence of gross errors can corrupt a model's performance,giving undesirable results. A novel weighted least square support vector machine regression (WLS-SVM) is proposed,which combines gross error detection and adaptive weight value for the training sample. First,the 3δ principle is applied to detect the gross error. Second,the initial weight is obtained according to the fitting error of each sample. Then,an adaptive immune algorithm (AIGA) is applied to obtain the optimal parameters of the WLS-SVM. To illustrate the performance of the WLS-SVM,simulation experiment is designed to produce the training sample. The results showed that the predicting performance of AIGA-WLS-SVM is the best. Furthermore,the AIGA-WLS-SVM method was applied to estimate the rate constants of an industrial p-xylene oxidation model,and the satisfactory result was obtained.

  8. Estimation of Curie temperature of manganite-based materials for magnetic refrigeration application using hybrid gravitational based support vector regression

    Science.gov (United States)

    Owolabi, Taoreed O.; Akande, Kabiru O.; Olatunji, Sunday O.; Alqahtani, Abdullah; Aldhafferi, Nahier

    2016-10-01

    Magnetic refrigeration (MR) technology stands a good chance of replacing the conventional gas compression system (CGCS) of refrigeration due to its unique features such as high efficiency, low cost as well as being environmental friendly. Its operation involves the use of magnetocaloric effect (MCE) of a magnetic material caused by application of magnetic field. Manganite-based material demonstrates maximum MCE at its magnetic ordering temperature known as Curie temperature (TC). Consequently, manganite-based material with TC around room temperature is essentially desired for effective utilization of this technology. The TC of manganite-based materials can be adequately altered to a desired value through doping with appropriate foreign materials. In order to determine a manganite with TC around room temperature and to circumvent experimental challenges therein, this work proposes a model that can effectively estimates the TC of manganite-based material doped with different materials with the aid of support vector regression (SVR) hybridized with gravitational search algorithm (GSA). Implementation of GSA algorithm ensures optimum selection of SVR hyper-parameters for improved performance of the developed model using lattice distortions as the descriptors. The result of the developed model is promising and agrees excellently with the experimental results. The outstanding estimates of the proposed model suggest its potential in promoting room temperature magnetic refrigeration through quick estimation of the effect of dopants on TC so as to obtain manganite that works well around the room temperature.

  9. Estimation of Curie temperature of manganite-based materials for magnetic refrigeration application using hybrid gravitational based support vector regression

    Directory of Open Access Journals (Sweden)

    Taoreed O. Owolabi

    2016-10-01

    Full Text Available Magnetic refrigeration (MR technology stands a good chance of replacing the conventional gas compression system (CGCS of refrigeration due to its unique features such as high efficiency, low cost as well as being environmental friendly. Its operation involves the use of magnetocaloric effect (MCE of a magnetic material caused by application of magnetic field. Manganite-based material demonstrates maximum MCE at its magnetic ordering temperature known as Curie temperature (TC. Consequently, manganite-based material with TC around room temperature is essentially desired for effective utilization of this technology. The TC of manganite-based materials can be adequately altered to a desired value through doping with appropriate foreign materials. In order to determine a manganite with TC around room temperature and to circumvent experimental challenges therein, this work proposes a model that can effectively estimates the TC of manganite-based material doped with different materials with the aid of support vector regression (SVR hybridized with gravitational search algorithm (GSA. Implementation of GSA algorithm ensures optimum selection of SVR hyper-parameters for improved performance of the developed model using lattice distortions as the descriptors. The result of the developed model is promising and agrees excellently with the experimental results. The outstanding estimates of the proposed model suggest its potential in promoting room temperature magnetic refrigeration through quick estimation of the effect of dopants on TC so as to obtain manganite that works well around the room temperature.

  10. Recursive Algorithm For Linear Regression

    Science.gov (United States)

    Varanasi, S. V.

    1988-01-01

    Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.

  11. Design of Demining Machines

    CERN Document Server

    Mikulic, Dinko

    2013-01-01

    In constant effort to eliminate mine danger, international mine action community has been developing safety, efficiency and cost-effectiveness of clearance methods. Demining machines have become necessary when conducting humanitarian demining where the mechanization of demining provides greater safety and productivity. Design of Demining Machines describes the development and testing of modern demining machines in humanitarian demining.   Relevant data for design of demining machines are included to explain the machinery implemented and some innovative and inspiring development solutions. Development technologies, companies and projects are discussed to provide a comprehensive estimate of the effects of various design factors and to proper selection of optimal parameters for designing the demining machines.   Covering the dynamic processes occurring in machine assemblies and their components to a broader understanding of demining machine as a whole, Design of Demining Machines is primarily tailored as a tex...

  12. Applied machining technology

    CERN Document Server

    Tschätsch, Heinz

    2010-01-01

    Machining and cutting technologies are still crucial for many manufacturing processes. This reference presents all important machining processes in a comprehensive and coherent way. It includes many examples of concrete calculations, problems and solutions.

  13. Machining with abrasives

    CERN Document Server

    Jackson, Mark J

    2011-01-01

    Abrasive machining is key to obtaining the desired geometry and surface quality in manufacturing. This book discusses the fundamentals and advances in the abrasive machining processes. It provides a complete overview of developing areas in the field.

  14. Women, Men, and Machines.

    Science.gov (United States)

    Form, William; McMillen, David Byron

    1983-01-01

    Data from the first national study of technological change show that proportionately more women than men operate machines, are more exposed to machines that have alienating effects, and suffer more from the negative effects of technological change. (Author/SSH)

  15. Machine medical ethics

    CERN Document Server

    Pontier, Matthijs

    2015-01-01

    The essays in this book, written by researchers from both humanities and sciences, describe various theoretical and experimental approaches to adding medical ethics to a machine in medical settings. Medical machines are in close proximity with human beings, and getting closer: with patients who are in vulnerable states of health, who have disabilities of various kinds, with the very young or very old, and with medical professionals. In such contexts, machines are undertaking important medical tasks that require emotional sensitivity, knowledge of medical codes, human dignity, and privacy. As machine technology advances, ethical concerns become more urgent: should medical machines be programmed to follow a code of medical ethics? What theory or theories should constrain medical machine conduct? What design features are required? Should machines share responsibility with humans for the ethical consequences of medical actions? How ought clinical relationships involving machines to be modeled? Is a capacity for e...

  16. Brain versus Machine Control.

    Directory of Open Access Journals (Sweden)

    Jose M Carmena

    2004-12-01

    Full Text Available Dr. Octopus, the villain of the movie "Spiderman 2", is a fusion of man and machine. Neuroscientist Jose Carmena examines the facts behind this fictional account of a brain- machine interface

  17. Leukemia prediction using sparse logistic regression.

    Directory of Open Access Journals (Sweden)

    Tapio Manninen

    Full Text Available We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Formula: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical.

  18. Discharge estimation based on machine learning

    Institute of Scientific and Technical Information of China (English)

    Zhu JIANG; Hui-yan WANG; Wen-wu SONG

    2013-01-01

    To overcome the limitations of the traditional stage-discharge models in describing the dynamic characteristics of a river, a machine learning method of non-parametric regression, the locally weighted regression method was used to estimate discharge. With the purpose of improving the precision and efficiency of river discharge estimation, a novel machine learning method is proposed:the clustering-tree weighted regression method. First, the training instances are clustered. Second, the k-nearest neighbor method is used to cluster new stage samples into the best-fit cluster. Finally, the daily discharge is estimated. In the estimation process, the interference of irrelevant information can be avoided, so that the precision and efficiency of daily discharge estimation are improved. Observed data from the Luding Hydrological Station were used for testing. The simulation results demonstrate that the precision of this method is high. This provides a new effective method for discharge estimation.

  19. A Universal Reactive Machine

    DEFF Research Database (Denmark)

    Andersen, Henrik Reif; Mørk, Simon; Sørensen, Morten U.

    1997-01-01

    Turing showed the existence of a model universal for the set of Turing machines in the sense that given an encoding of any Turing machine asinput the universal Turing machine simulates it. We introduce the concept of universality for reactive systems and construct a CCS processuniversal...

  20. Regression in autistic spectrum disorders.

    Science.gov (United States)

    Stefanatos, Gerry A

    2008-12-01

    A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.

  1. Combining Alphas via Bounded Regression

    Directory of Open Access Journals (Sweden)

    Zura Kakushadze

    2015-11-01

    Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.

  2. Linear regression in astronomy. I

    Science.gov (United States)

    Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh

    1990-01-01

    Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.

  3. 基于SVR选择性集成的机场噪声预测模型研究%Airport Noise Prediction Model Research Based on SVR Selective Ensemble

    Institute of Scientific and Technical Information of China (English)

    谢华; 陈海燕; 袁立罡

    2016-01-01

    机场噪声预测对机场规划设计、航班计划制定以及机场噪声控制具有十分重要的作用。针对机场周围各个监测点上的单飞行事件进行噪声预测。由于机场噪声数据的复杂性,用单一的SVR方法对其预测往往得出局部优化结果,不能达到理想的预测效果,针对这一问题,提出一种基于SVR选择性集成的机场噪声预测方法,通过Adaboost方法对机场噪声数据进行采样训练得到多个SVR预测模型,并结合一种排序方法对预测模型进行选择集成得到最终机场噪声预测值,取得了较好的预测效果。%Airport noise prediction plays an important role in airport planning, flight plan schedule and noise control. According to different monitoring points around airport, this paper aim to predict corre-sponding noise of individual flight event. For the complexity of airport noise data,prediction method which only applied single SVR would cause the problem of local optimum,and cannot get an accurate prediction result as expected. To solve this problem,an airport noise prediction method based on SVR selective en-semble was proposed in this paper. Adaboost method was used to airport noise data sampling,and then multiple SVR forecasting models were trained. With the help of a sorting method,forecasting models selec-tive ensemble was achieved and used to predict the final airport noise value,proved has a good prediction effect.

  4. Asynchronized synchronous machines

    CERN Document Server

    Botvinnik, M M

    1964-01-01

    Asynchronized Synchronous Machines focuses on the theoretical research on asynchronized synchronous (AS) machines, which are "hybrids” of synchronous and induction machines that can operate with slip. Topics covered in this book include the initial equations; vector diagram of an AS machine; regulation in cases of deviation from the law of full compensation; parameters of the excitation system; and schematic diagram of an excitation regulator. The possible applications of AS machines and its calculations in certain cases are also discussed. This publication is beneficial for students and indiv

  5. Quantum machine learning.

    Science.gov (United States)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  6. Precision machine design

    CERN Document Server

    Slocum, Alexander H

    1992-01-01

    This book is a comprehensive engineering exploration of all the aspects of precision machine design - both component and system design considerations for precision machines. It addresses both theoretical analysis and practical implementation providing many real-world design case studies as well as numerous examples of existing components and their characteristics. Fast becoming a classic, this book includes examples of analysis techniques, along with the philosophy of the solution method. It explores the physics of errors in machines and how such knowledge can be used to build an error budget for a machine, how error budgets can be used to design more accurate machines.

  7. Statistical learning from a regression perspective

    CERN Document Server

    Berk, Richard A

    2016-01-01

    This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this can be seen as an extension of nonparametric regression. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. A continued emphasis on the implications for practice runs through the text. Among the statistical learning procedures examined are bagging, random forests, boosting, support vector machines and neural networks. Response variables may be quantitative or categorical. As in the first edition, a unifying theme is supervised learning that can be trea...

  8. Robust Nonlinear Regression: A Greedy Approach Employing Kernels With Application to Image Denoising

    Science.gov (United States)

    Papageorgiou, George; Bouboulis, Pantelis; Theodoridis, Sergios

    2017-08-01

    We consider the task of robust non-linear regression in the presence of both inlier noise and outliers. Assuming that the unknown non-linear function belongs to a Reproducing Kernel Hilbert Space (RKHS), our goal is to estimate the set of the associated unknown parameters. Due to the presence of outliers, common techniques such as the Kernel Ridge Regression (KRR) or the Support Vector Regression (SVR) turn out to be inadequate. Instead, we employ sparse modeling arguments to explicitly model and estimate the outliers, adopting a greedy approach. The proposed robust scheme, i.e., Kernel Greedy Algorithm for Robust Denoising (KGARD), is inspired by the classical Orthogonal Matching Pursuit (OMP) algorithm. Specifically, the proposed method alternates between a KRR task and an OMP-like selection step. Theoretical results concerning the identification of the outliers are provided. Moreover, KGARD is compared against other cutting edge methods, where its performance is evaluated via a set of experiments with various types of noise. Finally, the proposed robust estimation framework is applied to the task of image denoising, and its enhanced performance in the presence of outliers is demonstrated.

  9. 基于 RS-SV R的企业信用评分模型%Enterprise credit scoring model based on RS-SVR

    Institute of Scientific and Technical Information of China (English)

    陈云; 杨晓雪; 石松

    2016-01-01

    针对运用信用评分模型提升银行决策能力进行了研究。将支持向量回归模型应用于企业信用评分问题,并提出基于随机子集的支持向量回归集成模型。首先使用随机子集抽样模型获得大量训练数据集,然后使用不同的训练集子集获得差异化支持向量回归模型,最后使用简单平均方法整合不同模型的预测结果。基于企业信用评分数据的实验结果证明了支持向量回归模型的有效性。%This paper researched on using credit scoring models to improve banks’decision-making capacity.It applied sup-port vector regression model to the enterprise credit scoring,and then,it put forward a support vector regression integration model which based on random subset.Firstly,it used random subset sampling model to get enough different training data. Secondly,it employed different training subsets to get various support vector regression models.Finally,it integrated the pre-dicted results of different models by using the simple average method.In conclusion,the result of the experiment based on en-terprise credit scoring data proves the effectiveness of the model.

  10. A two-stage support-vector-regression optimization model for municipal solid waste management - a case study of Beijing, China.

    Science.gov (United States)

    Dai, C; Li, Y P; Huang, G H

    2011-12-01

    In this study, a two-stage support-vector-regression optimization model (TSOM) is developed for the planning of municipal solid waste (MSW) management in the urban districts of Beijing, China. It represents a new effort to enhance the analysis accuracy in optimizing the MSW management system through coupling the support-vector-regression (SVR) model with an interval-parameter mixed integer linear programming (IMILP). The developed TSOM can not only predict the city's future waste generation amount, but also reflect dynamic, interactive, and uncertain characteristics of the MSW management system. Four kernel functions such as linear kernel, polynomial kernel, radial basis function, and multi-layer perception kernel are chosen based on three quantitative simulation performance criteria [i.e. prediction accuracy (PA), fitting accuracy (FA) and over all accuracy (OA)]. The SVR with polynomial kernel has accurate prediction performance for MSW generation rate, with all of the three quantitative simulation performance criteria being over 96%. Two cases are considered based on different waste management policies. The results are valuable for supporting the adjustment of the existing waste-allocation patterns to raise the city's waste diversion rate, as well as the capacity planning of waste management system to satisfy the city's increasing waste treatment/disposal demands.

  11. Perspex machine: VII. The universal perspex machine

    Science.gov (United States)

    Anderson, James A. D. W.

    2006-01-01

    The perspex machine arose from the unification of projective geometry with the Turing machine. It uses a total arithmetic, called transreal arithmetic, that contains real arithmetic and allows division by zero. Transreal arithmetic is redefined here. The new arithmetic has both a positive and a negative infinity which lie at the extremes of the number line, and a number nullity that lies off the number line. We prove that nullity, 0/0, is a number. Hence a number may have one of four signs: negative, zero, positive, or nullity. It is, therefore, impossible to encode the sign of a number in one bit, as floating-point arithmetic attempts to do, resulting in the difficulty of having both positive and negative zeros and NaNs. Transrational arithmetic is consistent with Cantor arithmetic. In an extension to real arithmetic, the product of zero, an infinity, or nullity with its reciprocal is nullity, not unity. This avoids the usual contradictions that follow from allowing division by zero. Transreal arithmetic has a fixed algebraic structure and does not admit options as IEEE, floating-point arithmetic does. Most significantly, nullity has a simple semantics that is related to zero. Zero means "no value" and nullity means "no information." We argue that nullity is as useful to a manufactured computer as zero is to a human computer. The perspex machine is intended to offer one solution to the mind-body problem by showing how the computable aspects of mind and, perhaps, the whole of mind relates to the geometrical aspects of body and, perhaps, the whole of body. We review some of Turing's writings and show that he held the view that his machine has spatial properties. In particular, that it has the property of being a 7D lattice of compact spaces. Thus, we read Turing as believing that his machine relates computation to geometrical bodies. We simplify the perspex machine by substituting an augmented Euclidean geometry for projective geometry. This leads to a general

  12. Time-adaptive quantile regression

    DEFF Research Database (Denmark)

    Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik

    2008-01-01

    An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....

  13. Linear regression in astronomy. II

    Science.gov (United States)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  14. Polynomial Regression on Riemannian Manifolds

    CERN Document Server

    Hinkle, Jacob; Fletcher, P Thomas; Joshi, Sarang

    2012-01-01

    In this paper we develop the theory of parametric polynomial regression in Riemannian manifolds and Lie groups. We show application of Riemannian polynomial regression to shape analysis in Kendall shape space. Results are presented, showing the power of polynomial regression on the classic rat skull growth data of Bookstein as well as the analysis of the shape changes associated with aging of the corpus callosum from the OASIS Alzheimer's study.

  15. Addressing uncertainty in atomistic machine learning.

    Science.gov (United States)

    Peterson, Andrew A; Christensen, Rune; Khorshidi, Alireza

    2017-05-10

    Machine-learning regression has been demonstrated to precisely emulate the potential energy and forces that are output from more expensive electronic-structure calculations. However, to predict new regions of the potential energy surface, an assessment must be made of the credibility of the predictions. In this perspective, we address the types of errors that might arise in atomistic machine learning, the unique aspects of atomistic simulations that make machine-learning challenging, and highlight how uncertainty analysis can be used to assess the validity of machine-learning predictions. We suggest this will allow researchers to more fully use machine learning for the routine acceleration of large, high-accuracy, or extended-time simulations. In our demonstrations, we use a bootstrap ensemble of neural network-based calculators, and show that the width of the ensemble can provide an estimate of the uncertainty when the width is comparable to that in the training data. Intriguingly, we also show that the uncertainty can be localized to specific atoms in the simulation, which may offer hints for the generation of training data to strategically improve the machine-learned representation.

  16. The Fuzzy Cluster Analysis in Identification of Key Temperatures in Machine Tool

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The thermal-induced error is a very important sour ce of machining errors of machine tools. To compensate the thermal-induced machin ing errors, a relationship model between the thermal field and deformations was needed. The relationship can be deduced by virtual of FEM (Finite Element Method ), ANN (Artificial Neural Network) or MRA (Multiple Regression Analysis). MR A is on the basis of a total understanding of the temperature distribution of th e machine tool. Although the more the temperatures measu...

  17. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    Science.gov (United States)

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  18. Fuzzy and Regression Modelling of Hard Milling Process

    Directory of Open Access Journals (Sweden)

    A. Tamilarasan

    2014-04-01

    Full Text Available The present study highlights the application of box-behnken design coupled with fuzzy and regression modeling approach for making expert system in hard milling process to improve the process performance with systematic reduction of production cost. The important input fields of work piece hardness, nose radius, feed per tooth, radial depth of cut and axial depth cut were considered. The cutting forces, work surface temperature and sound pressure level were identified as key index of machining outputs. The results indicate that the fuzzy logic and regression modeling technique can be effectively used for the prediction of desired responses with less average error variation. Predicted results were verified by experiments and shown the good potential characteristics of the developed system for automated machining environment.

  19. Quantile regression theory and applications

    CERN Document Server

    Davino, Cristina; Vistocco, Domenico

    2013-01-01

    A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and

  20. Business applications of multiple regression

    CERN Document Server

    Richardson, Ronny

    2015-01-01

    This second edition of Business Applications of Multiple Regression describes the use of the statistical procedure called multiple regression in business situations, including forecasting and understanding the relationships between variables. The book assumes a basic understanding of statistics but reviews correlation analysis and simple regression to prepare the reader to understand and use multiple regression. The techniques described in the book are illustrated using both Microsoft Excel and a professional statistical program. Along the way, several real-world data sets are analyzed in deta

  1. Analysis of the effect of ultrasonic vibrations on the performance of micro-electrical discharge machining of A2 tool steel

    DEFF Research Database (Denmark)

    Puthumana, Govindan

    2016-01-01

    The application of ultrasonic vibrations to a workpiece or tool is a novel hybrid approach in micro-electrical discharge machining. The advantages of this method include effective flushing out of debris, higher machining efficiency and lesser short-circuits during machining. This paper presents...... effective at higher machining depths for achieving stable machining conditions. Regression equations were developed for MRR and TWR with capacitance, ultrasonic vibration factor, feed rate and machining time....

  2. Pileup Subtraction and Jet Energy Prediction Using Machine Learning

    CERN Document Server

    Kong, Vein S; Zhang, Yujia

    2015-01-01

    In the Large Hardron Collider (LHC), multiple proton-proton collisions cause pileup in reconstructing energy information for a single primary collision (jet). This project aims to select the most important features and create a model to accurately estimate jet energy. Different machine learning methods were explored, including linear regression, support vector regression and decision tree. The best result is obtained by linear regression with predictive features and the performance is improved significantly from the baseline method.

  3. Credit Scoring Model Hybridizing Artificial Intelligence with Logistic Regression

    Directory of Open Access Journals (Sweden)

    Han Lu

    2013-01-01

    Full Text Available Today the most commonly used techniques for credit scoring are artificial intelligence and statistics. In this paper, we started a new way to use these two kinds of models. Through logistic regression filters the variables with a high degree of correlation, artificial intelligence models reduce complexity and accelerate convergence, while these models hybridizing logistic regression have better explanations in statistically significance, thus improve the effect of artificial intelligence models. With experiments on German data set, we find an interesting phenomenon defined as ‘Dimensional interference’ with support vector machine and from cross validation it can be seen that the new method gives a lot of help with credit scoring.

  4. Supporting Regularized Logistic Regression Privately and Efficiently.

    Directory of Open Access Journals (Sweden)

    Wenfa Li

    Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  5. Supporting Regularized Logistic Regression Privately and Efficiently.

    Science.gov (United States)

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  6. Supporting Regularized Logistic Regression Privately and Efficiently

    Science.gov (United States)

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738

  7. The Application of GA-SVR Method in Financial Time Series Prediction%金融时间序列预测中的GA-SVR方法

    Institute of Scientific and Technical Information of China (English)

    焦帅; 颜七笙

    2012-01-01

    The application of support vector machine method in financial time series forecasting process often occurred low prediction accuracy and other issues with selected model improper parameters.In order to solve the problems,a financial time series forecasting model based on Genetic Algorithm which is used to optimize parameters of SVM has established.It was applied in China 'stock index time series prediction,and experimental results show that the method could better reflect the financial time series prediction rule,and improved the prediction accuracy of the model.%针对支持向量机方法在金融时间序列预测的过程中,模型参数选取不当的导致预测精度较低等问题,利用遗传算法优化选取支持向量机模型参数,建立了一种基于遗传算法优化支持向量机参数的金融时间序列预测模型。并将该方法应用于我国上证指数时间序列预测中。实验结果表明基于遗传算法优化的支持向量机方法能较好的反映金融时间序列预测规律,并且提高了模型预测精度。

  8. Machinability of advanced materials

    CERN Document Server

    Davim, J Paulo

    2014-01-01

    Machinability of Advanced Materials addresses the level of difficulty involved in machining a material, or multiple materials, with the appropriate tooling and cutting parameters.  A variety of factors determine a material's machinability, including tool life rate, cutting forces and power consumption, surface integrity, limiting rate of metal removal, and chip shape. These topics, among others, and multiple examples comprise this research resource for engineering students, academics, and practitioners.

  9. Pattern recognition & machine learning

    CERN Document Server

    Anzai, Y

    1992-01-01

    This is the first text to provide a unified and self-contained introduction to visual pattern recognition and machine learning. It is useful as a general introduction to artifical intelligence and knowledge engineering, and no previous knowledge of pattern recognition or machine learning is necessary. Basic for various pattern recognition and machine learning methods. Translated from Japanese, the book also features chapter exercises, keywords, and summaries.

  10. Support vector machines applications

    CERN Document Server

    Guo, Guodong

    2014-01-01

    Support vector machines (SVM) have both a solid mathematical background and good performance in practical applications. This book focuses on the recent advances and applications of the SVM in different areas, such as image processing, medical practice, computer vision, pattern recognition, machine learning, applied statistics, business intelligence, and artificial intelligence. The aim of this book is to create a comprehensive source on support vector machine applications, especially some recent advances.

  11. Machining of titanium alloys

    CERN Document Server

    2014-01-01

    This book presents a collection of examples illustrating the resent research advances in the machining of titanium alloys. These materials have excellent strength and fracture toughness as well as low density and good corrosion resistance; however, machinability is still poor due to their low thermal conductivity and high chemical reactivity with cutting tool materials. This book presents solutions to enhance machinability in titanium-based alloys and serves as a useful reference to professionals and researchers in aerospace, automotive and biomedical fields.

  12. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  13. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  14. Rotating electrical machines

    CERN Document Server

    Le Doeuff, René

    2013-01-01

    In this book a general matrix-based approach to modeling electrical machines is promulgated. The model uses instantaneous quantities for key variables and enables the user to easily take into account associations between rotating machines and static converters (such as in variable speed drives).   General equations of electromechanical energy conversion are established early in the treatment of the topic and then applied to synchronous, induction and DC machines. The primary characteristics of these machines are established for steady state behavior as well as for variable speed scenarios. I

  15. Chaotic Boltzmann machines.

    Science.gov (United States)

    Suzuki, Hideyuki; Imura, Jun-ichi; Horio, Yoshihiko; Aihara, Kazuyuki

    2013-01-01

    The chaotic Boltzmann machine proposed in this paper is a chaotic pseudo-billiard system that works as a Boltzmann machine. Chaotic Boltzmann machines are shown numerically to have computing abilities comparable to conventional (stochastic) Boltzmann machines. Since no randomness is required, efficient hardware implementation is expected. Moreover, the ferromagnetic phase transition of the Ising model is shown to be characterised by the largest Lyapunov exponent of the proposed system. In general, a method to relate probabilistic models to nonlinear dynamics by derandomising Gibbs sampling is presented.

  16. Tribology in machine design

    CERN Document Server

    Stolarski, Tadeusz

    1999-01-01

    ""Tribology in Machine Design is strongly recommended for machine designers, and engineers and scientists interested in tribology. It should be in the engineering library of companies producing mechanical equipment.""Applied Mechanics ReviewTribology in Machine Design explains the role of tribology in the design of machine elements. It shows how algorithms developed from the basic principles of tribology can be used in a range of practical applications within mechanical devices and systems.The computer offers today's designer the possibility of greater stringen

  17. Debugging the virtual machine

    Energy Technology Data Exchange (ETDEWEB)

    Miller, P.; Pizzi, R.

    1994-09-02

    A computer program is really nothing more than a virtual machine built to perform a task. The program`s source code expresses abstract constructs using low level language features. When a virtual machine breaks, it can be very difficult to debug because typical debuggers provide only low level machine implementation in formation to the software engineer. We believe that the debugging task can be simplified by introducing aspects of the abstract design into the source code. We introduce OODIE, an object-oriented language extension that allows programmers to specify a virtual debugging environment which includes the design and abstract data types of the virtual machine.

  18. Electrical machines & drives

    CERN Document Server

    Hammond, P

    1985-01-01

    Containing approximately 200 problems (100 worked), the text covers a wide range of topics concerning electrical machines, placing particular emphasis upon electrical-machine drive applications. The theory is concisely reviewed and focuses on features common to all machine types. The problems are arranged in order of increasing levels of complexity and discussions of the solutions are included where appropriate to illustrate the engineering implications. This second edition includes an important new chapter on mathematical and computer simulation of machine systems and revised discussions o

  19. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2013-01-01

    Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or

  20. Induction machine handbook

    CERN Document Server

    Boldea, Ion

    2002-01-01

    Often called the workhorse of industry, the advent of power electronics and advances in digital control are transforming the induction motor into the racehorse of industrial motion control. Now, the classic texts on induction machines are nearly three decades old, while more recent books on electric motors lack the necessary depth and detail on induction machines.The Induction Machine Handbook fills industry's long-standing need for a comprehensive treatise embracing the many intricate facets of induction machine analysis and design. Moving gradually from simple to complex and from standard to

  1. Fungible weights in logistic regression.

    Science.gov (United States)

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record

  2. The prediction of paper properties from spectrometric data with machine learning

    OpenAIRE

    Jakovac, Alen

    2011-01-01

    In this thesis we present a solution for the problem of predicting the chemical and physical properties of paper from spectrometric data. We used a data set that consists of over 1000 samples of paper. For each sample 15 chemical and physical properties and its near-infrared spectra were measured. We used the following machine learning methods to predict the properties of paper: linear regression, pace regression, a nearest neighbor-based model, regression trees, a support vector machine, ...

  3. Regression Testing Cost Reduction Suite

    Directory of Open Access Journals (Sweden)

    Mohamed Alaa El-Din

    2014-08-01

    Full Text Available The estimated cost of software maintenance exceeds 70 percent of total software costs [1], and large portion of this maintenance expenses is devoted to regression testing. Regression testing is an expensive and frequently executed maintenance activity used to revalidate the modified software. Any reduction in the cost of regression testing would help to reduce the software maintenance cost. Test suites once developed are reused and updated frequently as the software evolves. As a result, some test cases in the test suite may become redundant when the software is modified over time since the requirements covered by them are also covered by other test cases. Due to the resource and time constraints for re-executing large test suites, it is important to develop techniques to minimize available test suites by removing redundant test cases. In general, the test suite minimization problem is NP complete. This paper focuses on proposing an effective approach for reducing the cost of regression testing process. The proposed approach is applied on real-time case study. It was found that the reduction in cost of regression testing for each regression testing cycle is ranging highly improved in the case of programs containing high number of selected statements which in turn maximize the benefits of using it in regression testing of complex software systems. The reduction in the regression test suite size will reduce the effort and time required by the testing teams to execute the regression test suite. Since regression testing is done more frequently in software maintenance phase, the overall software maintenance cost can be reduced considerably by applying the proposed approach.

  4. Virtual machine vs Real Machine: Security Systems

    Directory of Open Access Journals (Sweden)

    Dr. C. Suresh Gnana Das

    2009-08-01

    Full Text Available This paper argues that the operating system and applications currently running on a real machine should relocate into a virtual machine. This structure enables services to be added below the operating system and to do so without trusting or modifying the operating system or applications. To demonstrate the usefulness of this structure, we describe three services that take advantage of it: secure logging, intrusion prevention and detection, and environment migration. In particular, we can provide services below the guest operating system without trusting or modifying it. We believe providing services at this layer are especially useful for enhancing security and mobility. This position paper describes the general benefits and challenges that arise from running most applications in a virtual machine, and then describes some example services and alternative ways to provide those services.

  5. Rank regression: an alternative regression approach for data with outliers.

    Science.gov (United States)

    Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin

    2014-10-01

    Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.

  6. Stirling machine operating experience

    Energy Technology Data Exchange (ETDEWEB)

    Ross, B. [Stirling Technology Co., Richland, WA (United States); Dudenhoefer, J.E. [Lewis Research Center, Cleveland, OH (United States)

    1994-09-01

    Numerous Stirling machines have been built and operated, but the operating experience of these machines is not well known. It is important to examine this operating experience in detail, because it largely substantiates the claim that stirling machines are capable of reliable and lengthy operating lives. The amount of data that exists is impressive, considering that many of the machines that have been built are developmental machines intended to show proof of concept, and are not expected to operate for lengthy periods of time. Some Stirling machines (typically free-piston machines) achieve long life through non-contact bearings, while other Stirling machines (typically kinematic) have achieved long operating lives through regular seal and bearing replacements. In addition to engine and system testing, life testing of critical components is also considered. The record in this paper is not complete, due to the reluctance of some organizations to release operational data and because several organizations were not contacted. The authors intend to repeat this assessment in three years, hoping for even greater participation.

  7. Perpetual Motion Machine

    Directory of Open Access Journals (Sweden)

    D. Tsaousis

    2008-01-01

    Full Text Available Ever since the first century A.D. there have been relative descriptions of known devices as well as manufactures for the creation of perpetual motion machines. Although physics has led, with two thermodynamic laws, to the opinion that a perpetual motion machine is impossible to be manufactured, inventors of every age and educational level appear to claim that they have invented something «entirely new» or they have improved somebody else’s invention, which «will function henceforth perpetually»! However the fact of the failure in manufacturing a perpetual motion machine till now, it does not mean that countless historical elements for these fictional machines become indifferent. The discussion on every version of a perpetual motion machine on the one hand gives the chance to comprehend the inventor’s of each period level of knowledge and his way of thinking, and on the other hand, to locate the points where this «perpetual motion machine» clashes with the laws of nature and that’s why it is impossible to have been manufactured or have functioned. The presentation of a new «perpetual motion machine» has excited our interest to locate its weak points. According to the designer of it the machine functions with the work produced by the buoyant force

  8. Machine Intelligence and Explication

    NARCIS (Netherlands)

    Wieringa, Roelf J.

    1987-01-01

    This report is an MA ("doctoraal") thesis submitted to the department of philosophy, university of Amsterdam. It attempts to answer the question whether machines can think by conceptual analysis. Ideally. a conceptual analysis should give plausible explications of the concepts of "machine" and "inte

  9. Microsoft Azure machine learning

    CERN Document Server

    Mund, Sumit

    2015-01-01

    The book is intended for those who want to learn how to use Azure Machine Learning. Perhaps you already know a bit about Machine Learning, but have never used ML Studio in Azure; or perhaps you are an absolute newbie. In either case, this book will get you up-and-running quickly.

  10. Reactive Turing machines

    NARCIS (Netherlands)

    Baeten, J.C.M.; Luttik, B.; Tilburg, P.J.A. van

    2013-01-01

    We propose reactive Turing machines (RTMs), extending classical Turing machines with a process-theoretical notion of interaction, and use it to define a notion of executable transition system. We show that every computable transition system with a bounded branching degree is simulated modulo diverge

  11. Machine Intelligence and Explication

    NARCIS (Netherlands)

    Wieringa, Roel

    1987-01-01

    This report is an MA ("doctoraal") thesis submitted to the department of philosophy, university of Amsterdam. It attempts to answer the question whether machines can think by conceptual analysis. Ideally. a conceptual analysis should give plausible explications of the concepts of "machine" and "inte

  12. Coordinate measuring machines

    DEFF Research Database (Denmark)

    De Chiffre, Leonardo

    This document is used in connection with three exercises of 2 hours duration as a part of the course GEOMETRICAL METROLOGY AND MACHINE TESTING. The exercises concern three aspects of coordinate measuring: 1) Measuring and verification of tolerances on coordinate measuring machines, 2) Traceability...

  13. Simple Machine Junk Cars

    Science.gov (United States)

    Herald, Christine

    2010-01-01

    During the month of May, the author's eighth-grade physical science students study the six simple machines through hands-on activities, reading assignments, videos, and notes. At the end of the month, they can easily identify the six types of simple machine: inclined plane, wheel and axle, pulley, screw, wedge, and lever. To conclude this unit,…

  14. Human Machine Learning Symbiosis

    Science.gov (United States)

    Walsh, Kenneth R.; Hoque, Md Tamjidul; Williams, Kim H.

    2017-01-01

    Human Machine Learning Symbiosis is a cooperative system where both the human learner and the machine learner learn from each other to create an effective and efficient learning environment adapted to the needs of the human learner. Such a system can be used in online learning modules so that the modules adapt to each learner's learning state both…

  15. Machine learning with R

    CERN Document Server

    Lantz, Brett

    2015-01-01

    Perhaps you already know a bit about machine learning but have never used R, or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

  16. Support Vector Regression-Based Adaptive Divided Difference Filter for Nonlinear State Estimation Problems

    Directory of Open Access Journals (Sweden)

    Hongjian Wang

    2014-01-01

    Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.

  17. 15 CFR 700.31 - Metalworking machines.

    Science.gov (United States)

    2010-01-01

    ... Drilling and tapping machines Electrical discharge, ultrasonic and chemical erosion machines Forging..., power driven Machining centers and way-type machines Manual presses Mechanical presses, power...

  18. LHC Report: machine development

    CERN Multimedia

    Rogelio Tomás García for the LHC team

    2015-01-01

    Machine development weeks are carefully planned in the LHC operation schedule to optimise and further study the performance of the machine. The first machine development session of Run 2 ended on Saturday, 25 July. Despite various hiccoughs, it allowed the operators to make great strides towards improving the long-term performance of the LHC.   The main goals of this first machine development (MD) week were to determine the minimum beam-spot size at the interaction points given existing optics and collimation constraints; to test new beam instrumentation; to evaluate the effectiveness of performing part of the beam-squeezing process during the energy ramp; and to explore the limits on the number of protons per bunch arising from the electromagnetic interactions with the accelerator environment and the other beam. Unfortunately, a series of events reduced the machine availability for studies to about 50%. The most critical issue was the recurrent trip of a sextupolar corrector circuit –...

  19. Micro-machining.

    Science.gov (United States)

    Brinksmeier, Ekkard; Preuss, Werner

    2012-08-28

    Manipulating bulk material at the atomic level is considered to be the domain of physics, chemistry and nanotechnology. However, precision engineering, especially micro-machining, has become a powerful tool for controlling the surface properties and sub-surface integrity of the optical, electronic and mechanical functional parts in a regime where continuum mechanics is left behind and the quantum nature of matter comes into play. The surprising subtlety of micro-machining results from the extraordinary precision of tools, machines and controls expanding into the nanometre range-a hundred times more precise than the wavelength of light. In this paper, we will outline the development of precision engineering, highlight modern achievements of ultra-precision machining and discuss the necessity of a deeper physical understanding of micro-machining.

  20. Introduction to machine learning.

    Science.gov (United States)

    Baştanlar, Yalin; Ozuysal, Mustafa

    2014-01-01

    The machine learning field, which can be briefly defined as enabling computers make successful predictions using past experiences, has exhibited an impressive development recently with the help of the rapid increase in the storage capacity and processing power of computers. Together with many other disciplines, machine learning methods have been widely employed in bioinformatics. The difficulties and cost of biological analyses have led to the development of sophisticated machine learning approaches for this application area. In this chapter, we first review the fundamental concepts of machine learning such as feature assessment, unsupervised versus supervised learning and types of classification. Then, we point out the main issues of designing machine learning experiments and their performance evaluation. Finally, we introduce some supervised learning methods.

  1. Twin support vector machines models, extensions and applications

    CERN Document Server

    Jayadeva; Chandra, Suresh

    2017-01-01

    This book provides a systematic and focused study of the various aspects of twin support vector machines (TWSVM) and related developments for classification and regression. In addition to presenting most of the basic models of TWSVM and twin support vector regression (TWSVR) available in the literature, it also discusses the important and challenging applications of this new machine learning methodology. A chapter on “Additional Topics” has been included to discuss kernel optimization and support tensor machine topics, which are comparatively new but have great potential in applications. It is primarily written for graduate students and researchers in the area of machine learning and related topics in computer science, mathematics, electrical engineering, management science and finance.

  2. Multiple Regression and Its Discontents

    Science.gov (United States)

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  3. Multiple Regression and Its Discontents

    Science.gov (United States)

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  4. Regression methods for medical research

    CERN Document Server

    Tai, Bee Choo

    2013-01-01

    Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the

  5. Forecasting with Dynamic Regression Models

    CERN Document Server

    Pankratz, Alan

    2012-01-01

    One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.

  6. Wrong Signs in Regression Coefficients

    Science.gov (United States)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  7. From Rasch scores to regression

    DEFF Research Database (Denmark)

    Christensen, Karl Bang

    2006-01-01

    Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....

  8. 基于最小二乘支持向量回归建模方法的人机系统操作员功能状态分析%Analysis on Operator Functional State of Human-Machine System Based on Approach of LSSVM Regressive Model

    Institute of Scientific and Technical Information of China (English)

    秦攀攀; 张建华

    2012-01-01

    Objective To construct an optimum mathematical model to estimate Operator Functional State (OFS) in a human-machine system. Methods This paper adopted Least Squares Support Vector Machine ( LSSVM) approach to construct OFS models with their multiple physiological and performance data. The model parameters were optimized with grid-search and 10-fold cross validation techniques. The modeling results of the LSSVM approach was compared with those of Genetic-Algorithms-based Mamdani ( GA-Mamdani) -type fuzzy modeling method. Results The LSSVM model was shown to be capable of capturing the actual fluctuations of the OFS over time. In general, the overall modeling error (indicated by the RMSE index) of the LSSVM model was accepted and smaller than that of GA-Mamdani model. Conclusion The data-driven LSSVM modeling approach is effective for OFS estimation thanks to its superior generalization performance.%目的 建立具有很强预测能力的数学模型来准确评估人机系统操作员功能状态( Operator Functional States,OFS).方法 基于采集到的一系列操作员电生理信号及性能数据,采用最小二乘支持向量机(Least Squares Support Vector Machine,LSSVM)方法对OFS建模.通过网格搜索和10-折交叉验证方法对模型参数进行优化,并将LSSVM与基于遗传算法的模糊建模方法进行比较.结果 模型基本能反映OFS的实际变化趋势,输出误差在可接受的范围之内且与基于遗传算法的模糊建模方法得到的模型输出误差相比较小.结论 LSSVM方法具有更好的泛化性能,将其用于OFS评估是有效的.

  9. Machine learning in sedimentation modelling.

    Science.gov (United States)

    Bhattacharya, B; Solomatine, D P

    2006-03-01

    The paper presents machine learning (ML) models that predict sedimentation in the harbour basin of the Port of Rotterdam. The important factors affecting the sedimentation process such as waves, wind, tides, surge, river discharge, etc. are studied, the corresponding time series data is analysed, missing values are estimated and the most important variables behind the process are chosen as the inputs. Two ML methods are used: MLP ANN and M5 model tree. The latter is a collection of piece-wise linear regression models, each being an expert for a particular region of the input space. The models are trained on the data collected during 1992-1998 and tested by the data of 1999-2000. The predictive accuracy of the models is found to be adequate for the potential use in the operational decision making.

  10. Support vector machines optimization based theory, algorithms, and extensions

    CERN Document Server

    Deng, Naiyang; Zhang, Chunhua

    2013-01-01

    Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)-classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built.The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twi

  11. Volumetric verification of multiaxis machine tool using laser tracker.

    Science.gov (United States)

    Aguado, Sergio; Samper, David; Santolaria, Jorge; Aguilar, Juan José

    2014-01-01

    This paper aims to present a method of volumetric verification in machine tools with linear and rotary axes using a laser tracker. Beyond a method for a particular machine, it presents a methodology that can be used in any machine type. Along this paper, the schema and kinematic model of a machine with three axes of movement, two linear and one rotational axes, including the measurement system and the nominal rotation matrix of the rotational axis are presented. Using this, the machine tool volumetric error is obtained and nonlinear optimization techniques are employed to improve the accuracy of the machine tool. The verification provides a mathematical, not physical, compensation, in less time than other methods of verification by means of the indirect measurement of geometric errors of the machine from the linear and rotary axes. This paper presents an extensive study about the appropriateness and drawbacks of the regression function employed depending on the types of movement of the axes of any machine. In the same way, strengths and weaknesses of measurement methods and optimization techniques depending on the space available to place the measurement system are presented. These studies provide the most appropriate strategies to verify each machine tool taking into consideration its configuration and its available work space.

  12. Machine Learning and Radiology

    Science.gov (United States)

    Wang, Shijun; Summers, Ronald M.

    2012-01-01

    In this paper, we give a short introduction to machine learning and survey its applications in radiology. We focused on six categories of applications in radiology: medical image segmentation, registration, computer aided detection and diagnosis, brain function or activity analysis and neurological disease diagnosis from fMR images, content-based image retrieval systems for CT or MRI images, and text analysis of radiology reports using natural language processing (NLP) and natural language understanding (NLU). This survey shows that machine learning plays a key role in many radiology applications. Machine learning identifies complex patterns automatically and helps radiologists make intelligent decisions on radiology data such as conventional radiographs, CT, MRI, and PET images and radiology reports. In many applications, the performance of machine learning-based automatic detection and diagnosis systems has shown to be comparable to that of a well-trained and experienced radiologist. Technology development in machine learning and radiology will benefit from each other in the long run. Key contributions and common characteristics of machine learning techniques in radiology are discussed. We also discuss the problem of translating machine learning applications to the radiology clinical setting, including advantages and potential barriers. PMID:22465077

  13. The basic anaesthesia machine.

    Science.gov (United States)

    Gurudatt, Cl

    2013-09-01

    After WTG Morton's first public demonstration in 1846 of use of ether as an anaesthetic agent, for many years anaesthesiologists did not require a machine to deliver anaesthesia to the patients. After the introduction of oxygen and nitrous oxide in the form of compressed gases in cylinders, there was a necessity for mounting these cylinders on a metal frame. This stimulated many people to attempt to construct the anaesthesia machine. HEG Boyle in the year 1917 modified the Gwathmey's machine and this became popular as Boyle anaesthesia machine. Though a lot of changes have been made for the original Boyle machine still the basic structure remains the same. All the subsequent changes which have been brought are mainly to improve the safety of the patients. Knowing the details of the basic machine will make the trainee to understand the additional improvements. It is also important for every practicing anaesthesiologist to have a thorough knowledge of the basic anaesthesia machine for safe conduct of anaesthesia.

  14. The basic anaesthesia machine

    Directory of Open Access Journals (Sweden)

    C L Gurudatt

    2013-01-01

    Full Text Available After WTG Morton′s first public demonstration in 1846 of use of ether as an anaesthetic agent, for many years anaesthesiologists did not require a machine to deliver anaesthesia to the patients. After the introduction of oxygen and nitrous oxide in the form of compressed gases in cylinders, there was a necessity for mounting these cylinders on a metal frame. This stimulated many people to attempt to construct the anaesthesia machine. HEG Boyle in the year 1917 modified the Gwathmey′s machine and this became popular as Boyle anaesthesia machine. Though a lot of changes have been made for the original Boyle machine still the basic structure remains the same. All the subsequent changes which have been brought are mainly to improve the safety of the patients. Knowing the details of the basic machine will make the trainee to understand the additional improvements. It is also important for every practicing anaesthesiologist to have a thorough knowledge of the basic anaesthesia machine for safe conduct of anaesthesia.

  15. A Matlab program for stepwise regression

    Directory of Open Access Journals (Sweden)

    Yanhong Qi

    2016-03-01

    Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.

  16. A Comparison Study of Extreme Learning Machine and Least Squares Support Vector Machine for Structural Impact Localization

    OpenAIRE

    Qingsong Xu

    2014-01-01

    Extreme learning machine (ELM) is a learning algorithm for single-hidden layer feedforward neural network dedicated to an extremely fast learning. However, the performance of ELM in structural impact localization is unknown yet. In this paper, a comparison study of ELM with least squares support vector machine (LSSVM) is presented for the application on impact localization of a plate structure with surface-mounted piezoelectric sensors. Both basic and kernel-based ELM regression models have b...

  17. A hybrid model of support vector regression with genetic algorithm for forecasting adsorption of malachite green onto multi-walled carbon nanotubes: central composite design optimization.

    Science.gov (United States)

    Ghaedi, M; Dashtian, K; Ghaedi, A M; Dehghanian, N

    2016-05-11

    The aim of this work is the study of the predictive ability of a hybrid model of support vector regression with genetic algorithm optimization (GA-SVR) for the adsorption of malachite green (MG) onto multi-walled carbon nanotubes (MWCNTs). Various factors were investigated by central composite design and optimum conditions was set as: pH 8, 0.018 g MWCNTs, 8 mg L(-1) dye mixed with 50 mL solution thoroughly for 10 min. The Langmuir, Freundlich, Temkin and D-R isothermal models are applied to fitting the experimental data, and the data was well explained by the Langmuir model with a maximum adsorption capacity of 62.11-80.64 mg g(-1) in a short time at 25 °C. Kinetic studies at various adsorbent dosages and the initial MG concentration show that maximum MG removal was achieved within 10 min of the start of every experiment under most conditions. The adsorption obeys the pseudo-second-order rate equation in addition to the intraparticle diffusion model. The optimal parameters (C of 0.2509, σ(2) of 0.1288 and ε of 0.2018) for the SVR model were obtained based on the GA. For the testing data set, MSE values of 0.0034 and the coefficient of determination (R(2)) values of 0.9195 were achieved.

  18. Part Machinability Evaluation System

    Institute of Scientific and Technical Information of China (English)

    1999-01-01

    In the early design period, estimation of the part or the whole product machinability is useful to consider the function and process request of the product at the same time so as to globally optimize the design decision. This paper presents a part machinability evaluation system, discusses the general restrictions of part machinability, and realizes the inspection of these restrictions with the relation between tool scan space and part model. During the system development, the expansibility and understandability were considered, and an independent restriction algorithm library and a general function library were set up. Additionally, the system has an interpreter and a knowledge manager.

  19. Fundamentals of machine design

    CERN Document Server

    Karaszewski, Waldemar

    2011-01-01

    A forum of researchers, educators and engineers involved in various aspects of Machine Design provided the inspiration for this collection of peer-reviewed papers. The resultant dissemination of the latest research results, and the exchange of views concerning the future research directions to be taken in this field will make the work of immense value to all those having an interest in the topics covered. The book reflects the cooperative efforts made in seeking out the best strategies for effecting improvements in the quality and the reliability of machines and machine parts and for extending

  20. Machine Tool Software

    Science.gov (United States)

    1988-01-01

    A NASA-developed software package has played a part in technical education of students who major in Mechanical Engineering Technology at William Rainey Harper College. Professor Hack has been using (APT) Automatically Programmed Tool Software since 1969 in his CAD/CAM Computer Aided Design and Manufacturing curriculum. Professor Hack teaches the use of APT programming languages for control of metal cutting machines. Machine tool instructions are geometry definitions written in APT Language to constitute a "part program." The part program is processed by the machine tool. CAD/CAM students go from writing a program to cutting steel in the course of a semester.

  1. Analysis of synchronous machines

    CERN Document Server

    Lipo, TA

    2012-01-01

    Analysis of Synchronous Machines, Second Edition is a thoroughly modern treatment of an old subject. Courses generally teach about synchronous machines by introducing the steady-state per phase equivalent circuit without a clear, thorough presentation of the source of this circuit representation, which is a crucial aspect. Taking a different approach, this book provides a deeper understanding of complex electromechanical drives. Focusing on the terminal rather than on the internal characteristics of machines, the book begins with the general concept of winding functions, describing the placeme

  2. Database machine performance

    Energy Technology Data Exchange (ETDEWEB)

    Cesarini, F.; Salza, S.

    1987-01-01

    This book is devoted to the important problem of database machine performance evaluation. The book presents several methodological proposals and case studies, that have been developed within an international project supported by the European Economic Community on Database Machine Evaluation Techniques and Tools in the Context of the Real Time Processing. The book gives an overall view of the modeling methodologies and the evaluation strategies that can be adopted to analyze the performance of the database machine. Moreover, it includes interesting case studies and an extensive bibliography.

  3. Virtual Machine Introspection

    Directory of Open Access Journals (Sweden)

    S C Rachana

    2014-06-01

    Full Text Available Cloud computing is an Internet-based computing solution which provides the resources in an effective manner. A very serious issue in cloud computing is security which is a major obstacle for the adoption of cloud. The most important threats of cloud computing are Multitenancy, Availability, Loss of control, Loss of Data, outside attacks, DOS attacks, malicious insiders, etc. Among many security issues in cloud, the Virtual Machine Security is one of the very serious issues. Thus, monitoring of virtual machine is essential. The paper proposes a Virtual Network Introspection [VMI] System to secure the Virtual machines from Distributed Denial of Service [DDOS] and Zombie attacks.

  4. Virtual Machine Introspection

    Directory of Open Access Journals (Sweden)

    S C Rachana

    2015-11-01

    Full Text Available Cloud computing is an Internet-based computing solution which provides the resources in an effective manner. A very serious issue in cloud computing is security which is a major obstacle for the adoption of cloud. The most important threats of cloud computing are Multitenancy, Availability, Loss of control, Loss of Data, outside attacks, DOS attacks, malicious insiders, etc. Among many security issues in cloud, the Virtual Machine Security is one of the very serious issues. Thus, monitoring of virtual machine is essential. The paper proposes a Virtual Network Introspection [VMI] System to secure the Virtual machines from Distributed Denial of Service [DDOS] and Zombie attacks.

  5. Machine Learning for Hackers

    CERN Document Server

    Conway, Drew

    2012-01-01

    If you're an experienced programmer interested in crunching data, this book will get you started with machine learning-a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation. Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyz

  6. Modelling tick abundance using machine learning techniques and satellite imagery

    DEFF Research Database (Denmark)

    Kjær, Lene Jung; Korslund, L.; Kjelland, V.

    satellite images to run Boosted Regression Tree machine learning algorithms to predict overall distribution (presence/absence of ticks) and relative tick abundance of nymphs and larvae in southern Scandinavia. For nymphs, the predicted abundance had a positive correlation with observed abundance...... the predicted distribution of larvae was mostly even throughout Denmark, it was primarily around the coastlines in Norway and Sweden. Abundance was fairly low overall except in some fragmented patches corresponding to forested habitats in the region. Machine learning techniques allow us to predict for larger...... the collected ticks for pathogens and using the same machine learning techniques to develop prevalence maps of the ScandTick region....

  7. XRA image segmentation using regression

    Science.gov (United States)

    Jin, Jesse S.

    1996-04-01

    Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.

  8. Modeling Approach of Regression Orthogonal Experiment Design for Thermal Error Compensation of CNC Turning Center

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The thermal induced errors can account for as much as 70% of the dimensional errors on a workpiece. Accurate modeling of errors is an essential part of error compensation. Base on analyzing the existing approaches of the thermal error modeling for machine tools, a new approach of regression orthogonal design is proposed, which combines the statistic theory with machine structures, surrounding condition, engineering judgements, and experience in modeling. A whole computation and analysis procedure is given. ...

  9. WekaPyScript: Classification, Regression, and Filter Schemes for WEKA Implemented in Python

    OpenAIRE

    Christopher Beckham; Mark Hall; Eibe Frank

    2016-01-01

    WekaPyScript is a package for the machine learning software WEKA that allows learning algorithms and preprocessing methods for classification and regression to be written in Python, as opposed to WEKA’s implementation language, Java. This opens up WEKA to its machine learning and scientific computing ecosystem. Furthermore, due to Python’s minimalist syntax, learning algorithms and preprocessing methods can be prototyped easily and utilised from within WEKA. WekaPyScript works by running a lo...

  10. Biplots in Reduced-Rank Regression

    NARCIS (Netherlands)

    Braak, ter C.J.F.; Looman, C.W.N.

    1994-01-01

    Regression problems with a number of related response variables are typically analyzed by separate multiple regressions. This paper shows how these regressions can be visualized jointly in a biplot based on reduced-rank regression. Reduced-rank regression combines multiple regression and principal c

  11. Interpretation of Standardized Regression Coefficients in Multiple Regression.

    Science.gov (United States)

    Thayer, Jerome D.

    The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for…

  12. An Online Wind Turbine Condition Assessment Method Based on SCADA and Support Vector Regression%基于SCADA和支持向量回归的风电机组状态在线评估方法

    Institute of Scientific and Technical Information of China (English)

    梁颖; 方瑞明

    2013-01-01

    为提高风电机组并网运行的实时可靠性、优化机组维修策略、降低风力发电成本,有必要充分考虑风电机组各部件或子系统之间的相互作用和耦合关系.利用数据挖掘技术,建立了一个针对风电机组整体运行状态的在线评估模型.首先,分析了数据采集与监控(SCADA)报警系统的不足,提出了基于回归预测模型和SCADA报警系统相配合的鲁棒性更强的在线评估方案;其次,对评估方案中的回归预测模型进行了详细说明,建立了以SCADA系统的部分监测项目为输入量、以风电机组有功功率为输出量的基于支持向量回归(SVR)算法的回归预测模型.最后,利用某风电场的实测数据对所提出的在线评估模型进行了验证,结果证明了此方法的可行性.%In order to improve the real-time reliability of grid-connected wind turbines,optimize the maintenance strategy,and reduce the cost of wind power generation,it is necessary to consider the interaction and coupling between components or subsystems of a wind turbine.An online assessment model for the operation conditions of the whole wind turbine is established by data mining technology.Firstly,after analyzing the shortcomings of the supervisory control and data acquisition (SCADA) warning system of wind turbines,a more robust on-line assessment scheme is proposed based on the cooperation of a regression prediction model and the SCADA warning system.Secondly,the regression prediction model is described in detail that the support vector regression (SVR) algorithm is adopted.The inputs of SVR are part of the monitoring projects of the SCADA system,and the output of SVR is the active power of the wind turbine.Finally,measurement results of a wind farm are used to verify the proposed model.

  13. Inferential Models for Linear Regression

    Directory of Open Access Journals (Sweden)

    Zuoyi Zhang

    2011-09-01

    Full Text Available Linear regression is arguably one of the most widely used statistical methods in applications.  However, important problems, especially variable selection, remain a challenge for classical modes of inference.  This paper develops a recently proposed framework of inferential models (IMs in the linear regression context.  In general, an IM is able to produce meaningful probabilistic summaries of the statistical evidence for and against assertions about the unknown parameter of interest and, moreover, these summaries are shown to be properly calibrated in a frequentist sense.  Here we demonstrate, using simple examples, that the IM framework is promising for linear regression analysis --- including model checking, variable selection, and prediction --- and for uncertain inference in general.

  14. [Is regression of atherosclerosis possible?].

    Science.gov (United States)

    Thomas, D; Richard, J L; Emmerich, J; Bruckert, E; Delahaye, F

    1992-10-01

    Experimental studies have shown the regression of atherosclerosis in animals given a cholesterol-rich diet and then given a normal diet or hypolipidemic therapy. Despite favourable results of clinical trials of primary prevention modifying the lipid profile, the concept of atherosclerosis regression in man remains very controversial. The methodological approach is difficult: this is based on angiographic data and requires strict standardisation of angiographic views and reliable quantitative techniques of analysis which are available with image processing. Several methodologically acceptable clinical coronary studies have shown not only stabilisation but also regression of atherosclerotic lesions with reductions of about 25% in total cholesterol levels and of about 40% in LDL cholesterol levels. These reductions were obtained either by drugs as in CLAS (Cholesterol Lowering Atherosclerosis Study), FATS (Familial Atherosclerosis Treatment Study) and SCOR (Specialized Center of Research Intervention Trial), by profound modifications in dietary habits as in the Lifestyle Heart Trial, or by surgery (ileo-caecal bypass) as in POSCH (Program On the Surgical Control of the Hyperlipidemias). On the other hand, trials with non-lipid lowering drugs such as the calcium antagonists (INTACT, MHIS) have not shown significant regression of existing atherosclerotic lesions but only a decrease on the number of new lesions. The clinical benefits of these regression studies are difficult to demonstrate given the limited period of observation, relatively small population numbers and the fact that in some cases the subjects were asymptomatic. The decrease in the number of cardiovascular events therefore seems relatively modest and concerns essentially subjects who were symptomatic initially. The clinical repercussion of studies of prevention involving a single lipid factor is probably partially due to the reduction in progression and anatomical regression of the atherosclerotic plaque

  15. 基于改进递归DFT与SVR融合的实时谐波检测%Real-time harmonic detection based on combined algorithm of improved recursive DFT and SVR

    Institute of Scientific and Technical Information of China (English)

    刘尚伟; 孙雅明

    2009-01-01

    根据电力系统中以整数次谐波检测为主要目标的情况,通过对递归离散傅里叶变换(DFT)和快速傅里叶变换(FFT)算法的分析比较,提出了改进递归DFT算法,即在递归DFT运算之前对首次采样做FFT运算,再将此运算结果作为递归DFT运算的初始值以进一步减少运算量.为同时提高检测精度,提出了基于改进递归DFT与改进序列最小最优化(SMO)的支持向量回归(SVR)融合的整数次谐波检测新方法.通过算例仿真,并与递归DFT及改进递归DFT相比较,证明了改进递归DFT与改进SMO的SVR融合的方法检测精度高,非常适合整数次谐波的实时检测和分析.

  16. Some relations between quantum Turing machines and Turing machines

    CERN Document Server

    Sicard, A; Sicard, Andrés; Vélez, Mario

    1999-01-01

    For quantum Turing machines we present three elements: Its components, its time evolution operator and its local transition function. The components are related with deterministic Turing machines, the time evolution operator is related with reversible Turing machines and the local transition function is related with probabilistic and reversible Turing machines.

  17. Machining of hard-to-machine materials

    OpenAIRE

    2016-01-01

    Bakalářská práce se zabývá studiem obrábění těžkoobrobitelných materiálů. V první části jsou rozděleny těžkoobrobitelné materiály a následuje jejich analýza. V další části se práce zaměřuje na problematiku obrobitelnosti jednotlivých slitin. Závěrečná část práce je věnovaná experimentu, jeho statistickému zpracování a nakonec následnému vyhodnocení. This bachelor thesis studies the machining of hard-to-machine materials. The first part of the thesis considers hard-to-machine materials and ...

  18. Nonparametric regression with filtered data

    CERN Document Server

    Linton, Oliver; Nielsen, Jens Perch; Van Keilegom, Ingrid; 10.3150/10-BEJ260

    2011-01-01

    We present a general principle for estimating a regression function nonparametrically, allowing for a wide variety of data filtering, for example, repeated left truncation and right censoring. Both the mean and the median regression cases are considered. The method works by first estimating the conditional hazard function or conditional survivor function and then integrating. We also investigate improved methods that take account of model structure such as independent errors and show that such methods can improve performance when the model structure is true. We establish the pointwise asymptotic normality of our estimators.

  19. Logistic regression for circular data

    Science.gov (United States)

    Al-Daffaie, Kadhem; Khan, Shahjahan

    2017-05-01

    This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.

  20. Quasi-least squares regression

    CERN Document Server

    Shults, Justine

    2014-01-01

    Drawing on the authors' substantial expertise in modeling longitudinal and clustered data, Quasi-Least Squares Regression provides a thorough treatment of quasi-least squares (QLS) regression-a computational approach for the estimation of correlation parameters within the framework of generalized estimating equations (GEEs). The authors present a detailed evaluation of QLS methodology, demonstrating the advantages of QLS in comparison with alternative methods. They describe how QLS can be used to extend the application of the traditional GEE approach to the analysis of unequally spaced longitu

  1. Machine (bulk) harvest

    Data.gov (United States)

    US Fish and Wildlife Service, Department of the Interior — This is a summary of machine harvesting activities on Neal Smith National Wildlife Refuge between 1991 and 2008. Information is provided for each year about...

  2. Machine Vision Handbook

    CERN Document Server

    2012-01-01

    The automation of visual inspection is becoming more and more important in modern industry as a consistent, reliable means of judging the quality of raw materials and manufactured goods . The Machine Vision Handbook  equips the reader with the practical details required to engineer integrated mechanical-optical-electronic-software systems. Machine vision is first set in the context of basic information on light, natural vision, colour sensing and optics. The physical apparatus required for mechanized image capture – lenses, cameras, scanners and light sources – are discussed followed by detailed treatment of various image-processing methods including an introduction to the QT image processing system. QT is unique to this book, and provides an example of a practical machine vision system along with extensive libraries of useful commands, functions and images which can be implemented by the reader. The main text of the book is completed by studies of a wide variety of applications of machine vision in insp...

  3. SVM for Solving Forward Problems of EIT.

    Science.gov (United States)

    Wu, Youxi; Li, Ying; Guo, Lei; Yan, Weili; Shen, Xueqin; Fu, Kun

    2005-01-01

    Support Vector Machine (SVM) can be seen as a new machine learning way which is based on the idea of VC dimensions and the principle of structural risk minimization rather than empirical risk minimization. SVM can be used for classification and regression. Support Vector Regression (SVR) is a very important branch of Support Vector Machine. Partial Differential Equations (PDEs) have been successfully treated by using SVR in previous works. The forward problems of EIT are the basis of EIT inverse problems. The forward problem's essence is to solve PDEs. The method has been successfully tested on the forward problems of EIT and has yielded accurate results.

  4. Tests of Machine Intelligence

    CERN Document Server

    Legg, Shane

    2007-01-01

    Although the definition and measurement of intelligence is clearly of fundamental importance to the field of artificial intelligence, no general survey of definitions and tests of machine intelligence exists. Indeed few researchers are even aware of alternatives to the Turing test and its many derivatives. In this paper we fill this gap by providing a short survey of the many tests of machine intelligence that have been proposed.

  5. Metalworking and machining fluids

    Science.gov (United States)

    Erdemir, Ali; Sykora, Frank; Dorbeck, Mark

    2010-10-12

    Improved boron-based metal working and machining fluids. Boric acid and boron-based additives that, when mixed with certain carrier fluids, such as water, cellulose and/or cellulose derivatives, polyhydric alcohol, polyalkylene glycol, polyvinyl alcohol, starch, dextrin, in solid and/or solvated forms result in improved metalworking and machining of metallic work pieces. Fluids manufactured with boric acid or boron-based additives effectively reduce friction, prevent galling and severe wear problems on cutting and forming tools.

  6. mlpy: Machine Learning Python

    CERN Document Server

    Albanese, Davide; Merler, Stefano; Riccadonna, Samantha; Jurman, Giuseppe; Furlanello, Cesare

    2012-01-01

    mlpy is a Python Open Source Machine Learning library built on top of NumPy/SciPy and the GNU Scientific Libraries. mlpy provides a wide range of state-of-the-art machine learning methods for supervised and unsupervised problems and it is aimed at finding a reasonable compromise among modularity, maintainability, reproducibility, usability and efficiency. mlpy is multiplatform, it works with Python 2 and 3 and it is distributed under GPL3 at the website http://mlpy.fbk.eu.

  7. Human-machine interactions

    Science.gov (United States)

    Forsythe, J. Chris; Xavier, Patrick G.; Abbott, Robert G.; Brannon, Nathan G.; Bernard, Michael L.; Speed, Ann E.

    2009-04-28

    Digital technology utilizing a cognitive model based on human naturalistic decision-making processes, including pattern recognition and episodic memory, can reduce the dependency of human-machine interactions on the abilities of a human user and can enable a machine to more closely emulate human-like responses. Such a cognitive model can enable digital technology to use cognitive capacities fundamental to human-like communication and cooperation to interact with humans.

  8. Machine Learning with Distances

    Science.gov (United States)

    2015-02-16

    and demonstrated their usefulness in experiments. 1 Introduction The goal of machine learning is to find useful knowledge behind data. Many machine...212, 172]. However, direct divergence approximators still suffer from the curse of dimensionality. A possible cure for this problem is to combine them...obtain the global optimal solution or even a good local solution without any prior knowledge . For this reason, we decided to introduce the unit-norm

  9. mlpy: Machine Learning Python

    OpenAIRE

    Albanese, Davide; Visintainer, Roberto; Merler, Stefano; Riccadonna, Samantha; Jurman, Giuseppe; Furlanello, Cesare

    2012-01-01

    mlpy is a Python Open Source Machine Learning library built on top of NumPy/SciPy and the GNU Scientific Libraries. mlpy provides a wide range of state-of-the-art machine learning methods for supervised and unsupervised problems and it is aimed at finding a reasonable compromise among modularity, maintainability, reproducibility, usability and efficiency. mlpy is multiplatform, it works with Python 2 and 3 and it is distributed under GPL3 at the website http://mlpy.fbk.eu.

  10. Applications of Support Vector Machines in Astronomy

    Science.gov (United States)

    Zhang, Y.; Zhao, Y.

    2014-05-01

    We review Support Vector Machines (SVMs) as applied in astronomy. SVMs are mainly used for solving the and regression issues. Take classification for example, selecting of cataclysmic variables from large spectroscopic survey, detecting quasar candidates from multiwavelength photometric data, identification of blue horizontal branch stars from photometric data, classification of galactic spectra, supernova search; for regression problem, photometric redshift estimation of galaxies and quasars, physical parameter measurement (metallicity, gravity, effective temperature) of stars. Comparatively, SVMs show better performance in classification than in regression. Nevertheless, SVMs has its disadvantages, which needs large computation cost on training. Based on this problem, CUDA-Accelerated SVMs is put forward. As for accuracy of SVMs, SVMs combined with other algorithms has further improvement, such as SVM-KNN.

  11. Gravitational Wave Emulation Using Gaussian Process Regression

    Science.gov (United States)

    Doctor, Zoheyr; Farr, Ben; Holz, Daniel

    2017-01-01

    Parameter estimation (PE) for gravitational wave signals from compact binary coalescences (CBCs) requires reliable template waveforms which span the parameter space. Waveforms from numerical relativity are accurate but computationally expensive, so approximate templates are typically used for PE. These `approximants', while quick to compute, can introduce systematic errors and bias PE results. We describe a machine learning method for generating CBC waveforms and uncertainties using existing accurate waveforms as a training set. Coefficients of a reduced order waveform model are computed and each treated as arising from a Gaussian process. These coefficients and their uncertainties are then interpolated using Gaussian process regression (GPR). As a proof of concept, we construct a training set of approximant waveforms (rather than NR waveforms) in the two-dimensional space of chirp mass and mass ratio and interpolate new waveforms with GPR. We demonstrate that the mismatch between interpolated waveforms and approximants is below the 1% level for an appropriate choice of training set and GPR kernel hyperparameters.

  12. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    Science.gov (United States)

    Verdoolaege, G.; Shabbir, A.; Hornung, G.

    2016-11-01

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standard least squares.

  13. Regression of lumbar disk herniation

    Directory of Open Access Journals (Sweden)

    G. Yu Evzikov

    2015-01-01

    Full Text Available Compression of the spinal nerve root, giving rise to pain and sensory and motor disorders in the area of its innervation is the most vivid manifestation of herniated intervertebral disk. Different treatment modalities, including neurosurgery, for evolving these conditions are discussed. There has been recent evidence that spontaneous regression of disk herniation can regress. The paper describes a female patient with large lateralized disc extrusion that has caused compression of the nerve root S1, leading to obvious myotonic and radicular syndrome. Magnetic resonance imaging has shown that the clinical manifestations of discogenic radiculopathy, as well myotonic syndrome and morphological changes completely regressed 8 months later. The likely mechanism is inflammation-induced resorption of a large herniated disk fragment, which agrees with the data available in the literature. A decision to perform neurosurgery for which the patient had indications was made during her first consultation. After regression of discogenic radiculopathy, there was only moderate pain caused by musculoskeletal diseases (facet syndrome, piriformis syndrome that were successfully eliminated by minimally invasive techniques. 

  14. Heteroscedasticity checks for regression models

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    For checking on heteroscedasticity in regression models, a unified approach is proposed to constructing test statistics in parametric and nonparametric regression models. For nonparametric regression, the test is not affected sensitively by the choice of smoothing parameters which are involved in estimation of the nonparametric regression function. The limiting null distribution of the test statistic remains the same in a wide range of the smoothing parameters. When the covariate is one-dimensional, the tests are, under some conditions, asymptotically distribution-free. In the high-dimensional cases, the validity of bootstrap approximations is investigated. It is shown that a variant of the wild bootstrap is consistent while the classical bootstrap is not in the general case, but is applicable if some extra assumption on conditional variance of the squared error is imposed. A simulation study is performed to provide evidence of how the tests work and compare with tests that have appeared in the literature. The approach may readily be extended to handle partial linear, and linear autoregressive models.

  15. Cactus: An Introduction to Regression

    Science.gov (United States)

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  16. Growth Regression and Economic Theory

    NARCIS (Netherlands)

    Elbers, Chris; Gunning, Jan Willem

    2002-01-01

    In this note we show that the standard, loglinear growth regression specificationis consistent with one and only one model in the class of stochastic Ramsey models. Thismodel is highly restrictive: it requires a Cobb-Douglas technology and a 100% depreciationrate and it implies that risk does not af

  17. Correlation Weights in Multiple Regression

    Science.gov (United States)

    Waller, Niels G.; Jones, Jeff A.

    2010-01-01

    A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

  18. Ridge Regression for Interactive Models.

    Science.gov (United States)

    Tate, Richard L.

    1988-01-01

    An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are favorable to…

  19. Soft sensor modeling of sewage disposal process based on multi-scale wavelet least square support vector regression%基于多尺度小波LSSVR的污水处理过程软测量

    Institute of Scientific and Technical Information of China (English)

    王鲜芳; 朱晓霞; 吴瑞红; 郑延斌

    2012-01-01

    To solve the problem that some parameters are difficult to be measured on-line in the process of waste water disposal, a soft measurement modeling method is presented base on multi-scale wavelet least square support vector machine in this Paper. Mexican-hat wavelet function is used as the support vector kernel function, and further the Multi-scale Wavelet Least square Support Vector Regression (MW-LSSVR) algorithm is presented. Build an advanced model with above SVR and characteristics between BOD&COD, predicting BOD&COD of drainage that had been treated. Through using this method in practical sewage disposal process, the result shows that this modeling method has higher precision and faster learning speed of BOD model, can make accurate predictions, can replace online measuring instrument in some expensive, provide control operation basis to the sewage treatment plant workers, and has a certain practical value.%针对污水处理中某些生物参数难以在线测量的情况,本文提出了一种基于小波核的多尺度最小二乘小波支持向量机软测量建模方法.首先,选取墨西哥草帽小波函数作为最小二乘支持向量机的核函数,进而设计出多尺度小波最小二乘支持向量回归机(MW-LSSVR).然后利用该支持向量机和出水水质参数特性建立混合软测量模型,实现对出水BOD浓度、COD浓度在线预测.通过在实际污水处理过程的应用,结果表明本建模方法具有较高的预测精度和较快的模型学习速度,能对BOD的做出准确的预测,一定程度上可以替代某些昂贵的在线测量仪表,给污水处理厂工作人员提供了控制操作依据,具有一定的实际应用价值.

  20. Logistic regression: a brief primer.

    Science.gov (United States)

    Stoltzfus, Jill C

    2011-10-01

    Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model

  1. EVALUATION OF MACHINE TOOL QUALITY

    Directory of Open Access Journals (Sweden)

    Ivan Kuric

    2011-12-01

    Full Text Available Paper deals with aspects of quality and accuracy of machine tools. As the accuracy of machine tools has key factor for product quality, it is important to know the methods for evaluation of quality and accuracy of machine tools. Several aspects of diagnostics of machine tools are described, such as aspects of reliability.

  2. An HTS machine laboratory prototype

    DEFF Research Database (Denmark)

    Mijatovic, Nenad; Jensen, Bogi Bech; Træholt, Chresten

    2012-01-01

    This paper describes Superwind HTS machine laboratory setup which is a small scale HTS machine designed and build as a part of the efforts to identify and tackle some of the challenges the HTS machine design may face. One of the challenges of HTS machines is a Torque Transfer Element (TTE) which...

  3. Machining of fiber reinforced composites

    Science.gov (United States)

    Komanduri, Ranga; Zhang, Bi; Vissa, Chandra M.

    Factors involved in machining of fiber-reinforced composites are reviewed. Consideration is given to properties of composites reinforced with boron filaments, glass fibers, aramid fibers, carbon fibers, and silicon carbide fibers and to polymer (organic) matrix composites, metal matrix composites, and ceramic matrix composites, as well as to the processes used in conventional machining of boron-titanium composites and of composites reinforced by each of these fibers. Particular attention is given to the methods of nonconventional machining, such as laser machining, water jet cutting, electrical discharge machining, and ultrasonic assisted machining. Also discussed are safety precautions which must be taken during machining of fiber-containing composites.

  4. Machining of Metal Matrix Composites

    CERN Document Server

    2012-01-01

    Machining of Metal Matrix Composites provides the fundamentals and recent advances in the study of machining of metal matrix composites (MMCs). Each chapter is written by an international expert in this important field of research. Machining of Metal Matrix Composites gives the reader information on machining of MMCs with a special emphasis on aluminium matrix composites. Chapter 1 provides the mechanics and modelling of chip formation for traditional machining processes. Chapter 2 is dedicated to surface integrity when machining MMCs. Chapter 3 describes the machinability aspects of MMCs. Chapter 4 contains information on traditional machining processes and Chapter 5 is dedicated to the grinding of MMCs. Chapter 6 describes the dry cutting of MMCs with SiC particulate reinforcement. Finally, Chapter 7 is dedicated to computational methods and optimization in the machining of MMCs. Machining of Metal Matrix Composites can serve as a useful reference for academics, manufacturing and materials researchers, manu...

  5. Machine learning in geosciences and remote sensing

    Institute of Scientific and Technical Information of China (English)

    David J. Lary; Amir H. Alavi; Amir H. Gandomi; Annette L. Walker

    2016-01-01

    Learning incorporates a broad range of complex procedures. Machine learning (ML) is a subdivision of artificial intelligence based on the biological learning process. The ML approach deals with the design of algorithms to learn from machine readable data. ML covers main domains such as data mining, difficult-to-program applications, and software applications. It is a collection of a variety of algorithms (e.g. neural networks, support vector machines, self-organizing map, decision trees, random forests, case-based reasoning, genetic programming, etc.) that can provide multivariate, nonlinear, nonparametric regres-sion or classification. The modeling capabilities of the ML-based methods have resulted in their extensive applications in science and engineering. Herein, the role of ML as an effective approach for solving problems in geosciences and remote sensing will be highlighted. The unique features of some of the ML techniques will be outlined with a specific attention to genetic programming paradigm. Furthermore, nonparametric regression and classification illustrative examples are presented to demonstrate the ef-ficiency of ML for tackling the geosciences and remote sensing problems.

  6. Machine learning in geosciences and remote sensing

    Directory of Open Access Journals (Sweden)

    David J. Lary

    2016-01-01

    Full Text Available Learning incorporates a broad range of complex procedures. Machine learning (ML is a subdivision of artificial intelligence based on the biological learning process. The ML approach deals with the design of algorithms to learn from machine readable data. ML covers main domains such as data mining, difficult-to-program applications, and software applications. It is a collection of a variety of algorithms (e.g. neural networks, support vector machines, self-organizing map, decision trees, random forests, case-based reasoning, genetic programming, etc. that can provide multivariate, nonlinear, nonparametric regression or classification. The modeling capabilities of the ML-based methods have resulted in their extensive applications in science and engineering. Herein, the role of ML as an effective approach for solving problems in geosciences and remote sensing will be highlighted. The unique features of some of the ML techniques will be outlined with a specific attention to genetic programming paradigm. Furthermore, nonparametric regression and classification illustrative examples are presented to demonstrate the efficiency of ML for tackling the geosciences and remote sensing problems.

  7. Non-conventional electrical machines

    CERN Document Server

    Rezzoug, Abderrezak

    2013-01-01

    The developments of electrical machines are due to the convergence of material progress, improved calculation tools, and new feeding sources. Among the many recent machines, the authors have chosen, in this first book, to relate the progress in slow speed machines, high speed machines, and superconducting machines. The first part of the book is dedicated to materials and an overview of magnetism, mechanic, and heat transfer.

  8. A Multiple Regression Approach to Normalization of Spatiotemporal Gait Features.

    Science.gov (United States)

    Wahid, Ferdous; Begg, Rezaul; Lythgo, Noel; Hass, Chris J; Halgamuge, Saman; Ackland, David C

    2016-04-01

    Normalization of gait data is performed to reduce the effects of intersubject variations due to physical characteristics. This study reports a multiple regression normalization approach for spatiotemporal gait data that takes into account intersubject variations in self-selected walking speed and physical properties including age, height, body mass, and sex. Spatiotemporal gait data including stride length, cadence, stance time, double support time, and stride time were obtained from healthy subjects including 782 children, 71 adults, 29 elderly subjects, and 28 elderly Parkinson's disease (PD) patients. Data were normalized using standard dimensionless equations, a detrending method, and a multiple regression approach. After normalization using dimensionless equations and the detrending method, weak to moderate correlations between walking speed, physical properties, and spatiotemporal gait features were observed (0.01 normalization using the multiple regression method reduced these correlations to weak values (|r| normalization using dimensionless equations and detrending resulted in significant differences in stride length and double support time of PD patients; however the multiple regression approach revealed significant differences in these features as well as in cadence, stance time, and stride time. The proposed multiple regression normalization may be useful in machine learning, gait classification, and clinical evaluation of pathological gait patterns.

  9. Machinability evaluation of machinable ceramics with fuzzy theory

    Institute of Scientific and Technical Information of China (English)

    YU Ai-bing; ZHONG Li-jun; TAN Ye-fa

    2005-01-01

    The property parameters and machining output parameters were selected for machinability evaluation of machinable ceramics. Based on fuzzy evaluation theory, two-stage fuzzy evaluation approach was applied to consider these parameters. Two-stage fuzzy comprehensive evaluation model was proposed to evaluate machinability of machinable ceramic materials. Ce-ZrO2/CePO4 composites were fabricated and machined for evaluation of machinable ceramics. Material removal rates and specific normal grinding forces were measured. The parameters concerned with machinability were selected as alternative set. Five grades were chosen for the machinability evaluation of machnable ceramics. Machinability grades of machinable ceramics were determined through fuzzy operation. Ductile marks are observed on Ce-ZrO2/CePO4 machined surface. Five prepared Ce-ZrO2/CePO4 composites are classified as three machinability grades according to the fuzzy comprehensive evaluation results. The machinability grades of Ce-ZrO2/CePO4 composites are concerned with CePO4 content.

  10. Polynomial Regressions and Nonsense Inference

    Directory of Open Access Journals (Sweden)

    Daniel Ventosa-Santaulària

    2013-11-01

    Full Text Available Polynomial specifications are widely used, not only in applied economics, but also in epidemiology, physics, political analysis and psychology, just to mention a few examples. In many cases, the data employed to estimate such specifications are time series that may exhibit stochastic nonstationary behavior. We extend Phillips’ results (Phillips, P. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. by proving that an inference drawn from polynomial specifications, under stochastic nonstationarity, is misleading unless the variables cointegrate. We use a generalized polynomial specification as a vehicle to study its asymptotic and finite-sample properties. Our results, therefore, lead to a call to be cautious whenever practitioners estimate polynomial regressions.

  11. Producing The New Regressive Left

    DEFF Research Database (Denmark)

    Crone, Christine

    to be a committed artist, and how that translates into supporting al-Assad’s rule in Syria; the Ramadan programme Harrir Aqlak’s attempt to relaunch an intellectual renaissance and to promote religious pluralism; and finally, al-Mayadeen’s cooperation with the pan-Latin American TV station TeleSur and its ambitions...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...... coalition (Iran, Hizbollah, Syria), capitalises on a series of factors that bring them together in spite of their otherwise diverse worldviews and agendas. The New Regressive Left is united by resistance against the growing influence of Saudi Arabia in the religious, cultural, political, economic...

  12. Quantile Regression With Measurement Error

    KAUST Repository

    Wei, Ying

    2009-08-27

    Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.

  13. Heteroscedasticity checks for regression models

    Institute of Scientific and Technical Information of China (English)

    ZHU; Lixing

    2001-01-01

    [1]Carroll, R. J., Ruppert, D., Transformation and Weighting in Regression, New York: Chapman and Hall, 1988.[2]Cook, R. D., Weisberg, S., Diagnostics for heteroscedasticity in regression, Biometrika, 1988, 70: 1—10.[3]Davidian, M., Carroll, R. J., Variance function estimation, J. Amer. Statist. Assoc., 1987, 82: 1079—1091.[4]Bickel, P., Using residuals robustly I: Tests for heteroscedasticity, Ann. Statist., 1978, 6: 266—291.[5]Carroll, R. J., Ruppert, D., On robust tests for heteroscedasticity, Ann. Statist., 1981, 9: 205—209.[6]Eubank, R. L., Thomas, W., Detecting heteroscedasticity in nonparametric regression, J. Roy. Statist. Soc., Ser. B, 1993, 55: 145—155.[7]Diblasi, A., Bowman, A., Testing for constant variance in a linear model, Statist. and Probab. Letters, 1997, 33: 95—103.[8]Dette, H., Munk, A., Testing heteoscedasticity in nonparametric regression, J. R. Statist. Soc. B, 1998, 60: 693—708.[9]Müller, H. G., Zhao, P. L., On a semi-parametric variance function model and a test for heteroscedasticity, Ann. Statist., 1995, 23: 946—967.[10]Stute, W., Manteiga, G., Quindimil, M. P., Bootstrap approximations in model checks for regression, J. Amer. Statist. Asso., 1998, 93: 141—149.[11]Stute, W., Thies, G., Zhu, L. X., Model checks for regression: An innovation approach, Ann. Statist., 1998, 26: 1916—1939.[12]Shorack, G. R., Wellner, J. A., Empirical Processes with Applications to Statistics, New York: Wiley, 1986.[13]Efron, B., Bootstrap methods: Another look at the jackknife, Ann. Statist., 1979, 7: 1—26.[14]Wu, C. F. J., Jackknife, bootstrap and other re-sampling methods in regression analysis, Ann. Statist., 1986, 14: 1261—1295.[15]H rdle, W., Mammen, E., Comparing non-parametric versus parametric regression fits, Ann. Statist., 1993, 21: 1926—1947.[16]Liu, R. Y., Bootstrap procedures under some non-i.i.d. models, Ann. Statist., 1988, 16: 1696—1708.[17

  14. Clustered regression with unknown clusters

    CERN Document Server

    Barman, Kishor

    2011-01-01

    We consider a collection of prediction experiments, which are clustered in the sense that groups of experiments ex- hibit similar relationship between the predictor and response variables. The experiment clusters as well as the regres- sion relationships are unknown. The regression relation- ships define the experiment clusters, and in general, the predictor and response variables may not exhibit any clus- tering. We call this prediction problem clustered regres- sion with unknown clusters (CRUC) and in this paper we focus on linear regression. We study and compare several methods for CRUC, demonstrate their applicability to the Yahoo Learning-to-rank Challenge (YLRC) dataset, and in- vestigate an associated mathematical model. CRUC is at the crossroads of many prior works and we study several prediction algorithms with diverse origins: an adaptation of the expectation-maximization algorithm, an approach in- spired by K-means clustering, the singular value threshold- ing approach to matrix rank minimization u...

  15. MACHINE MOTION EQUATIONS

    Directory of Open Access Journals (Sweden)

    Florian Ion Tiberiu Petrescu

    2015-09-01

    Full Text Available This paper presents the dynamic, original, machine motion equations. The equation of motion of the machine that generates angular speed of the shaft (which varies with position and rotation speed is deduced by conservation kinetic energy of the machine. An additional variation of angular speed is added by multiplying by the coefficient dynamic D (generated by the forces out of mechanism and or by the forces generated by the elasticity of the system. Kinetic energy conservation shows angular speed variation (from the shaft with inertial masses, while the dynamic coefficient introduces the variation of w with forces acting in the mechanism. Deriving the first equation of motion of the machine one can obtain the second equation of motion dynamic. From the second equation of motion of the machine it determines the angular acceleration of the shaft. It shows the distribution of the forces on the mechanism to the internal combustion heat engines. Dynamic, the velocities can be distributed in the same way as forces. Practically, in the dynamic regimes, the velocities have the same timing as the forces. Calculations should be made for an engine with a single cylinder. Originally exemplification is done for a classic distribution mechanism, and then even the module B distribution mechanism of an Otto engine type.

  16. Robust nonlinear regression in applications

    OpenAIRE

    Lim, Changwon; Sen, Pranab K.; Peddada, Shyamal D.

    2013-01-01

    Robust statistical methods, such as M-estimators, are needed for nonlinear regression models because of the presence of outliers/influential observations and heteroscedasticity. Outliers and influential observations are commonly observed in many applications, especially in toxicology and agricultural experiments. For example, dose response studies, which are routinely conducted in toxicology and agriculture, sometimes result in potential outliers, especially in the high dose gr...

  17. Astronomical Methods for Nonparametric Regression

    Science.gov (United States)

    Steinhardt, Charles L.; Jermyn, Adam

    2017-01-01

    I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.

  18. Standard Precipitation Index Drought Forecasting Using Neural Networks, Wavelet Neural Networks, and Support Vector Regression

    Directory of Open Access Journals (Sweden)

    A. Belayneh

    2012-01-01

    Full Text Available Drought forecasts can be an effective tool for mitigating some of the more adverse consequences of drought. Data-driven models are suitable forecasting tools due to their rapid development times, as well as minimal information requirements compared to the information required for physically based models. This study compares the effectiveness of three data-driven models for forecasting drought conditions in the Awash River Basin of Ethiopia. The Standard Precipitation Index (SPI is forecast and compared using artificial neural networks (ANNs, support vector regression (SVR, and wavelet neural networks (WN. SPI 3 and SPI 12 were the SPI values that were forecasted. These SPI values were forecast over lead times of 1 and 6 months. The performance of all the models was compared using RMSE, MAE, and R2. The forecast results indicate that the coupled wavelet neural network (WN models were the best models for forecasting SPI values over multiple lead times in the Awash River Basin in Ethiopia.

  19. Estimating stellar atmospheric parameters based on LASSO and support-vector regression

    CERN Document Server

    Lu, Yu

    2015-01-01

    A scheme for estimating atmospheric parameters T$_{eff}$, log$~g$, and [Fe/H] is proposed on the basis of Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and Haar wavelet. The proposed scheme consists of three processes. A spectrum is decomposed using the Haar wavelet transform and low-frequency components at the fourth level are considered as candidate features. Then, spectral features from the candidate features are detected using the LASSO algorithm to estimate the atmospheric parameters. Finally, atmospheric parameters are estimated from the extracted spectral features using the support-vector regression (SVR) method. The proposed scheme was evaluated using three sets of stellar spectra respectively from Sloan Digital Sky Survey (SDSS), Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST), and Kurucz's model, respectively. The mean absolute errors are as follows: for 40~000 SDSS spectra, 0.0062 dex for log~T$_{eff}$ (85.83 K for T$_{eff}$), 0.2035 dex for log$~g$ and 0.1512...

  20. A Novel Method for Flatness Pattern Recognition via Least Squares Support Vector Regression

    Institute of Scientific and Technical Information of China (English)

    2012-01-01

    To adapt to the new requirement of the developing flatness control theory and technology, cubic patterns were introduced on the basis of the traditional linear, quadratic and quartic flatness basic patterns. Linear, quadratic, cubic and quartic Legendre orthogonal polynomials were adopted to express the flatness basic patterns. In order to over- come the defects live in the existent recognition methods based on fuzzy, neural network and support vector regres- sion (SVR) theory, a novel flatness pattern recognition method based on least squares support vector regression (LS-SVR) was proposed. On this basis, for the purpose of determining the hyper-parameters of LS-SVR effectively and enhan- cing the recognition accuracy and generalization performance of the model, particle swarm optimization algorithm with leave-one-out (LOO) error as fitness function was adopted. To overcome the disadvantage of high computational complexity of naive cross-validation algorithm, a novel fast cross-validation algorithm was introduced to calculate the LOO error of LDSVR. Results of experiments on flatness data calculated by theory and a 900HC cold-rolling mill practically measured flatness signals demonstrate that the proposed approach can distinguish the types and define the magnitudes of the flatness defects effectively with high accuracy, high speed and strong generalization ability.