Directory of Open Access Journals (Sweden)
Hailun Wang
2017-01-01
Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.
Support Vector Regression Method for Wind Speed Prediction Incorporating Probability Prior Knowledge
Directory of Open Access Journals (Sweden)
Jiqiang Chen
2014-01-01
Full Text Available Prior knowledge, such as wind speed probability distribution based on historical data and the wind speed fluctuation between the maximal value and the minimal value in a certain period of time, provides much more information about the wind speed, so it is necessary to incorporate it into the wind speed prediction. First, a method of estimating wind speed probability distribution based on historical data is proposed based on Bernoulli’s law of large numbers. Second, in order to describe the wind speed fluctuation between the maximal value and the minimal value in a certain period of time, the probability distribution estimated by the proposed method is incorporated into the training data and the testing data. Third, a support vector regression model for wind speed prediction is proposed based on standard support vector regression. At last, experiments predicting the wind speed in a certain wind farm show that the proposed method is feasible and effective and the model’s running time and prediction errors can meet the needs of wind speed prediction.
Wu, Peilin; Zhang, Qunying; Fei, Chunjiao; Fang, Guangyou
2017-04-01
Aeromagnetic gradients are typically measured by optically pumped magnetometers mounted on an aircraft. Any aircraft, particularly helicopters, produces significant levels of magnetic interference. Therefore, aeromagnetic compensation is essential, and least square (LS) is the conventional method used for reducing interference levels. However, the LSs approach to solving the aeromagnetic interference model has a few difficulties, one of which is in handling multicollinearity. Therefore, we propose an aeromagnetic gradient compensation method, specifically targeted for helicopter use but applicable on any airborne platform, which is based on the ɛ-support vector regression algorithm. The structural risk minimization criterion intrinsic to the method avoids multicollinearity altogether. Local aeromagnetic anomalies can be retained, and platform-generated fields are suppressed simultaneously by constructing an appropriate loss function and kernel function. The method was tested using an unmanned helicopter and obtained improvement ratios of 12.7 and 3.5 in the vertical and horizontal gradient data, respectively. Both of these values are probably better than those that would have been obtained from the conventional method applied to the same data, had it been possible to do so in a suitable comparative context. The validity of the proposed method is demonstrated by the experimental result.
Castelletti, Davide; Demir, Begüm; Bruzzone, Lorenzo
2014-10-01
This paper presents a novel semisupervised learning (SSL) technique defined in the context of ɛ-insensitive support vector regression (SVR) to estimate biophysical parameters from remotely sensed images. The proposed SSL method aims to mitigate the problems of small-sized biased training sets without collecting any additional samples with reference measures. This is achieved on the basis of two consecutive steps. The first step is devoted to inject additional priors information in the learning phase of the SVR in order to adapt the importance of each training sample according to distribution of the unlabeled samples. To this end, a weight is initially associated to each training sample based on a novel strategy that defines higher weights for the samples located in the high density regions of the feature space while giving reduced weights to those that fall into the low density regions of the feature space. Then, in order to exploit different weights for training samples in the learning phase of the SVR, we introduce a weighted SVR (WSVR) algorithm. The second step is devoted to jointly exploit labeled and informative unlabeled samples for further improving the definition of the WSVR learning function. To this end, the most informative unlabeled samples that have an expected accurate target values are initially selected according to a novel strategy that relies on the distribution of the unlabeled samples in the feature space and on the WSVR function estimated at the first step. Then, we introduce a restructured WSVR algorithm that jointly uses labeled and unlabeled samples in the learning phase of the WSVR algorithm and tunes their importance by different values of regularization parameters. Experimental results obtained for the estimation of single-tree stem volume show the effectiveness of the proposed SSL method.
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...
Active set support vector regression.
Musicant, David R; Feinberg, Alexander
2004-03-01
This paper presents active set support vector regression (ASVR), a new active set strategy to solve a straightforward reformulation of the standard support vector regression problem. This new algorithm is based on the successful ASVM algorithm for classification problems, and consists of solving a finite number of linear equations with a typically large dimensionality equal to the number of points to be approximated. However, by making use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available. ASVR is extremely fast, produces comparable generalization error to other popular algorithms, and is available on the web for download.
Directory of Open Access Journals (Sweden)
Mehmet Das
2018-01-01
Full Text Available In this study, an air heated solar collector (AHSC dryer was designed to determine the drying characteristics of the pear. Flat pear slices of 10 mm thickness were used in the experiments. The pears were dried both in the AHSC dryer and under the sun. Panel glass temperature, panel floor temperature, panel inlet temperature, panel outlet temperature, drying cabinet inlet temperature, drying cabinet outlet temperature, drying cabinet temperature, drying cabinet moisture, solar radiation, pear internal temperature, air velocity and mass loss of pear were measured at 30 min intervals. Experiments were carried out during the periods of June 2017 in Elazig, Turkey. The experiments started at 8:00 a.m. and continued till 18:00. The experiments were continued until the weight changes in the pear slices stopped. Wet basis moisture content (MCw, dry basis moisture content (MCd, adjustable moisture ratio (MR, drying rate (DR, and convective heat transfer coefficient (hc were calculated with both in the AHSC dryer and the open sun drying experiment data. It was found that the values of hc in both drying systems with a range 12.4 and 20.8 W/m2 °C. Three different kernel models were used in the support vector machine (SVM regression to construct the predictive model of the calculated hc values for both systems. The mean absolute error (MAE, root mean squared error (RMSE, relative absolute error (RAE and root relative absolute error (RRAE analysis were performed to indicate the predictive model’s accuracy. As a result, the rate of drying of the pear was examined for both systems and it was observed that the pear had dried earlier in the AHSC drying system. A predictive model was obtained using the SVM regression for the calculated hc values for the pear in the AHSC drying system. The normalized polynomial kernel was determined as the best kernel model in SVM for estimating the hc values.
Directory of Open Access Journals (Sweden)
A. Faridi
2013-11-01
Full Text Available Support vector regression (SVR is used in this study to develop models to estimate apparent metabolizable energy (AME, AME corrected for nitrogen (AMEn, true metabolizable energy (TME, and TME corrected for nitrogen (TMEn contents of corn fed to ducks based on its chemical composition. Performance of the SVR models was assessed by comparing their results with those of artificial neural network (ANN and multiple linear regression (MLR models. The input variables to estimate metabolizable energy content (MJ kg-1 of corn were crude protein, ether extract, crude fibre, and ash (g kg-1. Goodness of fit of the models was examined using R2, mean square error, and bias. Based on these indices, the predictive performance of the SVR, ANN, and MLR models was acceptable. Comparison of models indicated that performance of SVR (in terms of R2 on the full data set (0.937 for AME, 0.954 for AMEn, 0.860 for TME, and 0.937 for TMEn was better than that of ANN (0.907 for AME, 0.922 for AMEn, 0.744 for TME, and 0.920 for TMEn and MLR (0.887 for AME, 0.903 for AMEn, 0.704 for TME, and 0.902 for TMEn. Similar findings were observed with the calibration and testing data sets. These results suggest SVR models are a promising tool for modelling the relationship between chemical composition and metabolizable energy of feedstuffs for poultry. Although from the present results the application of SVR models seems encouraging, the use of such models in other areas of animal nutrition needs to be evaluated.
Qin, Li-Tang; Liu, Shu-Shen; Liu, Hai-Ling; Zhang, Yong-Hong
2010-01-01
Accurate description of hormetic dose-response curves (DRC) is a key step for the determination of the efficacy and hazards of the pollutants with the hormetic phenomenon. This study tries to use support vector regression (SVR) and least squares support vector regression (LS-SVR) to address the problem of curve fitting existing in hormesis. The SVR and LS-SVR, which are entirely different from the non-linear fitting methods used to describe hormetic effects based on large sample, are at present only optimum methods based on small sample often encountered in the experimental toxicology. The tuning parameters (C and p1 for SVR, gam and sig2 for LS-SVR) determining SVR and LS-SVR models were obtained by both the internal and external validation of the models. The internal validation was performed by using leave-one-out (LOO) cross-validation and the external validation was performed by splitting the whole data set (12 data points) into the same size (six data points) of training set and test set. The results show that SVR and LS-SVR can accurately describe not only for the hermetic J-shaped DRC of seven water-soluble organic solvents consisting of acetonitrile, methanol, ethanol, acetone, ether, tetrahydrofuran, and isopropanol, but also for the classical sigmoid DRC of six pesticides including simetryn, prometon, bromacil, velpar, diquat-dibromide monohydrate, and dichlorvos. Copyright 2009 Elsevier Ltd. All rights reserved.
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Directory of Open Access Journals (Sweden)
Ying-Hsin Chang
2013-01-01
Full Text Available Human estrogen receptor (ER isoforms, ERα and ERβ, have long been an important focus in the field of biology. To better understand the structural features associated with the binding of ERα ligands to ERα and modulate their function, several QSAR models, including CoMFA, CoMSIA, SVR, and LR methods, have been employed to predict the inhibitory activity of 68 raloxifene derivatives. In the SVR and LR modeling, 11 descriptors were selected through feature ranking and sequential feature addition/deletion to generate equations to predict the inhibitory activity toward ERα. Among four descriptors that constantly appear in various generated equations, two agree with CoMFA and CoMSIA steric fields and another two can be correlated to a calculated electrostatic potential of ERα.
Fault Isolation for Nonlinear Systems Using Flexible Support Vector Regression
Directory of Open Access Journals (Sweden)
Yufang Liu
2014-01-01
Full Text Available While support vector regression is widely used as both a function approximating tool and a residual generator for nonlinear system fault isolation, a drawback for this method is the freedom in selecting model parameters. Moreover, for samples with discordant distributing complexities, the selection of reasonable parameters is even impossible. To alleviate this problem we introduce the method of flexible support vector regression (F-SVR, which is especially suited for modelling complicated sample distributions, as it is free from parameters selection. Reasonable parameters for F-SVR are automatically generated given a sample distribution. Lastly, we apply this method in the analysis of the fault isolation of high frequency power supplies, where satisfactory results have been obtained.
Cardiovascular Response Identification Based on Nonlinear Support Vector Regression
Wang, Lu; Su, Steven W.; Chan, Gregory S. H.; Celler, Branko G.; Cheng, Teddy M.; Savkin, Andrey V.
This study experimentally investigates the relationships between central cardiovascular variables and oxygen uptake based on nonlinear analysis and modeling. Ten healthy subjects were studied using cycle-ergometry exercise tests with constant workloads ranging from 25 Watt to 125 Watt. Breath by breath gas exchange, heart rate, cardiac output, stroke volume and blood pressure were measured at each stage. The modeling results proved that the nonlinear modeling method (Support Vector Regression) outperforms traditional regression method (reducing Estimation Error between 59% and 80%, reducing Testing Error between 53% and 72%) and is the ideal approach in the modeling of physiological data, especially with small training data set.
Deep Support Vector Machines for Regression Problems
Wiering, Marco; Schutten, Marten; Millea, Adrian; Meijster, Arnold; Schomaker, Lambertus
2013-01-01
In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Directory of Open Access Journals (Sweden)
Zhan-bo Chen
2014-01-01
Full Text Available In order to improve the performance prediction accuracy of hydraulic excavator, the regression least squares support vector machine is applied. First, the mathematical model of the regression least squares support vector machine is studied, and then the algorithm of the regression least squares support vector machine is designed. Finally, the performance prediction simulation of hydraulic excavator based on regression least squares support vector machine is carried out, and simulation results show that this method can predict the performance changing rules of hydraulic excavator correctly.
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Vector wave propagation method.
Fertig, M; Brenner, K-H
2010-04-01
In this paper, we extend the scalar wave propagation method (WPM) to vector fields. The WPM [Appl. Opt.32, 4984 (1993)] was introduced in order to overcome the major limitations of the beam propagation method (BPM). With the WPM, the range of application can be extended from the simulation of waveguides to simulation of other optical elements like lenses, prisms and gratings. In that reference it was demonstrated that the wave propagation scheme provides valid results for propagation angles up to 85 degrees and that it is not limited to small index variations in the axis of propagation. Here, we extend the WPM to three-dimensional vectorial fields (VWPMs) by considering the polarization dependent Fresnel coefficients for transmission in each propagation step. The continuity of the electric field is maintained in all three dimensions by an enhanced propagation vector and the transfer matrix. We verify the validity of the method by transmission through a prism and by comparison with the focal distribution from vectorial Debye theory. Furthermore, a two-dimensional grating is simulated and compared with the results from three-dimensional RCWA. Especially for 3D problems, the runtime of the VWPM exhibits special advantage over the RCWA.
Theory of net analyte signal vectors in inverse regression
DEFF Research Database (Denmark)
Bro, R.; Andersen, Charlotte Møller
2003-01-01
The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...
Mixed kernel function support vector regression for global sensitivity analysis
Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng
2017-11-01
Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.
Knowledge-Based Green's Kernel for Support Vector Regression
Directory of Open Access Journals (Sweden)
Tahir Farooq
2010-01-01
Full Text Available This paper presents a novel prior knowledge-based Green's kernel for support vector regression (SVR. After reviewing the correspondence between support vector kernels used in support vector machines (SVMs and regularization operators used in regularization networks and the use of Green's function of their corresponding regularization operators to construct support vector kernels, a mathematical framework is presented to obtain the domain knowledge about magnitude of the Fourier transform of the function to be predicted and design a prior knowledge-based Green's kernel that exhibits optimal regularization properties by using the concept of matched filters. The matched filter behavior of the proposed kernel function makes it suitable for signals corrupted with noise that includes many real world systems. We conduct several experiments mostly using benchmark datasets to compare the performance of our proposed technique with the results already published in literature for other existing support vector kernel over a variety of settings including different noise levels, noise models, loss functions, and SVM variations. Experimental results indicate that knowledge-based Green's kernel could be seen as a good choice among the other candidate kernel functions.
A Simpler Approach to Coefficient Regularized Support Vector Machines Regression
Directory of Open Access Journals (Sweden)
Hongzhi Tong
2014-01-01
Full Text Available We consider a kind of support vector machines regression (SVMR algorithms associated with lq (1≤q<∞ coefficient-based regularization and data-dependent hypothesis space. Compared with former literature, we provide here a simpler convergence analysis for those algorithms. The novelty of our analysis lies in the estimation of the hypothesis error, which is implemented by setting a stepping stone between the coefficient regularized SVMR and the classical SVMR. An explicit learning rate is then derived under very mild conditions.
Clifford support vector machines for classification, regression, and recurrence.
Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy
2010-11-01
This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.
Jeffrey T. Walton
2008-01-01
Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
Comparison of ν-support vector regression and logistic equation for ...
African Journals Online (AJOL)
Due to the complexity and high non-linearity of bioprocess, most simple mathematical models fail to describe the exact behavior of biochemistry systems. As a novel type of learning method, support vector regression (SVR) owns the powerful capability to characterize problems via small sample, nonlinearity, high dimension ...
Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms
Hu, Zhongyi; Xiong, Tao
2013-01-01
Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature. PMID:24459425
Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms
Directory of Open Access Journals (Sweden)
Zhongyi Hu
2013-01-01
Full Text Available Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA based memetic algorithm (FA-MA to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.
Support Vector Regression Model for Direct Methanol Fuel Cell
Tang, J. L.; Cai, C. Z.; Xiao, T. T.; Huang, S. J.
2012-07-01
The purpose of this paper is to establish a direct methanol fuel cell (DMFC) prediction model by using the support vector regression (SVR) approach combined with particle swarm optimization (PSO) algorithm for its parameter selection. Two variables, cell temperature and cell current density were employed as input variables, cell voltage value of DMFC acted as output variable. Using leave-one-out cross-validation (LOOCV) test on 21 samples, the maximum absolute percentage error (APE) yields 5.66%, the mean absolute percentage error (MAPE) is only 0.93% and the correlation coefficient (R2) as high as 0.995. Compared with the result of artificial neural network (ANN) approach, it is shown that the modeling ability of SVR surpasses that of ANN. These suggest that SVR prediction model can be a good predictor to estimate the cell voltage for DMFC system.
Multivariate Lesion-Symptom Mapping Using Support Vector Regression
Zhang, Yongsheng; Kimberg, Daniel Y.; Coslett, H. Branch; Schwartz, Myrna F.; Wang, Ze
2014-01-01
Lesion analysis is a classic approach to study brain functions. Because brain function is a result of coherent activations of a collection of functionally related voxels, lesion-symptom relations are generally contributed by multiple voxels simultaneously. Although voxel-based lesion symptom mapping (VLSM) has made substantial contributions to the understanding of brain-behavior relationships, a better understanding of the brain-behavior relationship contributed by multiple brain regions needs a multivariate lesion symptom mapping (MLSM). The purpose of this paper was to develop an MLSM using a machine learning-based multivariate regression algorithm: support vector regression (SVR). In the proposed SVR-LSM, the symptom relation to the entire lesion map as opposed to each isolated voxel is modeled using a non-linear function, so the inter-voxel correlations are intrinsically considered, resulting in a potentially more sensitive way to examine lesion-symptom relationships. To explore the relative merits of VLSM and SVR-LSM we used both approaches in the analysis of a synthetic dataset. SVR-LSM showed much higher sensitivity and specificity for detecting the synthetic lesion-behavior relations than VLSM. When applied to lesion data and language measures from patients with brain damages, SVR-LSM reproduced the essential pattern of previous findings identified by VLSM and showed higher sensitivity than VLSM for identifying the lesion-behavior relations. Our data also showed the possibility of using lesion data to predict continuous behavior scores. PMID:25044213
Support vector regression for real-time flood stage forecasting
Yu, Pao-Shan; Chen, Shien-Tsung; Chang, I.-Fan
2006-09-01
SummaryFlood forecasting is an important non-structural approach for flood mitigation. The flood stage is chosen as the variable to be forecasted because it is practically useful in flood forecasting. The support vector machine, a novel artificial intelligence-based method developed from statistical learning theory, is adopted herein to establish a real-time stage forecasting model. The lags associated with the input variables are determined by applying the hydrological concept of the time of response, and a two-step grid search method is applied to find the optimal parameters, and thus overcome the difficulties in constructing the learning machine. Two structures of models used to perform multiple-hour-ahead stage forecasts are developed. Validation results from flood events in Lan-Yang River, Taiwan, revealed that the proposed models can effectively predict the flood stage forecasts one-to-six-hours ahead. Moreover, a sensitivity analysis was conducted on the lags associated with the input variables.
A Support Vector Regression Approach for Investigating Multianticipative Driving Behavior
Directory of Open Access Journals (Sweden)
Bin Lu
2015-01-01
Full Text Available This paper presents a Support Vector Regression (SVR approach that can be applied to predict the multianticipative driving behavior using vehicle trajectory data. Building upon the SVR approach, a multianticipative car-following model is developed and enhanced in learning speed and predication accuracy. The model training and validation are conducted by using the field trajectory data extracted from the Next Generation Simulation (NGSIM project. During the model training and validation tests, the estimation results show that the SVR model performs as well as IDM model with respect to the model prediction accuracy. In addition, this paper performs a relative importance analysis to quantify the multianticipation in terms of the different stimuli to which drivers react in platoon car following. The analysis results confirm that drivers respond to the behavior of not only the immediate leading vehicle in front but also the second, third, and even fourth leading vehicles. Specifically, in congested traffic conditions, drivers are observed to be more sensitive to the relative speed than to the gap. These findings provide insight into multianticipative driving behavior and illustrate the necessity of taking into account multianticipative car-following model in microscopic traffic simulation.
Maroco, João; Silva, Dina; Rodrigues, Ana; Guerreiro, Manuela; Santana, Isabel; de Mendonça, Alexandre
2011-08-17
Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Press' Q test showed that all classifiers performed better than chance alone (p classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most
Energy Technology Data Exchange (ETDEWEB)
Riaz, Nadeem; Wiersma, Rodney; Mao Weihua; Xing Lei [Department of Radiation Oncology, Stanford University, 875 Blake Wilbur Drive, Stanford, CA 94305-5847 (United States); Shanker, Piyush; Gudmundsson, Olafur; Widrow, Bernard [Department of Electrical Engineering, Stanford University, Stanford, CA 94305 (United States)], E-mail: nriaz@stanford.edu
2009-10-07
Intra-fraction tumor tracking methods can improve radiation delivery during radiotherapy sessions. Image acquisition for tumor tracking and subsequent adjustment of the treatment beam with gating or beam tracking introduces time latency and necessitates predicting the future position of the tumor. This study evaluates the use of multi-dimensional linear adaptive filters and support vector regression to predict the motion of lung tumors tracked at 30 Hz. We expand on the prior work of other groups who have looked at adaptive filters by using a general framework of a multiple-input single-output (MISO) adaptive system that uses multiple correlated signals to predict the motion of a tumor. We compare the performance of these two novel methods to conventional methods like linear regression and single-input, single-output adaptive filters. At 400 ms latency the average root-mean-square-errors (RMSEs) for the 14 treatment sessions studied using no prediction, linear regression, single-output adaptive filter, MISO and support vector regression are 2.58, 1.60, 1.58, 1.71 and 1.26 mm, respectively. At 1 s, the RMSEs are 4.40, 2.61, 3.34, 2.66 and 1.93 mm, respectively. We find that support vector regression most accurately predicts the future tumor position of the methods studied and can provide a RMSE of less than 2 mm at 1 s latency. Also, a multi-dimensional adaptive filter framework provides improved performance over single-dimension adaptive filters. Work is underway to combine these two frameworks to improve performance.
Zhou, Lim Yi; Shan, Fam Pei; Shimizu, Kunio; Imoto, Tomoaki; Lateh, Habibah; Peng, Koay Swee
2017-08-01
A comparative study of logistic regression, support vector machine (SVM) and least square support vector machine (LSSVM) models has been done to predict the slope failure (landslide) along East-West Highway (Gerik-Jeli). The effects of two monsoon seasons (southwest and northeast) that occur in Malaysia are considered in this study. Two related factors of occurrence of slope failure are included in this study: rainfall and underground water. For each method, two predictive models are constructed, namely SOUTHWEST and NORTHEAST models. Based on the results obtained from logistic regression models, two factors (rainfall and underground water level) contribute to the occurrence of slope failure. The accuracies of the three statistical models for two monsoon seasons are verified by using Relative Operating Characteristics curves. The validation results showed that all models produced prediction of high accuracy. For the results of SVM and LSSVM, the models using RBF kernel showed better prediction compared to the models using linear kernel. The comparative results showed that, for SOUTHWEST models, three statistical models have relatively similar performance. For NORTHEAST models, logistic regression has the best predictive efficiency whereas the SVM model has the second best predictive efficiency.
A Novel Empirical Mode Decomposition With Support Vector Regression for Wind Speed Forecasting.
Ren, Ye; Suganthan, Ponnuthurai Nagaratnam; Srikanth, Narasimalu
2016-08-01
Wind energy is a clean and an abundant renewable energy source. Accurate wind speed forecasting is essential for power dispatch planning, unit commitment decision, maintenance scheduling, and regulation. However, wind is intermittent and wind speed is difficult to predict. This brief proposes a novel wind speed forecasting method by integrating empirical mode decomposition (EMD) and support vector regression (SVR) methods. The EMD is used to decompose the wind speed time series into several intrinsic mode functions (IMFs) and a residue. Subsequently, a vector combining one historical data from each IMF and the residue is generated to train the SVR. The proposed EMD-SVR model is evaluated with a wind speed data set. The proposed EMD-SVR model outperforms several recently reported methods with respect to accuracy or computational complexity.
An Adaptive Support Vector Regression Machine for the State Prognosis of Mechanical Systems
Directory of Open Access Journals (Sweden)
Qing Zhang
2015-01-01
Full Text Available Due to the unsteady state evolution of mechanical systems, the time series of state indicators exhibits volatile behavior and staged characteristics. To model hidden trends and predict deterioration failure utilizing volatile state indicators, an adaptive support vector regression (ASVR machine is proposed. In ASVR, the width of an error-insensitive tube, which is a constant in the traditional support vector regression, is set as a variable determined by the transient distribution boundary of local regions in the training time series. Thus, the localized regions are obtained using a sliding time window, and their boundaries are defined by a robust measure known as the truncated range. Utilizing an adaptive error-insensitive tube, a stabilized tolerance level for noise is achieved, whether the time series occurs in low-volatility regions or in high-volatility regions. The proposed method is evaluated by vibrational data measured on descaling pumps. The results show that ASVR is capable of capturing the local trends of the volatile time series of state indicators and is superior to the standard support vector regression for state prediction.
Estimating transmitted waves of floating breakwater using support vector regression model
Digital Repository Service at National Institute of Oceanography (India)
Mandal, S.; Hegde, A.V.; Kumar, V.; Patil, S.G.
to diameter of pipes (S/D). The radial basis functions performed well than the polynomial function in the regressive support vector machine as the kernel function for the given set of data. The support vector regression model gives the correlation coefficients...
DOA Finding with Support Vector Regression Based Forward-Backward Linear Prediction.
Pan, Jingjing; Wang, Yide; Le Bastard, Cédric; Wang, Tianzhen
2017-05-27
Direction-of-arrival (DOA) estimation has drawn considerable attention in array signal processing, particularly with coherent signals and a limited number of snapshots. Forward-backward linear prediction (FBLP) is able to directly deal with coherent signals. Support vector regression (SVR) is robust with small samples. This paper proposes the combination of the advantages of FBLP and SVR in the estimation of DOAs of coherent incoming signals with low snapshots. The performance of the proposed method is validated with numerical simulations in coherent scenarios, in terms of different angle separations, numbers of snapshots, and signal-to-noise ratios (SNRs). Simulation results show the effectiveness of the proposed method.
DOA Finding with Support Vector Regression Based Forward–Backward Linear Prediction
Directory of Open Access Journals (Sweden)
Jingjing Pan
2017-05-01
Full Text Available Direction-of-arrival (DOA estimation has drawn considerable attention in array signal processing, particularly with coherent signals and a limited number of snapshots. Forward–backward linear prediction (FBLP is able to directly deal with coherent signals. Support vector regression (SVR is robust with small samples. This paper proposes the combination of the advantages of FBLP and SVR in the estimation of DOAs of coherent incoming signals with low snapshots. The performance of the proposed method is validated with numerical simulations in coherent scenarios, in terms of different angle separations, numbers of snapshots, and signal-to-noise ratios (SNRs. Simulation results show the effectiveness of the proposed method.
Linear and support vector regressions based on geometrical correlation of data
Directory of Open Access Journals (Sweden)
Kaijun Wang
2007-10-01
Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.
Single Image Super-Resolution by Non-Linear Sparse Representation and Support Vector Regression
Directory of Open Access Journals (Sweden)
Yungang Zhang
2017-02-01
Full Text Available Sparse representations are widely used tools in image super-resolution (SR tasks. In the sparsity-based SR methods, linear sparse representations are often used for image description. However, the non-linear data distributions in images might not be well represented by linear sparse models. Moreover, many sparsity-based SR methods require the image patch self-similarity assumption; however, the assumption may not always hold. In this paper, we propose a novel method for single image super-resolution (SISR. Unlike most prior sparsity-based SR methods, the proposed method uses non-linear sparse representation to enhance the description of the non-linear information in images, and the proposed framework does not need to assume the self-similarity of image patches. Based on the minimum reconstruction errors, support vector regression (SVR is applied for predicting the SR image. The proposed method was evaluated on various benchmark images, and promising results were obtained.
SNPs selection using support vector regression and genetic algorithms in GWAS.
de Oliveira, Fabrízzio Condé; Borges, Carlos Cristiano Hasenclever; Almeida, Fernanda Nascimento; e Silva, Fabyano Fonseca; da Silva Verneque, Rui; da Silva, Marcos Vinicius G B; Arbex, Wagner
2014-01-01
This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels.
Support Vector Regression and Genetic Algorithm for HVAC Optimal Operation
Directory of Open Access Journals (Sweden)
Ching-Wei Chen
2016-01-01
Full Text Available This study covers records of various parameters affecting the power consumption of air-conditioning systems. Using the Support Vector Machine (SVM, the chiller power consumption model, secondary chilled water pump power consumption model, air handling unit fan power consumption model, and air handling unit load model were established. In addition, it was found that R2 of the models all reached 0.998, and the training time was far shorter than that of the neural network. Through genetic programming, a combination of operating parameters with the least power consumption of air conditioning operation was searched. Moreover, the air handling unit load in line with the air conditioning cooling load was predicted. The experimental results show that for the combination of operating parameters with the least power consumption in line with the cooling load obtained through genetic algorithm search, the power consumption of the air conditioning systems under said combination of operating parameters was reduced by 22% compared to the fixed operating parameters, thus indicating significant energy efficiency.
Directory of Open Access Journals (Sweden)
Changhao Fan
2017-01-01
Full Text Available In modeling, only information from the deviation between the output of the support vector regression (SVR model and the training sample is considered, whereas the other prior information of the training sample, such as probability distribution information, is ignored. Probabilistic distribution information describes the overall distribution of sample data in a training sample that contains different degrees of noise and potential outliers, as well as helping develop a high-accuracy model. To mine and use the probability distribution information of a training sample, a new support vector regression model that incorporates probability distribution information weight SVR (PDISVR is proposed. In the PDISVR model, the probability distribution of each sample is considered as the weight and is then introduced into the error coefficient and slack variables of SVR. Thus, the deviation and probability distribution information of the training sample are both used in the PDISVR model to eliminate the influence of noise and outliers in the training sample and to improve predictive performance. Furthermore, examples with different degrees of noise were employed to demonstrate the performance of PDISVR, which was then compared with those of three SVR-based methods. The results showed that PDISVR performs better than the three other methods.
Directory of Open Access Journals (Sweden)
Hongjian Wang
2014-01-01
Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.
Eisavi, Vahid; Homayouni, Saeid
2016-10-01
Information on land use and land cover changes is considered as a foremost requirement for monitoring environmental change. Developing change detection methodology in the remote sensing community is an active research topic. However, to the best of our knowledge, no research has been conducted so far on the application of random forest regression (RFR) and support vector regression (SVR) for natural hazard change detection from high-resolution optical remote sensing observations. Hence, the objective of this study is to examine the use of RFR and SVR to discriminate between changed and unchanged areas after a tsunami. For this study, RFR and SVR were applied to two different pilot coastlines in Indonesia and Japan. Two different remotely sensed data sets acquired by Quickbird and Ikonos sensors were used for efficient evaluation of the proposed methodology. The results demonstrated better performance of SVM compared to random forest (RF) with an overall accuracy higher by 3% to 4% and kappa coefficient by 0.05 to 0.07. Using McNemar's test, statistically significant differences (Z≥1.96), at the 5% significance level, between the confusion matrices of the RF classifier and the support vector classifier were observed in both study areas. The high accuracy of change detection obtained in this study confirms that these methods have the potential to be used for detecting changes due to natural hazards.
Study on Parameter Optimization for Support Vector Regression in Solving the Inverse ECG Problem
Directory of Open Access Journals (Sweden)
Mingfeng Jiang
2013-01-01
Full Text Available The typical inverse ECG problem is to noninvasively reconstruct the transmembrane potentials (TMPs from body surface potentials (BSPs. In the study, the inverse ECG problem can be treated as a regression problem with multi-inputs (body surface potentials and multi-outputs (transmembrane potentials, which can be solved by the support vector regression (SVR method. In order to obtain an effective SVR model with optimal regression accuracy and generalization performance, the hyperparameters of SVR must be set carefully. Three different optimization methods, that is, genetic algorithm (GA, differential evolution (DE algorithm, and particle swarm optimization (PSO, are proposed to determine optimal hyperparameters of the SVR model. In this paper, we attempt to investigate which one is the most effective way in reconstructing the cardiac TMPs from BSPs, and a full comparison of their performances is also provided. The experimental results show that these three optimization methods are well performed in finding the proper parameters of SVR and can yield good generalization performance in solving the inverse ECG problem. Moreover, compared with DE and GA, PSO algorithm is more efficient in parameters optimization and performs better in solving the inverse ECG problem, leading to a more accurate reconstruction of the TMPs.
Study on parameter optimization for support vector regression in solving the inverse ECG problem.
Jiang, Mingfeng; Jiang, Shanshan; Zhu, Lingyan; Wang, Yaming; Huang, Wenqing; Zhang, Heng
2013-01-01
The typical inverse ECG problem is to noninvasively reconstruct the transmembrane potentials (TMPs) from body surface potentials (BSPs). In the study, the inverse ECG problem can be treated as a regression problem with multi-inputs (body surface potentials) and multi-outputs (transmembrane potentials), which can be solved by the support vector regression (SVR) method. In order to obtain an effective SVR model with optimal regression accuracy and generalization performance, the hyperparameters of SVR must be set carefully. Three different optimization methods, that is, genetic algorithm (GA), differential evolution (DE) algorithm, and particle swarm optimization (PSO), are proposed to determine optimal hyperparameters of the SVR model. In this paper, we attempt to investigate which one is the most effective way in reconstructing the cardiac TMPs from BSPs, and a full comparison of their performances is also provided. The experimental results show that these three optimization methods are well performed in finding the proper parameters of SVR and can yield good generalization performance in solving the inverse ECG problem. Moreover, compared with DE and GA, PSO algorithm is more efficient in parameters optimization and performs better in solving the inverse ECG problem, leading to a more accurate reconstruction of the TMPs.
Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie
2014-09-01
Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique. Copyright © 2014 Elsevier Ltd. All rights reserved.
Macroeconomic Forecasting Using Penalized Regression Methods
Smeekes, Stephan; Wijler, Etiënne
2016-01-01
We study the suitability of lasso-type penalized regression techniques when applied to macroeconomic forecasting with high-dimensional datasets. We consider performance of the lasso-type methods when the true DGP is a factor model, contradicting the sparsity assumption underlying penalized
Directory of Open Access Journals (Sweden)
Zhang Sheng Bo
2016-01-01
Full Text Available A novel quality prediction method with mobile time window is proposed for small-batch producing process based on weighted least squares support vector regression (LS-SVR. The design steps and learning algorithm are also addressed. In the method, weighted LS-SVR is taken as the intelligent kernel, with which the small-batch learning is solved well and the nearer sample is set a larger weight, while the farther is set the smaller weight in the history data. A typical machining process of cutting bearing outer race is carried out and the real measured data are used to contrast experiment. The experimental results demonstrate that the prediction accuracy of the weighted LSSVR based model is only 20%-30% that of the standard LS-SVR based one in the same condition. It provides a better candidate for quality prediction of small-batch producing process.
Gas detonation cell width prediction model based on support vector regression
Directory of Open Access Journals (Sweden)
Jiyang Yu
2017-10-01
Full Text Available Detonation cell width is an important parameter in hydrogen explosion assessments. The experimental data on gas detonation are statistically analyzed to establish a universal method to numerically predict detonation cell widths. It is commonly understood that detonation cell width, λ, is highly correlated with the characteristic reaction zone width, δ. Classical parametric regression methods were widely applied in earlier research to build an explicit semiempirical correlation for the ratio of λ/δ. The obtained correlations formulate the dependency of the ratio λ/δ on a dimensionless effective chemical activation energy and a dimensionless temperature of the gas mixture. In this paper, support vector regression (SVR, which is based on nonparametric machine learning, is applied to achieve functions with better fitness to experimental data and more accurate predictions. Furthermore, a third parameter, dimensionless pressure, is considered as an additional independent variable. It is found that three-parameter SVR can significantly improve the performance of the fitting function. Meanwhile, SVR also provides better adaptability and the model functions can be easily renewed when experimental database is updated or new regression parameters are considered.
Application of Hybrid Quantum Tabu Search with Support Vector Regression (SVR for Load Forecasting
Directory of Open Access Journals (Sweden)
Cheng-Wen Lee
2016-10-01
Full Text Available Hybridizing chaotic evolutionary algorithms with support vector regression (SVR to improve forecasting accuracy is a hot topic in electricity load forecasting. Trapping at local optima and premature convergence are critical shortcomings of the tabu search (TS algorithm. This paper investigates potential improvements of the TS algorithm by applying quantum computing mechanics to enhance the search information sharing mechanism (tabu memory to improve the forecasting accuracy. This article presents an SVR-based load forecasting model that integrates quantum behaviors and the TS algorithm with the support vector regression model (namely SVRQTS to obtain a more satisfactory forecasting accuracy. Numerical examples demonstrate that the proposed model outperforms the alternatives.
Directory of Open Access Journals (Sweden)
Jun Shuai
2017-01-01
Full Text Available Numerous studies on fault diagnosis have been conducted in recent years because the timely and correct detection of machine fault effectively minimizes the damage resulting in the unexpected breakdown of machineries. The mathematical morphological analysis has been performed to denoise raw signal. However, the improper choice of the length of the structure element (SE will substantially influence the effectiveness of fault feature extraction. Moreover, the classification of fault type is a significant step in intelligent fault diagnosis, and many techniques have already been developed, such as support vector machine (SVM. This study proposes an intelligent fault diagnosis strategy that combines the extraction of morphological feature and support vector regression (SVR classifier. The vibration signal is first processed using various scales of morphological analysis, where the length of SE is determined adaptively. Thereafter, nine statistical features are extracted from the processed signal. Lastly, an SVR classifier is used to identify the health condition of the machinery. The effectiveness of the proposed scheme is validated using the data set from a bearing test rig. Results show the high accuracy of the proposed method despite the influence of noise.
Weng, Shizhuang; Yuan, Baohong; Zhu, Zede; Huang, Linsheng; Zhang, Dongyan; Zheng, Ling
2016-03-01
As a novel and ultrasensitive detection technology that had advantages of fingerprint effect, high speed and low cost, surface-enhanced Raman scattering (SERS) was used to develop the regression models for the fast quantitative detection of thiram by support vector machine regression (SVR) in the paper. Meanwhile, three parameter optimization methods, which were grid search (GS), genetic algorithm (GA) and particle swarm optimization (PSO), were employed to optimize the internal parameters of SVR. Furthermore, the influence of the spectral number, spectral wavenumber range and principal component analysis (PCA) on the quantitative detection was also discussed. Firstly, the experiments demonstrate the proposed method can realize the fast and quantitative detection of thiram, and the best result is obtained by GS-SVR with the spectra of the range of characteristic peak which are processed by PCA. And the effect of GS, GA, PSO on the parameter optimization is similar, but the analysis time has a great difference in which GS is the fastest. Considering the analysis accuracy and time simultaneously, the spectral number of samples over each concentration should be set to 50. Then, developing the quantitative model with the spectra of range of characteristic peak can reduce analysis time on the promise of ensuring the detection accuracy. Additionally, PCA can further reduce the detection error through reserving the main information of the spectra data and eliminating the noise.
Stochastic development regression using method of moments
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338
Directory of Open Access Journals (Sweden)
Suduan Chen
2014-01-01
Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
Directory of Open Access Journals (Sweden)
Wensheng Dai
2014-01-01
Full Text Available Sales forecasting is one of the most important issues in managing information technology (IT chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR, is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA, temporal ICA (tICA, and spatiotemporal ICA (stICA to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
Dai, Wensheng
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740
Feature Vector Construction Method for IRIS Recognition
Odinokikh, G.; Fartukov, A.; Korobkin, M.; Yoo, J.
2017-05-01
One of the basic stages of iris recognition pipeline is iris feature vector construction procedure. The procedure represents the extraction of iris texture information relevant to its subsequent comparison. Thorough investigation of feature vectors obtained from iris showed that not all the vector elements are equally relevant. There are two characteristics which determine the vector element utility: fragility and discriminability. Conventional iris feature extraction methods consider the concept of fragility as the feature vector instability without respect to the nature of such instability appearance. This work separates sources of the instability into natural and encodinginduced which helps deeply investigate each source of instability independently. According to the separation concept, a novel approach of iris feature vector construction is proposed. The approach consists of two steps: iris feature extraction using Gabor filtering with optimal parameters and quantization with separated preliminary optimized fragility thresholds. The proposed method has been tested on two different datasets of iris images captured under changing environmental conditions. The testing results show that the proposed method surpasses all the methods considered as a prior art by recognition accuracy on both datasets.
Kropotov, D. A.
2011-08-01
Problems of classification and regression estimation in which objects are represented by multidimensional arrays of features are considered. Many practical statements can be reduced to such problems, for example, the popular approach to the description of images as a set of patches and a set of descriptors in each patch or the description of an object in the form of a set of distances from it to certain support objects selected based on a set of features. For solving problems concerning the objects thus described, a generalization of the relevance vector model is proposed. In this generalization, specific regularization coefficients are defined for each dimension of the multidimensional array of the object description; the resultant regularization coefficient for a given element in the multidimensional array is determined as a combination of the regularization coefficients for all the dimensions. The models with the sum and product used for such combinations are examined. Algorithms based on the variational approach are proposed for learning in these models. These algorithms enable one to find the so-called "sparse" solutions, that is, exclude from the consideration the irrelevant dimensions in the multidimensional array of the object description. Compared with the classical relevance vector model, the proposed approach makes it possible to reduce the number of adjustable parameters because a sum of all the dimensions is considered instead of their product. As a result, the method becomes more robust under overfitting in the case of small samples. This property and the sparseness of the resulting solutions in the proposed models are demonstrated experimentally, in particular, in the case of the known face identification database called Labeled Faces in the Wild.
Hybrid ARIMA and Support Vector Regression in Short‑term Electricity Price Forecasting
Directory of Open Access Journals (Sweden)
Jindřich Pokora
2017-01-01
Full Text Available The literature suggests that, in short‑term electricity‑price forecasting, a combination of ARIMA and support vector regression (SVR yields performance improvement over separate use of each method. The objective of the research is to investigate the circumstances under which these hybrid models are superior for day‑ahead hourly price forecasting. Analysis of the Nord Pool market with 16 interconnected areas and 6 investigated monthly periods allows not only for a considerable level of generalizability but also for assessment of the effect of transmission congestion since this causes differences in prices between the Nord Pool areas. The paper finds that SVR, SVRARIMA and ARIMASVR provide similar performance, at the same time, hybrid methods outperform single models in terms of RMSE in 98 % of investigated time series. Furthermore, it seems that higher flexibility of hybrid models improves modeling of price spikes at a slight cost of imprecision during steady periods. Lastly, superiority of hybrid models is pronounced under transmission congestions, measured as first and second moments of the electricity price.
Directory of Open Access Journals (Sweden)
Jianzhou Wang
2015-01-01
Full Text Available This paper develops an effectively intelligent model to forecast short-term wind speed series. A hybrid forecasting technique is proposed based on recurrence plot (RP and optimized support vector regression (SVR. Wind caused by the interaction of meteorological systems makes itself extremely unsteady and difficult to forecast. To understand the wind system, the wind speed series is analyzed using RP. Then, the SVR model is employed to forecast wind speed, in which the input variables are selected by RP, and two crucial parameters, including the penalties factor and gamma of the kernel function RBF, are optimized by various optimization algorithms. Those optimized algorithms are genetic algorithm (GA, particle swarm optimization algorithm (PSO, and cuckoo optimization algorithm (COA. Finally, the optimized SVR models, including COA-SVR, PSO-SVR, and GA-SVR, are evaluated based on some criteria and a hypothesis test. The experimental results show that (1 analysis of RP reveals that wind speed has short-term predictability on a short-term time scale, (2 the performance of the COA-SVR model is superior to that of the PSO-SVR and GA-SVR methods, especially for the jumping samplings, and (3 the COA-SVR method is statistically robust in multi-step-ahead prediction and can be applied to practical wind farm applications.
Directory of Open Access Journals (Sweden)
Shahrbanoo Goli
2016-01-01
Full Text Available The Support Vector Regression (SVR model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.
Directory of Open Access Journals (Sweden)
Quoc-Huy Phan
2013-01-01
Full Text Available Multipath mitigation is a long-standing problem in global positioning system (GPS research and is essential for improving the accuracy and precision of positioning solutions. In this work, we consider multipath error estimation as a regression problem and propose a unified framework for both code and carrier-phase multipath mitigation for ground fixed GPS stations. We use the kernel support vector machine to predict multipath errors, since it is known to potentially offer better-performance traditional models, such as neural networks. The predicted multipath error is then used to correct GPS measurements. We empirically show that the proposed method can reduce the code multipath error standard deviation up to 79% on average, which significantly outperforms other approaches in the literature. A comparative analysis of reduction of double-differential carrier-phase multipath error reveals that a 57% reduction is also achieved. Furthermore, by simulation, we also show that this method is robust to coexisting signals of phenomena (e.g., seismic signals we wish to preserve.
MANCOVA for one way classification with homogeneity of regression coefficient vectors
Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.
2017-11-01
The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.
Directory of Open Access Journals (Sweden)
S.K. Lahiri
2009-09-01
Full Text Available Soft sensors have been widely used in the industrial process control to improve the quality of the product and assure safety in the production. The core of a soft sensor is to construct a soft sensing model. This paper introduces support vector regression (SVR, a new powerful machine learning methodbased on a statistical learning theory (SLT into soft sensor modeling and proposes a new soft sensing modeling method based on SVR. This paper presents an artificial intelligence based hybrid soft sensormodeling and optimization strategies, namely support vector regression – genetic algorithm (SVR-GA for modeling and optimization of mono ethylene glycol (MEG quality variable in a commercial glycol plant. In the SVR-GA approach, a support vector regression model is constructed for correlating the process data comprising values of operating and performance variables. Next, model inputs describing the process operating variables are optimized using genetic algorithm with a view to maximize the process performance. The SVR-GA is a new strategy for soft sensor modeling and optimization. The major advantage of the strategies is that modeling and optimization can be conducted exclusively from the historic process data wherein the detailed knowledge of process phenomenology (reaction mechanism, kinetics etc. is not required. Using SVR-GA strategy, a number of sets of optimized operating conditions were found. The optimized solutions, when verified in an actual plant, resulted in a significant improvement in the quality.
Directory of Open Access Journals (Sweden)
Xian-Xia Zhang
2013-01-01
Full Text Available This paper presents a reference function based 3D FLC design methodology using support vector regression (SVR learning. The concept of reference function is introduced to 3D FLC for the generation of 3D membership functions (MF, which enhance the capability of the 3D FLC to cope with more kinds of MFs. The nonlinear mathematical expression of the reference function based 3D FLC is derived, and spatial fuzzy basis functions are defined. Via relating spatial fuzzy basis functions of a 3D FLC to kernel functions of an SVR, an equivalence relationship between a 3D FLC and an SVR is established. Therefore, a 3D FLC can be constructed using the learned results of an SVR. Furthermore, the universal approximation capability of the proposed 3D fuzzy system is proven in terms of the finite covering theorem. Finally, the proposed method is applied to a catalytic packed-bed reactor and simulation results have verified its effectiveness.
Directory of Open Access Journals (Sweden)
Jaehyun Yoo
2015-05-01
Full Text Available Machine learning has been successfully used for target localization in wireless sensor networks (WSNs due to its accurate and robust estimation against highly nonlinear and noisy sensor measurement. For efficient and adaptive learning, this paper introduces online semi-supervised support vector regression (OSS-SVR. The first advantage of the proposed algorithm is that, based on semi-supervised learning framework, it can reduce the requirement on the amount of the labeled training data, maintaining accurate estimation. Second, with an extension to online learning, the proposed OSS-SVR automatically tracks changes of the system to be learned, such as varied noise characteristics. We compare the proposed algorithm with semi-supervised manifold learning, an online Gaussian process and online semi-supervised colocalization. The algorithms are evaluated for estimating the unknown location of a mobile robot in a WSN. The experimental results show that the proposed algorithm is more accurate under the smaller amount of labeled training data and is robust to varying noise. Moreover, the suggested algorithm performs fast computation, maintaining the best localization performance in comparison with the other methods.
Estimation of the laser cutting operating cost by support vector regression methodology
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Directory of Open Access Journals (Sweden)
Mustakim Mustakim
2016-02-01
Full Text Available The largest region that produces oil palm in Indonesia has an important role in improving the welfare of society and economy. Oil palm has increased significantly in Riau Province in every period, to determine the production development for the next few years with the functions and benefits of oil palm carried prediction production results that were seen from time series data last 8 years (2005-2013. In its prediction implementation, it was done by comparing the performance of Support Vector Regression (SVR method and Artificial Neural Network (ANN. From the experiment, SVR produced the best model compared with ANN. It is indicated by the correlation coefficient of 95% and 6% for MSE in the kernel Radial Basis Function (RBF, whereas ANN produced only 74% for R2 and 9% for MSE on the 8th experiment with hiden neuron 20 and learning rate 0,1. SVR model generates predictions for next 3 years which increased between 3% - 6% from actual data and RBF model predictions.
Zhou, Pei-pei; Shan, Jin-feng; Jiang, Jian-lan
2015-12-01
To optimize the optimal microwave-assisted extraction method of curcuminoids from Curcuma longa. On the base of single factor experiment, the ethanol concentration, the ratio of liquid to solid and the microwave time were selected for further optimization. Support Vector Regression (SVR) and Central Composite Design-Response Surface Methodology (CCD) algorithm were utilized to design and establish models respectively, while Particle Swarm Optimization (PSO) was introduced to optimize the parameters of SVR models and to search optimal points of models. The evaluation indicator, the sum of curcumin, demethoxycurcumin and bisdemethoxycurcumin by HPLC, were used. The optimal parameters of microwave-assisted extraction were as follows: ethanol concentration of 69%, ratio of liquid to solid of 21 : 1, microwave time of 55 s. On those conditions, the sum of three curcuminoids was 28.97 mg/g (per gram of rhizomes powder). Both the CCD model and the SVR model were credible, for they have predicted the similar process condition and the deviation of yield were less than 1.2%.
Directory of Open Access Journals (Sweden)
Zhongwei Li
Full Text Available Welan gum is a kind of novel microbial polysaccharide, which is widely produced during the process of microbial growth and metabolism in different external conditions. Welan gum can be used as the thickener, suspending agent, emulsifier, stabilizer, lubricant, film-forming agent and adhesive usage in agriculture. In recent years, finding optimal experimental conditions to maximize the production is paid growing attentions. In this work, a hybrid computational method is proposed to optimize experimental conditions for producing Welan gum with data collected from experiments records. Support Vector Regression (SVR is used to model the relationship between Welan gum production and experimental conditions, and then adaptive Genetic Algorithm (AGA, for short is applied to search optimized experimental conditions. As results, a mathematic model of predicting production of Welan gum from experimental conditions is obtained, which achieves accuracy rate 88.36%. As well, a class of optimized experimental conditions is predicted for producing Welan gum 31.65g/L. Comparing the best result in chemical experiment 30.63g/L, the predicted production improves it by 3.3%. The results provide potential optimal experimental conditions to improve the production of Welan gum.
Li, Zhongwei; Yuan, Xiang; Cui, Xuerong; Liu, Xin; Wang, Leiquan; Zhang, Weishan; Lu, Qinghua; Zhu, Hu
2017-01-01
Welan gum is a kind of novel microbial polysaccharide, which is widely produced during the process of microbial growth and metabolism in different external conditions. Welan gum can be used as the thickener, suspending agent, emulsifier, stabilizer, lubricant, film-forming agent and adhesive usage in agriculture. In recent years, finding optimal experimental conditions to maximize the production is paid growing attentions. In this work, a hybrid computational method is proposed to optimize experimental conditions for producing Welan gum with data collected from experiments records. Support Vector Regression (SVR) is used to model the relationship between Welan gum production and experimental conditions, and then adaptive Genetic Algorithm (AGA, for short) is applied to search optimized experimental conditions. As results, a mathematic model of predicting production of Welan gum from experimental conditions is obtained, which achieves accuracy rate 88.36%. As well, a class of optimized experimental conditions is predicted for producing Welan gum 31.65g/L. Comparing the best result in chemical experiment 30.63g/L, the predicted production improves it by 3.3%. The results provide potential optimal experimental conditions to improve the production of Welan gum.
Implicit Social Trust Dan Support Vector Regression Untuk Sistem Rekomendasi Berita
Directory of Open Access Journals (Sweden)
Melita Widya Ningrum
2018-01-01
Full Text Available Situs berita merupakan salah satu situs yang sering diakses masyarakat karena kemampuannya dalam menyajikan informasi terkini dari berbagai topik seperti olahraga, bisnis, politik, teknologi, kesehatan dan hiburan. Masyarakat dapat mencari dan melihat berita yang sedang populer dari seluruh dunia. Di sisi lain, melimpahnya artikel berita yang tersedia dapat menyulitkan pengguna dalam menemukan artikel berita yang sesuai dengan ketertarikannya. Pemilihan artikel berita yang ditampilkan ke halaman utama pengguna menjadi penting karena dapat meningkatkan minat pengguna untuk membaca artikel berita dari situs tersebut. Selain itu, pemilihan artikel berita yang sesuai dapat meminimalisir terjadinya banjir informasi yang tidak relevan. Dalam pemilihan artikel berita dibutuhkan sistem rekomendasi yang memiliki pengetahuan mengenai ketertarikan atau relevansi pengguna akan topik berita tertentu. Pada penelitian ini, peneliti membuat sistem rekomendasi artikel berita pada New York Times berbasis implicit social trust. Social trust dihasilkan dari interaksi antara pengguna dengan teman-temannya dan bobot kepercayaan teman pengguna pada media sosial Twitter. Data yang diambil merupakan data pengguna Twitter, teman dan jumlah interaksi antar pengguna berupa retweet. Sistem ini memanfaatkan algoritma Support Vector Regression untuk memberikan estimasi penilaian pengguna terhadap suatu topik tertentu. Hasil pengolahan data dengan Support Vector Regression menunjukkan tingkat akurasi dengan MAPE sebesar 0,8243075902233644%. Keywords : Twitter, Rekomendasi Berita, Social Trust, Support Vector Regression
Method for transforming a feature vector
Veldhuis, Raymond N.J.; Chen, C.; Kevenaar, Thomas A.M.; Akkermans, Antonius H.M.
2007-01-01
The present invention relates to a method for transforming a feature vector comprising a first and a second feature represented by a first and a second feature value, respectively, into a feature code using an encoder, said feature code usable in an algorithm and having a predetermined number of
Santos, Frédéric; Guyomarc'h, Pierre; Bruzek, Jaroslav
2014-12-01
Accuracy of identification tools in forensic anthropology primarily rely upon the variations inherent in the data upon which they are built. Sex determination methods based on craniometrics are widely used and known to be specific to several factors (e.g. sample distribution, population, age, secular trends, measurement technique, etc.). The goal of this study is to discuss the potential variations linked to the statistical treatment of the data. Traditional craniometrics of four samples extracted from documented osteological collections (from Portugal, France, the U.S.A., and Thailand) were used to test three different classification methods: linear discriminant analysis (LDA), logistic regression (LR), and support vector machines (SVM). The Portuguese sample was set as a training model on which the other samples were applied in order to assess the validity and reliability of the different models. The tests were performed using different parameters: some included the selection of the best predictors; some included a strict decision threshold (sex assessed only if the related posterior probability was high, including the notion of indeterminate result); and some used an unbalanced sex-ratio. Results indicated that LR tends to perform slightly better than the other techniques and offers a better selection of predictors. Also, the use of a decision threshold (i.e. p>0.95) is essential to ensure an acceptable reliability of sex determination methods based on craniometrics. Although the Portuguese, French, and American samples share a similar sexual dimorphism, application of Western models on the Thai sample (that displayed a lower degree of dimorphism) was unsuccessful. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression
Directory of Open Access Journals (Sweden)
Ali Asghar Besalatpour
2016-02-01
Full Text Available Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010. Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA, mean weight diameter (MWD, geometric mean diameter (GMD of aggregates, and water-dispersible clay (WDC content (Calero et al., 2008. Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010. Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014. The main objective of this study is to investigate the potential use of support vector machines (SVMs method for estimating soil aggregate stability (as quantified by GMD as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E, which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM content was determined by the Walkley-Black method (Nelson & Sommers, 1986. Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand were measured using the procedure described by Gee & Bauder (1986 and calcium carbonate equivalent (CCE content was determined by the back-titration method (Nelson, 1982. The modified Kemper & Rosenau (1986 method was used to determine wet-aggregate stability (GMD. The topographic attributes of elevation, slope, and aspect were characterized using a 20-m
Directory of Open Access Journals (Sweden)
Shokri Saeid
2015-01-01
Full Text Available An accurate prediction of sulfur content is very important for the proper operation and product quality control in hydrodesulfurization (HDS process. For this purpose, a reliable data- driven soft sensors utilizing Support Vector Regression (SVR was developed and the effects of integrating Vector Quantization (VQ with Principle Component Analysis (PCA were studied on the assessment of this soft sensor. First, in pre-processing step the PCA and VQ techniques were used to reduce dimensions of the original input datasets. Then, the compressed datasets were used as input variables for the SVR model. Experimental data from the HDS setup were employed to validate the proposed integrated model. The integration of VQ/PCA techniques with SVR model was able to increase the prediction accuracy of SVR. The obtained results show that integrated technique (VQ-SVR was better than (PCA-SVR in prediction accuracy. Also, VQ decreased the sum of the training and test time of SVR model in comparison with PCA. For further evaluation, the performance of VQ-SVR model was also compared to that of SVR. The obtained results indicated that VQ-SVR model delivered the best satisfactory predicting performance (AARE= 0.0668 and R2= 0.995 in comparison with investigated models.
Wang, Xiaolei
2014-12-12
Background: A quantitative understanding of interactions between transcription factors (TFs) and their DNA binding sites is key to the rational design of gene regulatory networks. Recent advances in high-throughput technologies have enabled high-resolution measurements of protein-DNA binding affinity. Importantly, such experiments revealed the complex nature of TF-DNA interactions, whereby the effects of nucleotide changes on the binding affinity were observed to be context dependent. A systematic method to give high-quality estimates of such complex affinity landscapes is, thus, essential to the control of gene expression and the advance of synthetic biology. Results: Here, we propose a two-round prediction method that is based on support vector regression (SVR) with weighted degree (WD) kernels. In the first round, a WD kernel with shifts and mismatches is used with SVR to detect the importance of subsequences with different lengths at different positions. The subsequences identified as important in the first round are then fed into a second WD kernel to fit the experimentally measured affinities. To our knowledge, this is the first attempt to increase the accuracy of the affinity prediction by applying two rounds of string kernels and by identifying a small number of crucial k-mers. The proposed method was tested by predicting the binding affinity landscape of Gcn4p in Saccharomyces cerevisiae using datasets from HiTS-FLIP. Our method explicitly identified important subsequences and showed significant performance improvements when compared with other state-of-the-art methods. Based on the identified important subsequences, we discovered two surprisingly stable 10-mers and one sensitive 10-mer which were not reported before. Further test on four other TFs in S. cerevisiae demonstrated the generality of our method. Conclusion: We proposed in this paper a two-round method to quantitatively model the DNA binding affinity landscape. Since the ability to modify
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Directory of Open Access Journals (Sweden)
Abustan Abustan
2009-06-01
Full Text Available Vector Auto Regression (VAR is an analysis or statistic method which can be used to predict time series variable and to analyst dynamic impact of disturbance factor in the variable system. In addition, VAR analysis is very useful to assess the interrelationship between economic variables. This research through the following test phases: unit root test, test of hypothesis, Granger causality test, and form a vector autoregresion model (VAR. The data used in this research is the GDP data and budget data of South Sulawesi in the period 1985-2004. The research aims to analyze the interrelationship between public expenditure and economic growth in South Sulawesi. The result showed statistically significant in economic growth (PDRB influence public expenditure (APBD, however, not vice versa. Otherwise, for the need of APBD prediction, the used of lag 4 was the optimum model based on the causal relationship to PDRB.
Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan
2017-07-01
Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.
Directory of Open Access Journals (Sweden)
Cheng-Wen Lee
2017-11-01
Full Text Available Accurate electricity forecasting is still the critical issue in many energy management fields. The applications of hybrid novel algorithms with support vector regression (SVR models to overcome the premature convergence problem and improve forecasting accuracy levels also deserve to be widely explored. This paper applies chaotic function and quantum computing concepts to address the embedded drawbacks including crossover and mutation operations of genetic algorithms. Then, this paper proposes a novel electricity load forecasting model by hybridizing chaotic function and quantum computing with GA in an SVR model (named SVRCQGA to achieve more satisfactory forecasting accuracy levels. Experimental examples demonstrate that the proposed SVRCQGA model is superior to other competitive models.
A method for nonlinear exponential regression analysis
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Support vector regression model for predicting the sorption capacity of lead (II
Directory of Open Access Journals (Sweden)
Nusrat Parveen
2016-09-01
Full Text Available Biosorption is supposed to be an economical process for the treatment of wastewater containing heavy metals like lead (II. In this research paper, the support vector regression (SVR has been used to predict the sorption capacity of lead (II ions with the independent input parameters being: initial lead ion concentration, pH, temperature and contact time. Tree fern, an agricultural by-product, has been employed as a low cost biosorbent. Comparison between multiple linear regression (MLR and SVR-based models has been made using statistical parameters. It has been found that the SVR model is more accurate and generalized for prediction of the sorption capacity of lead (II ions.
Yang, Chien-Chun; Nagarajan, Mahesh B.; Huber, Markus B.; Carballido-Gamio, Julio; Bauer, Jan S.; Baum, Thomas; Eckstein, Felix; Lochmüller, Eva-Maria; Link, Thomas M.; Wismüller, Axel
2014-03-01
Regional trabecular bone quality estimation for purposes of femoral bone strength prediction is important for improving the clinical assessment of osteoporotic fracture risk. In this study, we explore the ability of 3D Minkowski Functionals derived from multi-detector computed tomography (MDCT) images of proximal femur specimens in predicting their corresponding biomechanical strength. MDCT scans were acquired for 50 proximal femur specimens harvested from human cadavers. An automated volume of interest (VOI)-fitting algorithm was used to define a consistent volume in the femoral head of each specimen. In these VOIs, the trabecular bone micro-architecture was characterized by statistical moments of its BMD distribution and by topological features derived from Minkowski Functionals. A linear multiregression analysis and a support vector regression (SVR) algorithm with a linear kernel were used to predict the failure load (FL) from the feature sets; the predicted FL was compared to the true FL determined through biomechanical testing. The prediction performance was measured by the root mean square error (RMSE) for each feature set. The best prediction result was obtained from the Minkowski Functional surface used in combination with SVR, which had the lowest prediction error (RMSE = 0.939 ± 0.345) and which was significantly lower than mean BMD (RMSE = 1.075 ± 0.279, pfemur specimens with Minkowski Functionals extracted from on MDCT images used in conjunction with support vector regression.
Maximum likelihood optimal and robust Support Vector Regression with lncosh loss function.
Karal, Omer
2017-10-01
In this paper, a novel and continuously differentiable convex loss function based on natural logarithm of hyperbolic cosine function, namely lncosh loss, is introduced to obtain Support Vector Regression (SVR) models which are optimal in the maximum likelihood sense for the hyper-secant error distributions. Most of the current regression models assume that the distribution of error is Gaussian, which corresponds to the squared loss function and has helpful analytical properties such as easy computation and analysis. However, in many real world applications, most observations are subject to unknown noise distributions, so the Gaussian distribution may not be a useful choice. The developed SVR model with the parameterized lncosh loss provides a possibility of learning a loss function leading to a regression model which is maximum likelihood optimal for a specific input-output data. The SVR models obtained with different parameter choices of lncosh loss with ε-insensitiveness feature, possess most of the desirable characteristics of well-known loss functions such as Vapnik's loss, the Squared loss, and Huber's loss function as special cases. In other words, it is observed in the extensive simulations that the mentioned lncosh loss function is entirely controlled by a single adjustable λ parameter and as a result, it allows switching between different losses depending on the choice of λ. The effectiveness and feasibility of lncosh loss function are validated through a number of synthetic and real world benchmark data sets for various types of additive noise distributions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model
DEFF Research Database (Denmark)
Møller, Niels Framroze
, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers...
Applying support vector regression analysis on grip force level-related corticomuscular coherence
DEFF Research Database (Denmark)
Rong, Yao; Han, Xixuan; Hao, Dongmei
2014-01-01
Voluntary motor performance is the result of cortical commands driving muscle actions. Corticomuscular coherence can be used to examine the functional coupling or communication between human brain and muscles. To investigate the effects of grip force level on corticomuscular coherence in an acces......Voluntary motor performance is the result of cortical commands driving muscle actions. Corticomuscular coherence can be used to examine the functional coupling or communication between human brain and muscles. To investigate the effects of grip force level on corticomuscular coherence...... in an accessory muscle, this study proposed an expanded support vector regression (ESVR) algorithm to quantify the coherence between electroencephalogram (EEG) from sensorimotor cortex and surface electromyogram (EMG) from brachioradialis in upper limb. A measure called coherence proportion was introduced...... is more sensitive to grip force level than coherence area. The significantly higher corticomuscular coherence occurred in the alpha (pcontrol the activity...
Supplier Short Term Load Forecasting Using Support Vector Regression and Exogenous Input
Matijaš, Marin; Vukićcević, Milan; Krajcar, Slavko
2011-09-01
In power systems, task of load forecasting is important for keeping equilibrium between production and consumption. With liberalization of electricity markets, task of load forecasting changed because each market participant has to forecast their own load. Consumption of end-consumers is stochastic in nature. Due to competition, suppliers are not in a position to transfer their costs to end-consumers; therefore it is essential to keep forecasting error as low as possible. Numerous papers are investigating load forecasting from the perspective of the grid or production planning. We research forecasting models from the perspective of a supplier. In this paper, we investigate different combinations of exogenous input on the simulated supplier loads and show that using points of delivery as a feature for Support Vector Regression leads to lower forecasting error, while adding customer number in different datasets does the opposite.
Optimization of Filter by using Support Vector Regression Machine with Cuckoo Search Algorithm
Directory of Open Access Journals (Sweden)
M. İlarslan
2014-09-01
Full Text Available Herein, a new methodology using a 3D Electromagnetic (EM simulator-based Support Vector Regression Machine (SVRM models of base elements is presented for band-pass filter (BPF design. SVRM models of elements, which are as fast as analytical equations and as accurate as a 3D EM simulator, are employed in a simple and efficient Cuckoo Search Algorithm (CSA to optimize an ultra-wideband (UWB microstrip BPF. CSA performance is verified by comparing it with other Meta-Heuristics such as Genetic Algorithm (GA and Particle Swarm Optimization (PSO. As an example of the proposed design methodology, an UWB BPF that operates between the frequencies of 3.1 GHz and 10.6 GHz is designed, fabricated and measured. The simulation and measurement results indicate in conclusion the superior performance of this optimization methodology in terms of improved filter response characteristics like return loss, insertion loss, harmonic suppression and group delay.
A Vector Approach to Regression Analysis and Its Implications to Heavy-Duty Diesel Emissions
Energy Technology Data Exchange (ETDEWEB)
McAdams, H.T.
2001-02-14
An alternative approach is presented for the regression of response data on predictor variables that are not logically or physically separable. The methodology is demonstrated by its application to a data set of heavy-duty diesel emissions. Because of the covariance of fuel properties, it is found advantageous to redefine the predictor variables as vectors, in which the original fuel properties are components, rather than as scalars each involving only a single fuel property. The fuel property vectors are defined in such a way that they are mathematically independent and statistically uncorrelated. Because the available data set does not allow definitive separation of vehicle and fuel effects, and because test fuels used in several of the studies may be unrealistically contrived to break the association of fuel variables, the data set is not considered adequate for development of a full-fledged emission model. Nevertheless, the data clearly show that only a few basic patterns of fuel-property variation affect emissions and that the number of these patterns is considerably less than the number of variables initially thought to be involved. These basic patterns, referred to as ''eigenfuels,'' may reflect blending practice in accordance with their relative weighting in specific circumstances. The methodology is believed to be widely applicable in a variety of contexts. It promises an end to the threat of collinearity and the frustration of attempting, often unrealistically, to separate variables that are inseparable.
Directory of Open Access Journals (Sweden)
N. Sujay Raghavendra
2015-12-01
Full Text Available This research demonstrates the state-of-the-art capability of Wavelet packet analysis in improving the forecasting efficiency of Support vector regression (SVR through the development of a novel hybrid Wavelet packet–Support vector regression (WP–SVR model for forecasting monthly groundwater level fluctuations observed in three shallow unconfined coastal aquifers. The Sequential Minimal Optimization Algorithm-based SVR model is also employed for comparative study with WP–SVR model. The input variables used for modeling were monthly time series of total rainfall, average temperature, mean tide level, and past groundwater level observations recorded during the period 1996–2006 at three observation wells located near Mangalore, India. The Radial Basis function is employed as a kernel function during SVR modeling. Model parameters are calibrated using the first seven years of data, and the remaining three years data are used for model validation using various input combinations. The performance of both the SVR and WP–SVR models is assessed using different statistical indices. From the comparative result analysis of the developed models, it can be seen that WP–SVR model outperforms the classic SVR model in predicting groundwater levels at all the three well locations (e.g. NRMSE(WP–SVR = 7.14, NRMSE(SVR = 12.27; NSE(WP–SVR = 0.91, NSE(SVR = 0.8 during the test phase with respect to well location at Surathkal. Therefore, using the WP–SVR model is highly acceptable for modeling and forecasting of groundwater level fluctuations.
Directory of Open Access Journals (Sweden)
Young Do Koo
2017-06-01
Full Text Available Residual stress is a critical element in determining the integrity of parts and the lifetime of welded structures. It is necessary to estimate the residual stress of a welding zone because residual stress is a major reason for the generation of primary water stress corrosion cracking in nuclear power plants. That is, it is necessary to estimate the distribution of the residual stress in welding of dissimilar metals under manifold welding conditions. In this study, a cascaded support vector regression (CSVR model was presented to estimate the residual stress of a welding zone. The CSVR model was serially and consecutively structured in terms of SVR modules. Using numerical data obtained from finite element analysis by a subtractive clustering method, learning data that explained the characteristic behavior of the residual stress of a welding zone were selected to optimize the proposed model. The results suggest that the CSVR model yielded a better estimation performance when compared with a classic SVR model.
Directory of Open Access Journals (Sweden)
Jatin Alreja
2015-06-01
Full Text Available This paper uses Multivariate Adaptive Regression Spline (MARS and Least Squares Support Vector Machines (LSSVMs to predict hysteretic energy demand in steel moment resisting frames. These models are used to establish a relation between the hysteretic energy demand and several effective parameters such as earthquake intensity, number of stories, soil type, period, strength index, and the energy imparted to the structure. A total of 27 datasets (input–output pairs are used, 23 of which are used to train the model and 4 are used to test the models. The data-sets used in this study are derived from experimental results. The performance and validity of the model are further tested on different steel moment resisting structures. The developed models have been compared with Genetic-based simulated annealing method (GSA and accurate results portray the strong potential of MARS and LSSVM as reliable tools to predict the hysteretic energy demand.
Directory of Open Access Journals (Sweden)
Liyang Wang
2016-01-01
Full Text Available Time-varying external disturbances cause instability of humanoid robots or even tip robots over. In this work, a trapezoidal fuzzy least squares support vector regression- (TF-LSSVR- based control system is proposed to learn the external disturbances and increase the zero-moment-point (ZMP stability margin of humanoid robots. First, the humanoid states and the corresponding control torques of the joints for training the controller are collected by implementing simulation experiments. Secondly, a TF-LSSVR with a time-related trapezoidal fuzzy membership function (TFMF is proposed to train the controller using the simulated data. Thirdly, the parameters of the proposed TF-LSSVR are updated using a cubature Kalman filter (CKF. Simulation results are provided. The proposed method is shown to be effective in learning and adapting occasional external disturbances and ensuring the stability margin of the robot.
Directory of Open Access Journals (Sweden)
Peek Andrew S
2007-06-01
Full Text Available Abstract Background RNA interference (RNAi is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid
Balfer, Jenny; Bajorath, Jürgen
2015-01-01
Support vector machines are a popular machine learning method for many classification tasks in biology and chemistry. In addition, the support vector regression (SVR) variant is widely used for numerical property predictions. In chemoinformatics and pharmaceutical research, SVR has become the probably most popular approach for modeling of non-linear structure-activity relationships (SARs) and predicting compound potency values. Herein, we have systematically generated and analyzed SVR prediction models for a variety of compound data sets with different SAR characteristics. Although these SVR models were accurate on the basis of global prediction statistics and not prone to overfitting, they were found to consistently mispredict highly potent compounds. Hence, in regions of local SAR discontinuity, SVR prediction models displayed clear limitations. Compared to observed activity landscapes of compound data sets, landscapes generated on the basis of SVR potency predictions were partly flattened and activity cliff information was lost. Taken together, these findings have implications for practical SVR applications. In particular, prospective SVR-based potency predictions should be considered with caution because artificially low predictions are very likely for highly potent candidate compounds, the most important prediction targets.
Directory of Open Access Journals (Sweden)
Jenny Balfer
Full Text Available Support vector machines are a popular machine learning method for many classification tasks in biology and chemistry. In addition, the support vector regression (SVR variant is widely used for numerical property predictions. In chemoinformatics and pharmaceutical research, SVR has become the probably most popular approach for modeling of non-linear structure-activity relationships (SARs and predicting compound potency values. Herein, we have systematically generated and analyzed SVR prediction models for a variety of compound data sets with different SAR characteristics. Although these SVR models were accurate on the basis of global prediction statistics and not prone to overfitting, they were found to consistently mispredict highly potent compounds. Hence, in regions of local SAR discontinuity, SVR prediction models displayed clear limitations. Compared to observed activity landscapes of compound data sets, landscapes generated on the basis of SVR potency predictions were partly flattened and activity cliff information was lost. Taken together, these findings have implications for practical SVR applications. In particular, prospective SVR-based potency predictions should be considered with caution because artificially low predictions are very likely for highly potent candidate compounds, the most important prediction targets.
Directory of Open Access Journals (Sweden)
Kabiru O. Akande
2016-01-01
Full Text Available Hybrid computational intelligence is defined as a combination of multiple intelligent algorithms such that the resulting model has superior performance to the individual algorithms. Therefore, the importance of fusing two or more intelligent algorithms to achieve better performance cannot be overemphasized. In this work, a novel homogenous hybridization scheme is proposed for the improvement of the generalization and predictive ability of support vector machines regression (SVR. The proposed and developed hybrid SVR (HSVR works by considering the initial SVR prediction as a feature extraction process and then employs the SVR output, which is the extracted feature, as its sole descriptor. The developed hybrid model is applied to the prediction of reservoir permeability and the predicted permeability is compared to core permeability which is regarded as standard in petroleum industry. The results show that the proposed hybrid scheme (HSVR performed better than the existing SVR in both generalization and prediction ability. The outcome of this research will assist petroleum engineers to effectively predict permeability of carbonate reservoirs with higher degree of accuracy and will invariably lead to better reservoir. Furthermore, the encouraging performance of this hybrid will serve as impetus for further exploring homogenous hybrid system.
Support vector regression methodology for estimating global solar radiation in Algeria
Guermoui, Mawloud; Rabehi, Abdelaziz; Gairaa, Kacem; Benkaciali, Said
2018-01-01
Accurate estimation of Daily Global Solar Radiation (DGSR) has been a major goal for solar energy applications. In this paper we show the possibility of developing a simple model based on the Support Vector Regression (SVM-R), which could be used to estimate DGSR on the horizontal surface in Algeria based only on sunshine ratio as input. The SVM model has been developed and tested using a data set recorded over three years (2005-2007). The data was collected at the Applied Research Unit for Renewable Energies (URAER) in Ghardaïa city. The data collected between 2005-2006 are used to train the model while the 2007 data are used to test the performance of the selected model. The measured and the estimated values of DGSR were compared during the testing phase statistically using the Root Mean Square Error (RMSE), Relative Square Error (rRMSE), and correlation coefficient (r2), which amount to 1.59(MJ/m2), 8.46 and 97,4%, respectively. The obtained results show that the SVM-R is highly qualified for DGSR estimation using only sunshine ratio.
Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era
Energy Technology Data Exchange (ETDEWEB)
Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo; Balaprakash, Prasanna; Unsal, Osman; Labarta, Jesus; Cristal, Adrian; Cappello, Franck
2016-01-01
As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data point in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.
Predictive based monitoring of nuclear plant component degradation using support vector regression
Energy Technology Data Exchange (ETDEWEB)
Agarwal, Vivek [Idaho National Lab. (INL), Idaho Falls, ID (United States). Dept. of Human Factors, Controls, Statistics; Alamaniotis, Miltiadis [Purdue Univ., West Lafayette, IN (United States). School of Nuclear Engineering; Tsoukalas, Lefteri H. [Purdue Univ., West Lafayette, IN (United States). School of Nuclear Engineering
2015-02-01
Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component’s respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.
Brown, Joshua D; Summers, Michael F; Johnson, Bruce A
2015-09-01
The Biological Magnetic Resonance Data Bank (BMRB) contains NMR chemical shift depositions for over 200 RNAs and RNA-containing complexes. We have analyzed the (1)H NMR and (13)C chemical shifts reported for non-exchangeable protons of 187 of these RNAs. Software was developed that downloads BMRB datasets and corresponding PDB structure files, and then generates residue-specific attributes based on the calculated secondary structure. Attributes represent properties present in each sequential stretch of five adjacent residues and include variables such as nucleotide type, base-pair presence and type, and tetraloop types. Attributes and (1)H and (13)C NMR chemical shifts of the central nucleotide are then used as input to train a predictive model using support vector regression. These models can then be used to predict shifts for new sequences. The new software tools, available as stand-alone scripts or integrated into the NMR visualization and analysis program NMRViewJ, should facilitate NMR assignment and/or validation of RNA (1)H and (13)C chemical shifts. In addition, our findings enabled the re-calibration a ring-current shift model using published NMR chemical shifts and high-resolution X-ray structural data as guides.
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression.
Ibitoye, Morufu Olusola; Hamzaid, Nur Azah; Abdul Wahab, Ahmad Khairi; Hasnan, Nazirah; Olatunji, Sunday Olusanya; Davis, Glen M
2016-07-19
The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES) in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG) of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR) due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70%) and testing (30%) subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R²) between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE) of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
Chen, Hai-Feng
2009-08-01
Oil/water partition coefficient (log P) is one of the key points for lead compound to be drug. In silico log P models based solely on chemical structures have become an important part of modern drug discovery. Here, we report support vector machines, radial basis function neural networks, and multiple linear regression methods to investigate the correlation between partition coefficient and physico-chemical descriptors for a large data set of compounds. The correlation coefficient r(2) between experimental and predicted log P for training and test sets by support vector machines, radial basis function neural networks, and multiple linear regression is 0.92, 0.90, and 0.88, respectively. The results show that non-linear support vector machines derives statistical models that have better prediction ability than those of radial basis function neural networks and multiple linear regression methods. This indicates that support vector machines can be used as an alternative modeling tool for quantitative structure-property/activity relationships studies.
DEFF Research Database (Denmark)
Graversen, C; Frokjaer, J B; Brock, Christina
2012-01-01
patients were discriminated from the HV by a support vector machine (SVM) applied in regression mode. For the optimal DWT, the discriminative features were extracted and the SVM regression value representing the overall alteration of the EP was correlated to the clinical scores. A classification...... approach to study central mechanisms in diabetes mellitus, and may provide a future application for a clinical tool to optimize treatment in individual patients....
Directory of Open Access Journals (Sweden)
N. Zahir
2015-12-01
Full Text Available Lake Urmia is one of the most important ecosystems of the country which is on the verge of elimination. Many factors contribute to this crisis among them is the precipitation, paly important roll. Precipitation has many forms one of them is in the form of snow. The snow on Sahand Mountain is one of the main and important sources of the Lake Urmia’s water. Snow Depth (SD is vital parameters for estimating water balance for future year. In this regards, this study is focused on SD parameter using Special Sensor Microwave/Imager (SSM/I instruments on board the Defence Meteorological Satellite Program (DMSP F16. The usual statistical methods for retrieving SD include linear and non-linear ones. These methods used least square procedure to estimate SD model. Recently, kernel base methods widely used for modelling statistical problem. From these methods, the support vector regression (SVR is achieved the high performance for modelling the statistical problem. Examination of the obtained data shows the existence of outlier in them. For omitting these outliers, wavelet denoising method is applied. After the omission of the outliers it is needed to select the optimum bands and parameters for SVR. To overcome these issues, feature selection methods have shown a direct effect on improving the regression performance. We used genetic algorithm (GA for selecting suitable features of the SSMI bands in order to estimate SD model. The results for the training and testing data in Sahand mountain is [R²_TEST=0.9049 and RMSE= 6.9654] that show the high SVR performance.
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Carbon Nanotube Growth Rate Regression using Support Vector Machines and Artificial Neural Networks
2014-03-27
chiral vector is made up of the unit vectors a1 a2 and the angle θ determines the tube type of either zig zag , chiral or armchair. Recreated from [4...observed. Reprinted from [45] with permission from the Nature Publishing Group. . . . . . . . . . . . . . . . . . . 22 2.14 Armchair, zig zag and...of the unit vectors a1 a2 and the angle θ determines the tube type of either zig zag , chiral or armchair. Recreated from [4, 39]. the angle between
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression
Directory of Open Access Journals (Sweden)
Morufu Olusola Ibitoye
2016-07-01
Full Text Available The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70% and testing (30% subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2 between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
Directory of Open Access Journals (Sweden)
Mattia Callegari
2015-05-01
Full Text Available In this contribution we analyze the performance of a monthly river discharge forecasting model with a Support Vector Regression (SVR technique in a European alpine area. We considered as predictors the discharges of the antecedent months, snow-covered area (SCA, and meteorological and climatic variables for 14 catchments in South Tyrol (Northern Italy, as well as the long-term average discharge of the month of prediction, also regarded as a benchmark. Forecasts at a six-month lead time tend to perform no better than the benchmark, with an average 33% relative root mean square error (RMSE% on test samples. However, at one month lead time, RMSE% was 22%, a non-negligible improvement over the benchmark; moreover, the SVR model reduces the frequency of higher errors associated with anomalous months. Predictions with a lead time of three months show an intermediate performance between those at one and six months lead time. Among the considered predictors, SCA alone reduces RMSE% to 6% and 5% compared to using monthly discharges only, for a lead time equal to one and three months, respectively, whereas meteorological parameters bring only minor improvements. The model also outperformed a simpler linear autoregressive model, and yielded the lowest volume error in forecasting with one month lead time, while at longer lead times the differences compared to the benchmarks are negligible. Our results suggest that although an SVR model may deliver better forecasts than its simpler linear alternatives, long lead-time hydrological forecasting in Alpine catchments remains a challenge. Catchment state variables may play a bigger role than catchment input variables; hence a focus on characterizing seasonal catchment storage—Rather than seasonal weather forecasting—Could be key for improving our predictive capacity.
Shen, Wanxiang; Xiao, Tao; Chen, Shangying; Liu, Feng; Chen, Yu Zong; Jiang, Yuyang
2017-11-01
The enzymatic hydrolysis of chemicals, which is important for in vitro drug metabolism assays, is an important indicator of drug stability profiles during drug discovery and development. Herein, we employed a stepwise feature elimination (SFE) method with nonlinear support vector machine regression (SVR) models to predict the in vitro half-lives in human plasma/blood of various esters. The SVR model was developed using public databases and literature-reported data on the half-lives of esters in human plasma/blood. In particular, the SFE method was developed to prevent over fitting and under fitting in the nonlinear model, and it provided a novel and efficient method of realizing feature combinations and selections to enhance the prediction accuracy. Our final developed model with 24 features effectively predicted an external validation set using the time-split method and presented reasonably good R2 values (0.6) and also predicted two completely independent validation datasets with R2 values of 0.62 and 0.54; thus, this model performed much better than other prediction models. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
BOX-COX REGRESSION METHOD IN TIME SCALING
Directory of Open Access Journals (Sweden)
ATİLLA GÖKTAŞ
2013-06-01
Full Text Available Box-Cox regression method with λj, for j = 1, 2, ..., k, power transformation can be used when dependent variable and error term of the linear regression model do not satisfy the continuity and normality assumptions. The situation obtaining the smallest mean square error when optimum power λj, transformation for j = 1, 2, ..., k, of Y has been discussed. Box-Cox regression method is especially appropriate to adjust existence skewness or heteroscedasticity of error terms for a nonlinear functional relationship between dependent and explanatory variables. In this study, the advantage and disadvantage use of Box-Cox regression method have been discussed in differentiation and differantial analysis of time scale concept.
General method of boundary correction in kernel regression estimation
African Journals Online (AJOL)
Kernel estimators of both density and regression functions are not consistent near the nite end points of their supports. In other words, boundary eects seriously aect the performance of these estimators. In this paper, we combine the transformation and the reflection methods in order to introduce a new general method of ...
An Enhanced MEMS Error Modeling Approach Based on Nu-Support Vector Regression
Directory of Open Access Journals (Sweden)
Deepak Bhatt
2012-07-01
Full Text Available Micro Electro Mechanical System (MEMS-based inertial sensors have made possible the development of a civilian land vehicle navigation system by offering a low-cost solution. However, the accurate modeling of the MEMS sensor errors is one of the most challenging tasks in the design of low-cost navigation systems. These sensors exhibit significant errors like biases, drift, noises; which are negligible for higher grade units. Different conventional techniques utilizing the Gauss Markov model and neural network method have been previously utilized to model the errors. However, Gauss Markov model works unsatisfactorily in the case of MEMS units due to the presence of high inherent sensor errors. On the other hand, modeling the random drift utilizing Neural Network (NN is time consuming, thereby affecting its real-time implementation. We overcome these existing drawbacks by developing an enhanced Support Vector Machine (SVM based error model. Unlike NN, SVMs do not suffer from local minimisation or over-fitting problems and delivers a reliable global solution. Experimental results proved that the proposed SVM approach reduced the noise standard deviation by 10–35% for gyroscopes and 61–76% for accelerometers. Further, positional error drifts under static conditions improved by 41% and 80% in comparison to NN and GM approaches.
An enhanced MEMS error modeling approach based on Nu-Support Vector Regression.
Bhatt, Deepak; Aggarwal, Priyanka; Bhattacharya, Prabir; Devabhaktuni, Vijay
2012-01-01
Micro Electro Mechanical System (MEMS)-based inertial sensors have made possible the development of a civilian land vehicle navigation system by offering a low-cost solution. However, the accurate modeling of the MEMS sensor errors is one of the most challenging tasks in the design of low-cost navigation systems. These sensors exhibit significant errors like biases, drift, noises; which are negligible for higher grade units. Different conventional techniques utilizing the Gauss Markov model and neural network method have been previously utilized to model the errors. However, Gauss Markov model works unsatisfactorily in the case of MEMS units due to the presence of high inherent sensor errors. On the other hand, modeling the random drift utilizing Neural Network (NN) is time consuming, thereby affecting its real-time implementation. We overcome these existing drawbacks by developing an enhanced Support Vector Machine (SVM) based error model. Unlike NN, SVMs do not suffer from local minimisation or over-fitting problems and delivers a reliable global solution. Experimental results proved that the proposed SVM approach reduced the noise standard deviation by 10-35% for gyroscopes and 61-76% for accelerometers. Further, positional error drifts under static conditions improved by 41% and 80% in comparison to NN and GM approaches.
Chao, Cheng-Min; Yu, Ya-Wen; Cheng, Bor-Wen; Kuo, Yao-Lung
2014-10-01
The aim of the paper is to use data mining technology to establish a classification of breast cancer survival patterns, and offers a treatment decision-making reference for the survival ability of women diagnosed with breast cancer in Taiwan. We studied patients with breast cancer in a specific hospital in Central Taiwan to obtain 1,340 data sets. We employed a support vector machine, logistic regression, and a C5.0 decision tree to construct a classification model of breast cancer patients' survival rates, and used a 10-fold cross-validation approach to identify the model. The results show that the establishment of classification tools for the classification of the models yielded an average accuracy rate of more than 90% for both; the SVM provided the best method for constructing the three categories of the classification system for the survival mode. The results of the experiment show that the three methods used to create the classification system, established a high accuracy rate, predicted a more accurate survival ability of women diagnosed with breast cancer, and could be used as a reference when creating a medical decision-making frame.
Directory of Open Access Journals (Sweden)
Fereshteh Shiri
2010-08-01
Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.
Chen, Jing; Qiu, Xiaojie; Yin, Cunyi; Jiang, Hao
2018-02-01
An efficient method to design the broadband gain-flattened Raman fiber amplifier with multiple pumps is proposed based on least squares support vector regression (LS-SVR). A multi-input multi-output LS-SVR model is introduced to replace the complicated solving process of the nonlinear coupled Raman amplification equation. The proposed approach contains two stages: offline training stage and online optimization stage. During the offline stage, the LS-SVR model is trained. Owing to the good generalization capability of LS-SVR, the net gain spectrum can be directly and accurately obtained when inputting any combination of the pump wavelength and power to the well-trained model. During the online stage, we incorporate the LS-SVR model into the particle swarm optimization algorithm to find the optimal pump configuration. The design results demonstrate that the proposed method greatly shortens the computation time and enhances the efficiency of the pump parameter optimization for Raman fiber amplifier design.
Modern methods in topological vector spaces
Wilansky, Albert
2013-01-01
Designed for a one-year course in topological vector spaces, this text is geared toward advanced undergraduates and beginning graduate students of mathematics. The subjects involve properties employed by researchers in classical analysis, differential and integral equations, distributions, summability, and classical Banach and Frechét spaces. Optional problems with hints and references introduce non-locally convex spaces, Köthe-Toeplitz spaces, Banach algebra, sequentially barrelled spaces, and norming subspaces.Extensive introductory chapters cover metric ideas, Banach space, topological vect
Energy Technology Data Exchange (ETDEWEB)
Jiang, Huaiguang [National Renewable Energy Laboratory (NREL), Golden, CO (United States)
2017-08-25
This work proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of the hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system.
Development of orientation method with constraint conditions using vector data
Fuse, Takashi; Kamiya, Keita
2015-05-01
Recently, various kinds of vector data have been widely used. Images as raster data also became popular, and then applications using the vector data and images simultaneously attract more interests. Such applications require registration of those data in a same coordinates system. This paper proposes an orientation method combining the vector data with the images based on bundle adjustment. Since the vector data can be regarded as constraint condition, the bundle adjustment is extended to constrained non-linear optimization method. The constraint conditions are coincidence between lines extracted from images and the corresponding ones of vector data. For formulation, a representative point is set as midpoint of a projected line of vector data on the image. By using the representative points, the coincidence condition is expressed as distance the point and the lines extracted from the image. According to the conditions, the proposed method is formulated as Lagrange's method of undetermined multipliers. The proposed method is applied to synthetic and real data (compared with laser scanner data). The experiments with both synthetic and real data show that the proposed method is more accurate to errors caused by low accuracy of coordinates of feature points than a method without constraint conditions. According to the experiments, the significance of the proposed method is confirmed.
Comparison of ν-support vector regression and logistic equation for ...
African Journals Online (AJOL)
Jane
2011-07-04
Jul 4, 2011 ... Prediction of key state variables using support vector machines in bioprocess. Chem. Eng. Technol. 29: 313-319. Lin, W.Z., Xiao, X., and Chou, K.C., 2009. GPCR-GIA: a web-server for identifying G-protein coupled receptors and their families with grey incidence analysis. Protein Eng Des Sel 22, 699-705.
Water demand prediction using artificial neural networks and support vector regression
CSIR Research Space (South Africa)
Msiza, IS
2008-11-01
Full Text Available comparison are Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs). In this study it was observed that ANNs perform significantly better than SVMs. This performance is measured against the generalization ability of the two techniques in water...
A Vector AutoRegressive (VAR) Approach to the Credit Channel for ...
African Journals Online (AJOL)
This paper is an attempt to determine the presence and empirical significance of monetary policy and the bank lending view of the credit channel for Mauritius, which is particularly relevant at these times. A vector autoregressive (VAR) model of order three is used to examine the monetary transmission mechanism using ...
Hamidi, Omid; Tapak, Leili; Abbasi, Hamed; Maryanaji, Zohreh
2017-10-01
We have conducted a case study to investigate the performance of support vector machine, multivariate adaptive regression splines, and random forest time series methods in snowfall modeling. These models were applied to a data set of monthly snowfall collected during six cold months at Hamadan Airport sample station located in the Zagros Mountain Range in Iran. We considered monthly data of snowfall from 1981 to 2008 during the period from October/November to April/May as the training set and the data from 2009 to 2015 as the testing set. The root mean square errors (RMSE), mean absolute errors (MAE), determination coefficient (R 2), coefficient of efficiency (E%), and intra-class correlation coefficient (ICC) statistics were used as evaluation criteria. Our results indicated that the random forest time series model outperformed the support vector machine and multivariate adaptive regression splines models in predicting monthly snowfall in terms of several criteria. The RMSE, MAE, R 2, E, and ICC for the testing set were 7.84, 5.52, 0.92, 0.89, and 0.93, respectively. The overall results indicated that the random forest time series model could be successfully used to estimate monthly snowfall values. Moreover, the support vector machine model showed substantial performance as well, suggesting it may also be applied to forecast snowfall in this area.
Model reduction methods for vector autoregressive processes
Brüggemann, Ralf
2004-01-01
1. 1 Objective of the Study Vector autoregressive (VAR) models have become one of the dominant research tools in the analysis of macroeconomic time series during the last two decades. The great success of this modeling class started with Sims' (1980) critique of the traditional simultaneous equation models (SEM). Sims criticized the use of 'too many incredible restrictions' based on 'supposed a priori knowledge' in large scale macroeconometric models which were popular at that time. Therefore, he advo cated largely unrestricted reduced form multivariate time series models, unrestricted VAR models in particular. Ever since his influential paper these models have been employed extensively to characterize the underlying dynamics in systems of time series. In particular, tools to summarize the dynamic interaction between the system variables, such as impulse response analysis or forecast error variance decompo sitions, have been developed over the years. The econometrics of VAR models and related quantities i...
Analysis of Finite Element Methods for Vector Laplacians on Surfaces
Hansbo, Peter; Larson, Mats G.; Larsson, Karl
2016-01-01
We develop a finite element method for the vector Laplacian based on the covariant derivative of tangential vector fields on surfaces embedded in $\\mathbb{R}^3$. Closely related operators arise in models of flow on surfaces as well as elastic membranes and shells. The method is based on standard continuous parametric Lagrange elements with one order higher polynomial degree for the mapping. The tangent condition is weakly enforced using a penalization term. We derive error estimates that take...
Comparing parametric and nonparametric regression methods for panel data
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb-Douglas and......We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
Estimation of Students’ Graduation Using Multiple Linear Regression Method
Directory of Open Access Journals (Sweden)
Bintang Dewi Fajar Kurniatullah
2017-04-01
Full Text Available Utilization of students’ academic data to produce information used by management in monitoring students’ study period on Information System Department. Multiple linier regression method will produce multiple linier regression equation used for estimating students’ graduation equipped with prototype. According to analysis carried out by using nine variable SKS1, SKS2, SKS3, SKS4, IPS1, IPS2, IPS3, IPS4, and the number of repeated courses of 2008 to 2012 the multiple linier regression equation is Y = 13.49 + 0.099 X1 + (-0.068 X2 + 0.025 X3 + (-0.059 X4 + (-0.585 X5 + (-0.443 X6 + (-0.155 X7 + (-0.368 X8 + (-0.082 X9. From the equation there is an error of MSE and RMSE that is equal to 0.1168 and 0.3418. The prototype uses a PHP-based program using sublime text and XAMPP. The prototype monitoring the students’ study time in this research is very helpful if supported by management. Keywords: Data mining, multiple linear regression, estimation, monitoring, study time
Directory of Open Access Journals (Sweden)
Ibrahim A. Naguib
2011-12-01
Full Text Available Partial least squares regression (PLSR, spectral residual augmented classical least squares (SRACLS and support vector regression (SVR are three different chemometric models. These models are subjected to a comparative study that highlights their inherent characteristics via applying them to analysis of bisacodyl in the presence of its reported degradation products monoacetyl bisacodyl (I and desacetyl bisacodyl (II, in raw material. For proper analysis, a 3 factor 3 level experimental design was established resulting in a training set of 9 mixtures containing different ratios of the interfering species. A linear test set consisting of 6 mixtures was used to validate the prediction ability of the suggested models. To test the generalisation ability of the models, some extra mixtures were prepared that are outside the concentration space of the training set. To test the ability of models to handle nonlinearity in spectral response, another set of nonlinear samples was prepared. The paper highlights model transfer to other labs under other conditions as well. This paper aims to manifest the advantages of SRACLS and SVR over PLSR model, where SRACLS can tackle future changes without the need for tedious recalibration, while SVR is a more robust and general model, with high ability to model nonlinearity in spectral response, though like PLSR is needing recalibration. The results presented indicate the ability of the three models to analyse bisacodyl in the presence of its degradation products in raw material with high accuracy and precision; where SVR gives the best results at all tested conditions compared to other models.
Analysis of regression methods for solar activity forecasting
Lundquist, C. A.; Vaughan, W. W.
1979-01-01
The paper deals with the potential use of the most recent solar data to project trends in the next few years. Assuming that a mode of solar influence on weather can be identified, advantageous use of that knowledge presumably depends on estimating future solar activity. A frequently used technique for solar cycle predictions is a linear regression procedure along the lines formulated by McNish and Lincoln (1949). The paper presents a sensitivity analysis of the behavior of such regression methods relative to the following aspects: cycle minimum, time into cycle, composition of historical data base, and unnormalized vs. normalized solar cycle data. Comparative solar cycle forecasts for several past cycles are presented as to these aspects of the input data. Implications for the current cycle, No. 21, are also given.
An Asymmetrical Space Vector Method for Single Phase Induction Motor
DEFF Research Database (Denmark)
Cui, Yuanhai; Blaabjerg, Frede; Andersen, Gert Karmisholt
2002-01-01
the motor torque performance is not good enough. This paper addresses a new control method, an asymmetrical space vector method with PWM modulation, also a three-phase inverter is used for the main winding and the auxiliary winding. This method with PWM modulation is implemented to control the motor speed...
Unsupervised parsing of gaze data with a beta-process vector auto-regressive hidden Markov model.
Houpt, Joseph W; Frame, Mary E; Blaha, Leslie M
2017-10-26
The first stage of analyzing eye-tracking data is commonly to code the data into sequences of fixations and saccades. This process is usually automated using simple, predetermined rules for classifying ranges of the time series into events, such as "if the dispersion of gaze samples is lower than a particular threshold, then code as a fixation; otherwise code as a saccade." More recent approaches incorporate additional eye-movement categories in automated parsing algorithms by using time-varying, data-driven thresholds. We describe an alternative approach using the beta-process vector auto-regressive hidden Markov model (BP-AR-HMM). The BP-AR-HMM offers two main advantages over existing frameworks. First, it provides a statistical model for eye-movement classification rather than a single estimate. Second, the BP-AR-HMM uses a latent process to model the number and nature of the types of eye movements and hence is not constrained to predetermined categories. We applied the BP-AR-HMM both to high-sampling rate gaze data from Andersson et al. (Behavior Research Methods 49(2), 1-22 2016) and to low-sampling rate data from the DIEM project (Mital et al., Cognitive Computation 3(1), 5-24 2011). Driven by the data properties, the BP-AR-HMM identified over five categories of movements, some which clearly mapped on to fixations and saccades, and others potentially captured post-saccadic oscillations, smooth pursuit, and various recording errors. The BP-AR-HMM serves as an effective algorithm for data-driven event parsing alone or as an initial step in exploring the characteristics of gaze data sets.
Application of support vector regression (SVR) for stream flow prediction on the Amazon basin
CSIR Research Space (South Africa)
Du Toit, Melise
2016-10-01
Full Text Available regression technique is used in this study to analyse historical stream flow occurrences and predict stream flow values for the Amazon basin. Up to twelve month predictions are made and the coefficient of determination and root-mean-square error are used...
Directory of Open Access Journals (Sweden)
Rachid Darnag
2017-02-01
Full Text Available Support vector machines (SVM represent one of the most promising Machine Learning (ML tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR models using molecular descriptors. Multiple linear regression (MLR and artificial neural networks (ANNs were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure–activity relationships was evaluated.
Directory of Open Access Journals (Sweden)
ANDRÉS M. ÁLVAREZ MEZA
2012-01-01
Full Text Available RESUMEN: En este trabajo, se propone una metodología para la selección automática de los parámetros libres de la técnica de regresión basada en mínimos cuadrados máquinas de vectores de soporte (LS-SVM, a partir de un análisis de validación cruzada generalizada multidimensional sobre el conjunto de ecuaciones lineales de LS-SVM. La técnica desarrollada no requiere de un conocimiento a priori por parte del usuario acerca de la influencia de los parámetros libres en los resultados. Se realizan experimentos sobre dos bases de datos artificiales y dos bases de datos reales. De acuerdo a los resultados obtenidos, se concluye que el algoritmo desarrollado calcula regresiones apropiadas con errores relativos competentes.
Directory of Open Access Journals (Sweden)
Naradasu Kumar Ravi
2013-01-01
Full Text Available Diesel engine designers are constantly on the look-out for performance enhancement through efficient control of operating parameters. In this paper, the concept of an intelligent engine control system is proposed that seeks to ensure optimized performance under varying operating conditions. The concept is based on arriving at the optimum engine operating parameters to ensure the desired output in terms of efficiency. In addition, a Support Vector Machines based prediction model has been developed to predict the engine performance under varying operating conditions. Experiments were carried out at varying loads, compression ratios and amounts of exhaust gas recirculation using a variable compression ratio diesel engine for data acquisition. It was observed that the SVM model was able to predict the engine performance accurately.
Robust Logistic and Probit Methods for Binary and Multinomial Regression.
Tabatabai, M A; Li, H; Eby, W M; Kengwoung-Keumo, J J; Manne, U; Bae, S; Fouad, M; Singh, K P
In this paper we introduce new robust estimators for the logistic and probit regressions for binary, multinomial, nominal and ordinal data and apply these models to estimate the parameters when outliers or inluential observations are present. Maximum likelihood estimates don't behave well when outliers or inluential observations are present. One remedy is to remove inluential observations from the data and then apply the maximum likelihood technique on the deleted data. Another approach is to employ a robust technique that can handle outliers and inluential observations without removing any observations from the data sets. The robustness of the method is tested using real and simulated data sets.
DEFF Research Database (Denmark)
Boeriis, Morten; van Leeuwen, Theo
2017-01-01
This article revisits the concept of vectors, which, in Kress and van Leeuwen’s Reading Images (2006), plays a crucial role in distinguishing between ‘narrative’, action-oriented processes and ‘conceptual’, state-oriented processes. The use of this concept in image analysis has usually focused...... on the most salient vectors, and this works well, but many images contain a plethora of vectors, which makes their structure quite different from the linguistic transitivity structures with which Kress and van Leeuwen have compared ‘narrative’ images. It can also be asked whether facial expression vectors...... should be taken into account in discussing ‘reactions’, which Kress and van Leeuwen link only to eyeline vectors. Finally, the question can be raised as to whether actions are always realized by vectors. Drawing on a re-reading of Rudolf Arnheim’s account of vectors, these issues are outlined...
The Matrix Element Method and Vector-Like Quark Searches
Morrison, Benjamin
2016-01-01
In my time at the CERN summer student program, I worked on applying the matrix element method to vector-like quark identification. I worked in the ATLAS University of Geneva group under Dr. Olaf Nackenhorst. I developed automated plotting tools with ROOT, a script for implementing and optimizing generated matrix element calculation code, and kinematic transforms for the matrix element method.
Directory of Open Access Journals (Sweden)
Yingguo Cheng
2011-01-01
Full Text Available This paper describes the design and implementation of a wireless electronic nose (WEN system which can online detect the combustible gases methane and hydrogen (CH4/H2 and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs and a faster convergence rate than the standard support vector regression (SVR. The designed WEN system effectively achieves gas mixture analysis in a real-time process.
Song, Kai; Wang, Qi; Liu, Qi; Zhang, Hongquan; Cheng, Yingguo
2011-01-01
This paper describes the design and implementation of a wireless electronic nose (WEN) system which can online detect the combustible gases methane and hydrogen (CH(4)/H(2)) and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes--a slave node and a master node. The former comprises a Fe(2)O(3) gas sensing array for the combustible gas detection, a digital signal processor (DSP) system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU) by which the detection results can be transmitted to the master node connected with a computer. A type of Fe(2)O(3) gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR)estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs) and a faster convergence rate than the standard support vector regression (SVR). The designed WEN system effectively achieves gas mixture analysis in a real-time process.
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann
2012-01-01
Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... with sparse LAR, lasso and Elastic Net (EN) sparse regression methods. Due to the inconsistent measurement condition, Locally Weighted Scatter plot Smoothing (Loess) has been employed to alleviate the undesired variation in the estimated viscosity. The experimental results of applying different methods show...... that, the sparse regression lasso outperforms other methods. In addition, the use of local smoothing has improved the results considerably for all regression methods. Due to the sparsity of lasso, this result would assist to design a simpler vision system with less spectral bands....
Polygraph Test Results Assessment by Regression Analysis Methods
Directory of Open Access Journals (Sweden)
K. A. Leontiev
2014-01-01
Full Text Available The paper considers a problem of defining the importance of asked questions for the examinee under judicial and psychophysiological polygraph examination by methods of mathematical statistics. It offers the classification algorithm based on the logistic regression as an optimum Bayesian classifier, considering weight coefficients of information for the polygraph-recorded physiological parameters with no condition for independence of the measured signs.Actually, binary classification is executed by results of polygraph examination with preliminary normalization and standardization of primary results, with check of a hypothesis that distribution of obtained data is normal, as well as with calculation of coefficients of linear regression between input values and responses by method of maximum likelihood. Further, the logistic curve divided signs into two classes of the "significant" and "insignificant" type.Efficiency of model is estimated by means of the ROC analysis (Receiver Operator Characteristics. It is shown that necessary minimum sample has to contain results of 45 measurements at least. This approach ensures a reliable result provided that an expert-polygraphologist possesses sufficient qualification and follows testing techniques.
Directory of Open Access Journals (Sweden)
Giuliano de Oliveira Freitas
2013-10-01
Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.
A vector matching method for analysing logic Petri nets
Du, YuYue; Qi, Liang; Zhou, MengChu
2011-11-01
Batch processing function and passing value indeterminacy in cooperative systems can be described and analysed by logic Petri nets (LPNs). To directly analyse the properties of LPNs, the concept of transition enabling vector sets is presented and a vector matching method used to judge the enabling transitions is proposed in this article. The incidence matrix of LPNs is defined; an equation about marking change due to a transition's firing is given; and a reachable tree is constructed. The state space explosion is mitigated to a certain extent from directly analysing LPNs. Finally, the validity and reliability of the proposed method are illustrated by an example in electronic commerce.
Dimension Reduction and Discretization in Stochastic Problems by Regression Method
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager
1996-01-01
The chapter mainly deals with dimension reduction and field discretizations based directly on the concept of linear regression. Several examples of interesting applications in stochastic mechanics are also given.Keywords: Random fields discretization, Linear regression, Stochastic interpolation...
Spatial modelling of population concentration using geographically weighted regression method
Directory of Open Access Journals (Sweden)
Bajat Branislav
2011-01-01
Full Text Available This paper presents possibilities of applying the geographically weighted regression method in mapping population change index. During the last decade, this contemporary spatial modeling method has been increasingly used in geographical analyses. On the example of the researched region of Timočka Krajina (defined for the needs of elaborating the Regional Spatial Plan, the possibilities for applying this method in disaggregation of traditional models of population density, which are created using the choropleth maps at the level of statistical spatial units, are shown. The applied method is based on the use of ancillary spatial predictors which are in correlation with a targeted variable, the population change index. For this purpose, spatial databases have been used such as digital terrain model, distances from the network of I and II category state roads, as well as soil sealing databases. Spatial model has been developed in the GIS software environment using commercial GIS applications, as well as open source GIS software. Population change indexes for the period 1961-2002 have been mapped based on population census data, while the data on planned population forecast have been used for the period 2002-2027.
Assessment of School Merit with Multiple Regression: Methods and Critique.
Tate, Richard L.
1986-01-01
Regression-based adjustment of student outcomes for the assessment of the merit of schools is considered. First, the basics of causal modeling and multiple regression are briefly reviewed. Then, two common regression-based adjustment procedures are described, pointing out that the validity of the final assessments depends on: (1) the degree to…
A novel stepwise support vector machine (SVM) method based on ...
African Journals Online (AJOL)
ajl yemi
2011-11-23
Nov 23, 2011 ... began to use computational approaches, particularly machine learning methods to identify pre-miRNAs (Xue et al., 2005; Huang et al., 2007; Jiang et al., 2007). Xue et al. (2005) presented a support vector machine (SVM)- based classifier called triplet-SVM, which classifies human pre-miRNAs from pseudo ...
A New Method for Estimation of Velocity Vectors
DEFF Research Database (Denmark)
Jensen, Jørgen Arendt; Munk, Peter
1998-01-01
The paper describes a new method for determining the velocity vector of a remotely sensed object using either sound or electromagnetic radiation. The movement of the object is determined from a field with spatial oscillations in both the axial direction of the transducer and in one or two...
Accurate motion parameter estimation for colonoscopy tracking using a regression method
Liu, Jianfei; Subramanian, Kalpathi R.; Yoo, Terry S.
2010-03-01
Co-located optical and virtual colonoscopy images have the potential to provide important clinical information during routine colonoscopy procedures. In our earlier work, we presented an optical flow based algorithm to compute egomotion from live colonoscopy video, permitting navigation and visualization of the corresponding patient anatomy. In the original algorithm, motion parameters were estimated using the traditional Least Sum of squares(LS) procedure which can be unstable in the context of optical flow vectors with large errors. In the improved algorithm, we use the Least Median of Squares (LMS) method, a robust regression method for motion parameter estimation. Using the LMS method, we iteratively analyze and converge toward the main distribution of the flow vectors, while disregarding outliers. We show through three experiments the improvement in tracking results obtained using the LMS method, in comparison to the LS estimator. The first experiment demonstrates better spatial accuracy in positioning the virtual camera in the sigmoid colon. The second and third experiments demonstrate the robustness of this estimator, resulting in longer tracked sequences: from 300 to 1310 in the ascending colon, and 410 to 1316 in the transverse colon.
Yan, Jun; Huang, Jian-Hua; He, Min; Lu, Hong-Bing; Yang, Rui; Kong, Bo; Xu, Qing-Song; Liang, Yi-Zeng
2013-08-01
Retention indices for frequently reported compounds of plant essential oils on three different stationary phases were investigated. Multivariate linear regression, partial least squares, and support vector machine combined with a new variable selection approach called random-frog recently proposed by our group, were employed to model quantitative structure-retention relationships. Internal and external validations were performed to ensure the stability and predictive ability. All the three methods could obtain an acceptable model, and the optimal results by support vector machine based on a small number of informative descriptors with the square of correlation coefficient for cross validation, values of 0.9726, 0.9759, and 0.9331 on the dimethylsilicone stationary phase, the dimethylsilicone phase with 5% phenyl groups, and the PEG stationary phase, respectively. The performances of two variable selection approaches, random-frog and genetic algorithm, are compared. The importance of the variables was found to be consistent when estimated from correlation coefficients in multivariate linear regression equations and selection probability in model spaces. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Solgi, Abazar; Pourhaghi, Amir; Bahmani, Ramin; Zarei, Heidar
2017-07-01
An accurate estimation of flow using different models is an issue for water resource researchers. In this study, support vector regression (SVR) and gene expression programming (GEP) models in daily and monthly scale were used in order to simulate Gamasiyab River flow in Nahavand, Iran. The results showed that although the performance of models in daily scale was acceptable and the result of SVR model was a little better, their performance in the daily scale was really better than the monthly scale. Therefore, wavelet transform was used and the main signal of every input was decomposed. Then, by using principal component analysis method, important sub-signals were recognized and used as inputs for the SVR and GEP models to produce wavelet-support vector regression (WSVR) and wavelet-gene expression programming. The results showed that the performance of WSVR was better than the SVR in such a way that the combination of SVR with wavelet could improve the determination coefficient of the model up to 3% and 18% for daily and monthly scales, respectively. Totally, it can be said that the combination of wavelet with SVR is a suitable tool for the prediction of Gamasiyab River flow in both daily and monthly scales.
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738
Directory of Open Access Journals (Sweden)
Chi-Jie Lu
2014-01-01
Full Text Available Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA with K-means clustering and support vector regression (SVR. The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
Energy Technology Data Exchange (ETDEWEB)
Feng Wu; Hao Zhou; Tao Ren; Ligang Zheng; Kefa Cen [Zhejiang University, Hangzhou (China). State Key Laboratory of Clean Energy Utilization
2009-10-15
Support vector regression (SVR) was employed to establish mathematical models for the NOx emissions and carbon burnout of a 300 MW coal-fired utility boiler. Combined with the SVR models, the cellular genetic algorithm for multi-objective optimization (MOCell) was used for multi-objective optimization of the boiler combustion. Meanwhile, the comparison between MOCell and the improved non-dominated sorting genetic algorithm (NSGA-II) shows that MOCell has superior performance to NSGA-II regarding the problem. The field experiments were carried out to verify the accuracy of the results obtained by MOCell, the results were in good agreement with the measurement data. The proposed approach provides an effective tool for multi-objective optimization of coal combustion performance, whose feasibility and validity are experimental validated. A time period of less than 4 s was required for a run of optimization under a PC system, which is suitable for the online application. 19 refs., 8 figs., 2 tabs.
Oguntunde, Philip G; Lischeid, Gunnar; Dietrich, Ottfried
2017-10-14
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease (P yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried
2017-10-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease (P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Zongyuan Cai; Zhi Wang; Aixia Yan
2008-01-01
QSAR (Quantitative Structure Activity Relationships) models for the prediction of human intestinal absorption (HIA) were built with molecular descriptors calculated by ADRIANA.Code, Cerius2 and a combination of them. A dataset of 552 compounds covering a wide range of current drugs with experimental HIA values was investigated. A Genetic Algorithm feature selection method was applied to select proper descriptors. A Kohonen's self-organizing Neural Network (KohNN) map was used to split the who...
Wang, Guochen; Wang, Qiuying; Zhao, Bo; Wang, Zhenpeng
2016-02-10
Aiming to improve the bias stability of the fiber optical gyroscope (FOG) in an ambient temperature-change environment, a temperature-compensation method based on the relevance vector machine (RVM) under Bayesian framework is proposed and applied. Compared with other temperature models such as quadratic polynomial regression, neural network, and the support vector machine, the proposed RVM method possesses higher accuracy to explain the temperature dependence of the FOG gyro bias. Experimental results indicate that, with the proposed RVM method, the bias stability of an FOG can be apparently reduced in the whole temperature ranging from -40°C to 60°C. Therefore, the proposed method can effectively improve the adaptability of the FOG in a changing temperature environment.
Methods of treating Parkinson's disease using viral vectors
Energy Technology Data Exchange (ETDEWEB)
Bankiewicz, Krys; Cunningham, Janet
2012-11-13
Methods of delivering viral vectors, particularly recombinant AAV virions, to the central nervous system (CNS) are provided for the treatment of CNS disorders, particularly those disorders which involve the neurotransmitter dopamine. The methods entail providing rAAV virions that comprise a transgene encoding aromatic amino acid decarboxylase (AADC) and administering the virions to the brain of a mammal using a non-manual pump.
Methods of Detecting Outliers in A Regression Analysis Model ...
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... Abstract. This study detects outliers in a univariate and bivariate data by using both Rosner's and. Grubb's test in a regression analysis model. The study shows how an observation that causes the least square point estimate of a Regression model to be substantially different from what it would be if the ...
Methods of Detecting Outliers in A Regression Analysis Model. | Ogu ...
African Journals Online (AJOL)
This study detects outliers in a univariate and bivariate data by using both Rosner's and Grubb's test in a regression analysis model. The study shows how an observation that causes the least square point estimate of a Regression model to be substantially different from what it would be if the observation were removed from ...
Zhu, Dazhou; Ji, Baoping; Meng, Chaoying; Shi, Bolin; Tu, Zhenhua; Qing, Zhaoshen
2007-08-29
The nu-support vector regression (nu-SVR) was used to construct the calibration model between soluble solids content (SSC) of apples and acousto-optic tunable filter near-infrared (AOTF-NIR) spectra. The performance of nu-SVR was compared with the partial least square regression (PLSR) and the back-propagation artificial neural networks (BP-ANN). The influence of SVR parameters on the predictive ability of model was investigated. The results indicated that the parameter nu had a rather wide optimal area (between 0.35 and 1 for the apple data). Therefore, we could determine the value of nu beforehand and focus on the selection of other SVR parameters. For analyzing SSC of apple, nu-SVR was superior to PLSR and BP-ANN, especially in the case of fewer samples and treating the noise polluted spectra. Proper spectra pretreatment methods, such as scaling, mean center, standard normal variate (SNV) and the wavelength selection methods (stepwise multiple linear regression and genetic algorithm with PLS as its objective function), could improve the quality of nu-SVR model greatly.
Natural interpretations in Tobit regression models using marginal estimation methods.
Wang, Wei; Griswold, Michael E
2015-09-01
The Tobit model, also known as a censored regression model to account for left- and/or right-censoring in the dependent variable, has been used in many areas of applications, including dental health, medical research and economics. The reported Tobit model coefficient allows estimation and inference of an exposure effect on the latent dependent variable. However, this model does not directly provide overall exposure effects estimation on the original outcome scale. We propose a direct-marginalization approach using a reparameterized link function to model exposure and covariate effects directly on the truncated dependent variable mean. We also discuss an alternative average-predicted-value, post-estimation approach which uses model-predicted values for each person in a designated reference group under different exposure statuses to estimate covariate-adjusted overall exposure effects. Simulation studies were conducted to show the unbiasedness and robustness properties for both approaches under various scenarios. Robustness appears to diminish when covariates with substantial effects are imbalanced between exposure groups; we outline an approach for model choice based on information criterion fit statistics. The methods are applied to the Genetic Epidemiology Network of Arteriopathy (GENOA) cohort study to assess associations between obesity and cognitive function in the non-Hispanic white participants. © The Author(s) 2015.
Naguib, Ibrahim A.; Darwish, Hany W.
2012-02-01
A comparison between support vector regression (SVR) and Artificial Neural Networks (ANNs) multivariate regression methods is established showing the underlying algorithm for each and making a comparison between them to indicate the inherent advantages and limitations. In this paper we compare SVR to ANN with and without variable selection procedure (genetic algorithm (GA)). To project the comparison in a sensible way, the methods are used for the stability indicating quantitative analysis of mixtures of mebeverine hydrochloride and sulpiride in binary mixtures as a case study in presence of their reported impurities and degradation products (summing up to 6 components) in raw materials and pharmaceutical dosage form via handling the UV spectral data. For proper analysis, a 6 factor 5 level experimental design was established resulting in a training set of 25 mixtures containing different ratios of the interfering species. An independent test set consisting of 5 mixtures was used to validate the prediction ability of the suggested models. The proposed methods (linear SVR (without GA) and linear GA-ANN) were successfully applied to the analysis of pharmaceutical tablets containing mebeverine hydrochloride and sulpiride mixtures. The results manifest the problem of nonlinearity and how models like the SVR and ANN can handle it. The methods indicate the ability of the mentioned multivariate calibration models to deconvolute the highly overlapped UV spectra of the 6 components' mixtures, yet using cheap and easy to handle instruments like the UV spectrophotometer.
Naguib, Ibrahim A; Darwish, Hany W
2012-02-01
A comparison between support vector regression (SVR) and Artificial Neural Networks (ANNs) multivariate regression methods is established showing the underlying algorithm for each and making a comparison between them to indicate the inherent advantages and limitations. In this paper we compare SVR to ANN with and without variable selection procedure (genetic algorithm (GA)). To project the comparison in a sensible way, the methods are used for the stability indicating quantitative analysis of mixtures of mebeverine hydrochloride and sulpiride in binary mixtures as a case study in presence of their reported impurities and degradation products (summing up to 6 components) in raw materials and pharmaceutical dosage form via handling the UV spectral data. For proper analysis, a 6 factor 5 level experimental design was established resulting in a training set of 25 mixtures containing different ratios of the interfering species. An independent test set consisting of 5 mixtures was used to validate the prediction ability of the suggested models. The proposed methods (linear SVR (without GA) and linear GA-ANN) were successfully applied to the analysis of pharmaceutical tablets containing mebeverine hydrochloride and sulpiride mixtures. The results manifest the problem of nonlinearity and how models like the SVR and ANN can handle it. The methods indicate the ability of the mentioned multivariate calibration models to deconvolute the highly overlapped UV spectra of the 6 components' mixtures, yet using cheap and easy to handle instruments like the UV spectrophotometer. Copyright © 2011 Elsevier B.V. All rights reserved.
Neural cell image segmentation method based on support vector machine
Niu, Shiwei; Ren, Kan
2015-10-01
In the analysis of neural cell images gained by optical microscope, accurate and rapid segmentation is the foundation of nerve cell detection system. In this paper, a modified image segmentation method based on Support Vector Machine (SVM) is proposed to reduce the adverse impact caused by low contrast ratio between objects and background, adherent and clustered cells' interference etc. Firstly, Morphological Filtering and OTSU Method are applied to preprocess images for extracting the neural cells roughly. Secondly, the Stellate Vector, Circularity and Histogram of Oriented Gradient (HOG) features are computed to train SVM model. Finally, the incremental learning SVM classifier is used to classify the preprocessed images, and the initial recognition areas identified by the SVM classifier are added to the library as the positive samples for training SVM model. Experiment results show that the proposed algorithm can achieve much better segmented results than the classic segmentation algorithms.
Classification Method in Integrated Information Network Using Vector Image Comparison
Directory of Open Access Journals (Sweden)
Zhou Yuan
2014-05-01
Full Text Available Wireless Integrated Information Network (WMN consists of integrated information that can get data from its surrounding, such as image, voice. To transmit information, large resource is required which decreases the service time of the network. In this paper we present a Classification Approach based on Vector Image Comparison (VIC for WMN that improve the service time of the network. The available methods for sub-region selection and conversion are also proposed.
Seshan, Hari; Goyal, Manish K; Falk, Michael W; Wuertz, Stefan
2014-04-15
The relationship between microbial community structure and function has been examined in detail in natural and engineered environments, but little work has been done on using microbial community information to predict function. We processed microbial community and operational data from controlled experiments with bench-scale bioreactor systems to predict reactor process performance. Four membrane-operated sequencing batch reactors treating synthetic wastewater were operated in two experiments to test the effects of (i) the toxic compound 3-chloroaniline (3-CA) and (ii) bioaugmentation targeting 3-CA degradation, on the sludge microbial community in the reactors. In the first experiment, two reactors were treated with 3-CA and two reactors were operated as controls without 3-CA input. In the second experiment, all four reactors were additionally bioaugmented with a Pseudomonas putida strain carrying a plasmid with a portion of the pathway for 3-CA degradation. Molecular data were generated from terminal restriction fragment length polymorphism (T-RFLP) analysis targeting the 16S rRNA and amoA genes from the sludge community. The electropherograms resulting from these T-RFs were used to calculate diversity indices - community richness, dynamics and evenness - for the domain Bacteria as well as for ammonia-oxidizing bacteria in each reactor over time. These diversity indices were then used to train and test a support vector regression (SVR) model to predict reactor performance based on input microbial community indices and operational data. Considering the diversity indices over time and across replicate reactors as discrete values, it was found that, although bioaugmentation with a bacterial strain harboring a subset of genes involved in the degradation of 3-CA did not bring about 3-CA degradation, it significantly affected the community as measured through all three diversity indices in both the general bacterial community and the ammonia-oxidizer community (
Parallel/Vector Integration Methods for Dynamical Astronomy
Fukushima, Toshio
1999-01-01
This paper reviews three recent works on the numerical methods to integrate ordinary differential equations (ODE), which are specially designed for parallel, vector, and/or multi-processor-unit(PU) computers. The first is the Picard-Chebyshev method (Fukushima, 1997a). It obtains a global solution of ODE in the form of Chebyshev polynomial of large (> 1000) degree by applying the Picard iteration repeatedly. The iteration converges for smooth problems and/or perturbed dynamics. The method runs around 100-1000 times faster in the vector mode than in the scalar mode of a certain computer with vector processors (Fukushima, 1997b). The second is a parallelization of a symplectic integrator (Saha et al., 1997). It regards the implicit midpoint rules covering thousands of timesteps as large-scale nonlinear equations and solves them by the fixed-point iteration. The method is applicable to Hamiltonian systems and is expected to lead an acceleration factor of around 50 in parallel computers with more than 1000 PUs. The last is a parallelization of the extrapolation method (Ito and Fukushima, 1997). It performs trial integrations in parallel. Also the trial integrations are further accelerated by balancing computational load among PUs by the technique of folding. The method is all-purpose and achieves an acceleration factor of around 3.5 by using several PUs. Finally, we give a perspective on the parallelization of some implicit integrators which require multiple corrections in solving implicit formulas like the implicit Hermitian integrators (Makino and Aarseth, 1992), (Hut et al., 1995) or the implicit symmetric multistep methods (Fukushima, 1998), (Fukushima, 1999).
Analysis of some methods for reduced rank Gaussian process regression
DEFF Research Database (Denmark)
Quinonero-Candela, J.; Rasmussen, Carl Edward
2005-01-01
proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank......While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent...... Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning...
Biosensor method and system based on feature vector extraction
Greenbaum, Elias [Knoxville, TN; Rodriguez, Jr., Miguel; Qi, Hairong [Knoxville, TN; Wang, Xiaoling [San Jose, CA
2012-04-17
A method of biosensor-based detection of toxins comprises the steps of providing at least one time-dependent control signal generated by a biosensor in a gas or liquid medium, and obtaining a time-dependent biosensor signal from the biosensor in the gas or liquid medium to be monitored or analyzed for the presence of one or more toxins selected from chemical, biological or radiological agents. The time-dependent biosensor signal is processed to obtain a plurality of feature vectors using at least one of amplitude statistics and a time-frequency analysis. At least one parameter relating to toxicity of the gas or liquid medium is then determined from the feature vectors based on reference to the control signal.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Zhou, Yang; Fu, Xiaping; Ying, Yibin; Fang, Zhenhuan
2015-06-23
A fiber-optic probe system was developed to estimate the optical properties of turbid media based on spatially resolved diffuse reflectance. Because of the limitations in numerical calculation of radiative transfer equation (RTE), diffusion approximation (DA) and Monte Carlo simulations (MC), support vector regression (SVR) was introduced to model the relationship between diffuse reflectance values and optical properties. The SVR models of four collection fibers were trained by phantoms in calibration set with a wide range of optical properties which represented products of different applications, then the optical properties of phantoms in prediction set were predicted after an optimal searching on SVR models. The results indicated that the SVR model was capable of describing the relationship with little deviation in forward validation. The correlation coefficient (R) of reduced scattering coefficient μ'(s) and absorption coefficient μ(a) in the prediction set were 0.9907 and 0.9980, respectively. The root mean square errors of prediction (RMSEP) of μ'(s) and μ(a) in inverse validation were 0.411 cm(-1) and 0.338 cm(-1), respectively. The results indicated that the integrated fiber-optic probe system combined with SVR model were suitable for fast and accurate estimation of optical properties of turbid media based on spatially resolved diffuse reflectance. Copyright © 2015 Elsevier B.V. All rights reserved.
Wang, Chih-Ping; Kim, Hee-Jeong; Yue, Chao; Weygand, James M.; Hsu, Tung-Shin; Chu, Xiangning
2017-04-01
To investigate whether ultralow-frequency (ULF) fluctuations from 0.5 to 8.3 mHz in the solar wind and interplanetary magnetic field (IMF) can affect the plasma sheet electron temperature (Te) near geosynchronous distances, we use a support vector regression machine technique to decouple the effects from different solar wind parameters and their ULF fluctuation power. Te in this region varies from 0.1 to 10 keV with a median of 1.3 keV. We find that when the solar wind ULF power is weak, Te increases with increasing southward IMF Bz and solar wind speed, while it varies weakly with solar wind density. As the ULF power becomes stronger during weak IMF Bz ( 0) or northward IMF, Te becomes significantly enhanced, by a factor of up to 10. We also find that mesoscale disturbances in a time scale of a few to tens of minutes as indicated by AE during substorm expansion and recovery phases are more enhanced when the ULF power is stronger. The effect of ULF powers may be explained by stronger inward radial diffusion resulting from stronger mesoscale disturbances under higher ULF powers, which can bring high-energy plasma sheet electrons further toward geosynchronous distance. This effect of ULF powers is particularly important during weak southward IMF or northward IMF when convection electric drift is weak.
Directory of Open Access Journals (Sweden)
Jakub Langhammer
2016-11-01
Full Text Available This paper analyzes the potential of a nu-support vector regression (nu-SVR model for the reconstruction of missing data of hydrological time series from a sensor network. Sensor networks are currently experiencing rapid growth of applications in experimental research and monitoring and provide an opportunity to study the dynamics of hydrological processes in previously ungauged or remote areas. Due to physical vulnerability or limited maintenance, networks are prone to data outages, which can devaluate the unique data sources. This paper analyzes the potential of a nu-SVR model to simulate water levels in a network of sensors in four nested experimental catchments in a mid-latitude montane environment. The model was applied to a range of typical runoff situations, including a single event storm, multi-peak flood event, snowmelt, rain on snow and a low flow period. The simulations based on daily values proved the high efficiency of the nu-SVR modeling approach to simulate the hydrological processes in a network of monitoring stations. The model proved its ability to reliably reconstruct and simulate typical runoff situations, including complex events, such as rain on snow or flooding from recurrent regional rain. The worst model performance was observed at low flow periods and for single peak flows, especially in the high-altitude catchments.
Tang, J. L.; Cai, C. Z.; Xiao, T. T.; Huang, S. J.
2012-07-01
The electrical conductivity of solid oxide fuel cell (SOFC) cathode is one of the most important indices affecting the efficiency of SOFC. In order to improve the performance of fuel cell system, it is advantageous to have accurate model with which one can predict the electrical conductivity. In this paper, a model utilizing support vector regression (SVR) approach combined with particle swarm optimization (PSO) algorithm for its parameter optimization was established to modeling and predicting the electrical conductivity of Ba0.5Sr0.5Co0.8Fe0.2 O3-δ-xSm0.5Sr0.5CoO3-δ (BSCF-xSSC) composite cathode under two influence factors, including operating temperature (T) and SSC content (x) in BSCF-xSSC composite cathode. The leave-one-out cross validation (LOOCV) test result by SVR strongly supports that the generalization ability of SVR model is high enough. The absolute percentage error (APE) of 27 samples does not exceed 0.05%. The mean absolute percentage error (MAPE) of all 30 samples is only 0.09% and the correlation coefficient (R2) as high as 0.999. This investigation suggests that the hybrid PSO-SVR approach may be not only a promising and practical methodology to simulate the properties of fuel cell system, but also a powerful tool to be used for optimal designing or controlling the operating process of a SOFC system.
Directory of Open Access Journals (Sweden)
Ping Jiang
2015-01-01
Full Text Available Wind speed/power has received increasing attention around the earth due to its renewable nature as well as environmental friendliness. With the global installed wind power capacity rapidly increasing, wind industry is growing into a large-scale business. Reliable short-term wind speed forecasts play a practical and crucial role in wind energy conversion systems, such as the dynamic control of wind turbines and power system scheduling. In this paper, an intelligent hybrid model for short-term wind speed prediction is examined; the model is based on cross correlation (CC analysis and a support vector regression (SVR model that is coupled with brainstorm optimization (BSO and cuckoo search (CS algorithms, which are successfully utilized for parameter determination. The proposed hybrid models were used to forecast short-term wind speeds collected from four wind turbines located on a wind farm in China. The forecasting results demonstrate that the intelligent hybrid models outperform single models for short-term wind speed forecasting, which mainly results from the superiority of BSO and CS for parameter optimization.
Lattice Boltzmann method for one-dimensional vector radiative transfer.
Zhang, Yong; Yi, Hongliang; Tan, Heping
2016-02-08
A one-dimensional vector radiative transfer (VRT) model based on lattice Boltzmann method (LBM) that considers polarization using four Stokes parameters is developed. The angular space is discretized by the discrete-ordinates approach, and the spatial discretization is conducted by LBM. LBM has such attractive properties as simple calculation procedure, straightforward and efficient handing of boundary conditions, and capability of stable and accurate simulation. To validate the performance of LBM for vector radiative transfer, four various test problems are examined. The first case investigates the non-scattering thermal-emitting atmosphere with no external collimated solar. For the other three cases, the external collimated solar and three different scattering types are considered. Particularly, the LBM is extended to solve VRT in the atmospheric aerosol system where the scattering function contains singularities and the hemisphere space distributions for the Stokes vector are presented and discussed. The accuracy and computational efficiency of this algorithm are discussed. Numerical results show that the LBM is accurate, flexible and effective to solve one-dimensional polarized radiative transfer problems.
Regression Methods for Ophthalmic Glucose Sensing Using Metamaterials
Directory of Open Access Journals (Sweden)
Philipp Rapp
2011-01-01
Full Text Available We present a novel concept for in vivo sensing of glucose using metamaterials in combination with automatic learning systems. In detail, we use the plasmonic analogue of electromagnetically induced transparency (EIT as sensor and evaluate the acquired data with support vector machines. The metamaterial can be integrated into a contact lens. This sensor changes its optical properties such as reflectivity upon the ambient glucose concentration, which allows for in situ measurements in the eye. We demonstrate that estimation errors below 2% at physiological concentrations are possible using simulations of the optical properties of the metamaterial in combination with an appropriate electrical circuitry and signal processing scheme. In the future, functionalization of our sensor with hydrogel will allow for a glucose-specific detection which is insensitive to other tear liquid substances providing both excellent selectivity and sensitivity.
Abu Awad, Yara; Koutrakis, Petros; Coull, Brent A; Schwartz, Joel
2017-11-01
Fine ambient particulate matter has been widely associated with multiple health effects. Mitigation hinges on understanding which sources are contributing to its toxicity. Black Carbon (BC), an indicator of particles generated from traffic sources, has been associated with a number of health effects however due to its high spatial variability, its concentration is difficult to estimate. We previously fit a model estimating BC concentrations in the greater Boston area; however this model was built using limited monitoring data and could not capture the complex spatio-temporal patterns of ambient BC. In order to improve our predictive ability, we obtained more data for a total of 24,301 measurements from 368 monitors over a 12 year period in Massachusetts, Rhode Island and New Hampshire. We also used Nu-Support Vector Regression (nu-SVR) - a machine learning technique which incorporates nonlinear terms and higher order interactions, with appropriate regularization of parameter estimates. We then used a generalized additive model to refit the residuals from the nu-SVR and added the residual predictions to our earlier estimates. Both spatial and temporal predictors were included in the model which allowed us to capture the change in spatial patterns of BC over time. The 10 fold cross validated (CV) R2 of the model was good in both cold (10-fold CV R2 = 0.87) and warm seasons (CV R2 = 0.79). We have successfully built a model that can be used to estimate short and long-term exposures to BC and will be useful for studies looking at various health outcomes in MA, RI and Southern NH. Copyright © 2017 Elsevier Inc. All rights reserved.
Highly Sensitive Method for Titration of Adenovirus Vectors
sprotocols
2015-01-01
Authors: Hildegund Ertl, ZhiQuan Xiang, Yan Li, Dongming Zhou, Xiangyang Zhou, Wynetta Giles-Davis & Yi-lin E. Liu ### Abstract Clinical development of vaccines based on adenovirus (Ad) vectors requires accurate techniques to determine vector doses including contents of infectious particles. For vectors derived from Ad virus of human serotype 5 content of infectious particles can readily be determined by plaque assays. Vaccine vectors based on alternative Ad serotypes such as thos...
Permissible performance limits of regression analyses in method comparisons.
Haeckel, Rainer; Wosniok, Werner; Al Shareef, Nadera
2011-11-01
Method comparisons are indispensable tools for the extensive validation of analytic procedures. Laboratories often only want to know whether an established procedure (x-method) can be replaced by another one (y-method) without interfering with diagnostic purposes. Then split patients' samples are analyzed more or less simultaneously with both procedures designed to measure the same quantity. The measured values are usually presented graphically as a scatter or difference plots. The two methods are considered to be equivalent (comparable) if the data pairs scatter around the line of equality (x=y line) within permissible equivalence lines. It is proposed to derive these limits of permissible imprecision limits which are based on false-positive error rates. If all data pairs are within the limits, both methods lead to comparable false error rates. If one or more data pairs are outside the permissible equivalence limits, the x-method cannot simply be replaced by the y-method and further studies are required. The discordance may be caused either by aberrant values (outliers), non-linearity, bias or a higher variation of e.g., the y-values. The spread around the line of best fit can detect possible interferences if more than 1% of the data pairs are outside permissible spread lines in a scatter plot. Because bias between methods and imprecision can be inter-related, both require specific examinations for their identification.
Qin, Zijian; Wang, Maolin; Yan, Aixia
2017-07-01
In this study, quantitative structure-activity relationship (QSAR) models using various descriptor sets and training/test set selection methods were explored to predict the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by using a multiple linear regression (MLR) and a support vector machine (SVM) method. 512 HCV NS3/4A protease inhibitors and their IC 50 values which were determined by the same FRET assay were collected from the reported literature to build a dataset. All the inhibitors were represented with selected nine global and 12 2D property-weighted autocorrelation descriptors calculated from the program CORINA Symphony. The dataset was divided into a training set and a test set by a random and a Kohonen's self-organizing map (SOM) method. The correlation coefficients (r 2 ) of training sets and test sets were 0.75 and 0.72 for the best MLR model, 0.87 and 0.85 for the best SVM model, respectively. In addition, a series of sub-dataset models were also developed. The performances of all the best sub-dataset models were better than those of the whole dataset models. We believe that the combination of the best sub- and whole dataset SVM models can be used as reliable lead designing tools for new NS3/4A protease inhibitors scaffolds in a drug discovery pipeline. Copyright © 2017 Elsevier Ltd. All rights reserved.
Gao, Wei; Li, Xiang-ru
2017-07-01
The multi-task learning takes the multiple tasks together to make analysis and calculation, so as to dig out the correlations among them, and therefore to improve the accuracy of the analyzed results. This kind of methods have been widely applied to the machine learning, pattern recognition, computer vision, and other related fields. This paper investigates the application of multi-task learning in estimating the stellar atmospheric parameters, including the surface temperature (Teff), surface gravitational acceleration (lg g), and chemical abundance ([Fe/H]). Firstly, the spectral features of the three stellar atmospheric parameters are extracted by using the multi-task sparse group Lasso algorithm, then the support vector machine is used to estimate the atmospheric physical parameters. The proposed scheme is evaluated on both the Sloan stellar spectra and the theoretical spectra computed from the Kurucz's New Opacity Distribution Function (NEWODF) model. The mean absolute errors (MAEs) on the Sloan spectra are: 0.0064 for lg (Teff /K), 0.1622 for lg (g/(cm · s-2)), and 0.1221 dex for [Fe/H]; the MAEs on the synthetic spectra are 0.0006 for lg (Teff /K), 0.0098 for lg (g/(cm · s-2)), and 0.0082 dex for [Fe/H]. Experimental results show that the proposed scheme has a rather high accuracy for the estimation of stellar atmospheric parameters.
DEFF Research Database (Denmark)
Kirkeby, Carsten Thure; Hisham Beshara Halasa, Tariq; Gussmann, Maya Katrin
2017-01-01
Precise estimates of disease transmission rates are critical for epidemiological simulation models. Most often these rates must be estimated from longitudinal field data, which are costly and time-consuming to conduct. Consequently, measures to reduce cost like increased sampling intervals...... the transmission rate. We use data from the two simulation models and vary the sampling intervals and the size of the population sampled. We devise two new methods to determine transmission rate, and compare these to the frequently used Poisson regression method in both epidemic and endemic situations. For most...
Third-Order Newton-Type Methods Combined with Vector Extrapolation for Solving Nonlinear Systems
Directory of Open Access Journals (Sweden)
Wen Zhou
2014-01-01
Full Text Available We present a third-order method for solving the systems of nonlinear equations. This method is a Newton-type scheme with the vector extrapolation. We establish the local and semilocal convergence of this method. Numerical results show that the composite method is more robust and efficient than a number of Newton-type methods with the other vector extrapolations.
Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian
2016-05-11
In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.
Directory of Open Access Journals (Sweden)
Ibrahim A. Naguib
2017-12-01
Full Text Available In the presented study, orthogonal projection to latent structures (OPLS is introduced asÂ a data preprocessing method that handles nonlinear data prior to modelling with two well established nonlinear multivariate models; namely support vector regression (SVR and artificial neural networks (ANN. The proposed preprocessing proved to significantly improve prediction abilities through removal of uncorrelated data.The study was established based on a case study nonlinear spectrofluorimetric data of agomelatine (AGM and its hydrolysis degradation products (Deg I and Deg II, where a 3 factor 4 level experimental design was used to provide a training set of 16 mixtures with different proportions of studied components. An independent test set which consisted of 9 mixtures was established to confirm the prediction ability of the introduced models. Excitation wavelength was 227Â nm, and working range for emission spectra was 320â440Â nm.The couplings of OPLS-SVR and OPLS-ANN provided better accuracy for prediction of independent nonlinear test set. The root mean square error of prediction RMSEP for the test set mixtures was used asÂ a major comparison parameter, where RMSEP results for OPLS-SVR and OPLS-ANN are 2.19 and 1.50 respectively. Keywords: Agomelatine, SVR, ANN, OPLS, Spectrofluorimetry, Nonlinear
Figaro: a novel statistical method for vector sequence removal
White, James Robert; Roberts, Michael; Yorke, James A.; Pop, Mihai
2009-01-01
Motivation Sequences produced by automated Sanger sequencing machines frequently contain fragments of the cloning vector on their ends. Software tools currently available for identifying and removing the vector sequence require knowledge of the vector sequence, specific splice sites and any adapter sequences used in the experiment—information often omitted from public databases. Furthermore, the clipping coordinates themselves are missing or incorrectly reported. As an example, within the ~1.24 billion shotgun sequences deposited in the NCBI Trace Archive, as many as ~735 million (~60%) lack vector clipping information. Correct clipping information is essential to scientists attempting to validate, improve and even finish the increasingly large number of genomes released at a ‘draft’ quality level. Results We present here Figaro, a novel software tool for identifying and removing the vector from raw sequence data without prior knowledge of the vector sequence. The vector sequence is automatically inferred by analyzing the frequency of occurrence of short oligo-nucleotides using Poisson statistics. We show that Figaro achieves 99.98% sensitivity when tested on ~1.5 million shotgun reads from Drosophila pseudoobscura. We further explore the impact of accurate vector trimming on the quality of whole-genome assemblies by re-assembling two bacterial genomes from shotgun sequences deposited in the Trace Archive. Designed as a module in large computational pipelines, Figaro is fast, lightweight and flexible. Availability Figaro is released under an open-source license through the AMOS package (http://amos.sourceforge.net/Figaro). PMID:18202027
A New Hybrid Method Logistic Regression and Feedforward Neural Network for Lung Cancer Data
Taner Tunç
2012-01-01
Logistic regression (LR) is a conventional statistical technique used for data classification problem. Logistic regression is a model-based method, and it uses nonlinear model structure. Another technique used for classification is feedforward artificial neural networks. Feedforward artificial neural network is a data-based method which can model nonlinear models through its activation function. In this study, a hybrid approach of model-based logistic regression technique and data-based artif...
Regression Methods for Virtual Metrology of Layer Thickness in Chemical Vapor Deposition
DEFF Research Database (Denmark)
Purwins, Hendrik; Barak, Bernd; Nagi, Ahmed
2014-01-01
The quality of wafer production in semiconductor manufacturing cannot always be monitored by a costly physical measurement. Instead of measuring a quantity directly, it can be predicted by a regression method (Virtual Metrology). In this paper, a survey on regression methods is given to predict a...
Afifah, Rawyanil; Andriyana, Yudhie; Jaya, I. G. N. Mindra
2017-03-01
Geographically Weighted Regression (GWR) is a development of an Ordinary Least Squares (OLS) regression which is quite effective in estimating spatial non-stationary data. On the GWR models, regression parameters are generated locally, each observation has a unique regression coefficient. Parameter estimation process in GWR uses Weighted Least Squares (WLS). But when there are outliers in the data, the parameter estimation process with WLS produces estimators which are not efficient. Hence, this study uses a robust method called Least Absolute Deviation (LAD), to estimate the parameters of GWR model in the case of poverty in Java Island. This study concludes that GWR model with LAD method has a better performance.
Melo-Cardenas, J; Urquiza, M; Kipps, T J; Castro, J E
2012-05-01
Ad-ISF35 is an adenovirus (Ad) vector that encodes a mouse-human chimeric CD154. Ad-ISF35 induces activation of chronic lymphocytic leukemia (CLL) cells converting them into CLL cells capable of promoting immune recognition and anti-leukemia T-cell activation. Clinical trials in humans treated with Ad-ISF35-transduced leukemia cells or intranodal injection of Ad-ISF35 have shown objective clinical responses. To better understand the biology of Ad-ISF35 and to contribute to its clinical development, we preformed studies to evaluate biodistribution, persistence and toxicity of repeat dose intratumoral administration of Ad-ISF35 in a mouse model. Ad-ISF35 intratumoral administration induced tumor regression in more than 80% of mice bearing A20 tumors. There were no abnormalities in the serum chemistry. Mice receiving Ad-ISF35 presented severe extramedullary hematopoiesis and follicular hyperplasia in the spleen and extramedullary hematopoiesis with lymphoid hyperplasia in lymph nodes. After Ad-ISF35 injection, the vector was found primarily in the injected tumors with a biodistribution pattern that showed a rapid clearance with no evidence of Ad-ISF35 accumulation or persistence in the injected tumor or peripheral organs. Furthermore, pre-existing antibodies against Ad-5 did not abrogate Ad-ISF35 anti-tumor activity. In conclusion, intratumoral administration of Ad-ISF35 induced tumor regression in A20 tumor bearing mice without toxicities and with no evidence of vector accumulation or persistence.
Deng, Zhaohong; Choi, Kup-Sze; Jiang, Yizhang; Wang, Shitong
2014-12-01
Inductive transfer learning has attracted increasing attention for the training of effective model in the target domain by leveraging the information in the source domain. However, most transfer learning methods are developed for a specific model, such as the commonly used support vector machine, which makes the methods applicable only to the adopted models. In this regard, the generalized hidden-mapping ridge regression (GHRR) method is introduced in order to train various types of classical intelligence models, including neural networks, fuzzy logical systems and kernel methods. Furthermore, the knowledge-leverage based transfer learning mechanism is integrated with GHRR to realize the inductive transfer learning method called transfer GHRR (TGHRR). Since the information from the induced knowledge is much clearer and more concise than that from the data in the source domain, it is more convenient to control and balance the similarity and difference of data distributions between the source and target domains. The proposed GHRR and TGHRR algorithms have been evaluated experimentally by performing regression and classification on synthetic and real world datasets. The results demonstrate that the performance of TGHRR is competitive with or even superior to existing state-of-the-art inductive transfer learning algorithms.
Directory of Open Access Journals (Sweden)
Roberto Romaniello
2015-12-01
Full Text Available The aim of this work is to evaluate the potential of least squares support vector machine (LS-SVM regression to develop an efficient method to measure the colour of food materials in L*a*b* units by means of a computer vision systems (CVS. A laboratory CVS, based on colour digital camera (CDC, was implemented and three LS-SVM models were trained and validated, one for each output variables (L*, a*, and b* required by this problem, using the RGB signals generated by the CDC as input variables to these models. The colour target-based approach was used to camera characterization and a standard reference target of 242 colour samples was acquired using the CVS and a colorimeter. This data set was split in two sets of equal sizes, for training and validating the LS-SVM models. An effective two-stage grid search process on the parameters space was performed in MATLAB to tune the regularization parameters γ and the kernel parameters σ2 of the three LS-SVM models. A 3-8-3 multilayer feed-forward neural network (MFNN, according to the research conducted by León et al. (2006, was also trained in order to compare its performance with those of LS-SVM models. The LS-SVM models developed in this research have been shown better generalization capability then the MFNN, allowed to obtain high correlations between L*a*b* data acquired using the colorimeter and the corresponding data obtained by transformation of the RGB data acquired by the CVS. In particular, for the validation set, R2 values equal to 0.9989, 0.9987, and 0.9994 for L*, a* and b* parameters were obtained. The root mean square error values were 0.6443, 0.3226, and 0.2702 for L*, a*, and b* respectively, and the average of colour differences ΔEab was 0.8232±0.5033 units. Thus, LS-SVM regression seems to be a useful tool to measurement of food colour using a low cost CVS.
Empirical evaluation of gradient methods for matrix learning vector quantization
LeKander, M.; Biehl, M.; Vries, H. de
2017-01-01
Generalized Matrix Learning Vector Quantization (GMLVQ) critically relies on the use of an optimization algorithm to train its model parameters. We test various schemes for automated control of learning rates in gradient-based training. We evaluate these algorithms in terms of their achieved
Energy Technology Data Exchange (ETDEWEB)
Boucher, Thomas F., E-mail: boucher@cs.umass.edu [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Ozanne, Marie V. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Carmosino, Marco L. [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Dyar, M. Darby [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Mahadevan, Sridhar [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Breves, Elly A.; Lepore, Kate H. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Clegg, Samuel M. [Los Alamos National Laboratory, P.O. Box 1663, MS J565, Los Alamos, NM 87545 (United States)
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO{sub 2}, Fe{sub 2}O{sub 3}, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na{sub 2}O, K{sub 2}O, TiO{sub 2}, and P{sub 2}O{sub 5}, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high
Regression calibration method for correcting measurement-error bias in nutritional epidemiology.
Spiegelman, D; McDermott, A; Rosner, B
1997-04-01
Regression calibration is a statistical method for adjusting point and interval estimates of effect obtained from regression models commonly used in epidemiology for bias due to measurement error in assessing nutrients or other variables. Previous work developed regression calibration for use in estimating odds ratios from logistic regression. We extend this here to estimating incidence rate ratios from Cox proportional hazards models and regression slopes from linear-regression models. Regression calibration is appropriate when a gold standard is available in a validation study and a linear measurement error with constant variance applies or when replicate measurements are available in a reliability study and linear random within-person error can be assumed. In this paper, the method is illustrated by correction of rate ratios describing the relations between the incidence of breast cancer and dietary intakes of vitamin A, alcohol, and total energy in the Nurses' Health Study. An example using linear regression is based on estimation of the relation between ultradistal radius bone density and dietary intakes of caffeine, calcium, and total energy in the Massachusetts Women's Health Study. Software implementing these methods uses SAS macros.
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate...... practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric...
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Evaluation of regression methods when immunological measurements are constrained by detection limits
Directory of Open Access Journals (Sweden)
Yazdanbakhsh Maria
2008-10-01
Full Text Available Abstract Background The statistical analysis of immunological data may be complicated because precise quantitative levels cannot always be determined. Values below a given detection limit may not be observed (nondetects, and data with nondetects are called left-censored. Since nondetects cannot be considered as missing at random, a statistician faced with data containing these nondetects must decide how to combine nondetects with detects. Till now, the common practice is to impute each nondetect with a single value such as a half of the detection limit, and to conduct ordinary regression analysis. The first aim of this paper is to give an overview of methods to analyze, and to provide new methods handling censored data other than an (ordinary linear regression. The second aim is to compare these methods by simulation studies based on real data. Results We compared six new and existing methods: deletion of nondetects, single substitution, extrapolation by regression on order statistics, multiple imputation using maximum likelihood estimation, tobit regression, and logistic regression. The deletion and extrapolation by regression on order statistics methods gave biased parameter estimates. The single substitution method underestimated variances, and logistic regression suffered loss of power. Based on simulation studies, we found that tobit regression performed well when the proportion of nondetects was less than 30%, and that taken together the multiple imputation method performed best. Conclusion Based on simulation studies, the newly developed multiple imputation method performed consistently well under different scenarios of various proportion of nondetects, sample sizes and even in the presence of heteroscedastic errors.
An NCME Instructional Module on Data Mining Methods for Classification and Regression
Sinharay, Sandip
2016-01-01
Data mining methods for classification and regression are becoming increasingly popular in various scientific fields. However, these methods have not been explored much in educational measurement. This module first provides a review, which should be accessible to a wide audience in education measurement, of some of these methods. The module then…
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates...... and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...
A New Hybrid Method Logistic Regression and Feedforward Neural Network for Lung Cancer Data
Directory of Open Access Journals (Sweden)
Taner Tunç
2012-01-01
Full Text Available Logistic regression (LR is a conventional statistical technique used for data classification problem. Logistic regression is a model-based method, and it uses nonlinear model structure. Another technique used for classification is feedforward artificial neural networks. Feedforward artificial neural network is a data-based method which can model nonlinear models through its activation function. In this study, a hybrid approach of model-based logistic regression technique and data-based artificial neural network was proposed for classification purposes. The proposed approach was applied to lung cancer data, and obtained results were compared. It was seen that the proposed hybrid approach was superior to logistic regression and feedforward artificial neural networks with respect to many criteria.
Directory of Open Access Journals (Sweden)
ELİF BULUT
2013-06-01
Full Text Available Partial Least Squares Regression (PLSR is a multivariate statistical method that consists of partial least squares and multiple linear regression analysis. Explanatory variables, X, having multicollinearity are reduced to components which explain the great amount of covariance between explanatory and response variable. These components are few in number and they don’t have multicollinearity problem. Then multiple linear regression analysis is applied to those components to model the response variable Y. There are various PLSR algorithms. In this study NIPALS and PLS-Kernel algorithms will be studied and illustrated on a real data set.
Sine Rotation Vector Method for Attitude Estimation of an Underwater Robot
Directory of Open Access Journals (Sweden)
Nak Yong Ko
2016-08-01
Full Text Available This paper describes a method for estimating the attitude of an underwater robot. The method employs a new concept of sine rotation vector and uses both an attitude heading and reference system (AHRS and a Doppler velocity log (DVL for the purpose of measurement. First, the acceleration and magnetic-field measurements are transformed into sine rotation vectors and combined. The combined sine rotation vector is then transformed into the differences between the Euler angles of the measured attitude and the predicted attitude; the differences are used to correct the predicted attitude. The method was evaluated according to field-test data and simulation data and compared to existing methods that calculate angular differences directly without a preceding sine rotation vector transformation. The comparison verifies that the proposed method improves the attitude estimation performance.
Whole-genome regression and prediction methods applied to plant and animal breeding.
de Los Campos, Gustavo; Hickey, John M; Pong-Wong, Ricardo; Daetwyler, Hans D; Calus, Mario P L
2013-02-01
Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade.
Zhang, B; Liang, X L; Gao, H Y; Ye, L S; Wang, Y G
2016-05-13
We evaluated the application of three machine learning algorithms, including logistic regression, support vector machine and back-propagation neural network, for diagnosing congenital heart disease and colorectal cancer. By inspecting related serum tumor marker levels in colorectal cancer patients and healthy subjects, early diagnosis models for colorectal cancer were built using three machine learning algorithms to assess their corresponding diagnostic values. Except for serum alpha-fetoprotein, the levels of 11 other serum markers of patients in the colorectal cancer group were higher than those in the benign colorectal cancer group (P model and back-propagation, a neural network diagnosis model was built with diagnostic accuracies of 82 and 75%, sensitivities of 85 and 80%, and specificities of 80 and 70%, respectively. Colorectal cancer diagnosis models based on the three machine learning algorithms showed high diagnostic value and can help obtain evidence for the early diagnosis of colorectal cancer.
Zanariah Satari, Siti; Di, Nur Faraidah Muhammad; Zakaria, Roslinazairimah
2017-09-01
Two agglomerative hierarchical clustering algorithms for identifying multiple outliers in circular regression model have been developed in this study. The agglomerative hierarchical clustering algorithm starts with every single data in a single cluster and it continues to merge with the closest pair of clusters according to some similarity criterion until all the data are grouped in one cluster. The single-linkage method is one of the simplest agglomerative hierarchical methods that is commonly used to detect outlier. In this study, we compared the performance of single-linkage method with another agglomerative hierarchical method, namely average linkage for detecting outlier in circular regression model. The performances of both methods were examined via simulation studies by measuring their “success” probability, masking effect, and swamping effect with different number of sample sizes and level of contaminations. The results show that the single-linkage method performs very well in detecting the multiple outliers with lower masking and swamping effects.
Leone, Robert Matthew
A search for vector-like quarks (VLQs) decaying to a Z boson using multi-stage machine learning was compared to a search using a standard square cuts search strategy. VLQs are predicted by several new theories beyond the Standard Model. The searches used 20.3 inverse femtobarns of proton-proton collisions at a center-of-mass energy of 8 TeV collected with the ATLAS detector in 2012 at the CERN Large Hadron Collider. CLs upper limits on production cross sections of vector-like top and bottom quarks were computed for VLQs produced singly or in pairs, Tsingle, Bsingle, Tpair, and Bpair. The two stage machine learning classification search strategy did not provide any improvement over the standard square cuts strategy, but for Tpair, Bpair, and Tsingle, a third stage of machine learning regression was able to lower the upper limits of high signal masses by as much as 50%. Additionally, new test statistics were developed for use in the Neyman construction of confidence regions in order to address deficiencies in c...
Methods and applications of linear models regression and the analysis of variance
Hocking, Ronald R
2013-01-01
Praise for the Second Edition"An essential desktop reference book . . . it should definitely be on your bookshelf." -Technometrics A thoroughly updated book, Methods and Applications of Linear Models: Regression and the Analysis of Variance, Third Edition features innovative approaches to understanding and working with models and theory of linear regression. The Third Edition provides readers with the necessary theoretical concepts, which are presented using intuitive ideas rather than complicated proofs, to describe the inference that is appropriate for the methods being discussed. The book
An empirical likelihood method for semiparametric linear regression with right censored data.
Fang, Kai-Tai; Li, Gang; Lu, Xuyang; Qin, Hong
2013-01-01
This paper develops a new empirical likelihood method for semiparametric linear regression with a completely unknown error distribution and right censored survival data. The method is based on the Buckley-James (1979) estimating equation. It inherits some appealing properties of the complete data empirical likelihood method. For example, it does not require variance estimation which is problematic for the Buckley-James estimator. We also extend our method to incorporate auxiliary information. We compare our method with the synthetic data empirical likelihood of Li and Wang (2003) using simulations. We also illustrate our method using Stanford heart transplantation data.
Bayesian Regression and Neuro-Fuzzy Methods Reliability Assessment for Estimating Streamflow
Directory of Open Access Journals (Sweden)
Yaseen A. Hamaamin
2016-07-01
Full Text Available Accurate and efficient estimation of streamflow in a watershed’s tributaries is prerequisite parameter for viable water resources management. This study couples process-driven and data-driven methods of streamflow forecasting as a more efficient and cost-effective approach to water resources planning and management. Two data-driven methods, Bayesian regression and adaptive neuro-fuzzy inference system (ANFIS, were tested separately as a faster alternative to a calibrated and validated Soil and Water Assessment Tool (SWAT model to predict streamflow in the Saginaw River Watershed of Michigan. For the data-driven modeling process, four structures were assumed and tested: general, temporal, spatial, and spatiotemporal. Results showed that both Bayesian regression and ANFIS can replicate global (watershed and local (subbasin results similar to a calibrated SWAT model. At the global level, Bayesian regression and ANFIS model performance were satisfactory based on Nash-Sutcliffe efficiencies of 0.99 and 0.97, respectively. At the subbasin level, Bayesian regression and ANFIS models were satisfactory for 155 and 151 subbasins out of 155 subbasins, respectively. Overall, the most accurate method was a spatiotemporal Bayesian regression model that outperformed other models at global and local scales. However, all ANFIS models performed satisfactory at both scales.
Erener, Arzu; Sivas, A. Abdullah; Selcuk-Kestel, A. Sevtap; Düzgün, H. Sebnem
2017-07-01
All of the quantitative landslide susceptibility mapping (QLSM) methods requires two basic data types, namely, landslide inventory and factors that influence landslide occurrence (landslide influencing factors, LIF). Depending on type of landslides, nature of triggers and LIF, accuracy of the QLSM methods differs. Moreover, how to balance the number of 0 (nonoccurrence) and 1 (occurrence) in the training set obtained from the landslide inventory and how to select which one of the 1's and 0's to be included in QLSM models play critical role in the accuracy of the QLSM. Although performance of various QLSM methods is largely investigated in the literature, the challenge of training set construction is not adequately investigated for the QLSM methods. In order to tackle this challenge, in this study three different training set selection strategies along with the original data set is used for testing the performance of three different regression methods namely Logistic Regression (LR), Bayesian Logistic Regression (BLR) and Fuzzy Logistic Regression (FLR). The first sampling strategy is proportional random sampling (PRS), which takes into account a weighted selection of landslide occurrences in the sample set. The second method, namely non-selective nearby sampling (NNS), includes randomly selected sites and their surrounding neighboring points at certain preselected distances to include the impact of clustering. Selective nearby sampling (SNS) is the third method, which concentrates on the group of 1's and their surrounding neighborhood. A randomly selected group of landslide sites and their neighborhood are considered in the analyses similar to NNS parameters. It is found that LR-PRS, FLR-PRS and BLR-Whole Data set-ups, with order, yield the best fits among the other alternatives. The results indicate that in QLSM based on regression models, avoidance of spatial correlation in the data set is critical for the model's performance.
A different approach to estimate nonlinear regression model using numerical methods
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper concerns with the computational methods namely the Gauss-Newton method, Gradient algorithm methods (Newton-Raphson method, Steepest Descent or Steepest Ascent algorithm method, the Method of Scoring, the Method of Quadratic Hill-Climbing) based on numerical analysis to estimate parameters of nonlinear regression model in a very different way. Principles of matrix calculus have been used to discuss the Gradient-Algorithm methods. Yonathan Bard [1] discussed a comparison of gradient methods for the solution of nonlinear parameter estimation problems. However this article discusses an analytical approach to the gradient algorithm methods in a different way. This paper describes a new iterative technique namely Gauss-Newton method which differs from the iterative technique proposed by Gorden K. Smyth [2]. Hans Georg Bock et.al [10] proposed numerical methods for parameter estimation in DAE’s (Differential algebraic equation). Isabel Reis Dos Santos et al [11], Introduced weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel. For large-scale non smooth convex minimization the Hager and Zhang (HZ) conjugate gradient Method and the modified HZ (MHZ) method were presented by Gonglin Yuan et al [12].
Anwar Fitrianto; Lee Ceng Yik
2014-01-01
When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS) method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performan...
Directory of Open Access Journals (Sweden)
Yi-Ming Kuo
2011-06-01
Full Text Available Fine airborne particulate matter (PM2.5 has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS, the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME method. The resulting epistemic framework can assimilate knowledge bases including: (a empirical-based spatial trends of PM concentration based on landuse regression, (b the spatio-temporal dependence among PM observation information, and (c site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan from 2005–2007.
Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming
2011-06-01
Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Cox regression with missing covariate data using a modified partial likelihood method
DEFF Research Database (Denmark)
Martinussen, Torben; Holst, Klaus K.; Scheike, Thomas H.
2016-01-01
Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard...
Simulation of Experimental Parameters of RC Beams by Employing the Polynomial Regression Method
Sayin, B.; Sevgen, S.; Samli, R.
2016-07-01
A numerical model based on the method polynomial regression is developed to simulate the mechanical behavior of reinforced concrete beams strengthened with a carbon-fiber-reinforced polymer and subjected to four-point bending. The results obtained are in good agreement with data of laboratory tests.
Shieh, Gwowen
2017-12-01
Covariate-dependent reference limits have been extensively applied in biology and medicine for determining the substantial magnitude and relative importance of quantitative measurements. Confidence interval and sample size procedures are available for studying regression-based reference limits. However, the existing popular methods employ different technical simplifications and are applicable only in certain limited situations. This paper describes exact confidence intervals of regression-based reference limits and compares the exact approach with the approximate methods under a wide range of model configurations. Using the ratio between the widths of confidence interval and reference interval as the relative precision index, optimal sample size procedures are presented for precise interval estimation under expected ratio and tolerance probability considerations. Simulation results show that the approximate interval methods using normal distribution have inaccurate confidence limits. The exact confidence intervals dominate the approximate procedures in one- and two-sided coverage performance. Unlike the current simplifications, the proposed sample size procedures integrate all key factors including covariate features in the optimization process and are suitable for various regression-based reference limit studies with potentially diverse configurations. The exact interval estimation has theoretical and practical advantages over the approximate methods. The corresponding sample size procedures and computing algorithms are also presented to facilitate the data analysis and research design of regression-based reference limits. Copyright © 2017 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Bing-Chun Liu
Full Text Available Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR for Urban Air Quality Index (AQI prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction.
Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin
2017-01-01
Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction.
Kiala, Zolo; Odindi, John; Mutanga, Onisimo; Peerbhay, Kabir
2016-07-01
Leaf area index (LAI) is a key biophysical parameter commonly used to determine vegetation status, productivity, and health in tropical grasslands. Accurate LAI estimates are useful in supporting sustainable rangeland management by providing information related to grassland condition and associated goods and services. The performance of support vector regression (SVR) was compared to partial least square regression (PLSR) on selected optimal hyperspectral bands to detect LAI in heterogeneous grassland. Results show that PLSR performed better than SVR at the beginning and end of summer. At the peak of the growing season (mid-summer), during reflectance saturation, SVR models yielded higher accuracies (R2=0.902 and RMSE=0.371 m2 m-2) than PLSR models (R2=0.886 and RMSE=0.379 m2 m-2). For the combined dataset (all of summer), SVR models were slightly more accurate (R2=0.74 and RMSE=0.578 m2 m-2) than PLSR models (R2=0.732 and RMSE=0.58 m2 m-2). Variable importance on the projection scores show that most of the bands were located in the near-infrared and shortwave regions of the electromagnetic spectrum, thus providing a basis to investigate the potential of sensors on aerial and satellite platforms for large-scale grassland LAI prediction.
Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin
2017-01-01
Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction. PMID:28708836
Extreme learning machines for regression based on V-matrix method.
Yang, Zhiyong; Zhang, Taohong; Lu, Jingcheng; Su, Yuan; Zhang, Dezheng; Duan, Yaowu
2017-10-01
This paper studies the joint effect of V-matrix, a recently proposed framework for statistical inferences, and extreme learning machine (ELM) on regression problems. First of all, a novel algorithm is proposed to efficiently evaluate the V-matrix. Secondly, a novel weighted ELM algorithm called V-ELM is proposed based on the explicit kernel mapping of ELM and the V-matrix method. Though V-matrix method could capture the geometrical structure of training data, it tends to assign a higher weight to instance with smaller input value. In order to avoid this bias, a novel method called VI-ELM is proposed by minimizing both the regression error and the V-matrix weighted error simultaneously. Finally, experiment results on 12 real world benchmark datasets show the effectiveness of our proposed methods.
Ghaedi, M; Rahimi, Mahmoud Reza; Ghaedi, A M; Tyagi, Inderjeet; Agarwal, Shilpi; Gupta, Vinod Kumar
2016-01-01
Two novel and eco friendly adsorbents namely tin oxide nanoparticles loaded on activated carbon (SnO2-NP-AC) and activated carbon prepared from wood tree Pistacia atlantica (AC-PAW) were used for the rapid removal and fast adsorption of methyl orange (MO) from the aqueous phase. The dependency of MO removal with various adsorption influential parameters was well modeled and optimized using multiple linear regressions (MLR) and least squares support vector regression (LSSVR). The optimal parameters for the LSSVR model were found based on γ value of 0.76 and σ(2) of 0.15. For testing the data set, the mean square error (MSE) values of 0.0010 and the coefficient of determination (R(2)) values of 0.976 were obtained for LSSVR model, and the MSE value of 0.0037 and the R(2) value of 0.897 were obtained for the MLR model. The adsorption equilibrium and kinetic data was found to be well fitted and in good agreement with Langmuir isotherm model and second-order equation and intra-particle diffusion models respectively. The small amount of the proposed SnO2-NP-AC and AC-PAW (0.015 g and 0.08 g) is applicable for successful rapid removal of methyl orange (>95%). The maximum adsorption capacity for SnO2-NP-AC and AC-PAW was 250 mg g(-1) and 125 mg g(-1) respectively. Copyright © 2015 Elsevier Inc. All rights reserved.
A note on the multiple-recursive matrix method for generating pseudorandom vectors
Bishoi, Susil Kumar; Haran, Himanshu Kumar; Hasan, Sartaj Ul
2016-01-01
The multiple-recursive matrix method for generating pseudorandom vectors was introduced by Niederreiter (Linear Algebra Appl. 192 (1993), 301-328). We propose an algorithm for finding an efficient primitive multiple-recursive matrix method. Moreover, for improving the linear complexity, we introduce a tweak on the contents of the primitive multiple-recursive matrix method.
Vector-based plane-wave spectrum method for the propagation of cylindrical electromagnetic fields.
Shi, S; Prather, D W
1999-11-01
We present a vector-based plane-wave spectrum (VPWS) method for efficient propagation of cylindrical electromagnetic fields. In comparison with electromagnetic propagation integrals, the VPWS method significantly reduces time of propagation. Numerical results that illustrate the utility of this method are presented.
Track Circuit Fault Diagnosis Method based on Least Squares Support Vector
Cao, Yan; Sun, Fengru
2018-01-01
In order to improve the troubleshooting efficiency and accuracy of the track circuit, track circuit fault diagnosis method was researched. Firstly, the least squares support vector machine was applied to design the multi-fault classifier of the track circuit, and then the measured track data as training samples was used to verify the feasibility of the methods. Finally, the results based on BP neural network fault diagnosis methods and the methods used in this paper were compared. Results shows that the track fault classifier based on least squares support vector machine can effectively achieve the five track circuit fault diagnosis with less computing time.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Non-overlapped P- and S-wave Poynting vectors and its solution on Grid Method
Lu, Yong Ming
2017-12-12
Poynting vector represents the local directional energy flux density of seismic waves in geophysics. It is widely used in elastic reverse time migration (RTM) to analyze source illumination, suppress low-wavenumber noise, correct for image polarity and extract angle-domain common imaging gather (ADCIG). However, the P and S waves are mixed together during wavefield propagation such that the P and S energy fluxes are not clean everywhere, especially at the overlapped points. In this paper, we use a modified elastic wave equation in which the P and S vector wavefields are naturally separated. Then, we develop an efficient method to evaluate the separable P and S poynting vectors, respectively, based on the view that the group velocity and phase velocity have the same direction in isotropic elastic media. We furthermore formulate our method using an unstructured mesh based modeling method named the grid method. Finally, we verify our method using two numerical examples.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary
Gillis, Nicolas; Luce, Robert
2018-01-01
A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Dhanya, S; Kumari Roshni, V S
2016-01-01
Textures play an important role in image classification. This paper proposes a high performance texture classification method using a combination of multiresolution analysis tool and linear regression modelling by channel elimination. The correlation between different frequency regions has been validated as a sort of effective texture characteristic. This method is motivated by the observation that there exists a distinctive correlation between the image samples belonging to the same kind of texture, at different frequency regions obtained by a wavelet transform. Experimentally, it is observed that this correlation differs across textures. The linear regression modelling is employed to analyze this correlation and extract texture features that characterize the samples. Our method considers not only the frequency regions but also the correlation between these regions. This paper primarily focuses on applying the Dual Tree Complex Wavelet Packet Transform and the Linear Regression model for classification of the obtained texture features. Additionally the paper also presents a comparative assessment of the classification results obtained from the above method with two more types of wavelet transform methods namely the Discrete Wavelet Transform and the Discrete Wavelet Packet Transform.
A geometric Newton method for Oja's vector field.
Absil, P A; Ishteva, M; De Lathauwer, L; Van Huffel, S
2009-05-01
Newton's method for solving the matrix equation F(X) identical to AX-XX(T) AX = 0 runs up against the fact that its zeros are not isolated. This is due to a symmetry of F by the action of the orthogonal group. We show how differential-geometric techniques can be exploited to remove this symmetry and obtain a "geometric" Newton algorithm that finds the zeros of F. The geometric Newton method does not suffer from the degeneracy issue that stands in the way of the original Newton method.
Al-Najami, Issam; Drue, Henrik C; Steele, Robert; Baatrup, Gunnar
2017-12-01
The measurement of tumor regression after neoadjuvant oncological treatment has gained increasing interest because it has a prognostic value and because it may influence the method of treatment in rectal cancer. The assessment of tumor regression remains difficult and inaccurate with existing methods. Dual Energy Computed Tomography (DECT) enables qualitative tissue differentiation by simultaneous scanning with different levels of energy. We aimed to assess the feasibility of DECT in quantifying tumor response to neoadjuvant therapy in loco-advanced rectal cancer. We enrolled 11 patients with histological and MRI verified loco-advanced rectal adenocarcinoma and followed up on them prospectively. All patients had one DECT scanning before neoadjuvant treatment and one 12 weeks after using the spectral imaging scan mode. DECT analyzing tools were used to determine the average quantitative parameters; effective-Z, water- and iodine-concentration, Dual Energy Index (DEI), and Dual Energy Ratio (DER). These parameters were compared to the regression in the resection specimen as measured by the pathologist. Changes in the quantitative parameters differed significantly after treatment in comparison with pre-treatment, and the results were different in patients with different CRT response rates. DECT might be helpful in the assessment of rectal cancer regression grade after neoadjuvant treatment. © 2017 Wiley Periodicals, Inc.
Ricles, James M.
1990-01-01
The development and preliminary assessment of a method for dynamic structural analysis based on load-dependent Ritz vectors are presented. The vector basis is orthogonalized with respect to the mass and structural stiffness in order that the equations of motion can be uncoupled and efficient analysis of large space structure performed. A series of computer programs was developed based on the algorithm for generating the orthogonal load-dependent Ritz vectors. Transient dynamic analysis performed on the Space Station Freedom using the software was found to provide solutions that require a smaller number of vectors than the modal analysis method. Error norm based on the participation of the mass distribution of the structure and spatial distribution of structural loading, respectively, were developed in order to provide an indication of vector truncation. These norms are computed before the transient analysis is performed. An assessment of these norms through a convergence study of the structural response was performed. The results from this assessment indicate that the error norms can provide a means of judging the quality of the vector basis and accuracy of the transient dynamic solution.
Adaptive Vector Finite Element Methods for the Maxwell Equations
Harutyunyan, D.
2007-01-01
The increasing demand to understand the behaviour of electromagnetic waves in many real life problems requires solution of the Maxwell equations. In most cases the exact solution of the Maxwell equations is not available, hence numerical methods are indispensable tool to solve them numerically using
Permanent Magnet Flux Online Estimation Based on Zero-Voltage Vector Injection Method
DEFF Research Database (Denmark)
Xie, Ge; Lu, Kaiyuan; Kumar, Dwivedi Sanjeet
2015-01-01
In this paper, a simple signal injection method is proposed for sensorless control of PMSM at low speed, which ideally requires one voltage vector only for position estimation. The proposed method is easy to implement resulting in low computation burden. No filters are needed for extracting the h...
Ultrasonic 3-D Vector Flow Method for Quantitative In Vivo Peak Velocity and Flow Rate Estimation
DEFF Research Database (Denmark)
Holbek, Simon; Ewertsen, Caroline; Bouzari, Hamed
2017-01-01
Current clinical ultrasound (US) systems are limited to show blood flow movement in either 1-D or 2-D. In this paper, a method for estimating 3-D vector velocities in a plane using the transverse oscillation method, a 32×32 element matrix array, and the experimental US scanner SARUS is presented....
An Information Retrieval Model Based on Vector Space Method by Supervised Learning.
Tai, Xiaoying; Ren, Fuji; Kita, Kenji
2002-01-01
Proposes a method to improve retrieval performance of the vector space model by using users' relevance feedback. Discusses the use of singular value decomposition and the latent semantic indexing model, and reports the results of two experiments that show the effectiveness of the proposed method. (Author/LRW)
Methods of treating Parkinson's disease using viral vectors
Energy Technology Data Exchange (ETDEWEB)
Bankiewicz, Krystof; Cunningham, Janet
2016-11-15
Methods of delivering viral vectors, particularly recombinant adeno-associated virus (rAAV) virions, to the central nervous system (CNS) using convection enhanced delivery (CED) are provided. The rAAV virions include a nucleic acid sequence encoding a therapeutic polypeptide. The methods can be used for treating CNS disorders such as for treating Parkinson's Disease.
Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu
2016-09-01
We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1-10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633-0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926-0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation.
Directory of Open Access Journals (Sweden)
Xueyong Liu
2014-01-01
Full Text Available Infrasound is a type of low frequency signal that occurs in nature and results from man-made events, typically ranging in frequency from 0.01 Hz to 20 Hz. In this paper, a classification method based on Hilbert-Huang transform (HHT and support vector machine (SVM is proposed to discriminate between three different natural events. The frequency spectrum characteristics of infrasound signals produced by different events, such as volcanoes, are unique, which lays the foundation for infrasound signal classification. First, the HHT method was used to extract the feature vectors of several kinds of infrasound events from the Hilbert marginal spectrum. Then, the feature vectors were classified by the SVM method. Finally, the present of classification and identification accuracy are given. The simulation results show that the recognition rate is above 97.7%, and that approach is effective for classifying event types for small samples.
Correcting for cryptic relatedness by a regression-based genomic control method
Directory of Open Access Journals (Sweden)
Yang Yaning
2009-12-01
Full Text Available Abstract Background Genomic control (GC method is a useful tool to correct for the cryptic relatedness in population-based association studies. It was originally proposed for correcting for the variance inflation of Cochran-Armitage's additive trend test by using information from unlinked null markers, and was later generalized to be applicable to other tests with the additional requirement that the null markers are matched with the candidate marker in allele frequencies. However, matching allele frequencies limits the number of available null markers and thus limits the applicability of the GC method. On the other hand, errors in genotype/allele frequencies may cause further bias and variance inflation and thereby aggravate the effect of GC correction. Results In this paper, we propose a regression-based GC method using null markers that are not necessarily matched in allele frequencies with the candidate marker. Variation of allele frequencies of the null markers is adjusted by a regression method. Conclusion The proposed method can be readily applied to the Cochran-Armitage's trend tests other than the additive trend test, the Pearson's chi-square test and other robust efficiency tests. Simulation results show that the proposed method is effective in controlling type I error in the presence of population substructure.
A subagging regression method for estimating the qualitative and quantitative state of groundwater
Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kue-Young
2017-08-01
A subsample aggregating (subagging) regression (SBR) method for the analysis of groundwater data pertaining to trend-estimation-associated uncertainty is proposed. The SBR method is validated against synthetic data competitively with other conventional robust and non-robust methods. From the results, it is verified that the estimation accuracies of the SBR method are consistent and superior to those of other methods, and the uncertainties are reasonably estimated; the others have no uncertainty analysis option. To validate further, actual groundwater data are employed and analyzed comparatively with Gaussian process regression (GPR). For all cases, the trend and the associated uncertainties are reasonably estimated by both SBR and GPR regardless of Gaussian or non-Gaussian skewed data. However, it is expected that GPR has a limitation in applications to severely corrupted data by outliers owing to its non-robustness. From the implementations, it is determined that the SBR method has the potential to be further developed as an effective tool of anomaly detection or outlier identification in groundwater state data such as the groundwater level and contaminant concentration.
Directory of Open Access Journals (Sweden)
Zhang Jing
2016-01-01
Full Text Available To assist physicians to quickly find the required 3D model from the mass medical model, we propose a novel retrieval method, called DRFVT, which combines the characteristics of dimensionality reduction (DR and feature vector transformation (FVT method. The DR method reduces the dimensionality of feature vector; only the top M low frequency Discrete Fourier Transform coefficients are retained. The FVT method does the transformation of the original feature vector and generates a new feature vector to solve the problem of noise sensitivity. The experiment results demonstrate that the DRFVT method achieves more effective and efficient retrieval results than other proposed methods.
Jing, Zhang; Sheng, Kang Bao
2015-01-01
To assist physicians to quickly find the required 3D model from the mass medical model, we propose a novel retrieval method, called DRFVT, which combines the characteristics of dimensionality reduction (DR) and feature vector transformation (FVT) method. The DR method reduces the dimensionality of feature vector; only the top M low frequency Discrete Fourier Transform coefficients are retained. The FVT method does the transformation of the original feature vector and generates a new feature vector to solve the problem of noise sensitivity. The experiment results demonstrate that the DRFVT method achieves more effective and efficient retrieval results than other proposed methods.
Owolabi, Taoreed O.; Akande, Kabiru O.; Olatunji, Sunday O.; Aldhafferi, Nahier; Alqahtani, Abdullah
2017-11-01
Titanium dioxide (TiO2) semiconductor is characterized with a wide band gap and attracts a significant attention for several applications that include solar cell carrier transportation and photo-catalysis. The tunable band gap of this semiconductor coupled with low cost, chemical stability and non-toxicity make it indispensable for these applications. Structural distortion always accompany TiO2 band gap tuning through doping and this present work utilizes the resulting structural lattice distortion to estimate band gap of doped TiO2 using support vector regression (SVR) coupled with novel gravitational search algorithm (GSA) for hyper-parameters optimization. In order to fully capture the non-linear relationship between lattice distortion and band gap, two SVR models were homogeneously hybridized and were subsequently optimized using GSA. GSA-HSVR (hybridized SVR) performs better than GSA-SVR model with performance improvement of 57.2% on the basis of root means square error reduction of the testing dataset. Effect of Co doping and Nitrogen-Iodine co-doping on band gap of TiO2 semiconductor was modeled and simulated. The obtained band gap estimates show excellent agreement with the values reported from the experiment. By implementing the models, band gap of doped TiO2 can be estimated with high level of precision and absorption ability of the semiconductor can be extended to visible region of the spectrum for improved properties and efficiency.
Ghaedi, M; Dashtian, K; Ghaedi, A M; Dehghanian, N
2016-05-11
The aim of this work is the study of the predictive ability of a hybrid model of support vector regression with genetic algorithm optimization (GA-SVR) for the adsorption of malachite green (MG) onto multi-walled carbon nanotubes (MWCNTs). Various factors were investigated by central composite design and optimum conditions was set as: pH 8, 0.018 g MWCNTs, 8 mg L(-1) dye mixed with 50 mL solution thoroughly for 10 min. The Langmuir, Freundlich, Temkin and D-R isothermal models are applied to fitting the experimental data, and the data was well explained by the Langmuir model with a maximum adsorption capacity of 62.11-80.64 mg g(-1) in a short time at 25 °C. Kinetic studies at various adsorbent dosages and the initial MG concentration show that maximum MG removal was achieved within 10 min of the start of every experiment under most conditions. The adsorption obeys the pseudo-second-order rate equation in addition to the intraparticle diffusion model. The optimal parameters (C of 0.2509, σ(2) of 0.1288 and ε of 0.2018) for the SVR model were obtained based on the GA. For the testing data set, MSE values of 0.0034 and the coefficient of determination (R(2)) values of 0.9195 were achieved.
Comparison of regression methods for modeling intensive care length of stay.
Directory of Open Access Journals (Sweden)
Ilona W M Verburg
Full Text Available Intensive care units (ICUs are increasingly interested in assessing and improving their performance. ICU Length of Stay (LoS could be seen as an indicator for efficiency of care. However, little consensus exists on which prognostic method should be used to adjust ICU LoS for case-mix factors. This study compared the performance of different regression models when predicting ICU LoS. We included data from 32,667 unplanned ICU admissions to ICUs participating in the Dutch National Intensive Care Evaluation (NICE in the year 2011. We predicted ICU LoS using eight regression models: ordinary least squares regression on untransformed ICU LoS,LoS truncated at 30 days and log-transformed LoS; a generalized linear model with a Gaussian distribution and a logarithmic link function; Poisson regression; negative binomial regression; Gamma regression with a logarithmic link function; and the original and recalibrated APACHE IV model, for all patients together and for survivors and non-survivors separately. We assessed the predictive performance of the models using bootstrapping and the squared Pearson correlation coefficient (R2, root mean squared prediction error (RMSPE, mean absolute prediction error (MAPE and bias. The distribution of ICU LoS was skewed to the right with a median of 1.7 days (interquartile range 0.8 to 4.0 and a mean of 4.2 days (standard deviation 7.9. The predictive performance of the models was between 0.09 and 0.20 for R2, between 7.28 and 8.74 days for RMSPE, between 3.00 and 4.42 days for MAPE and between -2.99 and 1.64 days for bias. The predictive performance was slightly better for survivors than for non-survivors. We were disappointed in the predictive performance of the regression models and conclude that it is difficult to predict LoS of unplanned ICU admissions using patient characteristics at admission time only.
Gui, J; Li, H
2005-01-01
An important area of research in pharmacogenomics is to relate high-dimensional genetic or genomic data to various clinical phenotypes of patients. Due to large variability in time to certain clinical event among patients, studying possibly censored survival phenotypes can be more informative than treating the phenotypes as categorical variables. In this paper, we develop a threshold gradient descent (TGD) method for the Cox model to select genes that are relevant to patients' survival and to build a predictive model for the risk of a future patient. The computational difficulty associated with the estimation in the high-dimensional and low-sample size settings can be efficiently solved by the gradient descent iterations. Results from application to real data set on predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed method can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. The TGD based Cox regression gives better predictive performance than the L2 penalized regression and can select more relevant genes than the L1 penalized regression.
Multiple linear regression with some correlated errors: classical and robust methods.
Pires, Ana M; Rodrigues, Isabel M
2007-07-10
In this paper we consider classical and robust methods of estimation and diagnostics for the multiple linear regression model when some of the errors are correlated. This work was motivated by the analysis of a medical data set, from an observational study aimed at identifying factors affecting the outcome of a surgical method for the correction of scoliosis (abnormal lateral spinal curvature). There are 392 observations but some of them are on the same patient (double curves). It seems adequate to consider a multiple linear regression model but, since it is not desirable to discard the double curves, the assumption of non-correlated errors is clearly violated, and this is indeed confirmed by related diagnostics on the residuals (Durbin-Watson test). A more appropriate model retains the linear structure but allows for non-null correlation between the errors on the same patient. We propose two different procedures for the estimation of the parameters of the linear model and the correlation parameters: maximum likelihood assuming normal errors and a robustified version obtained by plugging-in results from robust linear regression. The latter procedure is designed to be resistant to outlying observations or error distributions with heavy tails and has produced the most satisfactory results for the analysed data set. Copyright 2006 John Wiley & Sons, Ltd.
A robust and efficient stepwise regression method for building sparse polynomial chaos expansions
Energy Technology Data Exchange (ETDEWEB)
Abraham, Simon, E-mail: Simon.Abraham@ulb.ac.be [Vrije Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2, 1050 Brussels (Belgium); Raisee, Mehrdad [School of Mechanical Engineering, College of Engineering, University of Tehran, P.O. Box: 11155-4563, Tehran (Iran, Islamic Republic of); Ghorbaniasl, Ghader; Contino, Francesco; Lacor, Chris [Vrije Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2, 1050 Brussels (Belgium)
2017-03-01
Polynomial Chaos (PC) expansions are widely used in various engineering fields for quantifying uncertainties arising from uncertain parameters. The computational cost of classical PC solution schemes is unaffordable as the number of deterministic simulations to be calculated grows dramatically with the number of stochastic dimension. This considerably restricts the practical use of PC at the industrial level. A common approach to address such problems is to make use of sparse PC expansions. This paper presents a non-intrusive regression-based method for building sparse PC expansions. The most important PC contributions are detected sequentially through an automatic search procedure. The variable selection criterion is based on efficient tools relevant to probabilistic method. Two benchmark analytical functions are used to validate the proposed algorithm. The computational efficiency of the method is then illustrated by a more realistic CFD application, consisting of the non-deterministic flow around a transonic airfoil subject to geometrical uncertainties. To assess the performance of the developed methodology, a detailed comparison is made with the well established LAR-based selection technique. The results show that the developed sparse regression technique is able to identify the most significant PC contributions describing the problem. Moreover, the most important stochastic features are captured at a reduced computational cost compared to the LAR method. The results also demonstrate the superior robustness of the method by repeating the analyses using random experimental designs.
Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi
2016-03-01
Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Lu, Miao; Zhou, Jianhui; Naylor, Caitlin; Kirkpatrick, Beth D; Haque, Rashidul; Petri, William A; Ma, Jennie Z
2017-01-01
Environmental Enteropathy (EE) is a subclinical condition caused by constant fecal-oral contamination and resulting in blunting of intestinal villi and intestinal inflammation. Of primary interest in the clinical research is to evaluate the association between non-invasive EE biomarkers and malnutrition in a cohort of Bangladeshi children. The challenges are that the number of biomarkers/covariates is relatively large, and some of them are highly correlated. Many variable selection methods are available in the literature, but which are most appropriate for EE biomarker selection remains unclear. In this study, different variable selection approaches were applied and the performance of these methods was assessed numerically through simulation studies, assuming the correlations among covariates were similar to those in the Bangladesh cohort. The suggested methods from simulations were applied to the Bangladesh cohort to select the most relevant biomarkers for the growth response, and bootstrapping methods were used to evaluate the consistency of selection results. Through simulation studies, SCAD (Smoothly Clipped Absolute Deviation), Adaptive LASSO (Least Absolute Shrinkage and Selection Operator) and MCP (Minimax Concave Penalty) are the suggested variable selection methods, compared to traditional stepwise regression method. In the Bangladesh data, predictors such as mother weight, height-for-age z-score (HAZ) at week 18, and inflammation markers (Myeloperoxidase (MPO) at week 12 and soluable CD14 at week 18) are informative biomarkers associated with children's growth. Penalized linear regression methods are plausible alternatives to traditional variable selection methods, and the suggested methods are applicable to other biomedical studies. The selected early-stage biomarkers offer a potential explanation for the burden of malnutrition problems in low-income countries, allow early identification of infants at risk, and suggest pathways for intervention. This
Celikel, Oguz
2011-03-01
This paper presents the application of the vector modulation method (VMM) to an open-loop interferometric fiber optic gyroscope, called the north finder capability gyroscope (NFCG), designed and assembled in TUBITAK UME (National Metrology Institute of Turkey). The method contains a secondary modulation/demodulation circuit with an AD630 chip, depending on the periodic variation of the orientation of the sensing coil sensitive surface vector with respect to geographic north at a laboratory latitude and collection of dc voltage at the secondary demodulation circuit output in the time domain. The resultant dc voltage proportional to the first-kind Bessel function based on Sagnac phase shift for the first order is obtained as a result of vector modulation together with the Earth's rotation. A new model function is developed and introduced to evaluate the angular errors of the NFCG with VMM in finding geographic north.
Increasing the computational efficient of digital cross correlation by a vectorization method
Chang, Ching-Yuan; Ma, Chien-Ching
2017-08-01
This study presents a vectorization method for use in MATLAB programming aimed at increasing the computational efficiency of digital cross correlation in sound and images, resulting in a speedup of 6.387 and 36.044 times compared with performance values obtained from looped expression. This work bridges the gap between matrix operations and loop iteration, preserving flexibility and efficiency in program testing. This paper uses numerical simulation to verify the speedup of the proposed vectorization method as well as experiments to measure the quantitative transient displacement response subjected to dynamic impact loading. The experiment involved the use of a high speed camera as well as a fiber optic system to measure the transient displacement in a cantilever beam under impact from a steel ball. Experimental measurement data obtained from the two methods are in excellent agreement in both the time and frequency domain, with discrepancies of only 0.68%. Numerical and experiment results demonstrate the efficacy of the proposed vectorization method with regard to computational speed in signal processing and high precision in the correlation algorithm. We also present the source code with which to build MATLAB-executable functions on Windows as well as Linux platforms, and provide a series of examples to demonstrate the application of the proposed vectorization method.
Liu, Ke; Chen, Xiaojing; Li, Limin; Chen, Huiling; Ruan, Xiukai; Liu, Wenbin
2015-02-09
The successive projections algorithm (SPA) is widely used to select variables for multiple linear regression (MLR) modeling. However, SPA used only once may not obtain all the useful information of the full spectra, because the number of selected variables cannot exceed the number of calibration samples in the SPA algorithm. Therefore, the SPA-MLR method risks the loss of useful information. To make a full use of the useful information in the spectra, a new method named "consensus SPA-MLR" (C-SPA-MLR) is proposed herein. This method is the combination of consensus strategy and SPA-MLR method. In the C-SPA-MLR method, SPA-MLR is used to construct member models with different subsets of variables, which are selected from the remaining variables iteratively. A consensus prediction is obtained by combining the predictions of the member models. The proposed method is evaluated by analyzing the near infrared (NIR) spectra of corn and diesel. The results of C-SPA-MLR method showed a better prediction performance compared with the SPA-MLR and full-spectra PLS methods. Moreover, these results could serve as a reference for combination the consensus strategy and other variable selection methods when analyzing NIR spectra and other spectroscopic techniques. Copyright © 2014 Elsevier B.V. All rights reserved.
Samira-VP: A simple protein alignment method with rechecking the alphabet vector positions.
Fotoohifiroozabadi, Samira; Mohamad, Mohd Saberi; Deris, Safaai
2017-04-01
Protein structure alignment and comparisons that are based on an alphabetical demonstration of protein structure are more simple to run with faster evaluation processes; thus, their accuracy is not as reliable as three-dimension (3D)-based tools. As a 1D method candidate, TS-AMIR used the alphabetic demonstration of secondary-structure elements (SSE) of proteins and compared the assigned letters to each SSE using the [Formula: see text]-gram method. Although the results were comparable to those obtained via geometrical methods, the SSE length and accuracy of adjacency between SSEs were not considered in the comparison process. Therefore, to obtain further information on accuracy of adjacency between SSE vectors, the new approach of assigning text to vectors was adopted according to the spherical coordinate system in the present study. Moreover, dynamic programming was applied in order to account for the length of SSE vectors. Five common datasets were selected for method evaluation. The first three datasets were small, but difficult to align, and the remaining two datasets were used to compare the capability of the proposed method with that of other methods on a large protein dataset. The results showed that the proposed method, as a text-based alignment approach, obtained results comparable to both 1D and 3D methods. It outperformed 1D methods in terms of accuracy and 3D methods in terms of runtime.
Directory of Open Access Journals (Sweden)
Gholam Reza Sheykhzadeh
2017-02-01
Full Text Available Introduction: Penetration resistance is one of the criteria for evaluating soil compaction. It correlates with several soil properties such as vehicle trafficability, resistance to root penetration, seedling emergence, and soil compaction by farm machinery. Direct measurement of penetration resistance is time consuming and difficult because of high temporal and spatial variability. Therefore, many different regressions and artificial neural network pedotransfer functions have been proposed to estimate penetration resistance from readily available soil variables such as particle size distribution, bulk density (Db and gravimetric water content (θm. The lands of Ardabil Province are one of the main production regions of potato in Iran, thus, obtaining the soil penetration resistance in these regions help with the management of potato production. The objective of this research was to derive pedotransfer functions by using regression and artificial neural network to predict penetration resistance from some soil variations in the agricultural soils of Ardabil plain and to compare the performance of artificial neural network with regression models. Materials and methods: Disturbed and undisturbed soil samples (n= 105 were systematically taken from 0-10 cm soil depth with nearly 3000 m distance in the agricultural lands of the Ardabil plain ((lat 38°15' to 38°40' N, long 48°16' to 48°61' E. The contents of sand, silt and clay (hydrometer method, CaCO3 (titration method, bulk density (cylinder method, particle density (Dp (pychnometer method, organic carbon (wet oxidation method, total porosity(calculating from Db and Dp, saturated (θs and field soil water (θf using the gravimetric method were measured in the laboratory. Mean geometric diameter (dg and standard deviation (σg of soil particles were computed using the percentages of sand, silt and clay. Penetration resistance was measured in situ using cone penetrometer (analog model at 10
A deformation analysis method of stepwise regression for bridge deflection prediction
Shen, Yueqian; Zeng, Ying; Zhu, Lei; Huang, Teng
2015-12-01
Large-scale bridges are among the most important infrastructures whose safe conditions concern people's daily activities and life safety. Monitoring of large-scale bridges is crucial since deformation might have occurred. How to obtain the deformation information and then judge the safe conditions are the key and difficult problems in bridge deformation monitoring field. Deflection is the important index for evaluation of bridge safety. This paper proposes a forecasting modeling of stepwise regression analysis. Based on the deflection monitoring data of Yangtze River Bridge, the main factors influenced deflection deformation is chiefly studied. Authors use the monitoring data to forecast the deformation value of a bridge deflection at different time from the perspective of non-bridge structure, and compared to the forecasting of gray relational analysis based on linear regression. The result show that the accuracy and reliability of stepwise regression analysis is high, which provides the scientific basis to the bridge operation management. And above all, the ideas of this research provide and effective method for bridge deformation analysis.
Comparison of methods for calculating serum osmolality: multivariate linear regression analysis.
Rasouli, Mehdi; Kalantari, Kiarash Rezaei
2005-01-01
There are several methods for calculating serum osmolality, and their accordance with measured osmolality is the subject of controversy. The concentrations of sodium, potassium, glucose, blood urea nitrogen (BUN) and osmolalities of 210 serum samples were measured. Two empirical equations were deduced for the calculation of serum osmolality by regression analysis of the data. To choose the best equation, chemical concentrations were also used to calculate osmolalities according to our formulas and 16 different equations were taken from the literature and compared with the measured osmolalities. Correlation and linear regression analyses were performed using Excel and SPSS software. Multiple linear regression analysis showed that serum concentrations of sodium (beta = 0.778, pformula presented by Dorwart and Chalmers gave inferior results to those obtained with our formulas. Our data suggest use of the Worthley et al. formula Osm = 2[Na +]+glucose+BUN for rapid mental calculation and the formulas of Bhagat et al. or ours for calculation of serum osmolality by equipment linked to a computer.
Landslide susceptibility mapping on a global scale using the method of logistic regression
Lin, Le; Lin, Qigen; Wang, Ying
2017-08-01
This paper proposes a statistical model for mapping global landslide susceptibility based on logistic regression. After investigating explanatory factors for landslides in the existing literature, five factors were selected for model landslide susceptibility: relative relief, extreme precipitation, lithology, ground motion and soil moisture. When building the model, 70 % of landslide and nonlandslide points were randomly selected for logistic regression, and the others were used for model validation. To evaluate the accuracy of predictive models, this paper adopts several criteria including a receiver operating characteristic (ROC) curve method. Logistic regression experiments found all five factors to be significant in explaining landslide occurrence on a global scale. During the modeling process, percentage correct in confusion matrix of landslide classification was approximately 80 % and the area under the curve (AUC) was nearly 0.87. During the validation process, the above statistics were about 81 % and 0.88, respectively. Such a result indicates that the model has strong robustness and stable performance. This model found that at a global scale, soil moisture can be dominant in the occurrence of landslides and topographic factor may be secondary.
Estimation of the maximum flow-mediated brachial artery response using local regression methods.
Andrew, M E; Li, S; Fekedulegn, D; Dorn, J; Joseph, P N; Violanti, J; Burchfiel, C M
2007-10-01
We consider methods for estimating the maximum from a sequence of measurements of flow-mediated diameter of the brachial artery. Flow-mediated vasodilation (FMD) is represented using the maximum change from a baseline diameter measurement after the release of a blood pressure cuff that has been inflated to reduce flow in the brachial artery. The influence of the measurement error on the maximum diameter from raw data can lead to overestimation of the average maximum change from the baseline for a sample of individuals. Nonparametric regression models provide a potential means for dealing with this problem. When using this approach, it is necessary to make a judicious choice of regression methods and smoothing parameters to avoid overestimation or underestimation of FMD. This study presents results from simulation studies using kernel-based local linear regression methods that characterize the relationship between the measurement error, smoothing and bias in estimates of FMD. Comparisons between fixed or constant smoothing and automated smoothing parameter selection using the generalized cross validation (GCV) statistic are made, and it is shown that GCV-optimized smoothing may over-smooth or under-smooth depending on the heart rate, measurement error and measurement frequency. We also present an example using measured data from the Buffalo Cardio-Metabolic Occupational Police Stress (BCOPS) pilot study. In this example, smoothing resulted in lower estimates of FMD and there was no clear evidence of an optimal smoothing level. The choice to use smoothing and the appropriate smoothing level to use may depend on the application.
The method of principal vectors for the systhesis of shaking moment balanced linkages
van der Wijk, V.; Herder, Justus Laurens; Viadero, Fernando; Ceccarelli, Marco
2013-01-01
The design of shaking-moment-balanced linkages still is challenging. Considering moment balance in the very beginning of the design process of mechanisms is important for finding applicable solutions. For this purpose, the method of principal vectors is investigated, showing a compact notation of
From toe to head: use of robust regression methods in stature estimation based on foot remains.
Pablos, Adrián; Gómez-Olivencia, Asier; García-Pérez, Alfonso; Martínez, Ignacio; Lorenzo, Carlos; Arsuaga, Juan Luis
2013-03-10
Stature estimation is a standard procedure in the fields of forensic and biological anthropology, bio-archaeology and paleoanthropology, in order to gain biological insights into the individuals/populations studied. The most accurate stature estimation method is based on anatomical reconstruction (i.e., the Fully method), followed by type I regression equations (e.g., ordinary least squares - OLS) based on long bones, preferably from the lower limb. In some cases, due to the fragmentary nature of the osseous material recovered, stature estimates have to rely on other elements, such as foot remains. In this study, we explore stature estimation based on different foot bones: the talus, calcaneus, and metatarsals 1-4 in Afro- and Euroamericans of both sexes. The approach undertaken in this study is novel for two reasons. First, individual estimates for each bone are provided, and tarsals and metatarsals are combined in order to obtain more accurate estimates. Second, robust statistical methods based on type I regression equations are used, namely least trimmed squares (LTS). Our results show that the best individual bones for estimating stature are the first and second metatarsal and both the talus and the calcaneus. The combination of a tarsal and a metatarsal bone slightly improves the accuracy of the stature estimate. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Bangyong Sun
2014-01-01
Full Text Available The polynomial regression method is employed to calculate the relationship of device color space and CIE color space for color characterization, and the performance of different expressions with specific parameters is evaluated. Firstly, the polynomial equation for color conversion is established and the computation of polynomial coefficients is analysed. And then different forms of polynomial equations are used to calculate the RGB and CMYK’s CIE color values, while the corresponding color errors are compared. At last, an optimal polynomial expression is obtained by analysing several related parameters during color conversion, including polynomial numbers, the degree of polynomial terms, the selection of CIE visual spaces, and the linearization.
Design of vaccination and fumigation on Host-Vector Model by input-output linearization method
Nugraha, Edwin Setiawan; Naiborhu, Janson; Nuraini, Nuning
2017-03-01
Here, we analyze the Host-Vector Model and proposed design of vaccination and fumigation to control infectious population by using feedback control especially input-output liniearization method. Host population is divided into three compartments: susceptible, infectious and recovery. Whereas the vector population is divided into two compartment such as susceptible and infectious. In this system, vaccination and fumigation treat as input factors and infectious population as output result. The objective of design is to stabilize of the output asymptotically tend to zero. We also present the examples to illustrate the design model.
Wei, W. B.; Tan, L.; Jia, M. Q.; Pan, Z. K.
2017-01-01
The variational level set method is one of the main methods of image segmentation. Due to signed distance functions as level sets have to keep the nature of the functions through numerical remedy or additional technology in an evolutionary process, it is not very efficient. In this paper, a normal vector projection method for image segmentation using Chan-Vese model is proposed. An equivalent formulation of Chan-Vese model is used by taking advantage of property of binary level set functions and combining with the concept of convex relaxation. Threshold method and projection formula are applied in the implementation. It can avoid the above problems and obtain a global optimal solution. Experimental results on both synthetic and real images validate the effects of the proposed normal vector projection method, and show advantages over traditional algorithms in terms of computational efficiency.
Minimum-Voltage Vector Injection Method for Sensorless Control of PMSM for Low-Speed Operations
DEFF Research Database (Denmark)
Xie, Ge; Lu, Kaiyuan; Kumar, Dwivedi Sanjeet
2016-01-01
In this paper, a simple signal injection method is proposed for sensorless control of PMSM at low speed, which ideally requires one voltage vector only for position estimation. The proposed method is easy to implement resulting in low computation burden. No filters are needed for extracting...... may also be further developed to inject two opposite voltage vectors to reduce the effects of inverter voltage error on the position estimation accuracy. The effectiveness of the proposed method is demonstrated by comparing with other sensorless control method. Theoretical analysis and experimental...... the high frequency current signals for position estimation. The use of Low-Pass Filters (LPFs) in the current control loop to filter out the fundamental current component is not necessary. Therefore, the control bandwidth of the inner current control loop may not need to be sacrificed. The proposed method...
Ichii, Kazuhito; Ueyama, Masahito; Kondo, Masayuki; Saigusa, Nobuko; Kim, Joon; Alberto, Ma. Carmelita; Ardö, Jonas; Euskirchen, Eugénie S.; Kang, Minseok; Hirano, Takashi; Joiner, Joanna; Kobayashi, Hideki; Marchesini, Luca Belelli; Merbold, Lutz; Miyata, Akira; Saitoh, Taku M.; Takagi, Kentaro; Varlagin, Andrej; Bret-Harte, M. Syndonia; Kitamura, Kenzo; Kosugi, Yoshiko; Kotani, Ayumi; Kumar, Kireet; Li, Sheng-Gong; Machimura, Takashi; Matsuura, Yojiro; Mizoguchi, Yasuko; Ohta, Takeshi; Mukherjee, Sandipan; Yanagi, Yuji; Yasuda, Yukio; Zhang, Yiping; Zhao, Fenghua
2017-04-01
The lack of a standardized database of eddy covariance observations has been an obstacle for data-driven estimation of terrestrial CO2 fluxes in Asia. In this study, we developed such a standardized database using 54 sites from various databases by applying consistent postprocessing for data-driven estimation of gross primary productivity (GPP) and net ecosystem CO2 exchange (NEE). Data-driven estimation was conducted by using a machine learning algorithm: support vector regression (SVR), with remote sensing data for 2000 to 2015 period. Site-level evaluation of the estimated CO2 fluxes shows that although performance varies in different vegetation and climate classifications, GPP and NEE at 8 days are reproduced (e.g., r2 = 0.73 and 0.42 for 8 day GPP and NEE). Evaluation of spatially estimated GPP with Global Ozone Monitoring Experiment 2 sensor-based Sun-induced chlorophyll fluorescence shows that monthly GPP variations at subcontinental scale were reproduced by SVR (r2 = 1.00, 0.94, 0.91, and 0.89 for Siberia, East Asia, South Asia, and Southeast Asia, respectively). Evaluation of spatially estimated NEE with net atmosphere-land CO2 fluxes of Greenhouse Gases Observing Satellite (GOSAT) Level 4A product shows that monthly variations of these data were consistent in Siberia and East Asia; meanwhile, inconsistency was found in South Asia and Southeast Asia. Furthermore, differences in the land CO2 fluxes from SVR-NEE and GOSAT Level 4A were partially explained by accounting for the differences in the definition of land CO2 fluxes. These data-driven estimates can provide a new opportunity to assess CO2 fluxes in Asia and evaluate and constrain terrestrial ecosystem models.
Wong, Jacklyn; Bayoh, Nabie; Olang, George; Killeen, Gerry F; Hamel, Mary J; Vulule, John M; Gimnig, John E
2013-04-30
Operational vector sampling methods lack standardization, making quantitative comparisons of malaria transmission across different settings difficult. Human landing catch (HLC) is considered the research gold standard for measuring human-mosquito contact, but is unsuitable for large-scale sampling. This study assessed mosquito catch rates of CDC light trap (CDC-LT), Ifakara tent trap (ITT), window exit trap (WET), pot resting trap (PRT), and box resting trap (BRT) relative to HLC in western Kenya to 1) identify appropriate methods for operational sampling in this region, and 2) contribute to a larger, overarching project comparing standardized evaluations of vector trapping methods across multiple countries. Mosquitoes were collected from June to July 2009 in four districts: Rarieda, Kisumu West, Nyando, and Rachuonyo. In each district, all trapping methods were rotated 10 times through three houses in a 3 × 3 Latin Square design. Anophelines were identified by morphology and females classified as fed or non-fed. Anopheles gambiae s.l. were further identified as Anopheles gambiae s.s. or Anopheles arabiensis by PCR. Relative catch rates were estimated by negative binomial regression. When data were pooled across all four districts, catch rates (relative to HLC indoor) for An. gambiae s.l (95.6% An. arabiensis, 4.4% An. gambiae s.s) were high for HLC outdoor (RR = 1.01), CDC-LT (RR = 1.18), and ITT (RR = 1.39); moderate for WET (RR = 0.52) and PRT outdoor (RR = 0.32); and low for all remaining types of resting traps (PRT indoor, BRT indoor, and BRT outdoor; RR type varied from district to district. ITT, CDC-LT, and WET appear to be effective methods for large-scale vector sampling in western Kenya. Ultimately, choice of collection method for operational surveillance should be driven by trap efficacy and scalability, rather than fine-scale precision with respect to HLC. When compared with recent, similar trap evaluations in Tanzania and Zambia, these data suggest
Kew, William; Mitchell, John B O
2015-09-01
The application of Machine Learning to cheminformatics is a large and active field of research, but there exist few papers which discuss whether ensembles of different Machine Learning methods can improve upon the performance of their component methodologies. Here we investigated a variety of methods, including kernel-based, tree, linear, neural networks, and both greedy and linear ensemble methods. These were all tested against a standardised methodology for regression with data relevant to the pharmaceutical development process. This investigation focused on QSPR problems within drug-like chemical space. We aimed to investigate which methods perform best, and how the 'wisdom of crowds' principle can be applied to ensemble predictors. It was found that no single method performs best for all problems, but that a dynamic, well-structured ensemble predictor would perform very well across the board, usually providing an improvement in performance over the best single method. Its use of weighting factors allows the greedy ensemble to acquire a bigger contribution from the better performing models, and this helps the greedy ensemble generally to outperform the simpler linear ensemble. Choice of data preprocessing methodology was found to be crucial to performance of each method too. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Shi, Yinghuan; Gao, Yaozong; Liao, Shu; Zhang, Daoqiang; Gao, Yang; Shen, Dinggang
2016-01-15
In 1 recent years, there has been a great interest in prostate segmentation, which is a important and challenging task for CT image guided radiotherapy. In this paper, a learning-based segmentation method via joint transductive feature selection and transductive regression is presented, which incorporates the physician's simple manual specification (only taking a few seconds), to aid accurate segmentation, especially for the case with large irregular prostate motion. More specifically, for the current treatment image, experienced physician is first allowed to manually assign the labels for a small subset of prostate and non-prostate voxels, especially in the first and last slices of the prostate regions. Then, the proposed method follows the two step: in prostate-likelihood estimation step, two novel algorithms: tLasso and wLapRLS, will be sequentially employed for transductive feature selection and transductive regression, respectively, aiming to generate the prostate-likelihood map. In multi-atlases based label fusion step, the final segmentation result will be obtained according to the corresponding prostate-likelihood map and the previous images of the same patient. The proposed method has been substantially evaluated on a real prostate CT dataset including 24 patients with 330 CT images, and compared with several state-of-the-art methods. Experimental results show that the proposed method outperforms the state-of-the-arts in terms of higher Dice ratio, higher true positive fraction, and lower centroid distances. Also, the results demonstrate that simple manual specification can help improve the segmentation performance, which is clinically feasible in real practice.
Yun, Yuqi; Zevin, Michael; Sampson, Laura; Kalogera, Vassiliki
2017-01-01
With more observations from LIGO in the upcoming years, we will be able to construct an observed mass distribution of black holes to compare with binary evolution simulations. This will allow us to investigate the physics of binary evolution such as the effects of common envelope efficiency and wind strength, or the properties of the population such as the initial mass function.However, binary evolution codes become computationally expensive when running large populations of binaries over a multi-dimensional grid of input parameters, and may simulate accurately only for a limited combination of input parameter values. Therefore we developed a fast machine-learning method that utilizes Gaussian Mixture Model (GMM) and Gaussian Process (GP) regression, which together can predict distributions over the entire parameter space based on a limited number of simulated models. Furthermore, Gaussian Process regression naturally provides interpolation errors in addition to interpolation means, which could provide a means of targeting the most uncertain regions of parameter space for running further simulations.We also present a case study on applying this new method to predicting chirp mass distributions for binary black hole systems (BBHs) in Milky-way like galaxies of different metallicities.
Directory of Open Access Journals (Sweden)
Adi Syahputra
2014-03-01
Full Text Available Quantitative structure activity relationship (QSAR for 21 insecticides of phthalamides containing hydrazone (PCH was studied using multiple linear regression (MLR, principle component regression (PCR and artificial neural network (ANN. Five descriptors were included in the model for MLR and ANN analysis, and five latent variables obtained from principle component analysis (PCA were used in PCR analysis. Calculation of descriptors was performed using semi-empirical PM6 method. ANN analysis was found to be superior statistical technique compared to the other methods and gave a good correlation between descriptors and activity (r2 = 0.84. Based on the obtained model, we have successfully designed some new insecticides with higher predicted activity than those of previously synthesized compounds, e.g.2-(decalinecarbamoyl-5-chloro-N’-((5-methylthiophen-2-ylmethylene benzohydrazide, 2-(decalinecarbamoyl-5-chloro-N’-((thiophen-2-yl-methylene benzohydrazide and 2-(decaline carbamoyl-N’-(4-fluorobenzylidene-5-chlorobenzohydrazide with predicted log LC50 of 1.640, 1.672, and 1.769 respectively.
Nonparametric Methods in Astronomy: Think, Regress, Observe—Pick Any Three
Steinhardt, Charles L.; Jermyn, Adam S.
2018-02-01
Telescopes are much more expensive than astronomers, so it is essential to minimize required sample sizes by using the most data-efficient statistical methods possible. However, the most commonly used model-independent techniques for finding the relationship between two variables in astronomy are flawed. In the worst case they can lead without warning to subtly yet catastrophically wrong results, and even in the best case they require more data than necessary. Unfortunately, there is no single best technique for nonparametric regression. Instead, we provide a guide for how astronomers can choose the best method for their specific problem and provide a python library with both wrappers for the most useful existing algorithms and implementations of two new algorithms developed here.
Shirk, Andrew J; Landguth, Erin L; Cushman, Samuel A
2018-01-01
Anthropogenic migration barriers fragment many populations and limit the ability of species to respond to climate-induced biome shifts. Conservation actions designed to conserve habitat connectivity and mitigate barriers are needed to unite fragmented populations into larger, more viable metapopulations, and to allow species to track their climate envelope over time. Landscape genetic analysis provides an empirical means to infer landscape factors influencing gene flow and thereby inform such conservation actions. However, there are currently many methods available for model selection in landscape genetics, and considerable uncertainty as to which provide the greatest accuracy in identifying the true landscape model influencing gene flow among competing alternative hypotheses. In this study, we used population genetic simulations to evaluate the performance of seven regression-based model selection methods on a broad array of landscapes that varied by the number and type of variables contributing to resistance, the magnitude and cohesion of resistance, as well as the functional relationship between variables and resistance. We also assessed the effect of transformations designed to linearize the relationship between genetic and landscape distances. We found that linear mixed effects models had the highest accuracy in every way we evaluated model performance; however, other methods also performed well in many circumstances, particularly when landscape resistance was high and the correlation among competing hypotheses was limited. Our results provide guidance for which regression-based model selection methods provide the most accurate inferences in landscape genetic analysis and thereby best inform connectivity conservation actions. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Wolstenholme, E Œ
1978-01-01
Elementary Vectors, Third Edition serves as an introductory course in vector analysis and is intended to present the theoretical and application aspects of vectors. The book covers topics that rigorously explain and provide definitions, principles, equations, and methods in vector analysis. Applications of vector methods to simple kinematical and dynamical problems; central forces and orbits; and solutions to geometrical problems are discussed as well. This edition of the text also provides an appendix, intended for students, which the author hopes to bridge the gap between theory and appl
Zhao, Na; Yue, Tianxiang; Zhou, Xun; Zhao, Mingwei; Liu, Yu; Du, Zhengping; Zhang, Lili
2017-07-01
Downscaling precipitation is required in local scale climate impact studies. In this paper, a statistical downscaling scheme was presented with a combination of geographically weighted regression (GWR) model and a recently developed method, high accuracy surface modeling method (HASM). This proposed method was compared with another downscaling method using the Coupled Model Intercomparison Project Phase 5 (CMIP5) database and ground-based data from 732 stations across China for the period 1976-2005. The residual which was produced by GWR was modified by comparing different interpolators including HASM, Kriging, inverse distance weighted method (IDW), and Spline. The spatial downscaling from 1° to 1-km grids for period 1976-2005 and future scenarios was achieved by using the proposed downscaling method. The prediction accuracy was assessed at two separate validation sites throughout China and Jiangxi Province on both annual and seasonal scales, with the root mean square error (RMSE), mean relative error (MRE), and mean absolute error (MAE). The results indicate that the developed model in this study outperforms the method that builds transfer function using the gauge values. There is a large improvement in the results when using a residual correction with meteorological station observations. In comparison with other three classical interpolators, HASM shows better performance in modifying the residual produced by local regression method. The success of the developed technique lies in the effective use of the datasets and the modification process of the residual by using HASM. The results from the future climate scenarios show that precipitation exhibits overall increasing trend from T1 (2011-2040) to T2 (2041-2070) and T2 to T3 (2071-2100) in RCP2.6, RCP4.5, and RCP8.5 emission scenarios. The most significant increase occurs in RCP8.5 from T2 to T3, while the lowest increase is found in RCP2.6 from T2 to T3, increased by 47.11 and 2.12 mm, respectively.
Passaro, Antony D; Vettel, Jean M; McDaniel, Jonathan; Lawhern, Vernon; Franaszczuk, Piotr J; Gordon, Stephen M
2017-03-01
During an experimental session, behavioral performance fluctuates, yet most neuroimaging analyses of functional connectivity derive a single connectivity pattern. These conventional connectivity approaches assume that since the underlying behavior of the task remains constant, the connectivity pattern is also constant. We introduce a novel method, behavior-regressed connectivity (BRC), to directly examine behavioral fluctuations within an experimental session and capture their relationship to changes in functional connectivity. This method employs the weighted phase lag index (WPLI) applied to a window of trials with a weighting function. Using two datasets, the BRC results are compared to conventional connectivity results during two time windows: the one second before stimulus onset to identify predictive relationships, and the one second after onset to capture task-dependent relationships. In both tasks, we replicate the expected results for the conventional connectivity analysis, and extend our understanding of the brain-behavior relationship using the BRC analysis, demonstrating subject-specific BRC maps that correspond to both positive and negative relationships with behavior. Comparison with Existing Method(s): Conventional connectivity analyses assume a consistent relationship between behaviors and functional connectivity, but the BRC method examines performance variability within an experimental session to understand dynamic connectivity and transient behavior. The BRC approach examines connectivity as it covaries with behavior to complement the knowledge of underlying neural activity derived from conventional connectivity analyses. Within this framework, BRC may be implemented for the purpose of understanding performance variability both within and between participants. Published by Elsevier B.V.
Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Understanding MCP-MOD dose finding as a method based on linear regression.
Thomas, Neal
2017-11-30
MCP-MOD is a testing and model selection approach for clinical dose finding studies. During testing, contrasts of dose group means are derived from candidate dose response models. A multiple-comparison procedure is applied that controls the alpha level for the family of null hypotheses associated with the contrasts. Provided at least one contrast is significant, a corresponding set of "good" candidate models is identified. The model generating the most significant contrast is typically selected. There have been numerous publications on the method. It was endorsed by the European Medicines Agency. The MCP-MOD procedure can be alternatively represented as a method based on simple linear regression, where "simple" refers to the inclusion of an intercept and a single predictor variable, which is a transformation of dose. It is shown that the contrasts are equal to least squares linear regression slope estimates after a rescaling of the predictor variables. The test for each contrast is the usual t statistic for a null slope parameter, except that a variance estimate with fewer degrees of freedom is used in the standard error. Selecting the model corresponding to the most significant contrast P value is equivalent to selecting the predictor variable yielding the smallest residual sum of squares. This criteria orders the models like a common goodness-of-fit test, but it does not assure a good fit. Common inferential methods applied to the selected model are subject to distortions that are often present following data-based model selection. Copyright © 2017 John Wiley & Sons, Ltd.
A simple method of equine limb force vector analysis and its potential applications
Directory of Open Access Journals (Sweden)
Sarah Jane Hobbs
2018-02-01
Full Text Available Background Ground reaction forces (GRF measured during equine gait analysis are typically evaluated by analyzing discrete values obtained from continuous force-time data for the vertical, longitudinal and transverse GRF components. This paper describes a simple, temporo-spatial method of displaying and analyzing sagittal plane GRF vectors. In addition, the application of statistical parametric mapping (SPM is introduced to analyse differences between contra-lateral fore and hindlimb force-time curves throughout the stance phase. The overall aim of the study was to demonstrate alternative methods of evaluating functional (asymmetry within horses. Methods GRF and kinematic data were collected from 10 horses trotting over a series of four force plates (120 Hz. The kinematic data were used to determine clean hoof contacts. The stance phase of each hoof was determined using a 50 N threshold. Vertical and longitudinal GRF for each stance phase were plotted both as force-time curves and as force vector diagrams in which vectors originating at the centre of pressure on the force plate were drawn at intervals of 8.3 ms for the duration of stance. Visual evaluation was facilitated by overlay of the vector diagrams for different limbs. Summary vectors representing the magnitude (VecMag and direction (VecAng of the mean force over the entire stance phase were superimposed on the force vector diagram. Typical measurements extracted from the force-time curves (peak forces, impulses were compared with VecMag and VecAng using partial correlation (controlling for speed. Paired samples t-tests (left v. right diagonal pair comparison and high v. low vertical force diagonal pair comparison were performed on discrete and vector variables using traditional methods and Hotelling’s T2 tests on normalized stance phase data using SPM. Results Evidence from traditional statistical tests suggested that VecMag is more influenced by the vertical force and impulse, whereas
Azil, Aishah H; Bruce, David; Williams, Craig R
2014-06-01
We investigated spatial autocorrelation of female Aedes aegypti L. mosquito abundance from BG-Sentinel trap and sticky ovitrap collections in Cairns, north Queensland, Australia. BG-Sentinel trap collections in 2010 show a significant spatial autocorrelation across the study site and over a smaller spatial extent, while sticky ovitrap collections only indicate a non-significant, weak spatial autocorrelation. The BG-Sentinel trap collections were suitable for spatial interpolation using ordinary kriging and cokriging techniques. The uses of Premise Condition Index and potential breeding container data have helped improve our prediction of vector abundance. Semiovariograms and prediction maps indicate that the spatial autocorrelation of mosquito abundance determined by BG-Sentinel traps extends farther compared to sticky ovitrap collections. Based on our data, fewer BG-Sentinel traps are required to represent vector abundance at a series of houses compared to sticky ovitraps. A lack of spatial structure was observed following vector control treatment in the area. This finding has implications for the design and costs of dengue vector surveillance programs. © 2014 The Society for Vector Ecology.
Directory of Open Access Journals (Sweden)
Wen-Gang Zhou
2015-06-01
Full Text Available With the deep research of genomics and proteomics, the number of new protein sequences has expanded rapidly. With the obvious shortcomings of high cost and low efficiency of the traditional experimental method, the calculation method for protein localization prediction has attracted a lot of attention due to its convenience and low cost. In the machine learning techniques, neural network and support vector machine (SVM are often used as learning tools. Due to its complete theoretical framework, SVM has been widely applied. In this paper, we make an improvement on the existing machine learning algorithm of the support vector machine algorithm, and a new improved algorithm has been developed, combined with Bayesian algorithms. The proposed algorithm can improve calculation efficiency, and defects of the original algorithm are eliminated. According to the verification, the method has proved to be valid. At the same time, it can reduce calculation time and improve prediction efficiency.
Mandal, Nilrudra; Doloi, Biswanath; Mondal, Biswanath
2016-01-01
In the present study, an attempt has been made to apply the Taguchi parameter design method and regression analysis for optimizing the cutting conditions on surface finish while machining AISI 4340 steel with the help of the newly developed yttria based Zirconia Toughened Alumina (ZTA) inserts. These inserts are prepared through wet chemical co-precipitation route followed by powder metallurgy process. Experiments have been carried out based on an orthogonal array L9 with three parameters (cutting speed, depth of cut and feed rate) at three levels (low, medium and high). Based on the mean response and signal to noise ratio (SNR), the best optimal cutting condition has been arrived at A3B1C1 i.e. cutting speed is 420 m/min, depth of cut is 0.5 mm and feed rate is 0.12 m/min considering the condition smaller is the better approach. Analysis of Variance (ANOVA) is applied to find out the significance and percentage contribution of each parameter. The mathematical model of surface roughness has been developed using regression analysis as a function of the above mentioned independent variables. The predicted values from the developed model and experimental values are found to be very close to each other justifying the significance of the model. A confirmation run has been carried out with 95 % confidence level to verify the optimized result and the values obtained are within the prescribed limit.
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections
Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.
2014-01-01
A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
Wang, Molin; Kuchiba, Aya; Ogino, Shuji
2015-01-01
In interdisciplinary biomedical, epidemiologic, and population research, it is increasingly necessary to consider pathogenesis and inherent heterogeneity of any given health condition and outcome. As the unique disease principle implies, no single biomarker can perfectly define disease subtypes. The complex nature of molecular pathology and biology necessitates biostatistical methodologies to simultaneously analyze multiple biomarkers and subtypes. To analyze and test for heterogeneity hypotheses across subtypes defined by multiple categorical and/or ordinal markers, we developed a meta-regression method that can utilize existing statistical software for mixed-model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by 1 marker while controlling for other markers and to evaluate whether the difference in exposure-subtype association across subtypes defined by 1 marker depends on any other markers. To illustrate this method in molecular pathological epidemiology research, we examined the associations between smoking status and colorectal cancer subtypes defined by 3 correlated tumor molecular characteristics (CpG island methylator phenotype, microsatellite instability, and the B-Raf protooncogene, serine/threonine kinase (BRAF), mutation) in the Nurses' Health Study (1980–2010) and the Health Professionals Follow-up Study (1986–2010). This method can be widely useful as molecular diagnostics and genomic technologies become routine in clinical medicine and public health. PMID:26116215
Directory of Open Access Journals (Sweden)
Wei-Chih Hsu
2012-04-01
Full Text Available Support vector machines (SVM are a powerful tool for building good spam filtering models. However, the performance of the model depends on parameter selection. Parameter selection of SVM will affect classification performance seriously during training process. In this study, we use combined Taguchi method and Staelin method to optimize the SVM-based E-mail Spam Filtering model and promote spam filtering accuracy. We compare it with other parameters optimization methods, such as grid search. Six real-world mail data sets are selected to demonstrate the effectiveness and feasibility of the method. The results show that our proposed methods can find the effective model with high classification accuracy
Fienen, Michael N.; Selbig, William R.
2012-01-01
A new sample collection system was developed to improve the representation of sediment entrained in urban storm water by integrating water quality samples from the entire water column. The depth-integrated sampler arm (DISA) was able to mitigate sediment stratification bias in storm water, thereby improving the characterization of suspended-sediment concentration and particle size distribution at three independent study locations. Use of the DISA decreased variability, which improved statistical regression to predict particle size distribution using surrogate environmental parameters, such as precipitation depth and intensity. The performance of this statistical modeling technique was compared to results using traditional fixed-point sampling methods and was found to perform better. When environmental parameters can be used to predict particle size distributions, environmental managers have more options when characterizing concentrations, loads, and particle size distributions in urban runoff.
Variable selection methods in PLS regression - a comparison study on metabolomics data
DEFF Research Database (Denmark)
Karaman, İbrahim; Hedemann, Mette Skou; Knudsen, Knud Erik Bach
different strategies for variable selection on PLSR method were considered and compared with respect to selected subset of variables and the possibility for biological validation. Sparse PLSR [1] as well as PLSR with Jack-knifing [2] was applied to data in order to achieve variable selection prior...... to comparison. Sparse PLSR is based on penalization of the loading weights (by elastic net, soft/hard thresholding etc.) on a PLSR model. In PLSR with Jack-knifing, significance of variables are calculated by uncertainty test. The data set used in this study is LC-MS data from an animal intervention study...... Integrating Omics data. Statistical Applications in Genetics and Molecular Biology, 7:Article 35, 2008. 2. Martens H and Martens M. Modifed Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR). Food Quality and Preference, 11:5-16, 2000....
Intelligent Emergency Stop Algorithm for a Manipulator Using a New Regression Method
Directory of Open Access Journals (Sweden)
Mignon Park
2012-05-01
Full Text Available In working environments with large manipulators, accidental collisions can cause severe personal injuries and can seriously damage manipulators, necessitating the development of an emergency stop algorithm to prevent such occurrences. In this paper, we propose an emergency stop system for the efficient and safe operation of a manipulator by applying an intelligent emergency stop algorithm. Our proposed intelligent algorithm considers the direction of motion of the manipulator. In addition, using a new regression method, the algorithm includes a decision step that determines whether a detected object is a collision-causing obstacle or a part of the manipulator. We apply our emergency stop system to a two-link manipulator and assess the performance of our intelligent emergency stop algorithm as compared with other models.
Gong, Ang; Zhao, Xiubin; Pang, Chunlei; Duan, Rong; Wang, Yong
2015-12-02
For Global Navigation Satellite System (GNSS) single frequency, single epoch attitude determination, this paper proposes a new reliable method with baseline vector constraint. First, prior knowledge of baseline length, heading, and pitch obtained from other navigation equipment or sensors are used to reconstruct objective function rigorously. Then, searching strategy is improved. It substitutes gradually Enlarged ellipsoidal search space for non-ellipsoidal search space to ensure correct ambiguity candidates are within it and make the searching process directly be carried out by least squares ambiguity decorrelation algorithm (LAMBDA) method. For all vector candidates, some ones are further eliminated by derived approximate inequality, which accelerates the searching process. Experimental results show that compared to traditional method with only baseline length constraint, this new method can utilize a priori baseline three-dimensional knowledge to fix ambiguity reliably and achieve a high success rate. Experimental tests also verify it is not very sensitive to baseline vector error and can perform robustly when angular error is not great.
An Improved Endmember Selection Method Based on Vector Length for MODIS Reflectance Channels
Directory of Open Access Journals (Sweden)
Yuanliu Xu
2015-05-01
Full Text Available Endmember selection is the basis for sub-pixel land cover classifications using multiple endmember spectral mixture analysis (MESMA that adopts variant endmember matrices for each pixel to mitigate errors caused by endmember variability in SMA. A spectral library covering a large number of endmembers can account for endmember variability, but it also lowers the computational efficiency. Therefore, an efficient endmember selection scheme to optimize the library is crucial to implement MESMA. In this study, we present an endmember selection method based on vector length. The spectra of a land cover class were divided into subsets using vector length intervals of the spectra, and the representative endmembers were derived from these subsets. Compared with the available endmember average RMSE (EAR method, our approach improved the computational efficiency in endmember selection. The method accuracy was further evaluated using spectral libraries derived from the ground reference polygon and Moderate Resolution Imaging Spectroradiometer (MODIS imagery respectively. Results using the different spectral libraries indicated that MESMA combined with the new approach performed slightly better than EAR method, with Kappa coefficient improved from 0.75 to 0.78. A MODIS image was used to test the mapping fraction, and the representative spectra based on vector length successfully modeled more than 90% spectra of the MODIS pixels by 2-endmember models.
Directory of Open Access Journals (Sweden)
Ang Gong
2015-12-01
Full Text Available For Global Navigation Satellite System (GNSS single frequency, single epoch attitude determination, this paper proposes a new reliable method with baseline vector constraint. First, prior knowledge of baseline length, heading, and pitch obtained from other navigation equipment or sensors are used to reconstruct objective function rigorously. Then, searching strategy is improved. It substitutes gradually Enlarged ellipsoidal search space for non-ellipsoidal search space to ensure correct ambiguity candidates are within it and make the searching process directly be carried out by least squares ambiguity decorrelation algorithm (LAMBDA method. For all vector candidates, some ones are further eliminated by derived approximate inequality, which accelerates the searching process. Experimental results show that compared to traditional method with only baseline length constraint, this new method can utilize a priori baseline three-dimensional knowledge to fix ambiguity reliably and achieve a high success rate. Experimental tests also verify it is not very sensitive to baseline vector error and can perform robustly when angular error is not great.
Liu, Zhong-bao; Gao, Yan-yun; Wang, Jian-zhen
2015-01-01
Support vector machine (SVM) with good leaning ability and generalization is widely used in the star spectra data classification. But when the scale of data becomes larger, the shortages of SVM appear: the calculation amount is quite large and the classification speed is too slow. In order to solve the above problems, twin support vector machine (TWSVM) was proposed by Jayadeva. The advantage of TSVM is that the time cost is reduced to 1/4 of that of SVM. While all the methods mentioned above only focus on the global characteristics and neglect the local characteristics. In view of this, an automatic classification method of star spectra data based on manifold fuzzy twin support vector machine (MF-TSVM) is proposed in this paper. In MF-TSVM, manifold-based discriminant analysis (MDA) is used to obtain the global and local characteristics of the input data and the fuzzy membership is introduced to reduce the influences of noise and singular data on the classification results. Comparative experiments with current classification methods, such as C-SVM and KNN, on the SDSS star spectra datasets verify the effectiveness of the proposed method.
Cormanich, Rodrigo A; Goodarzi, Mohammad; Freitas, Matheus P
2009-02-01
Inhibition of tyrosine kinase enzyme WEE1 is an important step for the treatment of cancer. The bioactivities of a series of WEE1 inhibitors have been previously modeled through comparative molecular field analyses (CoMFA and CoMSIA), but a two-dimensional image-based quantitative structure-activity relationship approach has shown to be highly predictive for other compound classes. This method, called multivariate image analysis applied to quantitative structure-activity relationship, was applied here to derive quantitative structure-activity relationship models. Whilst the well-known bilinear and multilinear partial least squares regressions (PLS and N-PLS, respectively) correlated multivariate image analysis descriptors with the corresponding dependent variables only reasonably well, the use of wavelet and principal component ranking as variable selection methods, together with least-squares support vector machine, improved significantly the prediction statistics. These recently implemented mathematical tools, particularly novel in quantitative structure-activity relationship studies, represent an important advance for the development of more predictive quantitative structure-activity relationship models and, consequently, new drugs.
Widyaningsih, Purnami; Retno Sari Saputro, Dewi; Nugrahani Putri, Aulia
2017-06-01
GWOLR model combines geographically weighted regression (GWR) and (ordinal logistic reression) OLR models. Its parameter estimation employs maximum likelihood estimation. Such parameter estimation, however, yields difficult-to-solve system of nonlinear equations, and therefore numerical approximation approach is required. The iterative approximation approach, in general, uses Newton-Raphson (NR) method. The NR method has a disadvantage—its Hessian matrix is always the second derivatives of each iteration so it does not always produce converging results. With regard to this matter, NR model is modified by substituting its Hessian matrix into Fisher information matrix, which is termed Fisher scoring (FS). The present research seeks to determine GWOLR model parameter estimation using Fisher scoring method and apply the estimation on data of the level of vulnerability to Dengue Hemorrhagic Fever (DHF) in Semarang. The research concludes that health facilities give the greatest contribution to the probability of the number of DHF sufferers in both villages. Based on the number of the sufferers, IR category of DHF in both villages can be determined.
Ultrasonic 3-D Vector Flow Method for Quantitative In Vivo Peak Velocity and Flow Rate Estimation.
Holbek, Simon; Ewertsen, Caroline; Bouzari, Hamed; Pihl, Michael Johannes; Hansen, Kristoffer Lindskov; Stuart, Matthias Bo; Thomsen, Carsten; Nielsen, Michael Bachmann; Jensen, Jorgen Arendt
2017-03-01
Current clinical ultrasound (US) systems are limited to show blood flow movement in either 1-D or 2-D. In this paper, a method for estimating 3-D vector velocities in a plane using the transverse oscillation method, a 32×32 element matrix array, and the experimental US scanner SARUS is presented. The aim of this paper is to estimate precise flow rates and peak velocities derived from 3-D vector flow estimates. The emission sequence provides 3-D vector flow estimates at up to 1.145 frames/s in a plane, and was used to estimate 3-D vector flow in a cross-sectional image plane. The method is validated in two phantom studies, where flow rates are measured in a flow-rig, providing a constant parabolic flow, and in a straight-vessel phantom ( ∅=8 mm) connected to a flow pump capable of generating time varying waveforms. Flow rates are estimated to be 82.1 ± 2.8 L/min in the flow-rig compared with the expected 79.8 L/min, and to 2.68 ± 0.04 mL/stroke in the pulsating environment compared with the expected 2.57 ± 0.08 mL/stroke. Flow rates estimated in the common carotid artery of a healthy volunteer are compared with magnetic resonance imaging (MRI) measured flow rates using a 1-D through-plane velocity sequence. Mean flow rates were 333 ± 31 mL/min for the presented method and 346 ± 2 mL/min for the MRI measurements.
Reflexion on linear regression trip production modelling method for ensuring good model quality
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
Directory of Open Access Journals (Sweden)
Nina L. Timofeeva
2014-01-01
Full Text Available The article presents the methodological and technical bases for the creation of regression models that adequately reflect reality. The focus is on methods of removing residual autocorrelation in models. Algorithms eliminating heteroscedasticity and autocorrelation of the regression model residuals: reweighted least squares method, the method of Cochran-Orkutta are given. A model of "pure" regression is build, as well as to compare the effect on the dependent variable of the different explanatory variables when the latter are expressed in different units, a standardized form of the regression equation. The scheme of abatement techniques of heteroskedasticity and autocorrelation for the creation of regression models specific to the social and cultural sphere is developed.
Chen, Y. M.; Lin, P.; He, J. Q.; He, Y.; Li, X. L.
2016-01-01
This study was carried out for rapid and noninvasive determination of the class of sorghum species by using the manifold dimensionality reduction (MDR) method and the nonlinear regression method of least squares support vector machines (LS-SVM) combing with the mid-infrared spectroscopy (MIRS) techniques. The methods of Durbin and Run test of augmented partial residual plot (APaRP) were performed to diagnose the nonlinearity of the raw spectral data. The nonlinear MDR methods of isometric feature mapping (ISOMAP), local linear embedding, laplacian eigenmaps and local tangent space alignment, as well as the linear MDR methods of principle component analysis and metric multidimensional scaling were employed to extract the feature variables. The extracted characteristic variables were utilized as the input of LS-SVM and established the relationship between the spectra and the target attributes. The mean average precision (MAP) scores and prediction accuracy were respectively used to evaluate the performance of models. The prediction results showed that the ISOMAP-LS-SVM model obtained the best classification performance, where the MAP scores and prediction accuracy were 0.947 and 92.86%, respectively. It can be concluded that the ISOMAP-LS-SVM model combined with the MIRS technique has the potential of classifying the species of sorghum in a reasonable accuracy.
Seifert, Veronica Aili
Lyme disease is the most prevalent tick-borne disease in North America and presents challenges to clinicians, researchers and the public in diagnosis, treatment and prevention. Lyme disease is caused by the spirochete, Borrelia burgdorferi, which is a zoonotic pathogen obligate upon hematophagous arthropod vectors and propagates in small mammal reservoir hosts. Identifying factors governing zoonotic diseases within regions of high-risk provides local health and agricultural agencies with necessary information to formulate public policy and implement treatment protocols to abate the rise and expansion of infectious disease outbreaks. In the United States, the documented primary reservoir host of Lyme disease is the white-footed mouse, Peromyscus leucopus, and the arthropod vector is the deer tick, Ixodes scapularis. Reducing the impact of Lyme disease will need novel methods for identifying both the reservoir host and the tick vector. The reservoir host, Peromyscus leucopus is difficult to distinguish from the virtually identical Peromyscus maniculatus that also is present in Northern Minnesota, a region where Lyme disease is endemic. Collection of the Ixodes tick, the Lyme disease vector, is difficult as this is season dependent and differs from year to year. This study develops new strategies to assess the extent of Borrelia burgdorferi in the local environment of Northern Minnesota. A selective and precise method to identify Peromyscus species was developed. This assay provides a reliable and definitive method to identify the reservoir host, Peromyscus leucopus from a physically identical and sympatric Peromyscus species, Peromyscus maniculatus. A new strategy to collect ticks for measuring the disbursement of Borrelia was employed. Students from local high schools were recruited to collect ticks. This strategy increased the available manpower to cover greater terrain, provided students with valuable experience in research methodology, and highlighted the
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various
Qian, Lin-Feng; Shi, Guo-Dong; Huang, Yong; Xing, Yu-Ming
2017-10-01
In vector radiative transfer, backward ray tracing is seldom used. We present a backward and forward Monte Carlo method to simulate vector radiative transfer in a two-dimensional graded index medium, which is new and different from the conventional Monte Carlo method. The backward and forward Monte Carlo method involves dividing the ray tracing into two processes backward tracing and forward tracing. In multidimensional graded index media, the trajectory of a ray is usually a three-dimensional curve. During the transport of a polarization ellipse, the curved ray trajectory will induce geometrical effects and cause Stokes parameters to continuously change. The solution processes for a non-scattering medium and an anisotropic scattering medium are analysed. We also analyse some parameters that influence the Stokes vector in two-dimensional graded index media. The research shows that the Q component of the Stokes vector cannot be ignored. However, the U and V components of the Stokes vector are very small.
Geometrical Modification of Learning Vector Quantization Method for Solving Classification Problems
Directory of Open Access Journals (Sweden)
Korhan GÜNEL
2016-09-01
Full Text Available In this paper, a geometrical scheme is presented to show how to overcome an encountered problem arising from the use of generalized delta learning rule within competitive learning model. It is introduced a theoretical methodology for describing the quantization of data via rotating prototype vectors on hyper-spheres.The proposed learning algorithm is tested and verified on different multidimensional datasets including a binary class dataset and two multiclass datasets from the UCI repository, and a multiclass dataset constructed by us. The proposed method is compared with some baseline learning vector quantization variants in literature for all domains. Large number of experiments verify the performance of our proposed algorithm with acceptable accuracy and macro f1 scores.
Selecting minimum dataset soil variables using PLSR as a regressive multivariate method
Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.
2017-04-01
Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP
Investigating the Accuracy of Three Estimation Methods for Regression Discontinuity Design
Sun, Shuyan; Pan, Wei
2013-01-01
Regression discontinuity design is an alternative to randomized experiments to make causal inference when random assignment is not possible. This article first presents the formal identification and estimation of regression discontinuity treatment effects in the framework of Rubin's causal model, followed by a thorough literature review of…
Energy Technology Data Exchange (ETDEWEB)
Urbanski, P.; Kowalska, E.
1997-12-31
The principle of the bootstrap methodology applied for the assessment of parameters and prediction ability of the linear regression models was presented. Application of this method was shown on the example of calibration of the radioisotope sulphuric acid concentration gauge. The bootstrap method allows to determine not only the numerical values of the regression coefficients, but also enables to investigate their distributions. (author). 11 refs, 12 figs, 3 tabs.
EPMLR: sequence-based linear B-cell epitope prediction method using multiple linear regression.
Lian, Yao; Ge, Meng; Pan, Xian-Ming
2014-12-19
B-cell epitopes have been studied extensively due to their immunological applications, such as peptide-based vaccine development, antibody production, and disease diagnosis and therapy. Despite several decades of research, the accurate prediction of linear B-cell epitopes has remained a challenging task. In this work, based on the antigen's primary sequence information, a novel linear B-cell epitope prediction model was developed using the multiple linear regression (MLR). A 10-fold cross-validation test on a large non-redundant dataset was performed to evaluate the performance of our model. To alleviate the problem caused by the noise of negative dataset, 300 experiments utilizing 300 sub-datasets were performed. We achieved overall sensitivity of 81.8%, precision of 64.1% and area under the receiver operating characteristic curve (AUC) of 0.728. We have presented a reliable method for the identification of linear B cell epitope using antigen's primary sequence information. Moreover, a web server EPMLR has been developed for linear B-cell epitope prediction: http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/ .
An enhanced method for sequence walking and paralog mining: TOPO® Vector-Ligation PCR
Directory of Open Access Journals (Sweden)
Davis Thomas M
2010-03-01
Full Text Available Abstract Background Although technological advances allow for the economical acquisition of whole genome sequences, many organisms' genomes remain unsequenced, and fully sequenced genomes may contain gaps. Researchers reliant upon partial genomic or heterologous sequence information require methods for obtaining unknown sequences from loci of interest. Various PCR based techniques are available for sequence walking - i.e., the acquisition of unknown DNA sequence adjacent to known sequence. Many such methods require rigid, elaborate protocols and/or impose narrowly confined options in the choice of restriction enzymes for necessary genomic digests. We describe a new method, TOPO® Vector-Ligation PCR (or TVL-PCR that innovatively integrates available tools and familiar concepts to offer advantages as a means of both targeted sequence walking and paralog mining. Findings TVL-PCR exploits the ligation efficiency of the pCR®4-TOPO® (Invitrogen, Carlsbad, California vector system to capture fragments of unknown sequence by creating chimeric molecules containing defined priming sites at both ends. Initially, restriction enzyme-digested genomic DNA is end-repaired to create 3' adenosine overhangs and is then ligated to pCR4-TOPO vectors. The ligation product pool is used directly as a template for nested PCR, using specific primers to target orthologous sequences, or degenerate primers to enable capture of paralogous gene family members. We demonstrated the efficacy of this method by capturing entire coding and partial promoter sequences of several strawberry Superman-like genes. Conclusions TVL-PCR is a convenient and efficient method for DNA sequence walking and paralog mining that is applicable to any organism for which relevant DNA sequence is available as a basis for primer design.
Roychoudhury, Aryadeep; Basu, Supratim; Sengupta, Dibyendu N
2009-10-01
The efficiencies of different transformation methods of E. coli DH5Qalpha train, induced by several cations like Mg2+, Mn2+ Rb+ and especially Ca2+, with or without polyethylene glycol (PEG) and dimethyl sulfoxide (DMSO) were compared using the two commonly used plasmid vectors pCAMBIA1201 and pBI121. The widely used calcium chloride (CaCl2) method appeared to be the most efficient procedure, while rubidium chloride (RbCl) method was the least effective. The improvements in the classical CaCl2 method were found to further augment the transformation efficiency (TR)E for both the vectors like repeated alternate cycles of heat shock, followed by immediate cold, at least up to the third cycle; replacement of the heat shock step by a single microwave pulse and even more by double microwave treatment and administration of combined heat shock-microwave treatments. The pre-treatment of CaCl2-competent cells with 5% (v/v) ethanol, accompanied by single heat shock also triggered the (TR)E, which was further enhanced, when combined heat shock-microwave was applied. The minor alterations or improved approaches in CaCl2 method suggested in the present study may thus find use in more efficient E. coli transformation.
Ilhan, Ilhan; Tezel, Gülay
2013-04-01
SNPs (Single Nucleotide Polymorphisms) include millions of changes in human genome, and therefore, are promising tools for disease-gene association studies. However, this kind of studies is constrained by the high expense of genotyping millions of SNPs. For this reason, it is required to obtain a suitable subset of SNPs to accurately represent the rest of SNPs. For this purpose, many methods have been developed to select a convenient subset of tag SNPs, but all of them only provide low prediction accuracy. In the present study, a brand new method is developed and introduced as GA-SVM with parameter optimization. This method benefits from support vector machine (SVM) and genetic algorithm (GA) to predict SNPs and to select tag SNPs, respectively. Furthermore, it also uses particle swarm optimization (PSO) algorithm to optimize C and γ parameters of support vector machine. It is experimentally tested on a wide range of datasets, and the obtained results demonstrate that this method can provide better prediction accuracy in identifying tag SNPs compared to other methods at present. Copyright © 2012 Elsevier Inc. All rights reserved.
A primer on regression methods for decoding cis-regulatory logic
Energy Technology Data Exchange (ETDEWEB)
Das, Debopriya; Pellegrini, Matteo; Gray, Joe W.
2009-03-03
The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain
Stefanello, C; Vieira, S L; Xue, P; Ajuwon, K M; Adeola, O
2016-07-01
A study was conducted to determine the ileal digestible energy (IDE), ME, and MEn contents of bakery meal using the regression method and to evaluate whether the energy values are age-dependent in broiler chickens from zero to 21 d post hatching. Seven hundred and eighty male Ross 708 chicks were fed 3 experimental diets in which bakery meal was incorporated into a corn-soybean meal-based reference diet at zero, 100, or 200 g/kg by replacing the energy-yielding ingredients. A 3 × 3 factorial arrangement of 3 ages (1, 2, or 3 wk) and 3 dietary bakery meal levels were used. Birds were fed the same experimental diets in these 3 evaluated ages. Birds were grouped by weight into 10 replicates per treatment in a randomized complete block design. Apparent ileal digestibility and total tract retention of DM, N, and energy were calculated. Expression of mucin (MUC2), sodium-dependent phosphate transporter (NaPi-IIb), solute carrier family 7 (cationic amino acid transporter, Y(+) system, SLC7A2), glucose (GLUT2), and sodium-glucose linked transporter (SGLT1) genes were measured at each age in the jejunum by real-time PCR. Addition of bakery meal to the reference diet resulted in a linear decrease in retention of DM, N, and energy, and a quadratic reduction (P bakery meal did not affect jejunal gene expression. Expression of genes encoding MUC2, NaPi-IIb, and SLC7A2 linearly increased (P bakery meal linearly increased (P bakery meal was included and increased with age of broiler chickens. © 2016 Poultry Science Association Inc.
Directory of Open Access Journals (Sweden)
Lin Du
2016-06-01
Full Text Available Nitrogen is an essential nutrient element in crop photosynthesis and yield improvement. Thus, it is urgent and important to accurately estimate the leaf nitrogen contents (LNC of crops for precision nitrogen management. Based on the correlation between LNC and reflectance spectra, the hyperspectral LiDAR (HSL system can determine three-dimensional structural parameters and biochemical changes of crops. Thereby, HSL technology has been widely used to monitor the LNC of crops at leaf and canopy levels. In addition, the laser-induced fluorescence (LIF of chlorophyll, related to the histological structure and physiological conditions of green plants, can also be utilized to detect nutrient stress in crops. In this study, four regression algorithms, support vector machines (SVMs, partial least squares (PLS and two artificial neural networks (ANNs, back propagation NNs (BP-NNs and radial basic function NNs (RBF-NNs, were selected to estimate rice LNC in booting and heading stages based on reflectance and LIF spectra. These four regression algorithms were used for 36 input variables, including the reflectance spectral variables on 32 wavelengths and four peaks of the LIF spectra. A feature weight algorithm was proposed to select different band combinations for the LNC retrieval models. The determination coefficient (R2 and the root mean square error (RMSE of the retrieval models were utilized to compare their abilities of estimating the rice LNC. The experimental results demonstrate that (I these four regression methods are useful for estimating rice LNC in the order of RBF-NNs > SVMs > BP-NNs > PLS; (II The LIF data in two forms, including peaks and indices, display potential in rice LNC retrieval, especially when using the PLS regression (PLSR model for the relationship of rice LNC with spectral variables. The feature weighting algorithm is an effective and necessary method to determine appropriate band combinations for rice LNC estimation.
A Numerical Comparison of Rule Ensemble Methods and Support Vector Machines
Energy Technology Data Exchange (ETDEWEB)
Meza, Juan C.; Woods, Mark
2009-12-18
Machine or statistical learning is a growing field that encompasses many scientific problems including estimating parameters from data, identifying risk factors in health studies, image recognition, and finding clusters within datasets, to name just a few examples. Statistical learning can be described as 'learning from data' , with the goal of making a prediction of some outcome of interest. This prediction is usually made on the basis of a computer model that is built using data where the outcomes and a set of features have been previously matched. The computer model is called a learner, hence the name machine learning. In this paper, we present two such algorithms, a support vector machine method and a rule ensemble method. We compared their predictive power on three supernova type 1a data sets provided by the Nearby Supernova Factory and found that while both methods give accuracies of approximately 95%, the rule ensemble method gives much lower false negative rates.
Tomigashi, Yoshio; Ueyama, Kenji
A method for directly estimating the axis corresponding to the current phase in maximum torque per ampere (MTPA) control is proposed for sensorless vector control. This axis is called the maximum torque control (MTC) axis. In past studies concerning such methods, the behavior of the axis has been considered only for the case of MTPA control and not for an arbitrary current vector (with flux weakening control) that has a phase different from that for MTPA control. This paper enhances the definition of the MTC axis for an arbitrary current vector, describes an extended EMF model based on the MTC axis, and presents a method that can directly estimate the axis for an arbitrary current vector with flux weakening control. The effectiveness of the proposed method is confirmed by numerical analysis results.
Applying the Support Vector Machine Method to Matching IRAS and SDSS Catalogues
Directory of Open Access Journals (Sweden)
Chen Cao
2007-10-01
Full Text Available This paper presents results of applying a machine learning technique, the Support Vector Machine (SVM, to the astronomical problem of matching the Infra-Red Astronomical Satellite (IRAS and Sloan Digital Sky Survey (SDSS object catalogues. In this study, the IRAS catalogue has much larger positional uncertainties than those of the SDSS. A model was constructed by applying the supervised learning algorithm (SVM to a set of training data. Validation of the model shows a good identification performance (∼ 90% correct, better than that derived from classical cross-matching algorithms, such as the likelihood-ratio method used in previous studies.
Directory of Open Access Journals (Sweden)
Yukai Yao
2015-01-01
Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.
A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine
Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing
2017-09-01
Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.
Functional regression method for whole genome eQTL epistasis analysis with sequencing data.
Xu, Kelin; Jin, Li; Xiong, Momiao
2017-05-18
Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction
Energy Technology Data Exchange (ETDEWEB)
Pang, Hongfeng [Academy of Equipment, Beijing 101416 (China); College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073 (China); Zhu, XueJun, E-mail: zhuxuejun1990@126.com [College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073 (China); Pan, Mengchun; Zhang, Qi; Wan, Chengbiao; Luo, Shitu; Chen, Dixiang; Chen, Jinfei; Li, Ji; Lv, Yunxiao [College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073 (China)
2016-12-01
Misalignment error is one key factor influencing the measurement accuracy of geomagnetic vector measurement system, which should be calibrated with the difficulties that sensors measure different physical information and coordinates are invisible. A new misalignment calibration method by rotating a parallelepiped frame is proposed. Simulation and experiment result show the effectiveness of calibration method. The experimental system mainly contains DM-050 three-axis fluxgate magnetometer, INS (inertia navigation system), aluminium parallelepiped frame, aluminium plane base. Misalignment angles are calculated by measured data of magnetometer and INS after rotating the aluminium parallelepiped frame on aluminium plane base. After calibration, RMS error of geomagnetic north, vertical and east are reduced from 349.441 nT, 392.530 nT and 562.316 nT to 40.130 nT, 91.586 nT and 141.989 nT respectively. - Highlights: • A new misalignment calibration method by rotating a parallelepiped frame is proposed. • It does not need to know sensor attitude information or local dip angle. • The calibration system attitude change angle is not strictly required. • It can be widely used when sensors measure different physical information. • Geomagnetic vector measurement error is reduced evidently.
Dynamic analysis of suspension cable based on vector form intrinsic finite element method
Qin, Jian; Qiao, Liang; Wan, Jiancheng; Jiang, Ming; Xia, Yongjun
2017-10-01
A vector finite element method is presented for the dynamic analysis of cable structures based on the vector form intrinsic finite element (VFIFE) and mechanical properties of suspension cable. Firstly, the suspension cable is discretized into different elements by space points, the mass and external forces of suspension cable are transformed into space points. The structural form of cable is described by the space points at different time. The equations of motion for the space points are established according to the Newton’s second law. Then, the element internal forces between the space points are derived from the flexible truss structure. Finally, the motion equations of space points are solved by the central difference method with reasonable time integration step. The tangential tension of the bearing rope in a test ropeway with the moving concentrated loads is calculated and compared with the experimental data. The results show that the tangential tension of suspension cable with moving loads is consistent with the experimental data. This method has high calculated precision and meets the requirements of engineering application.
Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.
Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin
2017-04-01
As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.
Directory of Open Access Journals (Sweden)
Sergei Vladimirovich Varaksin
2017-06-01
Full Text Available Purpose. Construction of a mathematical model of the dynamics of childbearing change in the Altai region in 2000–2016, analysis of the dynamics of changes in birth rates for multiple age categories of women of childbearing age. Methodology. A auxiliary analysis element is the construction of linear mathematical models of the dynamics of childbearing by using fuzzy linear regression method based on fuzzy numbers. Fuzzy linear regression is considered as an alternative to standard statistical linear regression for short time series and unknown distribution law. The parameters of fuzzy linear and standard statistical regressions for childbearing time series were defined with using the built in language MatLab algorithm. Method of fuzzy linear regression is not used in sociological researches yet. Results. There are made the conclusions about the socio-demographic changes in society, the high efficiency of the demographic policy of the leadership of the region and the country, and the applicability of the method of fuzzy linear regression for sociological analysis.
Community effectiveness of pyriproxyfen as a dengue vector control method: A systematic review.
Directory of Open Access Journals (Sweden)
Dorit Maoz
2017-07-01
Full Text Available Vector control is the only widely utilised method for primary prevention and control of dengue. The use of pyriproxyfen may be promising, and autodissemination approach may reach hard to reach breeding places. It offers a unique mode of action (juvenile hormone mimic and as an additional tool for the management of insecticide resistance among Aedes vectors. However, evidence of efficacy and community effectiveness (CE remains limited.The aim of this systematic review is to compile and analyse the existing literature for evidence on the CE of pyriproxyfen as a vector control method for reducing Ae. aegypti and Ae. albopictus populations and thereby human dengue transmission.Systematic search of PubMed, Embase, Lilacs, Cochrane library, WHOLIS, Web of Science, Google Scholar as well as reference lists of all identified studies. Removal of duplicates, screening of abstracts and assessment for eligibility of the remaining studies followed. Relevant data were extracted, and a quality assessment conducted. Results were classified into four main categories of how pyriproxyfen was applied: - 1 container treatment, 2 fumigation, 3 auto-dissemination or 4 combination treatments,-and analysed with a view to their public health implication.Out of 745 studies 17 studies were identified that fulfilled all eligibility criteria. The results show that pyriproxyfen can be effective in reducing the numbers of Aedes spp. immatures with different methods of application when targeting their main breeding sites. However, the combination of pyriproxyfen with a second product increases efficacy and/or persistence of the intervention and may also slow down the development of insecticide resistance. Open questions concern concentration and frequency of application in the various treatments. Area-wide ultra-low volume treatment with pyriproxyfen currently lacks evidence and cannot be recommended. Community participation and acceptance has not consistently been successful
Community effectiveness of pyriproxyfen as a dengue vector control method: A systematic review.
Maoz, Dorit; Ward, Tara; Samuel, Moody; Müller, Pie; Runge-Ranzinger, Silvia; Toledo, Joao; Boyce, Ross; Velayudhan, Raman; Horstick, Olaf
2017-07-01
Vector control is the only widely utilised method for primary prevention and control of dengue. The use of pyriproxyfen may be promising, and autodissemination approach may reach hard to reach breeding places. It offers a unique mode of action (juvenile hormone mimic) and as an additional tool for the management of insecticide resistance among Aedes vectors. However, evidence of efficacy and community effectiveness (CE) remains limited. The aim of this systematic review is to compile and analyse the existing literature for evidence on the CE of pyriproxyfen as a vector control method for reducing Ae. aegypti and Ae. albopictus populations and thereby human dengue transmission. Systematic search of PubMed, Embase, Lilacs, Cochrane library, WHOLIS, Web of Science, Google Scholar as well as reference lists of all identified studies. Removal of duplicates, screening of abstracts and assessment for eligibility of the remaining studies followed. Relevant data were extracted, and a quality assessment conducted. Results were classified into four main categories of how pyriproxyfen was applied: - 1) container treatment, 2) fumigation, 3) auto-dissemination or 4) combination treatments,-and analysed with a view to their public health implication. Out of 745 studies 17 studies were identified that fulfilled all eligibility criteria. The results show that pyriproxyfen can be effective in reducing the numbers of Aedes spp. immatures with different methods of application when targeting their main breeding sites. However, the combination of pyriproxyfen with a second product increases efficacy and/or persistence of the intervention and may also slow down the development of insecticide resistance. Open questions concern concentration and frequency of application in the various treatments. Area-wide ultra-low volume treatment with pyriproxyfen currently lacks evidence and cannot be recommended. Community participation and acceptance has not consistently been successful and needs to
Al-Ghraibah, Amani
Solar flares release stored magnetic energy in the form of radiation and can have significant detrimental effects on earth including damage to technological infrastructure. Recent work has considered methods to predict future flare activity on the basis of quantitative measures of the solar magnetic field. Accurate advanced warning of solar flare occurrence is an area of increasing concern and much research is ongoing in this area. Our previous work 111] utilized standard pattern recognition and classification techniques to determine (classify) whether a region is expected to flare within a predictive time window, using a Relevance Vector Machine (RVM) classification method. We extracted 38 features which describing the complexity of the photospheric magnetic field, the result classification metrics will provide the baseline against which we compare our new work. We find a true positive rate (TPR) of 0.8, true negative rate (TNR) of 0.7, and true skill score (TSS) of 0.49. This dissertation proposes three basic topics; the first topic is an extension to our previous work [111, where we consider a feature selection method to determine an appropriate feature subset with cross validation classification based on a histogram analysis of selected features. Classification using the top five features resulting from this analysis yield better classification accuracies across a large unbalanced dataset. In particular, the feature subsets provide better discrimination of the many regions that flare where we find a TPR of 0.85, a TNR of 0.65 sightly lower than our previous work, and a TSS of 0.5 which has an improvement comparing with our previous work. In the second topic, we study the prediction of solar flare size and time-to-flare using support vector regression (SVR). When we consider flaring regions only, we find an average error in estimating flare size of approximately half a GOES class. When we additionally consider non-flaring regions, we find an increased average
Chen, Quan
2018-01-01
The impact of underground excavation on slope stability is controlled by many parameters, including the shape of slope, the mechanical property of soil and rock, the relative position of excavation zone and slip surface, and so on. The factor of safety (FOS) base on limit equilibrium method (LEM) and strength reduction method (SRM) is not suitable to evaluate the impact. Vector sum method (VSM) and orthogonal experiment are used to evaluate the impact by doing parameters sensitivity analysis. The result shows that the VSM could be used to in this research field, and the gradient of a slope, the relative position between a excavation area and a slope, the cohesion are the top three factors which impact the stability significantly.
An Improved Array Steering Vector Estimation Method and Its Application in Speech Enhancement
Directory of Open Access Journals (Sweden)
Meng Hwa Er
2005-11-01
Full Text Available We propose a robust microphone array for speech enhancement and noise suppression. To overcome target signal cancellation problem of conventional beamformes caused by array imperfections or reverberation effects, the proposed method adopts arbitrary transfer function relating each microphone and target speech signal as array channel model. This is achieved in two ways. First, we propose a method to estimate the array steering vector (ASV by means of exploiting the nonstationarity of speech signal to combat stationary noise and interference. Next, with the estimated ASV, a robust matched-filter-(MF- array-based generalized sidelobe canceller (MF-GSC is constructed to enhance the speech signal and suppress noise/interference. In addition, it also has the capability to reduce the reverberation effects of the acoustic enclosure. Numerical results show that the proposed method demonstrates high performance even in adverse environments.
A Vector Flow Imaging Method for Portable Ultrasound Using Synthetic Aperture Sequential Beamforming
DEFF Research Database (Denmark)
di Ianni, Tommaso; Villagómez Hoyos, Carlos Armando; Ewertsen, Caroline
2017-01-01
for the velocity estimation along the lateral and axial directions using a phase-shift estimator. The performance of the method was investigated with constant flow measurements in a flow rig system using the SARUS scanner and a 4.1-MHz linear array. A sequence was designed with interleaved B-mode and flow...... emissions to obtain continuous data acquisition. A parametric study was carried out to evaluate the effect of critical parameters. The vessel was placed at depths from 20 to 40 mm, with beam-to-flow angles of 65°, 75°, and 90°. For the lateral velocities at 20 mm, a bias between -5% and -6.2% was obtained......This paper presents a vector flow imaging method for the integration of quantitative blood flow imaging in portable ultrasound systems. The method combines directional transverse oscillation (TO) and synthetic aperture sequential beamforming to yield continuous velocity estimation in the whole...
Analysis of EEG signals by combining eigenvector methods and multiclass support vector machines.
Derya Ubeyli, Elif
2008-01-01
A new approach based on the implementation of multiclass support vector machine (SVM) with the error correcting output codes (ECOC) is presented for classification of electroencephalogram (EEG) signals. In practical applications of pattern recognition, there are often diverse features extracted from raw data which needs recognizing. Decision making was performed in two stages: feature extraction by eigenvector methods and classification using the classifiers trained on the extracted features. The aim of the study is classification of the EEG signals by the combination of eigenvector methods and multiclass SVM. The purpose is to determine an optimum classification scheme for this problem and also to infer clues about the extracted features. The present research demonstrated that the eigenvector methods are the features which well represent the EEG signals and the multiclass SVM trained on these features achieved high classification accuracies.
Mercedes Berterretche; Andrew T. Hudak; Warren B. Cohen; Thomas K. Maiersperger; Stith T. Gower; Jennifer Dungan
2005-01-01
This study compared aspatial and spatial methods of using remote sensing and field data to predict maximum growing season leaf area index (LAI) maps in a boreal forest in Manitoba, Canada. The methods tested were orthogonal regression analysis (reduced major axis, RMA) and two geostatistical techniques: kriging with an external drift (KED) and sequential Gaussian...
The development of vector based 2.5D print methods for a painting machine
Parraman, Carinna
2013-02-01
Through recent trends in the application of digitally printed decorative finishes to products, CAD, 3D additive layer manufacturing and research in material perception, [1, 2] there is a growing interest in the accurate rendering of materials and tangible displays. Although current advances in colour management and inkjet printing has meant that users can take for granted high-quality colour and resolution in their printed images, digital methods for transferring a photographic coloured image from screen to paper is constrained by pixel count, file size, colorimetric conversion between colour spaces and the gamut limits of input and output devices. This paper considers new approaches to applying alternative colour palettes by using a vector-based approach through the application of paint mixtures, towards what could be described as a 2.5D printing method. The objective is to not apply an image to a textured surface, but where texture and colour are integral to the mark, that like a brush, delineates the contours in the image. The paper describes the difference between the way inks and paints are mixed and applied. When transcribing the fluid appearance of a brush stroke, there is a difference between a halftone printed mark and a painted mark. The issue of surface quality is significant to subjective qualities when studying the appearance of ink or paint on paper. The paper provides examples of a range of vector marks that are then transcribed into brush stokes by the painting machine.
Regression methods for spatially correlated data: an example using beetle attacks in a seed orchard
Preisler Haiganoush; Nancy G. Rappaport; David L. Wood
1997-01-01
We present a statistical procedure for studying the simultaneous effects of observed covariates and unmeasured spatial variables on responses of interest. The procedure uses regression type analyses that can be used with existing statistical software packages. An example using the rate of twig beetle attacks on Douglas-fir trees in a seed orchard illustrates the...
Liebezeit, J.R.; Smith, P.A.; Lanctot, R.B.; Schekkerman, H.; Tulp, I.Y.M.; Kendall, S.J.; Tracy, D.M.; Rodrigues, R.J.; Meltofte, H.; Robinson, J.A.; Gratto-Trevor, C.; Mccaffery, B.J.; Morse, J.; Zack, S.W.
2007-01-01
We modeled the relationship between egg flotation and age of a developing embryo for 24 species of shorebirds. For 21 species, we used regression analyses to estimate hatching date by modeling egg angle and float height, measured as continuous variables, against embryo age. For eggs early in
Sample Size Determination for Regression Models Using Monte Carlo Methods in R
Beaujean, A. Alexander
2014-01-01
A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…
Directory of Open Access Journals (Sweden)
Fabio Faria da Mota
Full Text Available BACKGROUND: Chagas disease is a trypanosomiasis whose agent is the protozoan parasite Trypanosoma cruzi, which is transmitted to humans by hematophagous bugs known as triatomines. Even though insecticide treatments allow effective control of these bugs in most Latin American countries where Chagas disease is endemic, the disease still affects a large proportion of the population of South America. The features of the disease in humans have been extensively studied, and the genome of the parasite has been sequenced, but no effective drug is yet available to treat Chagas disease. The digestive tract of the insect vectors in which T. cruzi develops has been much less well investigated than blood from its human hosts and constitutes a dynamic environment with very different conditions. Thus, we investigated the composition of the predominant bacterial species of the microbiota in insect vectors from Rhodnius, Triatoma, Panstrongylus and Dipetalogaster genera. METHODOLOGY/PRINCIPAL FINDINGS: Microbiota of triatomine guts were investigated using cultivation-independent methods, i.e., phylogenetic analysis of 16s rDNA using denaturing gradient gel electrophoresis (DGGE and cloned-based sequencing. The Chao index showed that the diversity of bacterial species in triatomine guts is low, comprising fewer than 20 predominant species, and that these species vary between insect species. The analyses showed that Serratia predominates in Rhodnius, Arsenophonus predominates in Triatoma and Panstrongylus, while Candidatus Rohrkolberia predominates in Dipetalogaster. CONCLUSIONS/SIGNIFICANCE: The microbiota of triatomine guts represents one of the factors that may interfere with T. cruzi transmission and virulence in humans. The knowledge of its composition according to insect species is important for designing measures of biological control for T. cruzi. We found that the predominant species of the bacterial microbiota in triatomines form a group of low
A star tracker on-orbit calibration method based on vector pattern match
Li, Jian; Xiong, Kun; Wei, Xinguo; Zhang, Guangjun
2017-04-01
On-orbit calibration is aimed at revising the star trackers' measurement model parameters and maintaining its attitude accuracy. The performance of existing calibration methods is quite poor. Among all the model parameters, the estimation of the principal point location is very challenging due to its vulnerability against measurement errors, yet, that it is the only parameter depicting the optical axis' projecting position on the image plane makes it of great significance. Its estimation error adds fixed bias to the output attitudes. Based on the criterion of vector pattern match, an on-orbit calibration method is proposed. The principal point location is estimated according to the criterion first. The other model parameters are updated by maximum likelihood method, and measures of multiple succeeding frames optimization and star density weight are adopted in the method to guarantee the estimation of robustness. Simulation and night sky observation results proved the validity of the proposed method. In the simulation with a poor initial guess of the principal point location, novel method's result is better than the least square method and Samaan's method.
Ji, Yanju; Huang, Wanyu; Yu, Mingmei; Guan, Shanshan; Wang, Yuan; Zhu, Yu
2017-01-01
This article studies full-waveform associated identification method of airborne time-domain electromagnetic method (ATEM) 3-d anomalies based on multiple linear regression analysis method. By using convolution algorithm, full-waveform theoretical responses are computed to derive sample library including switch-off-time period responses and off-time period responses. Extract full-waveform attributes from theoretical responses to derive linear regression equations which are used to identify the geological parameters. In order to improve the precision ulteriorly, we optimize the identification method by separating the sample library into different groups and identify the parameter respectively. Performance of full-waveform associated identification method with field data of wire-loop test experiments with ATEM system in Daedao of Changchun proves that the full-waveform associated identification method is feasible practically.
Directory of Open Access Journals (Sweden)
Igor K. Kochanenko
2013-01-01
Full Text Available Procedures of construction of curve regress by criterion of the least fractals, i.e. the greatest probability of the sums of degrees of the least deviations measured intensity from their modelling values are proved. The exponent is defined as fractal dimension of a time number. The difference of results of a well-founded method and a method of the least squares is quantitatively estimated.
Directory of Open Access Journals (Sweden)
Yuan Chuanlai
2014-06-01
Full Text Available This paper takes software method to compensate temperature against the drift problem of piezoresistive waist force sensor. The compensation algorithm is modeled through using the PLS method. And compensation module is designed based on multiple regression method. According to the simulation results, the system designed meet the basic phenomenon for drift correction. Finally, the actual data is used to validate this algorithm.
Community effectiveness of indoor spraying as a dengue vector control method: A systematic review.
Directory of Open Access Journals (Sweden)
Moody Samuel
2017-08-01
Full Text Available The prevention and control of dengue rely mainly on vector control methods, including indoor residual spraying (IRS and indoor space spraying (ISS. This study aimed to systematically review the available evidence on community effectiveness of indoor spraying.A systematic review was conducted using seven databases (PubMed, EMBASE, LILACS, Web of Science, WHOLIS, Cochrane, and Google Scholar and a manual search of the reference lists of the identified studies. Data from included studies were extracted, analysed and reported.The review generated seven studies only, three IRS and four ISS (two/three controlled studies respectively. Two IRS studies measuring human transmission showed a decline. One IRS and all four ISS studies measuring adult mosquitoes showed a very good effect, up to 100%, but not sustained. Two IRS studies and one ISS measuring immature mosquitoes, showed mixed results.It is evident that IRS and also ISS are effective adulticidal interventions against Aedes mosquitoes. However, evidence to suggest effectiveness of IRS as a larvicidal intervention and to reduce human dengue cases is limited-and even more so for ISS. Overall, there is a paucity of studies available on these two interventions that may be promising for dengue vector control, particularly for IRS with its residual effect.
Community effectiveness of indoor spraying as a dengue vector control method: A systematic review.
Samuel, Moody; Maoz, Dorit; Manrique, Pablo; Ward, Tara; Runge-Ranzinger, Silvia; Toledo, Joao; Boyce, Ross; Horstick, Olaf
2017-08-01
The prevention and control of dengue rely mainly on vector control methods, including indoor residual spraying (IRS) and indoor space spraying (ISS). This study aimed to systematically review the available evidence on community effectiveness of indoor spraying. A systematic review was conducted using seven databases (PubMed, EMBASE, LILACS, Web of Science, WHOLIS, Cochrane, and Google Scholar) and a manual search of the reference lists of the identified studies. Data from included studies were extracted, analysed and reported. The review generated seven studies only, three IRS and four ISS (two/three controlled studies respectively). Two IRS studies measuring human transmission showed a decline. One IRS and all four ISS studies measuring adult mosquitoes showed a very good effect, up to 100%, but not sustained. Two IRS studies and one ISS measuring immature mosquitoes, showed mixed results. It is evident that IRS and also ISS are effective adulticidal interventions against Aedes mosquitoes. However, evidence to suggest effectiveness of IRS as a larvicidal intervention and to reduce human dengue cases is limited-and even more so for ISS. Overall, there is a paucity of studies available on these two interventions that may be promising for dengue vector control, particularly for IRS with its residual effect.
Ragab, Marwa A A; Youssef, Rasha M
2013-11-01
New hybrid chemometric method has been applied to the emission response data. It deals with convolution of emission data using 8-points sin xi polynomials (discrete Fourier functions) after the derivative treatment of these emission data. This new application was used for the simultaneous determination of Fexofenadine and Montelukast in bulk and pharmaceutical preparation. It was found beneficial in the resolution of partially overlapping emission spectra of this mixture. The application of this chemometric method was found beneficial in eliminating different types of interferences common in spectrofluorimetry such as overlapping emission spectra and self- quenching. Not only this chemometric approache was applied to the emission data but also the obtained data were subjected to non-parametric linear regression analysis (Theil's method). The presented work compares the application of Theil's method in handling the response data, with the least-squares parametric regression method, which is considered the de facto standard method used for regression. So this work combines the advantages of derivative and convolution using discrete Fourier function together with the reliability and efficacy of the non-parametric analysis of data. Theil's method was found to be superior to the method of least squares as it could effectively circumvent any outlier data points.
Korany, Mohamed A; Maher, Hadir M; Galal, Shereen M; Ragab, Marwa A A
2013-05-01
This manuscript discusses the application and the comparison between three statistical regression methods for handling data: parametric, nonparametric, and weighted regression (WR). These data were obtained from different chemometric methods applied to the high-performance liquid chromatography response data using the internal standard method. This was performed on a model drug Acyclovir which was analyzed in human plasma with the use of ganciclovir as internal standard. In vivo study was also performed. Derivative treatment of chromatographic response ratio data was followed by convolution of the resulting derivative curves using 8-points sin x i polynomials (discrete Fourier functions). This work studies and also compares the application of WR method and Theil's method, a nonparametric regression (NPR) method with the least squares parametric regression (LSPR) method, which is considered the de facto standard method used for regression. When the assumption of homoscedasticity is not met for analytical data, a simple and effective way to counteract the great influence of the high concentrations on the fitted regression line is to use WR method. WR was found to be superior to the method of LSPR as the former assumes that the y-direction error in the calibration curve will increase as x increases. Theil's NPR method was also found to be superior to the method of LSPR as the former assumes that errors could occur in both x- and y-directions and that might not be normally distributed. Most of the results showed a significant improvement in the precision and accuracy on applying WR and NPR methods relative to LSPR.
Determination of benzo(a)pyrene content in PM10 using regression methods
Jacek Gębicki; Tomasz Ludkiewicz; Jacek Namieśnik
2015-01-01
The paper presents an attempt of application of multidimensional linear regression to estimation of an empirical model describing the factors influencing on B(a)P content in suspended dust PM10 in Olsztyn and Elbląg city regions between 2010 and 2013. During this period annual average concentration of B(a)P in PM10 exceeded the admissible level 1.5-3 times. Conducted investigations confirm that the reasons of B(a)P concentration increase are low-efficiency individual ...
OGAARD, B; TENBOSCH, JJ
This article describes a new nondestructive optical method for evaluation of lesion regression in vivo. White spot caries lesions were induced with orthodontic bands in two vital premolars of seven patients. The teeth were banded for 4 weeks with special orthodontic bands that allowed plaque
Eekhout, I.; Wiel, M.A. van de; Heymans, M.W.
2017-01-01
Background. Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels
Braak, ter C.J.F.; Juggins, S.
1993-01-01
Weighted averaging regression and calibration form a simple, yet powerful method for reconstructing environmental variables from species assemblages. Based on the concepts of niche-space partitioning and ecological optima of species (indicator values), it performs well with noisy, species-rich data
Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza
2017-07-05
In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them. Copyright © 2017 Elsevier B.V. All rights reserved.
Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza
2017-07-01
In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300 nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them.
Diagnostic Method of Diabetes Based on Support Vector Machine and Tongue Images
Directory of Open Access Journals (Sweden)
Jianfeng Zhang
2017-01-01
Full Text Available Objective. The purpose of this research is to develop a diagnostic method of diabetes based on standardized tongue image using support vector machine (SVM. Methods. Tongue images of 296 diabetic subjects and 531 nondiabetic subjects were collected by the TDA-1 digital tongue instrument. Tongue body and tongue coating were separated by the division-merging method and chrominance-threshold method. With extracted color and texture features of the tongue image as input variables, the diagnostic model of diabetes with SVM was trained. After optimizing the combination of SVM kernel parameters and input variables, the influences of the combinations on the model were analyzed. Results. After normalizing parameters of tongue images, the accuracy rate of diabetes predication was increased from 77.83% to 78.77%. The accuracy rate and area under curve (AUC were not reduced after reducing the dimensions of tongue features with principal component analysis (PCA, while substantially saving the training time. During the training for selecting SVM parameters by genetic algorithm (GA, the accuracy rate of cross-validation was grown from 72% or so to 83.06%. Finally, we compare with several state-of-the-art algorithms, and experimental results show that our algorithm has the best predictive accuracy. Conclusions. The diagnostic method of diabetes on the basis of tongue images in Traditional Chinese Medicine (TCM is of great value, indicating the feasibility of digitalized tongue diagnosis.
Determination of benzo(apyrene content in PM10 using regression methods
Directory of Open Access Journals (Sweden)
Jacek Gębicki
2015-12-01
Full Text Available The paper presents an attempt of application of multidimensional linear regression to estimation of an empirical model describing the factors influencing on B(aP content in suspended dust PM10 in Olsztyn and Elbląg city regions between 2010 and 2013. During this period annual average concentration of B(aP in PM10 exceeded the admissible level 1.5-3 times. Conducted investigations confirm that the reasons of B(aP concentration increase are low-efficiency individual home heat stations or low-temperature heat sources, which are responsible for so-called low emission during heating period. Dependences between the following quantities were analysed: concentration of PM10 dust in air, air temperature, wind velocity, air humidity. A measure of model fitting to actual B(aP concentration in PM10 was the coefficient of determination of the model. Application of multidimensional linear regression yielded the equations characterized by high values of the coefficient of determination of the model, especially during heating season. This parameter ranged from 0.54 to 0.80 during the analyzed period.
An improved wave-vector frequency-domain method for nonlinear wave modeling.
Jing, Yun; Tao, Molei; Cannata, Jonathan
2014-03-01
In this paper, a recently developed wave-vector frequency-domain method for nonlinear wave modeling is improved and verified by numerical simulations and underwater experiments. Higher order numeric schemes are proposed that significantly increase the modeling accuracy, thereby allowing for a larger step size and shorter computation time. The improved algorithms replace the left-point Riemann sum in the original algorithm by the trapezoidal or Simpson's integration. Plane waves and a phased array were first studied to numerically validate the model. It is shown that the left-point Riemann sum, trapezoidal, and Simpson's integration have first-, second-, and third-order global accuracy, respectively. A highly focused therapeutic transducer was then used for experimental verifications. Short high-intensity pulses were generated. 2-D scans were conducted at a prefocal plane, which were later used as the input to the numerical model to predict the acoustic field at other planes. Good agreement is observed between simulations and experiments.
Fachrurrozi, Muhammad; Saparudin; Erwin
2017-04-01
Real-time Monitoring and early detection system which measures the quality standard of waste in Musi River, Palembang, Indonesia is a system for determining air and water pollution level. This system was designed in order to create an integrated monitoring system and provide real time information that can be read. It is designed to measure acidity and water turbidity polluted by industrial waste, as well as to show and provide conditional data integrated in one system. This system consists of inputting and processing the data, and giving output based on processed data. Turbidity, substances, and pH sensor is used as a detector that produce analog electrical direct current voltage (DC). Early detection system works by determining the value of the ammonia threshold, acidity, and turbidity level of water in Musi River. The results is then presented based on the level group pollution by the Support Vector Machine classification method.
Tracking Methods to Study the Surface Regression of the Solid-Propellant Grain
Directory of Open Access Journals (Sweden)
Yao Hsin Hwang
2014-10-01
Full Text Available In the work, we have successfully developed practical surface tracking methods to calculate the erosive volume and the associated burning areas which are the important parameters to solve a nonlinear, pressurization-rate dependent combustion ballistics. Three methodologies, namely the front tracking, the emanating ray and the least distance methods, are proposed. The front tracking method is based on the Lagrangian point of view; while both the emanating ray and the least distance methods are formulated from the Eulerian viewpoint. Two two-dimensional test problems have been examined to compare with the programming complexity, simulation accuracy and required CPU time of the proposed methods. It is found that the least distance method performs superior to the other two methods in numerical respects. The least distance method is implemented with tetrahedron grids to track the outward propagation of a three-dimensional cubic. Comparison between the predicted erosive volume and corresponding theoretical result yields satisfactory agreement.
2017-12-01
the window method calls one a hit and the other a false alarm, so double-responding would not inflate the HR estimate. The regression method, however...does not have special handling of double responses, and they could inflate the HR estimate. Based on these data, we cannot know if these responses...GOODWIN RDRL HRA A C METEVIER RDRL HRA D B PETTIT 12423 RESEARCH PARKWAY ORLANDO FL 32826 1 USA ARMY G1 (PDF) DAPE HSI B
Efectivity of Additive Spline for Partial Least Square Method in Regression Model Estimation
Directory of Open Access Journals (Sweden)
Ahmad Bilfarsah
2005-04-01
Full Text Available Additive Spline of Partial Least Square method (ASPL as one generalization of Partial Least Square (PLS method. ASPLS method can be acommodation to non linear and multicollinearity case of predictor variables. As a principle, The ASPLS method approach is cahracterized by two idea. The first is to used parametric transformations of predictors by spline function; the second is to make ASPLS components mutually uncorrelated, to preserve properties of the linear PLS components. The performance of ASPLS compared with other PLS method is illustrated with the fisher economic application especially the tuna fish production.
Cao, Jin; Zhang, Li; Wang, Bangjun; Li, Fanzhang; Yang, Jiwen
2015-02-01
For cancer classification problems based on gene expression, the data usually has only a few dozen sizes but has thousands to tens of thousands of genes which could contain a large number of irrelevant genes. A robust feature selection algorithm is required to remove irrelevant genes and choose the informative ones. Support vector data description (SVDD) has been applied to gene selection for many years. However, SVDD cannot address the problems with multiple classes since it only considers the target class. In addition, it is time-consuming when applying SVDD to gene selection. This paper proposes a novel fast feature selection method based on multiple SVDD and applies it to multi-class microarray data. A recursive feature elimination (RFE) scheme is introduced to iteratively remove irrelevant features, so the proposed method is called multiple SVDD-RFE (MSVDD-RFE). To make full use of all classes for a given task, MSVDD-RFE independently selects a relevant gene subset for each class. The final selected gene subset is the union of these relevant gene subsets. The effectiveness and accuracy of MSVDD-RFE are validated by experiments on five publicly available microarray datasets. Our proposed method is faster and more effective than other methods. Copyright © 2014 Elsevier Inc. All rights reserved.
A Novel Method for Vertical Acceleration Noise Suppression of a Thrust-Vectored VTOL UAV
Directory of Open Access Journals (Sweden)
Huanyu Li
2016-12-01
Full Text Available Acceleration is of great importance in motion control for unmanned aerial vehicles (UAVs, especially during the takeoff and landing stages. However, the measured acceleration is inevitably polluted by severe noise. Therefore, a proper noise suppression procedure is required. This paper presents a novel method to reduce the noise in the measured vertical acceleration for a thrust-vectored tail-sitter vertical takeoff and landing (VTOL UAV. In the new procedure, a Kalman filter is first applied to estimate the UAV mass by using the information in the vertical thrust and measured acceleration. The UAV mass is then used to compute an estimate of UAV vertical acceleration. The estimated acceleration is finally fused with the measured acceleration to obtain the minimum variance estimate of vertical acceleration. By doing this, the new approach incorporates the thrust information into the acceleration estimate. The method is applied to the data measured in a VTOL UAV takeoff experiment. Two other denoising approaches developed by former researchers are also tested for comparison. The results demonstrate that the new method is able to suppress the acceleration noise substantially. It also maintains the real-time performance in the final estimated acceleration, which is not seen in the former denoising approaches. The acceleration treated with the new method can be readily used in the motion control applications for UAVs to achieve improved accuracy.
A Novel Method for Vertical Acceleration Noise Suppression of a Thrust-Vectored VTOL UAV.
Li, Huanyu; Wu, Linfeng; Li, Yingjie; Li, Chunwen; Li, Hangyu
2016-12-02
Acceleration is of great importance in motion control for unmanned aerial vehicles (UAVs), especially during the takeoff and landing stages. However, the measured acceleration is inevitably polluted by severe noise. Therefore, a proper noise suppression procedure is required. This paper presents a novel method to reduce the noise in the measured vertical acceleration for a thrust-vectored tail-sitter vertical takeoff and landing (VTOL) UAV. In the new procedure, a Kalman filter is first applied to estimate the UAV mass by using the information in the vertical thrust and measured acceleration. The UAV mass is then used to compute an estimate of UAV vertical acceleration. The estimated acceleration is finally fused with the measured acceleration to obtain the minimum variance estimate of vertical acceleration. By doing this, the new approach incorporates the thrust information into the acceleration estimate. The method is applied to the data measured in a VTOL UAV takeoff experiment. Two other denoising approaches developed by former researchers are also tested for comparison. The results demonstrate that the new method is able to suppress the acceleration noise substantially. It also maintains the real-time performance in the final estimated acceleration, which is not seen in the former denoising approaches. The acceleration treated with the new method can be readily used in the motion control applications for UAVs to achieve improved accuracy.
Gilstrap, Donald L.
2013-01-01
In addition to qualitative methods presented in chaos and complexity theories in educational research, this article addresses quantitative methods that may show potential for future research studies. Although much in the social and behavioral sciences literature has focused on computer simulations, this article explores current chaos and…
Paul C. Van Deusen; Linda S. Heath
2010-01-01
Weighted estimation methods for analysis of mapped plot forest inventory data are discussed. The appropriate weighting scheme can vary depending on the type of analysis and graphical display. Both statistical issues and user expectations need to be considered in these methods. A weighting scheme is proposed that balances statistical considerations and the logical...
MAPPING LOCAL CLIMATE ZONES WITH A VECTOR-BASED GIS METHOD
Directory of Open Access Journals (Sweden)
E. Lelovics
2013-03-01
Full Text Available In this study we determined Local Climate Zones in a South-Hungarian city, using vector-based and raster-based databases. We calculated seven of the originally proposed ten physical (geometric, surface cover and radiative properties for areas which are based on the mobile temperature measurement campaigns earlier carried out in this city.As input data we applied 3D building database (earlier created with photogrammetric methods, 2D road database, topographic map, aerial photographs, remotely sensed reflectance information from RapidEye satellite image and our local knowledge about the area. The values of the properties were calculated by GIS methods developed for this purpose.We derived for the examined areas and applied for classification sky view factor, mean building height, terrain roughness class, building surface fraction, pervious surface fraction, impervious surface fraction and albedo.Six built and one land cover LCZ classes could be detected with this method on our study area. From each class one circle area was selected, which is representative for that class. Their thermal reactions were examined with the application of mobile temperature measurement dataset. The comparison was made in cases, when the weather was clear and calm and the surface was dry. We found that compact built-in types have more temperature surplus than open ones, and midrise types also have more than lowrise ones. According to our primary results, these categories provide a useful opportunity for intra- and inter-urban comparisons.
A Fast Classification Method of Faults in Power Electronic Circuits Based on Support Vector Machines
Directory of Open Access Journals (Sweden)
Cui Jiang
2017-12-01
Full Text Available Fault detection and location are important and front-end tasks in assuring the reliability of power electronic circuits. In essence, both tasks can be considered as the classification problem. This paper presents a fast fault classification method for power electronic circuits by using the support vector machine (SVM as a classifier and the wavelet transform as a feature extraction technique. Using one-against-rest SVM and one-against-one SVM are two general approaches to fault classification in power electronic circuits. However, these methods have a high computational complexity, therefore in this design we employ a directed acyclic graph (DAG SVM to implement the fault classification. The DAG SVM is close to the one-against-one SVM regarding its classification performance, but it is much faster. Moreover, in the presented approach, the DAG SVM is improved by introducing the method of Knearest neighbours to reduce some computations, so that the classification time can be further reduced. A rectifier and an inverter are demonstrated to prove effectiveness of the presented design.
Directory of Open Access Journals (Sweden)
Hongtao Xue
2014-01-01
Full Text Available This paper proposed an intelligent diagnosis method for a centrifugal pump system using statistic filter, support vector machine (SVM, possibility theory, and Dempster-Shafer theory (DST on the basis of the vibration signals, to diagnose frequent faults in the centrifugal pump at an early stage, such as cavitation, impeller unbalance, and shaft misalignment. Firstly, statistic filter is used to extract the feature signals of pump faults from the measured vibration signals across an optimum frequency region, and nondimensional symptom parameters (NSPs are defined to represent the feature signals for distinguishing fault types. Secondly, the optimal classification hyperplane for distinguishing two states is obtained by SVM and NSPs, and its function is defined as synthetic symptom parameter (SSP in order to increase the diagnosis’ sensitivity. Finally, the possibility functions of the SSP are used to construct a sequential fuzzy diagnosis for fault detection and fault-type identification by possibility theory and DST. The proposed method has been applied to detect the faults of the centrifugal pump, and the efficiency of the method has been verified using practical examples.
High order vector mode coupling mechanism based on mode matching method
Zhang, Zhishen; Gan, Jiulin; Heng, Xiaobo; Li, Muqiao; Li, Jiong; Xu, Shanhui; Yang, Zhongmin
2017-06-01
The high order vector mode (HOVM) coupling mechanism is investigated based on the mode matching method (MMM). In the case of strong HOVM coupling where the weakly guiding approximation fails, conventional coupling analysis methods become invalid due to the asynchronous coupling feature of the horizontal and vertical polarization components of HOVM. The MMM, which uses the interference of the local eigenmodes instead of the assumptive modes to simulate the light propagation, is adopted as a more efficient analysis method for investigating HOVM coupling processes, especially for strong coupling situations. The rules of the optimal coupling length, coupling efficiency, and mode purity in microfiber directional coupler are firstly quantitatively analyzed and summarized. Different from the specific input modes, some special new modes would be excited at the output through the strong HOVM coupling process. The analysis of HOVM coupling mechanism based on MMM could provide precise and accurate design guidance for HOVM directional coupler and mode converter, which are believed to be fundamental devices for multi-mode communication applications.
Hopkins, Dale A.
1998-01-01
A key challenge in designing the new High Speed Civil Transport (HSCT) aircraft is determining a good match between the airframe and engine. Multidisciplinary design optimization can be used to solve the problem by adjusting parameters of both the engine and the airframe. Earlier, an example problem was presented of an HSCT aircraft with four mixed-flow turbofan engines and a baseline mission to carry 305 passengers 5000 nautical miles at a cruise speed of Mach 2.4. The problem was solved by coupling NASA Lewis Research Center's design optimization testbed (COMETBOARDS) with NASA Langley Research Center's Flight Optimization System (FLOPS). The computing time expended in solving the problem was substantial, and the instability of the FLOPS analyzer at certain design points caused difficulties. In an attempt to alleviate both of these limitations, we explored the use of two approximation concepts in the design optimization process. The two concepts, which are based on neural network and linear regression approximation, provide the reanalysis capability and design sensitivity analysis information required for the optimization process. The HSCT aircraft optimization problem was solved by using three alternate approaches; that is, the original FLOPS analyzer and two approximate (derived) analyzers. The approximate analyzers were calibrated and used in three different ranges of the design variables; narrow (interpolated), standard, and wide (extrapolated).
Huang, Lei
2015-09-30
To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required.
Derevtsov, E. Yu; Louis, A. K.; Maltseva, S. V.; Polyakova, A. P.; Svetov, I. E.
2017-12-01
A problem of reconstruction of 2D vector or symmetric 2-tensor fields by their known ray transforms is considered. Two numerical approaches based on the method of approximate inverse are suggested for solving the problem. The first method allows to recover components of a vector or tensor field, and the second reconstructs its potentials in the sense of feature reconstruction, where the observation operator assigns to a field its potential. Numerical simulations show good results of reconstruction of the sought-for fields or their solenoidal or potential parts from its ray transforms.
Sentürk, Damla; Dalrymple, Lorien S; Mu, Yi; Nguyen, Danh V
2014-11-10
We propose a new weighted hurdle regression method for modeling count data, with particular interest in modeling cardiovascular events in patients on dialysis. Cardiovascular disease remains one of the leading causes of hospitalization and death in this population. Our aim is to jointly model the relationship/association between covariates and (i) the probability of cardiovascular events, a binary process, and (ii) the rate of events once the realization is positive-when the 'hurdle' is crossed-using a zero-truncated Poisson distribution. When the observation period or follow-up time, from the start of dialysis, varies among individuals, the estimated probability of positive cardiovascular events during the study period will be biased. Furthermore, when the model contains covariates, then the estimated relationship between the covariates and the probability of cardiovascular events will also be biased. These challenges are addressed with the proposed weighted hurdle regression method. Estimation for the weighted hurdle regression model is a weighted likelihood approach, where standard maximum likelihood estimation can be utilized. The method is illustrated with data from the United States Renal Data System. Simulation studies show the ability of proposed method to successfully adjust for differential follow-up times and incorporate the effects of covariates in the weighting. Copyright © 2014 John Wiley & Sons, Ltd.
Radial basis function regression methods for predicting quantitative traits using SNP markers.
Long, Nanye; Gianola, Daniel; Rosa, Guilherme J M; Weigel, Kent A; Kranis, Andreas; González-Recio, Oscar
2010-06-01
A challenge when predicting total genetic values for complex quantitative traits is that an unknown number of quantitative trait loci may affect phenotypes via cryptic interactions. If markers are available, assuming that their effects on phenotypes are additive may lead to poor predictive ability. Non-parametric radial basis function (RBF) regression, which does not assume a particular form of the genotype-phenotype relationship, was investigated here by simulation and analysis of body weight and food conversion rate data in broilers. The simulation included a toy example in which an arbitrary non-linear genotype-phenotype relationship was assumed, and five different scenarios representing different broad sense heritability levels (0.1, 0.25, 0.5, 0.75 and 0.9) were created. In addition, a whole genome simulation was carried out, in which three different gene action modes (pure additive, additive+dominance and pure epistasis) were considered. In all analyses, a training set was used to fit the model and a testing set was used to evaluate predictive performance. The latter was measured by correlation and predictive mean-squared error (PMSE) on the testing data. For comparison, a linear additive model known as Bayes A was used as benchmark. Two RBF models with single nucleotide polymorphism (SNP)-specific (RBF I) and common (RBF II) weights were examined. Results indicated that, in the presence of complex genotype-phenotype relationships (i.e. non-linearity and non-additivity), RBF outperformed Bayes A in predicting total genetic values using SNP markers. Extension of Bayes A to include all additive, dominance and epistatic effects could improve its prediction accuracy. RBF I was generally better than RBF II, and was able to identify relevant SNPs in the toy example.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Support vector machine-based facial-expression recognition method combining shape and appearance
Han, Eun Jung; Kang, Byung Jun; Park, Kang Ryoung; Lee, Sangyoun
2010-11-01
Facial expression recognition can be widely used for various applications, such as emotion-based human-machine interaction, intelligent robot interfaces, face recognition robust to expression variation, etc. Previous studies have been classified as either shape- or appearance-based recognition. The shape-based method has the disadvantage that the individual variance of facial feature points exists irrespective of similar expressions, which can cause a reduction of the recognition accuracy. The appearance-based method has a limitation in that the textural information of the face is very sensitive to variations in illumination. To overcome these problems, a new facial-expression recognition method is proposed, which combines both shape and appearance information, based on the support vector machine (SVM). This research is novel in the following three ways as compared to previous works. First, the facial feature points are automatically detected by using an active appearance model. From these, the shape-based recognition is performed by using the ratios between the facial feature points based on the facial-action coding system. Second, the SVM, which is trained to recognize the same and different expression classes, is proposed to combine two matching scores obtained from the shape- and appearance-based recognitions. Finally, a single SVM is trained to discriminate four different expressions, such as neutral, a smile, anger, and a scream. By determining the expression of the input facial image whose SVM output is at a minimum, the accuracy of the expression recognition is much enhanced. The experimental results showed that the recognition accuracy of the proposed method was better than previous researches and other fusion methods.
DEFF Research Database (Denmark)
Shirali, Mahmoud; Nielsen, Vivi Hunnicke; Møller, Steen Henrik
2014-01-01
be obtained by only considering RFI estimate and BW at pelting, however, lower genetic correlations than unity indicate that extra genetic gain can be obtained by including estimates of these traits at the growing period. This study suggests random regression methods are suitable for analysing feed efficiency......The aim of this study was to determine genetic background of longitudinal residual feed intake (RFI) and body weight (BW) growth in farmed mink using random regression methods considering heterogeneous residual variances. Eight BW measures for each mink was recorded every three weeks from 63 to 210...... days of age for 2139 male mink and the same number of females. Cumulative feed intake was calculated six times with three weeks interval based on daily feed consumption between weighing’s from 105 to 210 days of age. Heritability estimates for RFI increased by age from 0.18 (0.03, standard deviation...
Directory of Open Access Journals (Sweden)
Weston Anderson
Full Text Available Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.
Boundary integral equation method calculations of surface regression effects in flame spreading
Altenkirch, R. A.; Rezayat, M.; Eichhorn, R.; Rizzo, F. J.
1982-01-01
A solid-phase conduction problem that is a modified version of one that has been treated previously in the literature and is applicable to flame spreading over a pyrolyzing fuel is solved using a boundary integral equation (BIE) method. Results are compared to surface temperature measurements that can be found in the literature. In addition, the heat conducted through the solid forward of the flame, the heat transfer responsible for sustaining the flame, is also computed in terms of the Peclet number based on a heated layer depth using the BIE method and approximate methods based on asymptotic expansions. Agreement between computed and experimental results is quite good as is agreement between the BIE and the approximate results.
Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.
2017-08-01
Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically
DEFF Research Database (Denmark)
Minarik, David; Senneby, Martin; Wollmer, Per
2015-01-01
Background The interpretation of myocardial perfusion scintigraphy (MPS) largely relies on visual assessment by the physician of the localization and extent of a perfusion defect. The aim of this study was to introduce the concept of the perfusion vector as a new objective quantitative method...... for further assisting the visual interpretation and to test the concept using simulated MPS images as well as patients. Methods The perfusion vector is based on calculating the difference between the anatomical centroid and the perfusion center of gravity of the left ventricle. Simulated MPS images were.......001) but not for patients with infarction. The correlation between the defect size and stress vector magnitude was also found to be significant (p assisting the visual interpretation in MPS studies. Further...
Directory of Open Access Journals (Sweden)
Tamer Khatib
2014-01-01
Full Text Available In this research an improved approach for sizing standalone PV system (SAPV is presented. This work is an improved work developed previously by the authors. The previous work is based on the analytical method which faced some concerns regarding the difficulty of finding the model’s coefficients. Therefore, the proposed approach in this research is based on a combination of an analytical method and a machine learning approach for a generalized artificial neural network (GRNN. The GRNN assists to predict the optimal size of a PV system using the geographical coordinates of the targeted site instead of using mathematical formulas. Employing the GRNN facilitates the use of a previously developed method by the authors and avoids some of its drawbacks. The approach has been tested using data from five Malaysian sites. According to the results, the proposed method can be efficiently used for SAPV sizing whereas the proposed GRNN based model predicts the sizing curves of the PV system accurately with a prediction error of 0.6%. Moreover, hourly meteorological and load demand data are used in this research in order to consider the uncertainty of the solar energy and the load demand.
Comparison of Sparse and Jack-knife partial least squares regression methods for variable selection
DEFF Research Database (Denmark)
Karaman, Ibrahim; Qannari, El Mostafa; Martens, Harald
2013-01-01
The objective of this study was to compare two different techniques of variable selection, Sparse PLSR and Jack-knife PLSR, with respect to their predictive ability and their ability to identify relevant variables. Sparse PLSR is a method that is frequently used in genomics, whereas Jack-knife PL...
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Directory of Open Access Journals (Sweden)
Man Zhu
2017-03-01
Full Text Available Determination of ship maneuvering models is a tough task of ship maneuverability prediction. Among several prime approaches of estimating ship maneuvering models, system identification combined with the full-scale or free- running model test is preferred. In this contribution, real-time system identification programs using recursive identification method, such as the recursive least square method (RLS, are exerted for on-line identification of ship maneuvering models. However, this method seriously depends on the objects of study and initial values of identified parameters. To overcome this, an intelligent technology, i.e., support vector machines (SVM, is firstly used to estimate initial values of the identified parameters with finite samples. As real measured motion data of the Mariner class ship always involve noise from sensors and external disturbances, the zigzag simulation test data include a substantial quantity of Gaussian white noise. Wavelet method and empirical mode decomposition (EMD are used to filter the data corrupted by noise, respectively. The choice of the sample number for SVM to decide initial values of identified parameters is extensively discussed and analyzed. With de-noised motion data as input-output training samples, parameters of ship maneuvering models are estimated using RLS and SVM-RLS, respectively. The comparison between identification results and true values of parameters demonstrates that both the identified ship maneuvering models from RLS and SVM-RLS have reasonable agreements with simulated motions of the ship, and the increment of the sample for SVM positively affects the identification results. Furthermore, SVM-RLS using data de-noised by EMD shows the highest accuracy and best convergence.
Directory of Open Access Journals (Sweden)
Foad Rahimidehgolan
2017-11-01
Full Text Available Damage models, particularly the Gurson–Tvergaard–Needleman (GTN model, are widely used in numerical simulation of material deformations. Each damage model has some constants which must be identified for each material. The direct identification methods are costly and time consuming. In the current work, a combination of experimental, numerical simulation and optimization were used to determine the constants. Quasi-static and dynamic tests were carried out on notched specimens. The experimental profiles of the specimens were used to determine the constants. The constants of GTN damage model were identified through the proposed method and using the results of quasi-static tests. Numerical simulation of the dynamic test was performed utilizing the constants obtained from quasi-static experiments. The results showed a high precision in predicting the specimen’s profile in the dynamic testing. The sensitivity analysis was performed on the constants of GTN model to validate the proposed method. Finally, the experiments were simulated using the Johnson–Cook (J–C damage model and the results were compared to those obtained from GTN damage model.
Stojić, Andreja; Maletić, Dimitrije; Stanišić Stojić, Svetlana; Mijić, Zoran; Šoštarić, Andrej
2015-07-15
In this study, advanced multivariate methods were applied for VOC source apportionment and subsequent short-term forecast of industrial- and vehicle exhaust-related contributions in Belgrade urban area (Serbia). The VOC concentrations were measured using PTR-MS, together with inorganic gaseous pollutants (NOx, NO, NO2, SO2, and CO), PM10, and meteorological parameters. US EPA Positive Matrix Factorization and Unmix receptor models were applied to the obtained dataset both resolving six source profiles. For the purpose of forecasting industrial- and vehicle exhaust-related source contributions, different multivariate methods were employed in two separate cases, relying on meteorological data, and on meteorological data and concentrations of inorganic gaseous pollutants, respectively. The results indicate that Boosted Decision Trees and Multi-Layer Perceptrons were the best performing methods. According to the results, forecasting accuracy was high (lowest relative error of only 6%), in particular when the forecast was based on both meteorological parameters and concentrations of inorganic gaseous pollutants. Copyright © 2015. Published by Elsevier B.V.
Outlier Detection Method in Linear Regression Based on Sum of Arithmetic Progression
Directory of Open Access Journals (Sweden)
K. K. L. B. Adikaram
2014-01-01
Full Text Available We introduce a new nonparametric outlier detection method for linear series, which requires no missing or removed data imputation. For an arithmetic progression (a series without outliers with n elements, the ratio (R of the sum of the minimum and the maximum elements and the sum of all elements is always 2/n:(0,1]. R≠2/n always implies the existence of outliers. Usually, R2/n implies that the maximum is an outlier. Based upon this, we derived a new method for identifying significant and nonsignificant outliers, separately. Two different techniques were used to manage missing data and removed outliers: (1 recalculate the terms after (or before the removed or missing element while maintaining the initial angle in relation to a certain point or (2 transform data into a constant value, which is not affected by missing or removed elements. With a reference element, which was not an outlier, the method detected all outliers from data sets with 6 to 1000 elements containing 50% outliers which deviated by a factor of ±1.0e-2 to ±1.0e+2 from the correct value.
Predicting metabolic syndrome using decision tree and support vector machine methods
Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh
2016-01-01
BACKGROUND Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. METHODS This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. RESULTS SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. CONCLUSION The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in
Directory of Open Access Journals (Sweden)
Hukharnsusatrue, A.
2005-11-01
Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than
Directory of Open Access Journals (Sweden)
J. Alm
2007-11-01
Full Text Available Closed (non-steady state chambers are widely used for quantifying carbon dioxide (CO2 fluxes between soils or low-stature canopies and the atmosphere. It is well recognised that covering a soil or vegetation by a closed chamber inherently disturbs the natural CO2 fluxes by altering the concentration gradients between the soil, the vegetation and the overlying air. Thus, the driving factors of CO2 fluxes are not constant during the closed chamber experiment, and no linear increase or decrease of CO2 concentration over time within the chamber headspace can be expected. Nevertheless, linear regression has been applied for calculating CO2 fluxes in many recent, partly influential, studies. This approach has been justified by keeping the closure time short and assuming the concentration change over time to be in the linear range. Here, we test if the application of linear regression is really appropriate for estimating CO2 fluxes using closed chambers over short closure times and if the application of nonlinear regression is necessary. We developed a nonlinear exponential regression model from diffusion and photosynthesis theory. This exponential model was tested with four different datasets of CO2 flux measurements (total number: 1764 conducted at three peatlands sites in Finland and a tundra site in Siberia. Thorough analyses of residuals demonstrated that linear regression was frequently not appropriate for the determination of CO2 fluxes by closed-chamber methods, even if closure times were kept short. The developed exponential model was well suited for nonlinear regression of the concentration over time c(t evolution in the chamber headspace and estimation of the initial CO2 fluxes at closure time for the majority of experiments. However, a rather large percentage of the exponential regression functions showed curvatures not consistent with the theoretical model which is considered to be caused by violations of the underlying model assumptions
A computer program for uncertainty analysis integrating regression and Bayesian methods
Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary
2014-01-01
This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.
Consistency analysis of subspace identification methods based on a linear regression approach
DEFF Research Database (Denmark)
Knudsen, Torben
2001-01-01
not include important model structures as e.g. Box-Jenkins. Based on a simple least squares approach this paper shows the possible inconsistency under the weak assumptions and develops only slightly stricter assumptions sufficient for consistency and which includes any model structure......In the literature results can be found which claim consistency for the subspace method under certain quite weak assumptions. Unfortunately, a new result gives a counter example showing inconsistency under these assumptions and then gives new more strict sufficient assumptions which however does...
Bitter, Christopher; Mulligan, Gordon F.; Dall'Erba, Sandy
2007-04-01
Hedonic house price models typically impose a constant price structure on housing characteristics throughout an entire market area. However, there is increasing evidence that the marginal prices of many important attributes vary over space, especially within large markets. In this paper, we compare two approaches to examine spatial heterogeneity in housing attribute prices within the Tucson, Arizona housing market: the spatial expansion method and geographically weighted regression (GWR). Our results provide strong evidence that the marginal price of key housing characteristics varies over space. GWR outperforms the spatial expansion method in terms of explanatory power and predictive accuracy.
A regression-based method for estimating risks and relative risks in case-base studies.
Chui, Tina Tsz-Ting; Lee, Wen-Chung
2013-01-01
Both the absolute risk and the relative risk (RR) have a crucial role to play in epidemiology. RR is often approximated by odds ratio (OR) under the rare-disease assumption in conventional case-control study; however, such a study design does not provide an estimate for absolute risk. The case-base study is an alternative approach which readily produces RR estimation without resorting to the rare-disease assumption. However, previous researchers only considered one single dichotomous exposure and did not elaborate how absolute risks can be estimated in a case-base study. In this paper, the authors propose a logistic model for the case-base study. The model is flexible enough to admit multiple exposures in any measurement scale-binary, categorical or continuous. It can be easily fitted using common statistical packages. With one additional step of simple calculations of the model parameters, one readily obtains relative and absolute risk estimates as well as their confidence intervals. Monte-Carlo simulations show that the proposed method can produce unbiased estimates and adequate-coverage confidence intervals, for ORs, RRs and absolute risks. The case-base study with all its desirable properties and its methods of analysis fully developed in this paper may become a mainstay in epidemiology.
A regression-based method for estimating risks and relative risks in case-base studies.
Directory of Open Access Journals (Sweden)
Tina Tsz-Ting Chui
Full Text Available Both the absolute risk and the relative risk (RR have a crucial role to play in epidemiology. RR is often approximated by odds ratio (OR under the rare-disease assumption in conventional case-control study; however, such a study design does not provide an estimate for absolute risk. The case-base study is an alternative approach which readily produces RR estimation without resorting to the rare-disease assumption. However, previous researchers only considered one single dichotomous exposure and did not elaborate how absolute risks can be estimated in a case-base study. In this paper, the authors propose a logistic model for the case-base study. The model is flexible enough to admit multiple exposures in any measurement scale-binary, categorical or continuous. It can be easily fitted using common statistical packages. With one additional step of simple calculations of the model parameters, one readily obtains relative and absolute risk estimates as well as their confidence intervals. Monte-Carlo simulations show that the proposed method can produce unbiased estimates and adequate-coverage confidence intervals, for ORs, RRs and absolute risks. The case-base study with all its desirable properties and its methods of analysis fully developed in this paper may become a mainstay in epidemiology.
Predicting metabolic syndrome using decision tree and support vector machine methods.
Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh
2016-05-01
Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in predicting metabolic syndrome. According
van der Wijk, V.; Herder, Justus Laurens; Koetsier, T.; Ceccarelli, M.
2012-01-01
This article gives an overview of the distinctive work of Otto Fischer (1861-1916) on the motion of the human musculoskeletal system. In order to be able to derive the individual muscle forces for human in motion, he invented the method of principal vectors to describe the motion of the
Support vector machine as an alternative method for lithology classification of crystalline rocks
Deng, Chengxiang; Pan, Heping; Fang, Sinan; Amara Konaté, Ahmed; Qin, Ruidong
2017-03-01
With the expansion of machine learning algorithms, automatic lithology classification that uses well logging data is becoming significant in formation evaluation and reservoir characterization. In fact, the complicated composition and structural variations of metamorphic rocks result in more nonlinear features in well logging data and elevate requirements to algorithms. Herein, the application of the support vector machine (SVM) in classifying crystalline rocks from Chinese Continental Scientific Drilling Main Hole (CCSD-MH) data was reported. We found that the SVM performs poorly on the lithology classification of crystalline rocks when training samples are imbalanced. The fact is that training samples are generally limited and imbalanced as cores cannot be obtained balanced and at 100 percent. In this paper, we introduced the synthetic minority over-sampling technique (SMOTE) and Borderline-SMOTE to deal with imbalanced data. After experiments generating different quantities of training samples by SMOTE and Borderline-SMOTE, the most suitable classifier was selected to overcome the disadvantage of the SVM. Then, the popular supervised classifier back-propagation neural networks (BPNN), which has been proved competent for lithology classification of crystalline rocks in previous studies, was compared to evaluate the performance of the SVM. Results show that Borderline-SMOTE can improve the SVM with substantially increased accuracy even for minority classes in a reasonable manner, while the SVM outperforms BPNN in aspects of lithology prediction and CCSD-MH data generalization. We demonstrate the potential of the SVM as an alternative to current methods for lithology identification of crystalline rocks.
Evaluation of a wave-vector-frequency-domain method for nonlinear wave propagation
Jing, Yun; Tao, Molei; Clement, Greg T.
2011-01-01
A wave-vector-frequency-domain method is presented to describe one-directional forward or backward acoustic wave propagation in a nonlinear homogeneous medium. Starting from a frequency-domain representation of the second-order nonlinear acoustic wave equation, an implicit solution for the nonlinear term is proposed by employing the Green’s function. Its approximation, which is more suitable for numerical implementation, is used. An error study is carried out to test the efficiency of the model by comparing the results with the Fubini solution. It is shown that the error grows as the propagation distance and step-size increase. However, for the specific case tested, even at a step size as large as one wavelength, sufficient accuracy for plane-wave propagation is observed. A two-dimensional steered transducer problem is explored to verify the nonlinear acoustic field directional independence of the model. A three-dimensional single-element transducer problem is solved to verify the forward model by comparing it with an existing nonlinear wave propagation code. Finally, backward-projection behavior is examined. The sound field over a plane in an absorptive medium is backward projected to the source and compared with the initial field, where good agreement is observed. PMID:21302985
Evaluation of a wave-vector-frequency-domain method for nonlinear wave propagation.
Jing, Yun; Tao, Molei; Clement, Greg T
2011-01-01
A wave-vector-frequency-domain method is presented to describe one-directional forward or backward acoustic wave propagation in a nonlinear homogeneous medium. Starting from a frequency-domain representation of the second-order nonlinear acoustic wave equation, an implicit solution for the nonlinear term is proposed by employing the Green's function. Its approximation, which is more suitable for numerical implementation, is used. An error study is carried out to test the efficiency of the model by comparing the results with the Fubini solution. It is shown that the error grows as the propagation distance and step-size increase. However, for the specific case tested, even at a step size as large as one wavelength, sufficient accuracy for plane-wave propagation is observed. A two-dimensional steered transducer problem is explored to verify the nonlinear acoustic field directional independence of the model. A three-dimensional single-element transducer problem is solved to verify the forward model by comparing it with an existing nonlinear wave propagation code. Finally, backward-projection behavior is examined. The sound field over a plane in an absorptive medium is backward projected to the source and compared with the initial field, where good agreement is observed.
Dai, Huanping; Micheyl, Christophe
2012-11-01
Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.
Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat
2015-01-01
Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
A framework for in-silico formulation design using multivariate latent variable regression methods.
Polizzi, Mark A; García-Muñoz, Salvador
2011-10-14
A comprehensive Quality by Design development paradigm should consider the impact of raw materials and formulation on the final drug product. This work proposes a quantitative approach to simultaneously predict particle, powder, and compact mechanical properties of a pharmaceutical blend, based on that of the raw materials. A new, two-step, multivariate modeling method, referred to as the weighted scores PLS, was developed to address the challenge of predicting the properties of a powder blend while enabling process understanding. The model validation exercise is shown along with selected practical applications. It is shown how the proposed in-silico model exhibits sufficient predictive power to be an important tool in the pharmaceutical development decision making process while requiring minimal experimentation and material usage. Copyright © 2011 Elsevier B.V. All rights reserved.
Mingzhu Tang; Chunhua Yang; Kang Zhang; Qiyue Xie
2014-01-01
Cost-sensitive support vector machine is one of the most popular tools to deal with class-imbalanced problem such as fault diagnosis. However, such data appear with a huge number of examples as well as features. Aiming at class-imbalanced problem on big data, a cost-sensitive support vector machine using randomized dual coordinate descent method (CSVM-RDCD) is proposed in this paper. The solution of concerned subproblem at each iteration is derived in closed form and the computational cost is...
Directory of Open Access Journals (Sweden)
Hongying Du
Full Text Available The epidermal growth factor receptor (EGFR protein tyrosine kinase (PTK is an important protein target for anti-tumor drug discovery. To identify potential EGFR inhibitors, we conducted a quantitative structure-activity relationship (QSAR study on the inhibitory activity of a series of quinazoline derivatives against EGFR tyrosine kinase. Two 2D-QSAR models were developed based on the best multi-linear regression (BMLR and grid-search assisted projection pursuit regression (GS-PPR methods. The results demonstrate that the inhibitory activity of quinazoline derivatives is strongly correlated with their polarizability, activation energy, mass distribution, connectivity, and branching information. Although the present investigation focused on EGFR, the approach provides a general avenue in the structure-based drug development of different protein receptor inhibitors.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
López-López, José Antonio; Van den Noortgate, Wim; Tanner-Smith, Emily E; Wilson, Sandra Jo; Lipsey, Mark W
2017-12-01
Dependent effect sizes are ubiquitous in meta-analysis. Using Monte Carlo simulation, we compared the performance of 2 methods for meta-regression with dependent effect sizes-robust variance estimation (RVE) and 3-level modeling-with the standard meta-analytic method for independent effect sizes. We further compared bias-reduced linearization and jackknife estimators as small-sample adjustments for RVE and Wald-type and likelihood ratio tests for 3-level models. The bias in the slope estimates, width of the confidence intervals around those estimates, and empirical type I error and statistical power rates of the hypothesis tests from these different methods were compared for mixed-effects meta-regression analysis with one moderator either at the study or at the effect size level. All methods yielded nearly unbiased slope estimates under most scenarios, but as expected, the standard method ignoring dependency provided inflated type I error rates when testing the significance of the moderators. Robust variance estimation methods yielded not only the best results in terms of type I error rate but also the widest confidence intervals and the lowest power rates, especially when using the jackknife adjustments. Three-level models showed a promising performance with a moderate to large number of studies, especially with the likelihood ratio test, and yielded narrower confidence intervals around the slope and higher power rates than those obtained with the RVE approach. All methods performed better when the moderator was at the effect size level, the number of studies was moderate to large, and the between-studies variance was small. Our results can help meta-analysts deal with dependency in their data. © 2017 Crown copyright. Research Synthesis Methods © 2017 John Wiley and Sons, Ltd. This article is published with the permission of the Controller of HMSO and the Queen's Printer for Scotland.
Frank, Sander B; Schulz, Veronique V; Miranti, Cindy K
2017-02-28
Short hairpin RNA (shRNA) is an established and effective tool for stable knock down of gene expression. Lentiviral vectors can be used to deliver shRNAs, thereby providing the ability to infect most mammalian cell types with high efficiency, regardless of proliferation state. Furthermore, the use of inducible promoters to drive shRNA expression allows for more thorough investigations into the specific timing of gene function in a variety of cellular processes. Moreover, inducible knockdown allows the investigation of genes that would be lethal or otherwise poorly tolerated if constitutively knocked down. Lentiviral inducible shRNA vectors are readily available, but unfortunately the process of cloning, screening, and testing shRNAs can be time-consuming and expensive. Therefore, we sought to refine a popular vector (Tet-pLKO-Puro) and streamline the cloning process with efficient protocols so that researchers can more efficiently utilize this powerful tool. METHODS: First, we modified the Tet-pLKO-Puro vector to make it easy ("EZ") for molecular cloning (EZ-Tet-pLKO-Puro). Our primary modification was to shrink the stuffer region, which allows vector purification via polyethylene glycol precipitation thereby avoiding the need to purify DNA through agarose. In addition, we generated EZ-Tet-pLKO vectors with hygromycin or blasticidin resistance to provide greater flexibility in cell line engineering. Furthermore, we provide a detailed guide for utilizing these vectors, including shRNA design strategy and simplified screening methods. Notably, we emphasize the importance of loop sequence design and demonstrate that the addition of a single mismatch in the loop stem can greatly improve shRNA efficiency. Lastly, we display the robustness of the system with a doxycycline titration and recovery time course and provide a cost/benefit analysis comparing our system with purchasing pre-designed shRNA vectors. Our aim was twofold: first, to take a very useful shRNA vector
Di Ianni, Tommaso; Villagomez Hoyos, Carlos Armando; Ewertsen, Caroline; Kjeldsen, Thomas Kim; Mosegaard, Jesper; Nielsen, Michael Bachmann; Jensen, Jorgen Arendt
2017-11-01
This paper presents a vector flow imaging method for the integration of quantitative blood flow imaging in portable ultrasound systems. The method combines directional transverse oscillation (TO) and synthetic aperture sequential beamforming to yield continuous velocity estimation in the whole imaging region. Six focused emissions are used to create a high-resolution image (HRI), and a dual-stage beamforming approach is used to lower the data throughput between the probe and the processing unit. The transmit/receive focal points are laterally separated to obtain a TO in the HRI that allows for the velocity estimation along the lateral and axial directions using a phase-shift estimator. The performance of the method was investigated with constant flow measurements in a flow rig system using the SARUS scanner and a 4.1-MHz linear array. A sequence was designed with interleaved B-mode and flow emissions to obtain continuous data acquisition. A parametric study was carried out to evaluate the effect of critical parameters. The vessel was placed at depths from 20 to 40 mm, with beam-to-flow angles of 65°, 75°, and 90°. For the lateral velocities at 20 mm, a bias between -5% and -6.2% was obtained, and the standard deviation (SD) was between 6% and 9.6%. The axial bias was lower than 1% with an SD around 2%. The mean estimated angles were 66.70° ± 2.86°, 72.65° ± 2.48°, and 89.13° ± 0.79° for the three cases. A proof-of-concept demonstration of the real-time processing and wireless transmission was tested in a commercial tablet obtaining a frame rate of 27 frames/s and a data rate of 14 MB/s. An in vivo measurement of a common carotid artery of a healthy volunteer was finally performed to show the potential of the method in a realistic setting. The relative SD averaged over a cardiac cycle was 4.33%.
Directory of Open Access Journals (Sweden)
Amin Moori Roozali
2014-08-01
Full Text Available Correct estimation of water inflow into underground excavations can decrease safety risks and associated costs. Researchers have proposed different methods to asses this value. It has been proved that water transmissivity of a rock joint is a function of factors, such as normal stress, joint roughness and its size and water pressure therefore, a laboratory setup was proposed to quantitatively measure the flow as a function of mentioned parameters. Among these, normal stress has proved to be the most influential parameter. With increasing joint roughness and rock sample size, water flow has decreased while increasing water pressure has a direct increasing effect on the flow. To simulate the complex interaction of these parameters, neural networks and Fuzzy method together with regression analysis have been utilized. Correlation factors between laboratory results and obtained numerical ones show good agreement which proves usefulness of these methods for assessment of water inflow.
Eekhout, Iris; van de Wiel, Mark A; Heymans, Martijn W
2017-08-22
Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin's Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels significantly contributes to the model, different methods are available. For example pooling chi-square tests with multiple degrees of freedom, pooling likelihood ratio test statistics, and pooling based on the covariance matrix of the regression model. These methods are more complex than RR and are not available in all mainstream statistical software packages. In addition, they do not always obtain optimal power levels. We argue that the median of the p-values from the overall significance tests from the analyses on the imputed datasets can be used as an alternative pooling rule for categorical variables. The aim of the current study is to compare different methods to test a categorical variable for significance after multiple imputation on applicability and power. In a large simulation study, we demonstrated the control of the type I error and power levels of different pooling methods for categorical variables. This simulation study showed that for non-significant categorical covariates the type I error is controlled and the statistical power of the median pooling rule was at least equal to current multiple parameter tests. An empirical data example showed similar results. It can therefore be concluded that using the median of the p-values from the imputed data analyses is an attractive and easy to use alternative method for significance testing of categorical variables.
A novel vector-based method for exclusive overexpression of star-form microRNAs.
Directory of Open Access Journals (Sweden)
Bo Qu
Full Text Available The roles of microRNAs (miRNAs as important regulators of gene expression have been studied intensively. Although most of these investigations have involved the highly expressed form of the two mature miRNA species, increasing evidence points to essential roles for star-form microRNAs (miRNA*, which are usually expressed at much lower levels. Owing to the nature of miRNA biogenesis, it is challenging to use plasmids containing miRNA coding sequences for gain-of-function experiments concerning the roles of microRNA* species. Synthetic microRNA mimics could introduce specific miRNA* species into cells, but this transient overexpression system has many shortcomings. Here, we report that specific miRNA* species can be overexpressed by introducing artificially designed stem-loop sequences into short hairpin RNA (shRNA overexpression vectors. By our prototypic plasmid, designed to overexpress hsa-miR-146b-3p, we successfully expressed high levels of hsa-miR-146b-3p without detectable change of hsa-miR-146b-5p. Functional analysis involving luciferase reporter assays showed that, like natural miRNAs, the overexpressed hsa-miR-146b-3p inhibited target gene expression by 3'UTR seed pairing. Our demonstration that this method could overexpress two other miRNAs suggests that the approach should be broadly applicable. Our novel strategy opens the way for exclusively stable overexpression of miRNA* species and analyzing their unique functions both in vitro and in vivo.
Arimura, Hidetaka; Anai, Shigeo; Yoshidome, Satoshi; Nakamura, Katsumasa; Shioyama, Yoshiyuki; Nomoto, Satoshi; Honda, Hiroshi; Onizuka, Yoshihiko; Terashima, Hiromi
2007-03-01
The purpose of this study was to develop a computerized method for measurement of displacement vectors of target position on electronic portal imaging device (EPID) cine images in a treatment without implanted markers. Our proposed method was based on a template matching technique with cross-correlation coefficient between a reference portal (RP) image and each consecutive portal (CP) image acquired by the EPID. EPID images with 512×384 pixels (pixel size:0.56 mm) were acquired in a cine mode at a sampling rate of 0.5 frame/sec by using an energy of 4, 6, or 10MV on linear accelerators. The displacement vector of the target on each cine image was determined from the position in which took the maximum cross-correlation value between the RP image and each CP image. We applied our method to EPID cine images of a lung phantom with a tumor model simulating respiratory motion, and 5 cases with a non-small cell lung cancer and one case of metastasis. For validation of our proposed method, displacement vectors of a target position calculated by our method were compared with those determined manually by two radiation oncologists. As a result, for lung phantom images, target displacements by our method correlated well with those by the oncologists (r=0.972 - 0.994). Correlation values for 4 cases ranged from 0.854 to 0.991, but the values for the other two cases were 0.609 and 0.644. This preliminary result suggested that our method may be useful for monitoring of displacement vectors of target positions without implanted markers in stereotactic radiotherapy.
DEFF Research Database (Denmark)
Le, T.H.A.; Pham, D. T.; Canh, Nam Nguyen
2010-01-01
Both the efficient and weakly efficient sets of an affine fractional vector optimization problem, in general, are neither convex nor given explicitly. Optimization problems over one of these sets are thus nonconvex. We propose two methods for optimizing a real-valued function over the efficient...... and weakly efficient sets of an affine fractional vector optimization problem. The first method is a local one. By using a regularization function, we reformulate the problem into a standard smooth mathematical programming problem that allows applying available methods for smooth programming. In case...... the objective function is linear, we have investigated a global algorithm based upon a branch-and-bound procedure. The algorithm uses Lagrangian bound coupling with a simplicial bisection in the criteria space. Preliminary computational results show that the global algorithm is promising....
Strong, Mark; Oakley, Jeremy E; Brennan, Alan; Breeze, Penny
2015-07-01
Health economic decision-analytic models are used to estimate the expected net benefits of competing decision options. The true values of the input parameters of such models are rarely known with certainty, and it is often useful to quantify the value to the decision maker of reducing uncertainty through collecting new data. In the context of a particular decision problem, the value of a proposed research design can be quantified by its expected value of sample information (EVSI). EVSI is commonly estimated via a 2-level Monte Carlo procedure in which plausible data sets are generated in an outer loop, and then, conditional on these, the parameters of the decision model are updated via Bayes rule and sampled in an inner loop. At each iteration of the inner loop, the decision model is evaluated. This is computationally demanding and may be difficult if the posterior distribution of the model parameters conditional on sampled data is hard to sample from. We describe a fast nonparametric regression-based method for estimating per-patient EVSI that requires only the probabilistic sensitivity analysis sample (i.e., the set of samples drawn from the joint distribution of the parameters and the corresponding net benefits). The method avoids the need to sample from the posterior distributions of the parameters and avoids the need to rerun the model. The only requirement is that sample data sets can be generated. The method is applicable with a model of any complexity and with any specification of model parameter distribution. We demonstrate in a case study the superior efficiency of the regression method over the 2-level Monte Carlo method. © The Author(s) 2015.
In-vivo Examples of Flow Patterns With The Fast Vector Velocity Ultrasound Method
DEFF Research Database (Denmark)
Hansen, Kristoffer Lindskov; Udesen, Jesper; Gran, Fredrik
2009-01-01
and using a 100 CPU linux cluster for post processing, PWE can achieve a frame of 100 Hz where one vector velocity sequence of approximately 3 sec, takes 10 h to store and 48 h to process. In this paper a case study is presented of in-vivo vector velocity estimates in different complex vessel geometries....... Results: The flow patterns of six bifurcations and two veins were investigated. It was shown: 1. that a stable vortex in the carotid bulb was present opposed to other examined bifurcations, 2. that retrograde flow was present in the superficial branch of the femoral artery during diastole, 3...
Directory of Open Access Journals (Sweden)
RSC Teixeira
2007-06-01
Full Text Available This study aimed at evaluating the alternative method of zinc oxide and fasting to induce molt in Japanese quails. A total number of 190 48-week-old quails was used. They were at end of laying cycle, and presented low egg production. Quails molted by zinc oxide (Z were fed a diet containing 25,000 ppm of zinc oxide, and received water ad libitum. Quails treated by fasting (F received no feed and a day of water restriction. The treatment period was determined by the experimental level of body weight loss (BWL. Birds were submitted to different levels of BWL in order to analyze reproductive system regression (ovary + oviduct, and livability. The following groups were established according to their BWL: Control (untreated quails; F25 (25% BWL by F; F35 (35% BWL by F; Z25 (25% BWL by Z, and Z35 (35% BWL by Z. Z25, Z35, and F35 presented no significant differences in reproductive system weights after molting; however, their weights were lower than F. Z, Z, and F presented the following livability: 97.5, 72.5, and 90%. Japanese quails treated by the alternative method of zinc oxide, presenting body weight loss of 25%, showed low mortality rate, and adequate regression of the reproductive organs.
Directory of Open Access Journals (Sweden)
Mohd Faris Dziauddin
2017-07-01
Full Text Available This study estimates the effect of locational attributes on residential property values in Kuala Lumpur, Malaysia. Geographically weighted regression (GWR enables the use of the local parameter rather than the global parameter to be estimated, with the results presented in map form. The results of this study reveal that residential property values are mainly determined by the property’s physical (structural attributes, but proximity to locational attributes also contributes marginally. The use of GWR in this study is considered a better approach than other methods to examine the effect of locational attributes on residential property values. GWR has the capability to produce meaningful results in which different locational attributes have differential spatial effects across a geographical area on residential property values. This method has the ability to determine the factors on which premiums depend, and in turn it can assist the government in taxation matters.
Chen, Yue; Li, Zeng; Chen, Hai-Feng
2010-03-01
CCR5 is the key receptor of HIV-1 virus entry into host cells and it becomes an attractive target for antiretroviral drug design. To date, six types of CCR5 antagonist were synthesized and evaluated. To search more potent bio-active compounds, non-linear support vector machine was used to construct the relationship models for 103 oximino-piperidino-piperidine CCR5 antagonists. Then, comparative molecular field analysis and comparative molecular similarity indices analysis models were constructed after alignment with their common substructure. Twenty-one structural diverse compounds, which were not included in the support vector machine, comparative molecular field analysis, and comparative molecular similarity indices analysis models, validated these models. The results show that these models possess good predictive ability. When comparing between support vector machine and 3D-quantitative structure activity relationship models, the results obtained from these two methods are compatible. However, 3D-quantitative structure activity relationship model is significantly better than support vector machine model and previous reported pharmacophore model. These models can help us to make quantitative prediction of their bio-activities before in vitro and in vivo stages.
Convergence of vector spherical wave expansion method applied to near-field radiative transfer.
Sasihithlu, Karthik; Narayanaswamy, Arvind
2011-07-04
Near-field radiative transfer between two objects can be computed using Rytov's theory of fluctuational electrodynamics in which the strength of electromagnetic sources is related to temperature through the fluctuation-dissipation theorem, and the resultant energy transfer is described using the dyadic Green's function of the vector Helmholtz equation. When the two objects are spheres, the dyadic Green's function can be expanded in a series of vector spherical waves. Based on comparison with the convergence criterion for the case of radiative transfer between two parallel surfaces, we derive a relation for the number of vector spherical waves required for convergence in the case of radiative transfer between two spheres. We show that when electromagnetic surface waves are active at a frequency the number of vector spherical waves required for convergence is proportional to Rmax/d when d/Rmax → 0, where Rmax is the radius of the larger sphere, and d is the smallest gap between the two spheres. This criterion for convergence applies equally well to other near-field electromagnetic scattering problems.
Novel sulI binary vectors enable an inexpensive foliar selection method in Arabidopsis
Directory of Open Access Journals (Sweden)
Smith Jamison
2011-03-01
Full Text Available Abstract Background Sulfonamide resistance is conferred by the sulI gene found on many Enterobacteriaceae R plasmids and Tn21 type transposons. The sulI gene encodes a sulfonamide insensitive dihydropteroate synthase enzyme required for folate biosynthesis. Transformation of tobacco, potato or Arabidopsis using sulI as a selectable marker generates sulfadiazine-resistant plants. Typically sulI-based selection of transgenic plants is performed on tissue culture media under sterile conditions. Findings A set of novel binary vectors containing a sulI selectable marker expression cassette were constructed and used to generate transgenic Arabidopsis. We demonstrate that the sulI selectable marker can be utilized for direct selection of plants grown in soil with a simple foliar spray application procedure. A highly effective and inexpensive high throughput screening strategy to identify transgenic Arabidopsis without use of tissue culture was developed. Conclusion Novel sulI-containing Agrobacterium binary vectors designed to over-express a gene of interest or to characterize a test promoter in transgenic plants have been constructed. These new vector tools combined with the various beneficial attributes of sulfonamide selection and the simple foliar screening strategy provide an advantageous alternative for plant biotechnology researchers. The set of binary vectors is freely available upon request.
A method for real-time three-dimensional vector velocity imaging
DEFF Research Database (Denmark)
Jensen, Jørgen Arendt; Nikolov, Svetoslav
2003-01-01
The paper presents an approach for making real-time three-dimensional vector flow imaging. Synthetic aperture data acquisition is used, and the data is beamformed along the flow direction to yield signals usable for flow estimation. The signals are cross-related to determine the shift in position...
Directory of Open Access Journals (Sweden)
Fuqiang Sun
2017-01-01
Full Text Available Rapid and accurate lifetime prediction of critical components in a system is important to maintaining the system’s reliable operation. To this end, many lifetime prediction methods have been developed to handle various failure-related data collected in different situations. Among these methods, machine learning and Bayesian updating are the most popular ones. In this article, a Bayesian least-squares support vector machine method that combines least-squares support vector machine with Bayesian inference is developed for predicting the remaining useful life of a microwave component. A degradation model describing the change in the component’s power gain over time is developed, and the point and interval remaining useful life estimates are obtained considering a predefined failure threshold. In our case study, the radial basis function neural network approach is also implemented for comparison purposes. The results indicate that the Bayesian least-squares support vector machine method is more precise and stable in predicting the remaining useful life of this type of components.
Robinson, Gilbert de B
2011-01-01
This brief undergraduate-level text by a prominent Cambridge-educated mathematician explores the relationship between algebra and geometry. An elementary course in plane geometry is the sole requirement for Gilbert de B. Robinson's text, which is the result of several years of teaching and learning the most effective methods from discussions with students. Topics include lines and planes, determinants and linear equations, matrices, groups and linear transformations, and vectors and vector spaces. Additional subjects range from conics and quadrics to homogeneous coordinates and projective geom
Pollett, Simon; Althouse, Benjamin M; Forshey, Brett; Rutherford, George W; Jarman, Richard G
2017-11-01
Internet-based surveillance methods for vector-borne diseases (VBDs) using "big data" sources such as Google, Twitter, and internet newswire scraping have recently been developed, yet reviews on such "digital disease detection" methods have focused on respiratory pathogens, particularly in high-income regions. Here, we present a narrative review of the literature that has examined the performance of internet-based biosurveillance for diseases caused by vector-borne viruses, parasites, and other pathogens, including Zika, dengue, other arthropod-borne viruses, malaria, leishmaniasis, and Lyme disease across a range of settings, including low- and middle-income countries. The fundamental features, advantages, and drawbacks of each internet big data source are presented for those with varying familiarity of "digital epidemiology." We conclude with some of the challenges and future directions in using internet-based biosurveillance for the surveillance and control of VBD.
A Novel Method for Vector Control of Three-Phase Induction Motor under Open-Phase Fault
Directory of Open Access Journals (Sweden)
Mohammad Jannati
2015-01-01
Full Text Available The majority of electrical machines such as induction motors can be modeled by an equivalent two-phase machine model (d-q model. A three-phase induction motor with one of the stator phases opened (faulty three-phase induction motor can be also modeled by an equivalent two-phase machine. If a conventional vector control method for balanced three-phase induction motors is used for this faulty machine, significant oscillations in speed and torque will result. In this paper, a novel technique for vector control of faulty three-phase induction motors based on rotor-field oriented control (RFOC is presented. The performance of the proposed method was evaluated using MATLAB software. The results show that it achieves significant improvements in the oscillation reduction of the speed and torque responses.
Wong, Jacklyn; Bayoh, Nabie; Olang, George; Killeen, Gerry; Hamel, Mary J; Vulule, John M.; Gimnig, John E.
2013-01-01
Background\\ud Operational vector sampling methods lack standardization, making quantitative comparisons of malaria transmission across different settings difficult. Human landing catch (HLC) is considered the research gold standard for measuring human-mosquito contact, but is unsuitable for large-scale sampling. This study assessed mosquito catch rates of CDC light trap (CDC-LT), Ifakara tent trap (ITT), window exit trap (WET), pot resting trap (PRT), and box resting trap (BRT) relative to HL...
van Dodewaard, Caitlin A M; Richards, Stephanie L; Harris, Jonathan W
2016-01-01
Commercially available blood can be used as an alternative to live animals to maintain mosquito colonies and deliver infectious bloodmeals during research studies. We analyzed the extent to which two methods for blood coagulate removal (defibrination or addition of sodium citrate) affected life table characteristics (i.e., fecundity, fertility, hatch rate, and adult survival) and vector competence (infection, dissemination, and transmission) of Aedes albopictus (Skuse) for dengue virus (DENV). Two types of bovine blood were tested at two extrinsic incubation temperatures (27 or 30°C) for DENV-infected and uninfected mosquitoes. Fully engorged mosquitoes were transferred to individual cages containing an oviposition cup and a substrate. Eggs (fecundity) and hatched larvae (fertility) were counted. At 14 and 21 d post feeding on a DENV-infected bloodmeal, 15 mosquitoes were sampled from each group, and vector competence was analyzed (bodies [infection], legs [dissemination], and saliva [transmission]). Differences in life table characteristics and vector competence were analyzed for mosquitoes fed blood processed using different methods for removal of coagulates. The method for removal of coagulates significantly impacted fecundity, fertility, and hatch time in the uninfected group, but not DENV-infected group. Infected mosquitoes showed significantly higher fecundity and faster hatch time than uninfected mosquitoes. We show no significant differences in infection or dissemination rates between groups; however, horizontal transmission rate was significantly higher in mosquitoes fed DENV-infected citrated compared with defibrinated blood. We expect the findings of this study to inform research using artificial blood delivery methods to assess vector competence. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Absolute Geostrophic Velocity Inverted from World Ocean Atlas 2013 (WOAV13) with the P-Vector Method
2015-11-01
modular ocean model (MOM). Journal of Oceanography, 54, 185-198. Chu PC, Lan J, Fan CW. 2001a. Japan Sea circulation and thermohaline structure...Part 1 Climatology. Journal of Physical Oceanography, 31, 244-271. Chu PC, Lan J, Fan CW. 2001b. Japan Sea circulation and thermohaline structure...geostrophic velocity, representing the large-scale ocean circulation , is calculated from the WOA13 (T, S) data using the P-vector inverse method (Chu
Li, Y; Graubard, B I; Huang, P; Gastwirth, J L
2015-02-20
Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters-Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance-covariance estimator that is based on the Taylor linearization variance-covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999-2004, are conducted. Empirical results indicate that the Taylor linearization variance-covariance estimation is accurate and that the proposed Wald test maintains the nominal level. Copyright © 2014 John Wiley & Sons, Ltd.
Li, Y.; Graubard, B. I.; Huang, P.; Gastwirth, J. L.
2015-01-01
Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters–Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance–covariance estimator that is based on the Taylor linearization variance–covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999–2004, are conducted. Empirical results indicate that the Taylor linearization variance–covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235
Directory of Open Access Journals (Sweden)
Corrado Dimauro
2010-01-01
Full Text Available Two methods of SNPs pre-selection based on single marker regression for the estimation of genomic breeding values (G-EBVs were compared using simulated data provided by the XII QTL-MAS workshop: i Bonferroni correction of the significance threshold and ii Permutation test to obtain the reference distribution of the null hypothesis and identify significant markers at P<0.01 and P<0.001 significance thresholds. From the set of markers significant at P<0.001, random subsets of 50% and 25% markers were extracted, to evaluate the effect of further reducing the number of significant SNPs on G-EBV predictions. The Bonferroni correction method allowed the identification of 595 significant SNPs that gave the best G-EBV accuracies in prediction generations (82.80%. The permutation methods gave slightly lower G-EBV accuracies even if a larger number of SNPs resulted significant (2,053 and 1,352 for 0.01 and 0.001 significance thresholds, respectively. Interestingly, halving or dividing by four the number of SNPs significant at P<0.001 resulted in an only slightly decrease of G-EBV accuracies. The genetic structure of the simulated population with few QTL carrying large effects, might have favoured the Bonferroni method.
Yadav, Manish; Singh, Nitin Kumar
2017-12-01
A comparison of the linear and non-linear regression method in selecting the optimum isotherm among three most commonly used adsorption isotherms (Langmuir, Freundlich, and Redlich-Peterson) was made to the experimental data of fluoride (F) sorption onto Bio-F at a solution temperature of 30 ± 1 °C. The coefficient of correlation (r2) was used to select the best theoretical isotherm among the investigated ones. A total of four Langmuir linear equations were discussed and out of which linear form of most popular Langmuir-1 and Langmuir-2 showed the higher coefficient of determination (0.976 and 0.989) as compared to other Langmuir linear equations. Freundlich and Redlich-Peterson isotherms showed a better fit to the experimental data in linear least-square method, while in non-linear method Redlich-Peterson isotherm equations showed the best fit to the tested data set. The present study showed that the non-linear method could be a better way to obtain the isotherm parameters and represent the most suitable isotherm. Redlich-Peterson isotherm was found to be the best representative (r2 = 0.999) for this sorption system. It is also observed that the values of β are not close to unity, which means the isotherms are approaching the Freundlich but not the Langmuir isotherm.
Zhang, H Steve; Kim, Eunmi; Lee, Slgirim; Ahn, Ik-Sung; Jang, Jae-Hyung
2012-01-01
Recombinant adeno-associated virus (AAV) vectors can be engineered to carry genetic material encoding therapeutic gene products that have demonstrated significant clinical promise. These viral vectors are typically produced in mammalian cells by the transient transfection of two or three plasmids encoding the AAV rep and cap genes, the adenovirus helper gene, and a gene of interest. Although this method can produce high-quality AAV vectors when used with multiple purification protocols, one critical limitation is the difficulty in scaling-up manufacturing, which poses a significant hurdle to the broad clinical utilization of AAV vectors. To address this challenge, recombinant herpes simplex virus type I (rHSV-1)- and recombinant baculovirus (rBac)-based methods have been established recently. These methods are more amenable to large-scale production of AAV vectors than methods using the transient transfection of mammalian cells. To investigate potential applications of AAV vectors produced by rHSV-1- or rBac-based platforms, the in vivo transduction of rHSV-1- or rBac-produced AAV serotype 2 (AAV2) vectors within the rat brain were examined by comparing them with vectors generated by the conventional transfection method. Injection of rHSV-1- or rBac-produced AAV vectors into rat striatum and cortex tissues revealed no differences in cellular tropism (i.e., predominantly neuronal targeting) or anteroposterior spread compared with AAV2 vectors produced by transient transfection. This report represents a step towards validating AAV vectors produced by the rHSV-1- and the rBac-based systems as promising tools, especially for delivering therapeutic molecules to the central nervous system. Copyright © 2011 Elsevier B.V. All rights reserved.
Akbari, Somaye; Zebardast, Tannaz; Zarghi, Afshin; Hajimahdi, Zahra
2017-01-01
COX-2 inhibitory activities of some 1,4-dihydropyridine and 5-oxo-1,4,5,6,7,8-hexahydroquinoline derivatives were modeled by quantitative structure-activity relationship (QSAR) using stepwise-multiple linear regression (SW-MLR) method. The built model was robust and predictive with correlation coefficient (R 2 ) of 0.972 and 0.531 for training and test groups, respectively. The quality of the model was evaluated by leave-one-out (LOO) cross validation (LOO correlation coefficient (Q 2 ) of 0.943) and Y-randomization. We also employed a leverage approach for the defining of applicability domain of model. Based on QSAR models results, COX-2 inhibitory activity of selected data set had correlation with BEHm6 (highest eigenvalue n. 6 of Burden matrix/weighted by atomic masses), Mor03u (signal 03/unweighted) and IVDE (Mean information content on the vertex degree equality) descriptors which derived from their structures.
Carlson, Jenny S; Giannitti, Federico; Valkiūnas, Gediminas; Tell, Lisa A; Snipes, Joy; Wright, Stan; Cornel, Anthony J
2016-03-11
Avian malaria vector competence studies are needed to understand more succinctly complex avian parasite-vector-relations. The lack of vector competence trials may be attributed to the difficulty of obtaining gametocytes for the majority of Plasmodium species and lineages. To conduct avian malaria infectivity assays for those Plasmodium spp. and lineages that are refractory to in vitro cultivation, it is necessary to obtain and preserve for short periods sufficient viable merozoites to infect naïve donor birds to be used as gametocyte donors to infect mosquitoes. Currently, there is only one described method for long-term storage of Plasmodium spp.-infected wild avian blood and it is reliable at a parasitaemia of at least 1%. However, most naturally infected wild-caught birds have a parasitaemia of much less that 1%. To address this problem, a method for short-term storage of infected wild avian blood with low parasitaemia (even ≤ 0.0005%) has been explored and validated. To obtain viable infective merozoites, blood was collected from wild birds using a syringe containing the anticoagulant and the red blood cell preservative citrate phosphate dextrose adenine solution (CPDA). Each blood sample was stored at 4 °C for up to 48 h providing sufficient time to determine the species and parasitaemia of Plasmodium spp. in the blood by morphological examination before injecting into donor canaries. Plasmodium spp.--infected blood was inoculated intravenously into canaries and once infection was established, Culex stigmatosoma, Cx. pipiens and Cx. quinquefasciatus mosquitoes were then allowed to feed on the infected canaries to validate the efficacy of this method for mosquito vector competence assays. Storage of Plasmodium spp.--infected donor blood at 4 °C yielded viable parasites for 48 h. All five experimentally-infected canaries developed clinical signs and were infectious. Pathologic examination of three canaries that later died revealed splenic lesions typical of
Directory of Open Access Journals (Sweden)
James C. K. Ng
2013-04-01
Full Text Available Successful vector-mediated plant virus transmission entails an intricate but poorly understood interplay of interactions among virus, vector, and plant. The complexity of interactions requires continually improving/evaluating tools and methods for investigating the determinants that are central to mediating virus transmission. A recent study using an organic fluorophore (Alexa Fluor-based immunofluorescent localization assay demonstrated that specific retention of Lettuce infectious yellows virus (LIYV virions in the anterior foregut or cibarium of its whitefly vector is required for virus transmission. Continuous exposure of organic fluorophore to high excitation light intensity can result in diminished or loss of signals, potentially confounding the identification of important interactions associated with virus transmission. This limitation can be circumvented by incorporation of photostable fluorescent nanocrystals, such as quantum dots (QDs, into the assay. We have developed and evaluated a QD-immunofluorescent labeling method for the in vitro and in situ localization of LIYV virions based on the recognition specificity of streptavidin-conjugated QD605 (S-QD605 for biotin-conjugated anti-LIYV IgG (B-αIgG. IgG biotinylation was verified in a blot overlay assay by probing SDS-PAGE separated B-αIgG with S-QD605. Immunoblot analyses of LIYV using B-αIgG and S-QD605 resulted in a virus detection limit comparable to that of DAS-ELISA. In membrane feeding experiments, QD signals were observed in the anterior foregut or cibarium of virion-fed whitefly vectors but absent in those of virion-fed whitefly non-vectors. Specific virion retention in whitefly vectors corresponded with successful virus transmission. A fluorescence photobleaching assay of viruliferous whiteflies fed B-αIgG and S-QD605 vs. those fed anti-LIYV IgG and Alexa Fluor 488-conjugated IgG revealed that QD signal was stable and deteriorated ∼7 to 8 fold slower than that of Alexa
Zhang, Jiyang; Zhang, Daibing; Zhang, Wei; Xie, Hongwei
2012-09-01
The online reversed-phase liquid chromatography (RPLC) contributes a lot for the large scale mass spectrometry based protein identification in proteomics. Retention time (RT) as an important evidence can be used to distinguish the false positive/true positive peptide identifications. Because of the nonlinear concentration curve of organic phase in the whole range of run time and the interactions among peptides, the sequence based RT prediction of peptides has low accuracy and is difficult to generalize in practice, and thus is less effective in the validation of peptide identifications. A serial and parallel support vector machine (SP-SVM) method was proposed to characterize the nonlinear effect of organic phase concentration and the interactions among peptides. The SP-SVM contains a support vector regression (SVR) only for model training (named as p-SVR) and 4 SVM models (named as C-SVM, 1-SVR, s-SVR and n-SVR) for the RT prediction. After distinguishing the peptide chromatographic behavior by C-SVM, 1-SVR and s-SVR were used to predict the peptide RT specifically to improve the accuracy. Then the peptide RT was normalized by n-SVR to characterize the peptide interactions. The prediction accuracy was improved significantly by applying this method to the processing of the complex sample dataset. The coefficient of the determination between predictive and experimental RTs reaches 0. 95, the prediction error range was less than 20% of the total LC run time for more than 95% cases, and less than 10% of the total LC run time for more than 70% cases. The performance of this model reaches the best of known so far. More important, the SP-SVM method provides a framework to take into account the interactions among peptides in chromatographic separation, and its performance can be improved further by introducing new data processing and experiment strategy.
Zhang, Linna; Li, Gang; Sun, Meixiu; Li, Hongxiao; Wang, Zhennan; Li, Yingxin; Lin, Ling
2017-11-01
Identifying whole bloods to be either human or nonhuman is an important responsibility for import-export ports and inspection and quarantine departments. Analytical methods and DNA testing methods are usually destructive. Previous studies demonstrated that visible diffuse reflectance spectroscopy method can realize noncontact human and nonhuman blood discrimination. An appropriate method for calibration set selection was very important for a robust quantitative model. In this paper, Random Selection (RS) method and Kennard-Stone (KS) method was applied in selecting samples for calibration set. Moreover, proper stoichiometry method can be greatly beneficial for improving the performance of classification model or quantification model. Partial Least Square Discrimination Analysis (PLSDA) method was commonly used in identification of blood species with spectroscopy methods. Least Square Support Vector Machine (LSSVM) was proved to be perfect for discrimination analysis. In this research, PLSDA method and LSSVM method was used for human blood discrimination. Compared with the results of PLSDA method, this method could enhance the performance of identified models. The overall results convinced that LSSVM method was more feasible for identifying human and animal blood species, and sufficiently demonstrated LSSVM method was a reliable and robust method for human blood identification, and can be more effective and accurate.
Yang, Fengping; Xiao, Fangfei
2017-03-01
Current control methods include hardware control and software control corresponding to the inherent unbalance problem of neutral point voltage in three level NPC inverter. The hardware control is rarely used due to its high cost. In this paper, a new compound control method has been presented based on the vector method of virtual space and traditional hysteresis control of neutral point voltage, which can make up the shortcoming of the virtual control without the feedback control system of neutral point voltage and the blind area of hysteresis control and control the deviation and wave of neutral point voltage. The accuracy of this method has been demonstrated by simulation.
Percutaneous Vaccination as an Effective Method of Delivery of MVA and MVA-Vectored Vaccines.
Directory of Open Access Journals (Sweden)
Clement A Meseda
Full Text Available The robustness of immune responses to an antigen could be dictated by the route of vaccine inoculation. Traditional smallpox vaccines, essentially vaccinia virus strains, that were used in the eradication of smallpox were administered by percutaneous inoculation (skin scarification. The modified vaccinia virus Ankara is licensed as a smallpox vaccine in Europe and Canada and currently undergoing clinical development in the United States. MVA is also being investigated as a vector for the delivery of heterologous genes for prophylactic or therapeutic immunization. Since MVA is replication-deficient, MVA and MVA-vectored vaccines are often inoculated through the intramuscular, intradermal or subcutaneous routes. Vaccine inoculation via the intramuscular, intradermal or subcutaneous routes requires the use of injection needles, and an estimated 10 to 20% of the population of the United States has needle phobia. Following an observation in our laboratory that a replication-deficient recombinant vaccinia virus derived from the New York City Board of Health strain elicited protective immune responses in a mouse model upon inoculation by tail scarification, we investigated whether MVA and MVA recombinants can elicit protective responses following percutaneous administration in mouse models. Our data suggest that MVA administered by percutaneous inoculation, elicited vaccinia-specific antibody responses, and protected mice from lethal vaccinia virus challenge, at levels comparable to or better than subcutaneous or intramuscular inoculation. High titers of specific neutralizing antibodies were elicited in mice inoculated with a recombinant MVA expressing the herpes simplex type 2 glycoprotein D after scarification. Similarly, a recombinant MVA expressing the hemagglutinin of attenuated influenza virus rgA/Viet Nam/1203/2004 (H5N1 elicited protective immune responses when administered at low doses by scarification. Taken together, our data suggest that
Xu, A; Zhang, Y; Ran, T; Liu, H; Lu, S; Xu, J; Xiong, X; Jiang, Y; Lu, T; Chen, Y
2015-01-01
Bruton's tyrosine kinase (BTK) plays a crucial role in B-cell activation and development, and has emerged as a new molecular target for the treatment of autoimmune diseases and B-cell malignancies. In this study, two- and three-dimensional quantitative structure-activity relationship (2D and 3D-QSAR) analyses were performed on a series of pyridine and pyrimidine-based BTK inhibitors by means of genetic algorithm optimized multivariate adaptive regression spline (GA-MARS) and comparative molecular similarity index analysis (CoMSIA) methods. Here, we propose a modified MARS algorithm to develop 2D-QSAR models. The top ranked models showed satisfactory statistical results (2D-QSAR: Q(2) = 0.884, r(2) = 0.929, r(2)pred = 0.878; 3D-QSAR: q(2) = 0.616, r(2) = 0.987, r(2)pred = 0.905). Key descriptors selected by 2D-QSAR were in good agreement with the conclusions of 3D-QSAR, and the 3D-CoMSIA contour maps facilitated interpretation of the structure-activity relationship. A new molecular database was generated by molecular fragment replacement (MFR) and further evaluated with GA-MARS and CoMSIA prediction. Twenty-five pyridine and pyrimidine derivatives as novel potential BTK inhibitors were finally selected for further study. These results also demonstrated that our method can be a very efficient tool for the discovery of novel potent BTK inhibitors.
Arsenault, Louis-François; Neuberg, Richard; Hannah, Lauren A.; Millis, Andrew J.
2017-11-01
We present a supervised machine learning approach to the inversion of Fredholm integrals of the first kind as they arise, for example, in the analytic continuation problem of quantum many-body physics. The approach provides a natural regularization for the ill-conditioned inverse of the Fredholm kernel, as well as an efficient and stable treatment of constraints. The key observation is that the stability of the forward problem permits the construction of a large database of outputs for physically meaningful inputs. Applying machine learning to this database generates a regression function of controlled complexity, which returns approximate solutions for previously unseen inputs; the approximate solutions are then projected onto the subspace of functions satisfying relevant constraints. Under standard error metrics the method performs as well or better than the Maximum Entropy method for low input noise and is substantially more robust to increased input noise. We suggest that the methodology will be similarly effective for other problems involving a formally ill-conditioned inversion of an integral operator, provided that the forward problem can be efficiently solved.
Hattori, Yusuke; Otsuka, Makoto
2017-05-30
In the pharmaceutical industry, the implementation of continuous manufacturing has been widely promoted in lieu of the traditional batch manufacturing approach. More specially, in recent years, the innovative concept of feed-forward control has been introduced in relation to process analytical technology. In the present study, we successfully developed a feed-forward control model for the tablet compression process by integrating data obtained from near-infrared (NIR) spectra and the physical properties of granules. In the pharmaceutical industry, batch manufacturing routinely allows for the preparation of granules with the desired properties through the manual control of process parameters. On the other hand, continuous manufacturing demands the automatic determination of these process parameters. Here, we proposed the development of a control model using the partial least squares regression (PLSR) method. The most significant feature of this method is the use of dataset integrating both the NIR spectra and the physical properties of the granules. Using our model, we determined that the properties of products, such as tablet weight and thickness, need to be included as independent variables in the PLSR analysis in order to predict unknown process parameters. Copyright © 2017 Elsevier B.V. All rights reserved.
Tang, Chunxiao; Sun, Wenfei; He, Hayi; Li, Hongqiang; Li, Enbang
2017-07-01
Spurious vectors (also called "outliers") in particle image velocimetry (PIV) experiments can be classified into two categories according to their space distribution characteristics: scattered and clustered outliers. Most of the currently used validation and correction methods treat these two kinds of outliers together without discrimination. In this paper, we propose a new technique based on a penalized least-squares (PLS) method, which allows automatic classification of flows with different types of outliers. PIV vector fields containing scattered outliers are detected and corrected using higher-order differentials, while lower-order differentials are used for the flows with clustered outliers. The order of differentials is determined adaptively by generalized cross-validation and outlier classification. A simple calculation method of eigenvalues of different orders is also developed to expedite computation speed. The performance of the proposed method is demonstrated with four different velocity fields, and the results show that it works better than conventional methods, especially when the number of outliers is large.
Kahane, Leo H
2007-01-01
Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Clement, Dominic; Gruber, Nicolas
2017-04-01
Major progress has been made by the international community (e.g., GO-SHIP, IOCCP, IMBER/SOLAS carbon working groups) in recent years by collecting and providing homogenized datasets for carbon and other biogeochemical variables in the surface ocean (SOCAT) and interior ocean (GLODAPv2). Together with previous efforts, this has enabled the community to develop methods to assess changes in the ocean carbon cycle through time. Of particular interest is the determination of the decadal change in the anthropogenic CO2 inventory solely based on in-situ measurements from at least two time periods in the interior ocean. However, all such methods face the difficulty of a scarce dataset in both space and time, making the use of appropriate interpolation techniques in time and space a crucial element of any method. Here we present a new method based on the parameter C*, whose variations reflect the total change in dissolved inorganic carbon (DIC) driven by the exchange of CO2 across the air-sea interface. We apply the extended Multiple Linear Regression method (Friis et al., 2005) on C* in order (1) to calculate the change in anthropogenic CO2 from the original DIC/C* measurements, and (2) to interpolate the result onto a spatial grid using other biogeochemical variables (T,S,AOU, etc.). These calculations are made on isopycnal slabs across whole ocean basins. In combination with the transient steady state assumption (Tanhua et al., 2007) providing a temporal correction factor, we address the spatial and temporal interpolation challenges. Using synthetic data from a hindcast simulation with a global ocean biogeochemistry model (NCAR-CCSM with BEC), we tested the method for robustness and accuracy in determining ΔCant. We will present data-based results for all ocean basins, with the most recent estimate of an global uptake of 32±6 Pg C between 1994 and 2007, indicating an uptake rate 2.5±0.5 Pg C yr-1 for this time period. These results are compared with regional and
Chen, Xiaol; Guo, Bei; Tuo, Jinliang; Zhou, Ruixin; Lu, Yang
2017-08-01
Nowadays, people are paying more and more attention to the noise reduction of household refrigerator compressor. This paper established a sound field bounded by compressor shell and ISO3744 standard field points. The Acoustic Transfer Vector (ATV) in the sound field radiated by a refrigerator compressor shell were calculated which fits the test result preferably. Then the compressor shell surface is divided into several parts. Based on Acoustic Transfer Vector approach, the sound pressure contribution to the field points and the sound power contribution to the sound field of each part were calculated. To obtain the noise radiation in the sound field, the sound pressure cloud charts were analyzed, and the contribution curves in different frequency of each part were acquired. Meanwhile, the sound power contribution of each part in different frequency was analyzed, to ensure those parts where contributes larger sound power. Through the analysis of acoustic contribution, those parts where radiate larger noise on the compressor shell were determined. This paper provides a credible and effective approach on the structure optimal design of refrigerator compressor shell, which is meaningful in the noise and vibration reduction.
Ismail, B; Anil, Manjula
2014-01-01
With modernization, rapid urbanization and industrialization, the price that the society is paying is tremendous load of "Non-Communicable" diseases, referred to as "Lifestyle Diseases". Coronary artery disease (CAD), one of the lifestyle diseases that manifests at a younger age can have divesting consequences for an individual, the family and society. Prevention of these diseases can be done by studying the risk factors, analyzing and interpreting them using various statistical methods. To determine, using logistic regression the relative contribution of independent variables according to the intensity of their influence (proven by statistical significance) upon the occurrence of values of the dependent cardio vascular risk scores. Additionally, we wanted to assess whether non parametric smoothing of the cardio vascular risk scores can be used as a better statistical method as compared to the existing methods. The study includes 498 students in the age group of 18-29 years. Prevalence of over weight (BMI 23-25 kg/m(2)) and obesity (BMI > 25 Kg/m(2)) was found among individuals of 22 years and above. Non smokers had decreased odds (OR = 0.041, CI = 0.015-0.107) and also increase in LDL Cholesterol (OR = 1.05, CI = 1.021-1.055) and BMI (OR = 1.42, CI = 1.244-1.631) were significantly contributing towards the risk of CVD. Localite students had decreased odds of developing CVD in the next 10 years (OR = 0.27, CI = 0.092-0.799) as compared to students residing in hostel or paying guests. Copyright © 2014 Cardiological Society of India. Published by Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
ZUO Xiang
2017-08-01
Full Text Available The existing underwater noise source near-field location method usually assumes that the measurement plane is flat, which increases the difficulty of applying the underwater noise target test for cylindrical distribution. Simultaneously, the conventional near-field focused beam has a lower spatial resolution when used to locate an underwater noise source with cylindrical distribution. Moreover, the near-field underwater noise source location method based on the sound pressure array has a left and right side fuzzy problem. In order to solve these problems, by establishing the near-field measurement model of the noise source with cylindrical distribution as the measurement surface, and combining the unilateral directivity of the vector hydrophone and the high resolution characteristics of the MUSIC algorithm, a near-field and high resolution location method is proposed for cylindrical distribution based on vector sound pressure, and a computer simulation is carried out. The results show that the method can use a smaller array aperture to locate the underwater noise source, enabling it to be used to locate and recognize the noise sources of complex and large-scale cylindrical systems.
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Directory of Open Access Journals (Sweden)
Liu Yi
2012-10-01
Full Text Available Abstract Background Gene targeting is a powerful method that can be used for examining the functions of genes. Traditionally, the construction of knockout (KO vectors requires an amplification step to obtain two homologous, large fragments of genomic DNA. Restriction enzymes that cut at unique recognitions sites and numerous cloning steps are then carried out; this is often a time-consuming and frustrating process. Results We have developed a one-step cloning method for the insertion of two arms into a KO vector using exonuclease III. We modified an adeno-associated virus KO shuttle vector (pTK-LoxP-NEO-AAV to yield pAAV-LIC, which contained two cassettes at the two multiple-cloning sites. The vector was digested with EcoRV to give two fragments. The two homologous arms, which had an overlap of 16 bases with the ends of the vector fragments, were amplified by polymerase chain reaction. After purification, the four fragments were mixed and treated with exonuclease III, then transformed into Escherichia coli to obtain the desired clones. Using this method, we constructed SirT1 and HDAC2 KO vectors, which were used to establish SirT1 KO cells from the colorectal cancer cell line (HCT116 and HDAC2 KO cells from the colorectal cancer cell line (DLD1. Conclusions Our method is a fast, simple, and efficient technique for cloning, and has great potential for high-throughput construction of KO vectors.
Directory of Open Access Journals (Sweden)
Ousmane Coulibaly
2016-01-01
Full Text Available We utilize the multiple linear regression method to analyse meteorological data for eight cities in Burkina Faso. A correlation between the monthly mean daily global solar radiation on a horizontal surface and five meteorological and geographical parameters, which are the mean daily extraterrestrial solar radiation intensity, the average daily ratio of sunshine duration, the mean daily relative humidity, the mean daily maximum air temperature, and the sine of the solar declination angle, was examined. A second correlation is established for the entire country, using, this time, the monthly mean global solar radiation on a horizontal surface and the following climatic variables: the average daily ratio of sunshine duration, the latitude, and the longitude. The results show that the coefficients of correlation vary between 0.96 and 0.99 depending on the station while the relative errors spread between −3.16% (Pô and 3.65% (Dédougou. The maximum value of the RMSD which is 312.36 kJ/m2 is obtained at Dori, which receives the strongest radiation. For the entire cities, the values of the MBD are found to be in the acceptable margin.
Energy Technology Data Exchange (ETDEWEB)
Lee, Sang Dae; Lohumi, Santosh; Cho, Byoung Kwan [Dept. of Biosystems Machinery Engineering, Chungnam National University, Daejeon (Korea, Republic of); Kim, Moon Sung [United States Department of Agriculture Agricultural Research Service, Washington (United States); Lee, Soo Hee [Life and Technology Co.,Ltd., Hwasung (Korea, Republic of)
2014-08-15
This study was conducted to develop a non-destructive detection method for adulterated powder products using Raman spectroscopy and partial least squares regression(PLSR). Garlic and ginger powder, which are used as natural seasoning and in health supplement foods, were selected for this experiment. Samples were adulterated with corn starch in concentrations of 5-35%. PLSR models for adulterated garlic and ginger powders were developed and their performances evaluated using cross validation. The R{sup 2}{sub c} and SEC of an optimal PLSR model were 0.99 and 2.16 for the garlic powder samples, and 0.99 and 0.84 for the ginger samples, respectively. The variable importance in projection (VIP) score is a useful and simple tool for the evaluation of the importance of each variable in a PLSR model. After the VIP scores were taken pre-selection, the Raman spectrum data was reduced by one third. New PLSR models, based on a reduced number of wavelengths selected by the VIP scores technique, gave good predictions for the adulterated garlic and ginger powder samples.
Error Concealment Method Based on Motion Vector Prediction Using Particle Filters
Directory of Open Access Journals (Sweden)
B. Hrusovsky
2011-09-01
Full Text Available Video transmitted over unreliable environment, such as wireless channel or in generally any network with unreliable transport protocol, is facing the losses of video packets due to network congestion and different kind of noises. The problem is becoming more important using highly effective video codecs. Visual quality degradation could propagate into subsequent frames due to redundancy elimination in order to obtain high compression ratio. Since the video stream transmission in real time is limited by transmission channel delay, it is not possible to retransmit all faulty or lost packets. It is therefore inevitable to conceal these defects. To reduce the undesirable effects of information losses, the lost data is usually estimated from the received data, which is generally known as error concealment problem. This paper discusses packet loss modeling in order to simulate losses during video transmission, packet losses analysis and their impacts on the motion vectors losses.
Directory of Open Access Journals (Sweden)
Giuseppe eMercurio
2014-01-01
Full Text Available We present an analysis method of normal incidence x-ray standing wave (NIXSW data that allows detailed adsorption geometries of complex molecules to be retrieved. This method (Fourier vector analysis is based on the comparison of both the coherence and phase of NIXSW data to NIXSW simulations of different molecular geometries as the relevant internal degrees of freedom are tuned. We introduce this analysis method using the prototypical molecular switch azobenzene (AB adsorbed on the Ag(111 surface as a model system. The application of the Fourier vector analysis to AB/Ag(111 provides, on the one hand, detailed adsorption geometries including dihedral angles, and on the other hand, insights into the dynamics of molecules and their bonding to the metal substrate. This analysis scheme is generally applicable to any adsorbate, it is necessary for molecules with potentially large distortions, and will be particularly valuable for molecules whose distortion on adsorption can be mapped on a limited number of internal degrees of freedom.
Directory of Open Access Journals (Sweden)
Muhammad Saiedullah
2015-01-01
Full Text Available Background: Friedewald’s formula (FF is used worldwide to calculate low-density lipoprotein cholesterol (LDL-chol. But it has several shortcomings: overestimation at lower triglyceride (TG concentrations and underestimation at higher concentrations. In FF, TG to very low-density lipoprotein cholesterol (VLDL-chol ratio (TG/VLDL-chol is considered as constant, but practically it is not a fixed value. Recently, by analyzing lipid profiles in a large population, continuously adjustable values of TG/VLDL-chol were used to derive a novel method (NM for the calculation of LDL-chol. Objective: The aim of this study was to evaluate the performance of the novel method compared with direct measurement and regression equation (RE developed for Bangladeshi population. Materials and Methods: In this cross-sectional comparative study we used lipid profiles of 955 adult Bangladeshi subjects. Total cholesterol (TC, TG, HDL-chol and LDL-chol were measured by direct methods using automation. LDL-chol was also calculated by NM and RE. LDL-chol calculated by NM and RE were compared with measured LDL-chol by twotailed paired t test, Pearson’s correlation test, bias against measured LDL-chol by Bland-Altman test, accuracy within ±5% and ±12% of measured LDL-chol and by inter-rater agreements with measured LDL-chol at different cut-off values. Results: The mean values of LDL-chol were 110.7 ± 32.0 mg/dL for direct measurement, 111.9 ± 34.8 mg/dL for NM and 113.2 ± 31.7 mg/dL for RE. Mean values of calculated LDL-chol by both NM and RE differed from that of measured LDL-chol (p130 mg/dL were 0.816 vs 0.815, 0.637 vs 0.649 and 0.791 vs 0.791 for NM and RE respectively. Conclusion: This study reveals that NM and RE developed for Bangladeshi population have similar performance and can be used for the calculation of LDL-chol.
Shinnaka, Shinji; I, Daisuke
This paper newly shows applicability of the generalized integral type PLL method to sensorless vector controls of permanent magnet synchronous motors using full-order state observer, which have identified rotor speed as a system parameter separately from rotor phase treated as a system state by adaptive identification algorithm so far. The PLL method can exploit the integral-derivative relation between phase and speed, consequently can allow simpler realization of the sensorless vector controls.
Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.
2014-05-01
Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution
Varying-coefficient functional linear regression
Wu, Yichao; Fan, Jianqing; Müller, Hans-Georg
2010-01-01
Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regression model. We study...
Directory of Open Access Journals (Sweden)
Long Jiao
2015-05-01
Full Text Available The quantitative structure property relationship (QSPR for the boiling point (Tb of polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs was investigated. The molecular distance-edge vector (MDEV index was used as the structural descriptor. The quantitative relationship between the MDEV index and Tb was modeled by using multivariate linear regression (MLR and artificial neural network (ANN, respectively. Leave-one-out cross validation and external validation were carried out to assess the prediction performance of the models developed. For the MLR method, the prediction root mean square relative error (RMSRE of leave-one-out cross validation and external validation was 1.77 and 1.23, respectively. For the ANN method, the prediction RMSRE of leave-one-out cross validation and external validation was 1.65 and 1.16, respectively. A quantitative relationship between the MDEV index and Tb of PCDD/Fs was demonstrated. Both MLR and ANN are practicable for modeling this relationship. The MLR model and ANN model developed can be used to predict the Tb of PCDD/Fs. Thus, the Tb of each PCDD/F was predicted by the developed models.
Silhavy, Radek; Silhavy, Petr; Prokopova, Zdenka
2017-01-01
This study investigates the significance of use case points (UCP) variables and the influence of the complexity of multiple linear regression models on software size estimation and accuracy. Stepwise multiple linear regression models and residual analysis were used to analyse the impact of model complexity. The impact of each variable was studied using correlation analysis. The estimated size of software depends mainly on the values of the weights of unadjusted UCP, which represent a number o...
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving
Choi, Giehae; Bell, Michelle L.; Lee, Jong-Tae
2017-04-01
The land-use regression (LUR) approach to estimate the levels of ambient air pollutants is becoming popular due to its high validity in predicting small-area variations. However, only a few studies have been conducted in Asian countries, and much less research has been conducted on comparing the performances and applied estimates of different exposure assessments including LUR. The main objectives of the current study were to conduct nitrogen dioxide (NO2) exposure assessment with four methods including LUR in the Republic of Korea, to compare the model performances, and to estimate the empirical NO2 exposures of a cohort. The study population was defined as the year 2010 participants of a government-supported cohort established for bio-monitoring in Ulsan, Republic of Korea. The annual ambient NO2 exposures of the 969 study participants were estimated with LUR, nearest station, inverse distance weighting, and ordinary kriging. Modeling was based on the annual NO2 average, traffic-related data, land-use data, and altitude of the 13 regularly monitored stations. The final LUR model indicated that area of transportation, distance to residential area, and area of wetland were important predictors of NO2. The LUR model explained 85.8% of the variation observed in the 13 monitoring stations of the year 2009. The LUR model outperformed the others based on leave-one out cross-validation comparing the correlations and root-mean square error. All NO2 estimates ranged from 11.3-18.0 ppb, with that of LUR having the widest range. The NO2 exposure levels of the residents differed by demographics. However, the average was below the national annual guidelines of the Republic of Korea (30 ppb). The LUR models showed high performances in an industrial city in the Republic of Korea, despite the small sample size and limited data. Our findings suggest that the LUR method may be useful in similar settings in Asian countries where the target region is small and availability of data is
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Directory of Open Access Journals (Sweden)
Mustafa Serter Uzer
2013-01-01
Full Text Available This paper offers a hybrid approach that uses the artificial bee colony (ABC algorithm for feature selection and support vector machines for classification. The purpose of this paper is to test the effect of elimination of the unimportant and obsolete features of the datasets on the success of the classification, using the SVM classifier. The developed approach conventionally used in liver diseases and diabetes diagnostics, which are commonly observed and reduce the quality of life, is developed. For the diagnosis of these diseases, hepatitis, liver disorders and diabetes datasets from the UCI database were used, and the proposed system reached a classification accuracies of 94.92%, 74.81%, and 79.29%, respectively. For these datasets, the classification accuracies were obtained by the help of the 10-fold cross-validation method. The results show that the performance of the method is highly successful compared to other results attained and seems very promising for pattern recognition applications.
Directory of Open Access Journals (Sweden)
Semra Boran
2007-09-01
Full Text Available Taguchi Method and Regression Analysis have wide spread applications in statistical researches. It can be said that Taguchi Method is one of the most frequently used method especially in optimization problems. But applications of this method are not common in food industry . In this study, optimal operating parameters were determined for industrial size fluidized bed dryer by using Taguchi method. Then the effects of operating parameters on activity value (the quality chracteristic of this problem were calculated by regression analysis. Finally, results of two methods were compared.To summarise, average activity value was found to be 660 for the 400 kg loading and average drying time 26 minutes by using the factors and levels taken from application of Taguchi Method. Whereas, in normal conditions (with 600 kg loading average activity value was found to be 630 and drying time 28 minutes. Taguchi Method application caused 15 % rise in activity value.
Directory of Open Access Journals (Sweden)
A. Yu. Bykov
2015-01-01
Full Text Available Modern practical task-solving techniques for designing information security systems in different purpose automated systems assume the solution of optimization tasks when choosing different elements of a security system. Formulations of mathematical programming tasks are rather often used, but in practical tasks it is not always analytically possible to set target function and (or restrictions in an explicit form. Sometimes, calculation of the target function value or checking of restrictions for the possible decision can be reduced to carrying out experiments on a simulation model of system. Similar tasks are considered within optimization-simulation approach and require the ad hoc methods of optimization considering the possible high computational effort of simulation.The article offers a modified recession vector method, which is used in tasks of discrete optimization to solve the similar problems. The method is applied when the task to be solved is to minimize the cost of selected information security tools in case of restriction on the maximum possible damage. The cost index is the linear function of the Boolean variables, which specify the selected security tools, with the restriction set as an "example simulator". Restrictions can be actually set implicitly. A validity of the possible solution is checked using a simulation model of the system.The offered algorithm of a method considers features of an objective. The main advantage of algorithm is that it requires a maximum of m+1 of steps where m is a dimensionality of the required vector of Boolean variables. The algorithm provides finding a local minimum by using the Hamming metrics in the discrete space; the radius of neighborhood is equal to 1. These statements are proved.The paper presents solution results of choosing security tools with the specified basic data.
Li, Wutao; Huang, Zhigang; Lang, Rongling; Qin, Honglei; Zhou, Kai; Cao, Yongbin
2016-03-04
Interferences can severely degrade the performance of Global Navigation Satellite System (GNSS) receivers. As the first step of GNSS any anti-interference measures, interference monitoring for GNSS is extremely essential and necessary. Since interference monitoring can be considered as a classification problem, a real-time interference monitoring technique based on Twin Support Vector Machine (TWSVM) is proposed in this paper. A TWSVM model is established, and TWSVM is solved by the Least Squares Twin Support Vector Machine (LSTWSVM) algorithm. The interference monitoring indicators are analyzed to extract features from the interfered GNSS signals. The experimental results show that the chosen observations can be used as the interference monitoring indicators. The interference monitoring performance of the proposed method is verified by using GPS L1 C/A code signal and being compared with that of standard SVM. The experimental results indicate that the TWSVM-based interference monitoring is much faster than the conventional SVM. Furthermore, the training time of TWSVM is on millisecond (ms) level and the monitoring time is on microsecond (μs) level, which make the proposed approach usable in practical interference monitoring applications.
Dabrowska, Dorota M.
1997-01-01
Nonparametric regression was shown by Beran and McKeague and Utikal to provide a flexible method for analysis of censored failure times and more general counting processes models in the presence of covariates. We discuss application of kernel smoothing towards estimation in a generalized Cox regression model with baseline intensity dependent on a covariate. Under regularity conditions we show that estimates of the regression parameters are asymptotically normal at rate root-n, and we also dis...
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-04-01
first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.
Kramberger, Petra; Urbas, Lidija; Štrancar, Aleš
2015-01-01
Downstream processing of nanoplexes (viruses, virus-like particles, bacteriophages) is characterized by complexity of the starting material, number of purification methods to choose from, regulations that are setting the frame for the final product and analytical methods for upstream and downstream monitoring. This review gives an overview on the nanoplex downstream challenges and chromatography based analytical methods for efficient monitoring of the nanoplex production. PMID:25751122
Perry, Thomas
2017-01-01
Value-added (VA) measures are currently the predominant approach used to compare the effectiveness of schools. Recent educational effectiveness research, however, has developed alternative approaches including the regression discontinuity (RD) design, which also allows estimation of absolute school effects. Initial research suggests RD is a viable…
Oranje, Andreas; Li, Deping; Kandathil, Mathew
2009-01-01
Several complex sample standard error estimators based on linearization and resampling for the latent regression model of the National Assessment of Educational Progress (NAEP) are studied with respect to design choices such as number of items, number of regressors, and the efficiency of the sample. This paper provides an evaluation of the extent…
Energy Technology Data Exchange (ETDEWEB)
Rieben, Robert N. [Univ. of California, Davis, CA (United States)
2004-01-01
The goal of this dissertation is two-fold. The first part concerns the development of a numerical method for solving Maxwell's equations on unstructured hexahedral grids that employs both high order spatial and high order temporal discretizations. The second part involves the use of this method as a computational tool to perform high fidelity simulations of various electromagnetic devices such as optical transmission lines and photonic crystal structures to yield a level of accuracy that has previously been computationally cost prohibitive. This work is based on the initial research of Daniel White who developed a provably stable, charge and energy conserving method for solving Maxwell's equations in the time domain that is second order accurate in both space and time. The research presented here has involved the generalization of this procedure to higher order methods. High order methods are capable of yielding far more accurate numerical results for certain problems when compared to corresponding h-refined first order methods , and often times at a significant reduction in total computational cost. The first half of this dissertation presents the method as well as the necessary mathematics required for its derivation. The second half addresses the implementation of the method in a parallel computational environment, its validation using benchmark problems, and finally its use in large scale numerical simulations of electromagnetic transmission devices.
Modal loss mechanism of micro-structured VCSELs studied using full vector FDTD method.
Jo, Du-Ho; Vu, Ngoc Hai; Kim, Jin-Tae; Hwang, In-Kag
2011-09-12
Modal properties of vertical cavity surface-emitting lasers (VCSELs) with holey structures are studied using a finite difference time domain (FDTD) method. We investigate loss behavior with respect to the variation of structural parameters, and explain the loss mechanism of VCSELs. We also propose an effective method to estimate the modal loss based on mode profiles obtained using FDTD simulation. Our results could provide an important guideline for optimization of the microstructures of high-power single-mode VCSELs.
Hua, S; Sun, Z
2001-04-27
We have introduced a new method of protein secondary structure prediction which is based on the theory of support vector machine (SVM). SVM represents a new approach to supervised pattern classification which has been successfully applied to a wide range of pattern recognition problems, including object recognition, speaker identification, gene function prediction with microarray expression profile, etc. In these cases, the performance of SVM either matches or is significantly better than that of traditional machine learning approaches, including neural networks.The first use of the SVM approach to predict protein secondary structure is described here. Unlike the previous studies, we first constructed several binary classifiers, then assembled a tertiary classifier for three secondary structure states (helix, sheet and coil) based on these binary classifiers. The SVM method achieved a good performance of segment overlap accuracy SOV=76.2 % through sevenfold cross validation on a database of 513 non-homologous protein chains with multiple sequence alignments, which out-performs existing methods. Meanwhile three-state overall per-residue accuracy Q(3) achieved 73.5 %, which is at least comparable to existing single prediction methods. Furthermore a useful "reliability index" for the predictions was developed. In addition, SVM has many attractive features, including effective avoidance of overfitting, the ability to handle large feature spaces, information condensing of the given data set, etc. The SVM method is conveniently applied to many other pattern classification tasks in biology. Copyright 2001 Academic Press.
Directory of Open Access Journals (Sweden)
Yukun Bao
2012-01-01
Full Text Available With regard to the nonlinearity and irregularity along with implicit seasonality and trend in the context of air passenger traffic forecasting, this study proposes an ensemble empirical mode decomposition (EEMD based support vector machines (SVMs modeling framework incorporating a slope-based method to restrain the end effect issue occurring during the shifting process of EEMD, which is abbreviated as EEMD-Slope-SVMs. Real monthly air passenger traffic series including six selected airlines in USA and UK were collected to test the effectiveness of the proposed approach. Empirical results demonstrate that the proposed decomposition and ensemble modeling framework outperform the selected counterparts such as single SVMs (straightforward application of SVMs, Holt-Winters, and ARIMA in terms of RMSE, MAPE, GMRAE, and DS. Additional evidence is also shown to highlight the improved performance while compared with EEMD-SVM model not restraining the end effect.
Hoffmann, Banesh
1975-01-01
From his unusual beginning in ""Defining a vector"" to his final comments on ""What then is a vector?"" author Banesh Hoffmann has written a book that is provocative and unconventional. In his emphasis on the unresolved issue of defining a vector, Hoffmann mixes pure and applied mathematics without using calculus. The result is a treatment that can serve as a supplement and corrective to textbooks, as well as collateral reading in all courses that deal with vectors. Major topics include vectors and the parallelogram law; algebraic notation and basic ideas; vector algebra; scalars and scalar p
Newell, Homer E
2006-01-01
When employed with skill and understanding, vector analysis can be a practical and powerful tool. This text develops the algebra and calculus of vectors in a manner useful to physicists and engineers. Numerous exercises (with answers) not only provide practice in manipulation but also help establish students' physical and geometric intuition in regard to vectors and vector concepts.Part I, the basic portion of the text, consists of a thorough treatment of vector algebra and the vector calculus. Part II presents the illustrative matter, demonstrating applications to kinematics, mechanics, and e
Directory of Open Access Journals (Sweden)
Mustafa Özuysal
2012-01-01
Full Text Available Passenger flow estimation of transit systems is essential for new decisions about additional facilities and feeder lines. For increasing the efficiency of an existing transit line, stations which are insufficient for trip production and attraction should be examined first. Such investigation supports decisions for feeder line projects which may seem necessary or futile according to the findings. In this study, passenger flow of a light rail transit (LRT system in Izmir, Turkey is estimated by using multiple regression and feed-forward back-propagation type of artificial neural networks (ANN. The number of alighting passengers at each station is estimated as a function of boarding passengers from other stations. It is found that ANN approach produced significantly better estimations specifically for the low passenger attractive stations. In addition, ANN is found to be more capable for the determination of trip-attractive parts of LRT lines. Keywords: light rail transit, multiple regression, artificial neural networks, public transportation
Mohd Faris Dziauddin; Zulkefli Idris
2017-01-01
This study estimates the effect of locational attributes on residential property values in Kuala Lumpur, Malaysia. Geographically weighted regression (GWR) enables the use of the local parameter rather than the global parameter to be estimated, with the results presented in map form. The results of this study reveal that residential property values are mainly determined by the property’s physical (structural) attributes, but proximity to locational attributes also contributes marginally. The ...
Barzin, Razieh; Shirvani, Amin; Lotfi, Hossein
2017-01-01
Downward shortwave radiation is a key quantity in the land-atmosphere interaction. Since the moderate resolution imaging spectroradiometer data has a coarse temporal resolution, which is not suitable for estimating daily average radiation, many efforts have been undertaken to estimate instantaneous solar radiation using moderate resolution imaging spectroradiometer data. In this study, the principal components analysis technique was applied to capture the information of moderate resolution imaging spectroradiometer bands, extraterrestrial radiation, aerosol optical depth, and atmospheric water vapour. A regression model based on the principal components was used to estimate daily average shortwave radiation for ten synoptic stations in the Fars province, Iran, for the period 2009-2012. The Durbin-Watson statistic and autocorrelation function of the residuals of the fitted principal components regression model indicated that the residuals were serially independent. The results indicated that the fitted principal components regression models accounted for about 86-96% of total variance of the observed shortwave radiation values and the root mean square error was about 0.9-2.04 MJ m-2 d-1. Also, the results indicated that the model accuracy decreased as the aerosol optical depth increased and extraterrestrial radiation was the most important predictor variable among all.
Vehicle Travel Time Predication based on Multiple Kernel Regression
Wenjing Xu
2014-01-01
With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR) method etc. However, these algo...