Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Directory of Open Access Journals (Sweden)
Hailun Wang
2017-01-01
Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.
Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K
2011-10-01
To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods
Directory of Open Access Journals (Sweden)
Guan Lian
2018-01-01
Full Text Available Accurate prediction of taxi-out time is significant precondition for improving the operationality of the departure process at an airport, as well as reducing the long taxi-out time, congestion, and excessive emission of greenhouse gases. Unfortunately, several of the traditional methods of predicting taxi-out time perform unsatisfactorily at congested airports. This paper describes and tests three of those conventional methods which include Generalized Linear Model, Softmax Regression Model, and Artificial Neural Network method and two improved Support Vector Regression (SVR approaches based on swarm intelligence algorithm optimization, which include Particle Swarm Optimization (PSO and Firefly Algorithm. In order to improve the global searching ability of Firefly Algorithm, adaptive step factor and Lévy flight are implemented simultaneously when updating the location function. Six factors are analysed, of which delay is identified as one significant factor in congested airports. Through a series of specific dynamic analyses, a case study of Beijing International Airport (PEK is tested with historical data. The performance measures show that the proposed two SVR approaches, especially the Improved Firefly Algorithm (IFA optimization-based SVR method, not only perform as the best modelling measures and accuracy rate compared with the representative forecast models, but also can achieve a better predictive performance when dealing with abnormal taxi-out time states.
A dynamic particle filter-support vector regression method for reliability prediction
International Nuclear Information System (INIS)
Wei, Zhao; Tao, Tao; ZhuoShu, Ding; Zio, Enrico
2013-01-01
Support vector regression (SVR) has been applied to time series prediction and some works have demonstrated the feasibility of its use to forecast system reliability. For accuracy of reliability forecasting, the selection of SVR's parameters is important. The existing research works on SVR's parameters selection divide the example dataset into training and test subsets, and tune the parameters on the training data. However, these fixed parameters can lead to poor prediction capabilities if the data of the test subset differ significantly from those of training. Differently, the novel method proposed in this paper uses particle filtering to estimate the SVR model parameters according to the whole measurement sequence up to the last observation instance. By treating the SVR training model as the observation equation of a particle filter, our method allows updating the SVR model parameters dynamically when a new observation comes. Because of the adaptability of the parameters to dynamic data pattern, the new PF–SVR method has superior prediction performance over that of standard SVR. Four application results show that PF–SVR is more robust than SVR to the decrease of the number of training data and the change of initial SVR parameter values. Also, even if there are trends in the test data different from those in the training data, the method can capture the changes, correct the SVR parameters and obtain good predictions. -- Highlights: •A dynamic PF–SVR method is proposed to predict the system reliability. •The method can adjust the SVR parameters according to the change of data. •The method is robust to the size of training data and initial parameter values. •Some cases based on both artificial and real data are studied. •PF–SVR shows superior prediction performance over standard SVR
International Nuclear Information System (INIS)
Sun Zhong-Hua; Jiang Fan
2010-01-01
In this paper a new continuous variable called core-ratio is defined to describe the probability for a residue to be in a binding site, thereby replacing the previous binary description of the interface residue using 0 and 1. So we can use the support vector machine regression method to fit the core-ratio value and predict the protein binding sites. We also design a new group of physical and chemical descriptors to characterize the binding sites. The new descriptors are more effective, with an averaging procedure used. Our test shows that much better prediction results can be obtained by the support vector regression (SVR) method than by the support vector classification method. (rapid communication)
Forecast daily indices of solar activity, F10.7, using support vector regression method
International Nuclear Information System (INIS)
Huang Cong; Liu Dandan; Wang Jingsong
2009-01-01
The 10.7 cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7 cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network. (research paper)
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
International Nuclear Information System (INIS)
Yang, Jianhong; Yi, Cancan; Xu, Jinwu; Ma, Xianghong
2015-01-01
A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine. - Highlights: • Both training and testing samples are considered for analytical lines selection. • The analytical lines are auto-selected based on the built-in characteristics of spectral lines. • The new method can achieve better prediction accuracy and modeling robustness. • Model predictions are given with confidence interval of probabilistic distribution
Sun, L.G.; De Visser, C.C.; Chu, Q.P.; Mulder, J.A.
2012-01-01
The optimality of the kernel number and kernel centers plays a significant role in determining the approximation power of nearly all kernel methods. However, the process of choosing optimal kernels is always formulated as a global optimization task, which is hard to accomplish. Recently, an
CSIR Research Space (South Africa)
Gregor, Luke
2017-12-01
Full Text Available understanding with spatially integrated air–sea flux estimates (Fay and McKinley, 2014). Conversely, ocean biogeochemical process models are good tools for mechanis- tic understanding, but fail to represent the seasonality of CO2 fluxes in the Southern Ocean... of including coordinate variables as proxies of 1pCO2 in the empirical methods. In the inter- comparison study by Rödenbeck et al. (2015) proxies typi- cally include, but are not limited to, sea surface temperature (SST), chlorophyll a (Chl a), mixed layer...
Image superresolution using support vector regression.
Ni, Karl S; Nguyen, Truong Q
2007-06-01
A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.
Fast multi-output relevance vector regression
Ha, Youngmin
2017-01-01
This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V
Directory of Open Access Journals (Sweden)
Mehmet Das
2018-01-01
Full Text Available In this study, an air heated solar collector (AHSC dryer was designed to determine the drying characteristics of the pear. Flat pear slices of 10 mm thickness were used in the experiments. The pears were dried both in the AHSC dryer and under the sun. Panel glass temperature, panel floor temperature, panel inlet temperature, panel outlet temperature, drying cabinet inlet temperature, drying cabinet outlet temperature, drying cabinet temperature, drying cabinet moisture, solar radiation, pear internal temperature, air velocity and mass loss of pear were measured at 30 min intervals. Experiments were carried out during the periods of June 2017 in Elazig, Turkey. The experiments started at 8:00 a.m. and continued till 18:00. The experiments were continued until the weight changes in the pear slices stopped. Wet basis moisture content (MCw, dry basis moisture content (MCd, adjustable moisture ratio (MR, drying rate (DR, and convective heat transfer coefficient (hc were calculated with both in the AHSC dryer and the open sun drying experiment data. It was found that the values of hc in both drying systems with a range 12.4 and 20.8 W/m2 °C. Three different kernel models were used in the support vector machine (SVM regression to construct the predictive model of the calculated hc values for both systems. The mean absolute error (MAE, root mean squared error (RMSE, relative absolute error (RAE and root relative absolute error (RRAE analysis were performed to indicate the predictive model’s accuracy. As a result, the rate of drying of the pear was examined for both systems and it was observed that the pear had dried earlier in the AHSC drying system. A predictive model was obtained using the SVM regression for the calculated hc values for the pear in the AHSC drying system. The normalized polynomial kernel was determined as the best kernel model in SVM for estimating the hc values.
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Directory of Open Access Journals (Sweden)
Ying-Hsin Chang
2013-01-01
Full Text Available Human estrogen receptor (ER isoforms, ERα and ERβ, have long been an important focus in the field of biology. To better understand the structural features associated with the binding of ERα ligands to ERα and modulate their function, several QSAR models, including CoMFA, CoMSIA, SVR, and LR methods, have been employed to predict the inhibitory activity of 68 raloxifene derivatives. In the SVR and LR modeling, 11 descriptors were selected through feature ranking and sequential feature addition/deletion to generate equations to predict the inhibitory activity toward ERα. Among four descriptors that constantly appear in various generated equations, two agree with CoMFA and CoMSIA steric fields and another two can be correlated to a calculated electrostatic potential of ERα.
DNBR Prediction Using a Support Vector Regression
International Nuclear Information System (INIS)
Yang, Heon Young; Na, Man Gyun
2008-01-01
PWRs (Pressurized Water Reactors) generally operate in the nucleate boiling state. However, the conversion of nucleate boiling into film boiling with conspicuously reduced heat transfer induces a boiling crisis that may cause the fuel clad melting in the long run. This type of boiling crisis is called Departure from Nucleate Boiling (DNB) phenomena. Because the prediction of minimum DNBR in a reactor core is very important to prevent the boiling crisis such as clad melting, a lot of research has been conducted to predict DNBR values. The object of this research is to predict minimum DNBR applying support vector regression (SVR) by using the measured signals of a reactor coolant system (RCS). The SVR has extensively and successfully been applied to nonlinear function approximation like the proposed problem for estimating DNBR values that will be a function of various input variables such as reactor power, reactor pressure, core mass flowrate, control rod positions and so on. The minimum DNBR in a reactor core is predicted using these various operating condition data as the inputs to the SVR. The minimum DBNR values predicted by the SVR confirm its correctness compared with COLSS values
Modeling and prediction of flotation performance using support vector regression
Directory of Open Access Journals (Sweden)
Despotović Vladimir
2017-01-01
Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.
Deep Support Vector Machines for Regression Problems
Wiering, Marco; Schutten, Marten; Millea, Adrian; Meijster, Arnold; Schomaker, Lambertus
2013-01-01
In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Fault trend prediction of device based on support vector regression
International Nuclear Information System (INIS)
Song Meicun; Cai Qi
2011-01-01
The research condition of fault trend prediction and the basic theory of support vector regression (SVR) were introduced. SVR was applied to the fault trend prediction of roller bearing, and compared with other methods (BP neural network, gray model, and gray-AR model). The results show that BP network tends to overlearn and gets into local minimum so that the predictive result is unstable. It also shows that the predictive result of SVR is stabilization, and SVR is superior to BP neural network, gray model and gray-AR model in predictive precision. SVR is a kind of effective method of fault trend prediction. (authors)
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding
Ochoa Gutierrez, L. H.; Vargas Jimenez, C. A.; Niño Vasquez, L. F.
2011-12-01
The "Sabana de Bogota" (Bogota Savannah) is the most important social and economical center of Colombia. Almost the third of population is concentrated in this region and generates about the 40% of Colombia's Internal Brute Product (IBP). According to this, the zone presents an elevated vulnerability in case that a high destructive seismic event occurs. Historical evidences show that high magnitude events took place in the past with a huge damage caused to the city and indicate that is probable that such events can occur in the next years. This is the reason why we are working in an early warning generation system, using the first few seconds of a seismic signal registered by three components and wide band seismometers. Such system can be implemented using Computational Intelligence tools, designed and calibrated to the particular Geological, Structural and environmental conditions present in the region. The methods developed are expected to work on real time, thus suitable software and electronic tools need to be developed. We used Support Vector Machines Regression (SVMR) methods trained and tested with historic seismic events registered by "EL ROSAL" Station, located near Bogotá, calculating descriptors or attributes as the input of the model, from the first 6 seconds of signal. With this algorithm, we obtained less than 10% of mean absolute error and correlation coefficients greater than 85% in hypocentral distance and Magnitude estimation. With this results we consider that we can improve the method trying to have better accuracy with less signal time and that this can be a very useful model to be implemented directly in the seismological stations to generate a fast characterization of the event, broadcasting not only raw signal but pre-processed information that can be very useful for accurate Early Warning Generation.
Image Jacobian Matrix Estimation Based on Online Support Vector Regression
Directory of Open Access Journals (Sweden)
Shangqin Mao
2012-10-01
Full Text Available Research into robotics visual servoing is an important area in the field of robotics. It has proven difficult to achieve successful results for machine vision and robotics in unstructured environments without using any a priori camera or kinematic models. In uncalibrated visual servoing, image Jacobian matrix estimation methods can be divided into two groups: the online method and the offline method. The offline method is not appropriate for most natural environments. The online method is robust but rough. Moreover, if the images feature configuration changes, it needs to restart the approximating procedure. A novel approach based on an online support vector regression (OL-SVR algorithm is proposed which overcomes the drawbacks and combines the virtues just mentioned.
Intelligent Quality Prediction Using Weighted Least Square Support Vector Regression
Yu, Yaojun
A novel quality prediction method with mobile time window is proposed for small-batch producing process based on weighted least squares support vector regression (LS-SVR). The design steps and learning algorithm are also addressed. In the method, weighted LS-SVR is taken as the intelligent kernel, with which the small-batch learning is solved well and the nearer sample is set a larger weight, while the farther is set the smaller weight in the history data. A typical machining process of cutting bearing outer race is carried out and the real measured data are used to contrast experiment. The experimental results demonstrate that the prediction accuracy of the weighted LS-SVR based model is only 20%-30% that of the standard LS-SVR based one in the same condition. It provides a better candidate for quality prediction of small-batch producing process.
Phase Space Prediction of Chaotic Time Series with Nu-Support Vector Machine Regression
International Nuclear Information System (INIS)
Ye Meiying; Wang Xiaodong
2005-01-01
A new class of support vector machine, nu-support vector machine, is discussed which can handle both classification and regression. We focus on nu-support vector machine regression and use it for phase space prediction of chaotic time series. The effectiveness of the method is demonstrated by applying it to the Henon map. This study also compares nu-support vector machine with back propagation (BP) networks in order to better evaluate the performance of the proposed methods. The experimental results show that the nu-support vector machine regression obtains lower root mean squared error than the BP networks and provides an accurate chaotic time series prediction. These results can be attributable to the fact that nu-support vector machine implements the structural risk minimization principle and this leads to better generalization than the BP networks.
Theory of net analyte signal vectors in inverse regression
DEFF Research Database (Denmark)
Bro, R.; Andersen, Charlotte Møller
2003-01-01
The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...
Linearity and Misspecification Tests for Vector Smooth Transition Regression Models
DEFF Research Database (Denmark)
Teräsvirta, Timo; Yang, Yukai
The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...
Directory of Open Access Journals (Sweden)
Guillaume Wattelez
2017-09-01
Full Text Available Particle transport by erosion from ultramafic lands in pristine tropical lagoons is a crucial problem, especially for the benthic and pelagic biodiversity associated with coral reefs. Satellite imagery is useful for assessing particle transport from land to sea. However, in the oligotrophic and shallow waters of tropical lagoons, the bottom reflection of downwelling light usually hampers the use of classical optical algorithms. In order to address this issue, a Support Vector Regression (SVR model was developed and tested. The proposed application concerns the lagoon of New Caledonia—the second longest continuous coral reef in the world—which is frequently exposed to river plumes from ultramafic watersheds. The SVR model is based on a large training sample of in-situ turbidity values representative of the annual variability in the Voh-Koné-Pouembout lagoon (Western Coast of New Caledonia during the 2014–2015 period and on coincident satellite reflectance values from MODerate Resolution Imaging Spectroradiometer (MODIS. It was trained with reflectance and two other explanatory parameters—bathymetry and bottom colour. This approach significantly improved the model’s capacity for retrieving the in-situ turbidity range from MODIS images, as compared with algorithms dedicated to deep oligotrophic or turbid waters, which were shown to be inadequate. This SVR model is applicable to the whole shallow lagoon waters from the Western Coast of New Caledonia and it is now ready to be tested over other oligotrophic shallow lagoon waters worldwide.
Chen, Xi; Lu, Fang; Jiang, Lu-di; Cai, Yi-Lian; Li, Gong-Yu; Zhang, Yan-Ling
2016-07-01
Inhibition of cytochrome P450 (CYP450) enzymes is the most common reasons for drug interactions, so the study on early prediction of CYPs inhibitors can help to decrease the incidence of adverse reactions caused by drug interactions.CYP450 2E1(CYP2E1), as a key role in drug metabolism process, has broad spectrum of drug metabolism substrate. In this study, 32 CYP2E1 inhibitors were collected for the construction of support vector regression (SVR) model. The test set data were used to verify CYP2E1 quantitative models and obtain the optimal prediction model of CYP2E1 inhibitor. Meanwhile, one molecular docking program, CDOCKER, was utilized to analyze the interaction pattern between positive compounds and active pocket to establish the optimal screening model of CYP2E1 inhibitors.SVR model and molecular docking prediction model were combined to screen traditional Chinese medicine database (TCMD), which could improve the calculation efficiency and prediction accuracy. 6 376 traditional Chinese medicine (TCM) compounds predicted by SVR model were obtained, and in further verification by using molecular docking model, 247 TCM compounds with potential inhibitory activities against CYP2E1 were finally retained. Some of them have been verified by experiments. The results demonstrated that this study could provide guidance for the virtual screening of CYP450 inhibitors and the prediction of CYPs-mediated DDIs, and also provide references for clinical rational drug use. Copyright© by the Chinese Pharmaceutical Association.
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
General Dimensional Multiple-Output Support Vector Regressions and Their Multiple Kernel Learning.
Chung, Wooyong; Kim, Jisu; Lee, Heejin; Kim, Euntai
2015-11-01
Support vector regression has been considered as one of the most important regression or function approximation methodologies in a variety of fields. In this paper, two new general dimensional multiple output support vector regressions (MSVRs) named SOCPL1 and SOCPL2 are proposed. The proposed methods are formulated in the dual space and their relationship with the previous works is clearly investigated. Further, the proposed MSVRs are extended into the multiple kernel learning and their training is implemented by the off-the-shelf convex optimization tools. The proposed MSVRs are applied to benchmark problems and their performances are compared with those of the previous methods in the experimental section.
Optimized support vector regression for drilling rate of penetration estimation
Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa
2015-12-01
In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.
Mixed kernel function support vector regression for global sensitivity analysis
Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng
2017-11-01
Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.
Directory of Open Access Journals (Sweden)
Santana Isabel
2011-08-01
Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.
Wang, Xiaolei; Kuwahara, Hiroyuki; Gao, Xin
2014-01-01
high-quality estimates of such complex affinity landscapes is, thus, essential to the control of gene expression and the advance of synthetic biology. Results: Here, we propose a two-round prediction method that is based on support vector regression
Comparison of ν-support vector regression and logistic equation for ...
African Journals Online (AJOL)
Due to the complexity and high non-linearity of bioprocess, most simple mathematical models fail to describe the exact behavior of biochemistry systems. As a novel type of learning method, support vector regression (SVR) owns the powerful capability to characterize problems via small sample, nonlinearity, high dimension ...
Estimating Frequency by Interpolation Using Least Squares Support Vector Regression
Directory of Open Access Journals (Sweden)
Changwei Ma
2015-01-01
Full Text Available Discrete Fourier transform- (DFT- based maximum likelihood (ML algorithm is an important part of single sinusoid frequency estimation. As signal to noise ratio (SNR increases and is above the threshold value, it will lie very close to Cramer-Rao lower bound (CRLB, which is dependent on the number of DFT points. However, its mean square error (MSE performance is directly proportional to its calculation cost. As a modified version of support vector regression (SVR, least squares SVR (LS-SVR can not only still keep excellent capabilities for generalizing and fitting but also exhibit lower computational complexity. In this paper, therefore, LS-SVR is employed to interpolate on Fourier coefficients of received signals and attain high frequency estimation accuracy. Our results show that the proposed algorithm can make a good compromise between calculation cost and MSE performance under the assumption that the sample size, number of DFT points, and resampling points are already known.
Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms
Directory of Open Access Journals (Sweden)
Zhongyi Hu
2013-01-01
Full Text Available Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA based memetic algorithm (FA-MA to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.
Support vector regression model based predictive control of water level of U-tube steam generators
Energy Technology Data Exchange (ETDEWEB)
Kavaklioglu, Kadir, E-mail: kadir.kavaklioglu@pau.edu.tr
2014-10-15
Highlights: • Water level of U-tube steam generators was controlled in a model predictive fashion. • Models for steam generator water level were built using support vector regression. • Cost function minimization for future optimal controls was performed by using the steepest descent method. • The results indicated the feasibility of the proposed method. - Abstract: A predictive control algorithm using support vector regression based models was proposed for controlling the water level of U-tube steam generators of pressurized water reactors. Steam generator data were obtained using a transfer function model of U-tube steam generators. Support vector regression based models were built using a time series type model structure for five different operating powers. Feedwater flow controls were calculated by minimizing a cost function that includes the level error, the feedwater change and the mismatch between feedwater and steam flow rates. Proposed algorithm was applied for a scenario consisting of a level setpoint change and a steam flow disturbance. The results showed that steam generator level can be controlled at all powers effectively by the proposed method.
International Nuclear Information System (INIS)
Riaz, Nadeem; Wiersma, Rodney; Mao Weihua; Xing Lei; Shanker, Piyush; Gudmundsson, Olafur; Widrow, Bernard
2009-01-01
Intra-fraction tumor tracking methods can improve radiation delivery during radiotherapy sessions. Image acquisition for tumor tracking and subsequent adjustment of the treatment beam with gating or beam tracking introduces time latency and necessitates predicting the future position of the tumor. This study evaluates the use of multi-dimensional linear adaptive filters and support vector regression to predict the motion of lung tumors tracked at 30 Hz. We expand on the prior work of other groups who have looked at adaptive filters by using a general framework of a multiple-input single-output (MISO) adaptive system that uses multiple correlated signals to predict the motion of a tumor. We compare the performance of these two novel methods to conventional methods like linear regression and single-input, single-output adaptive filters. At 400 ms latency the average root-mean-square-errors (RMSEs) for the 14 treatment sessions studied using no prediction, linear regression, single-output adaptive filter, MISO and support vector regression are 2.58, 1.60, 1.58, 1.71 and 1.26 mm, respectively. At 1 s, the RMSEs are 4.40, 2.61, 3.34, 2.66 and 1.93 mm, respectively. We find that support vector regression most accurately predicts the future tumor position of the methods studied and can provide a RMSE of less than 2 mm at 1 s latency. Also, a multi-dimensional adaptive filter framework provides improved performance over single-dimension adaptive filters. Work is underway to combine these two frameworks to improve performance.
Modeling and prediction of Turkey's electricity consumption using Support Vector Regression
International Nuclear Information System (INIS)
Kavaklioglu, Kadir
2011-01-01
Support Vector Regression (SVR) methodology is used to model and predict Turkey's electricity consumption. Among various SVR formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. (author)
A Novel Empirical Mode Decomposition With Support Vector Regression for Wind Speed Forecasting.
Ren, Ye; Suganthan, Ponnuthurai Nagaratnam; Srikanth, Narasimalu
2016-08-01
Wind energy is a clean and an abundant renewable energy source. Accurate wind speed forecasting is essential for power dispatch planning, unit commitment decision, maintenance scheduling, and regulation. However, wind is intermittent and wind speed is difficult to predict. This brief proposes a novel wind speed forecasting method by integrating empirical mode decomposition (EMD) and support vector regression (SVR) methods. The EMD is used to decompose the wind speed time series into several intrinsic mode functions (IMFs) and a residue. Subsequently, a vector combining one historical data from each IMF and the residue is generated to train the SVR. The proposed EMD-SVR model is evaluated with a wind speed data set. The proposed EMD-SVR model outperforms several recently reported methods with respect to accuracy or computational complexity.
Prediction of hourly PM2.5 using a space-time support vector regression model
Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang
2018-05-01
Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.
Dougherty, Andrew W.
Metal oxides are a staple of the sensor industry. The combination of their sensitivity to a number of gases, and the electrical nature of their sensing mechanism, make the particularly attractive in solid state devices. The high temperature stability of the ceramic material also make them ideal for detecting combustion byproducts where exhaust temperatures can be high. However, problems do exist with metal oxide sensors. They are not very selective as they all tend to be sensitive to a number of reduction and oxidation reactions on the oxide's surface. This makes sensors with large numbers of sensors interesting to study as a method for introducing orthogonality to the system. Also, the sensors tend to suffer from long term drift for a number of reasons. In this thesis I will develop a system for intelligently modeling metal oxide sensors and determining their suitability for use in large arrays designed to analyze exhaust gas streams. It will introduce prior knowledge of the metal oxide sensors' response mechanisms in order to produce a response function for each sensor from sparse training data. The system will use the same technique to model and remove any long term drift from the sensor response. It will also provide an efficient means for determining the orthogonality of the sensor to determine whether they are useful in gas sensing arrays. The system is based on least squares support vector regression using the reciprocal kernel. The reciprocal kernel is introduced along with a method of optimizing the free parameters of the reciprocal kernel support vector machine. The reciprocal kernel is shown to be simpler and to perform better than an earlier kernel, the modified reciprocal kernel. Least squares support vector regression is chosen as it uses all of the training points and an emphasis was placed throughout this research for extracting the maximum information from very sparse data. The reciprocal kernel is shown to be effective in modeling the sensor
Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study
Al-Anazi, A. F.; Gates, I. D.
2010-12-01
In wells with limited log and core data, porosity, a fundamental and essential property to characterize reservoirs, is challenging to estimate by conventional statistical methods from offset well log and core data in heterogeneous formations. Beyond simple regression, neural networks have been used to develop more accurate porosity correlations. Unfortunately, neural network-based correlations have limited generalization ability and global correlations for a field are usually less accurate compared to local correlations for a sub-region of the reservoir. In this paper, support vector machines are explored as an intelligent technique to correlate porosity to well log data. Recently, support vector regression (SVR), based on the statistical learning theory, have been proposed as a new intelligence technique for both prediction and classification tasks. The underlying formulation of support vector machines embodies the structural risk minimization (SRM) principle which has been shown to be superior to the traditional empirical risk minimization (ERM) principle employed by conventional neural networks and classical statistical methods. This new formulation uses margin-based loss functions to control model complexity independently of the dimensionality of the input space, and kernel functions to project the estimation problem to a higher dimensional space, which enables the solution of more complex nonlinear problem optimization methods to exist for a globally optimal solution. SRM minimizes an upper bound on the expected risk using a margin-based loss function ( ɛ-insensitivity loss function for regression) in contrast to ERM which minimizes the error on the training data. Unlike classical learning methods, SRM, indexed by margin-based loss function, can also control model complexity independent of dimensionality. The SRM inductive principle is designed for statistical estimation with finite data where the ERM inductive principle provides the optimal solution (the
Linear and support vector regressions based on geometrical correlation of data
Directory of Open Access Journals (Sweden)
Kaijun Wang
2007-10-01
Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.
Development of precursors recognition methods in vector signals
Kapralov, V. G.; Elagin, V. V.; Kaveeva, E. G.; Stankevich, L. A.; Dremin, M. M.; Krylov, S. V.; Borovov, A. E.; Harfush, H. A.; Sedov, K. S.
2017-10-01
Precursor recognition methods in vector signals of plasma diagnostics are presented. Their requirements and possible options for their development are considered. In particular, the variants of using symbolic regression for building a plasma disruption prediction system are discussed. The initial data preparation using correlation analysis and symbolic regression is discussed. Special attention is paid to the possibility of using algorithms in real time.
A multi-scale relevance vector regression approach for daily urban water demand forecasting
Bai, Yun; Wang, Pu; Li, Chuan; Xie, Jingjing; Wang, Yin
2014-09-01
Water is one of the most important resources for economic and social developments. Daily water demand forecasting is an effective measure for scheduling urban water facilities. This work proposes a multi-scale relevance vector regression (MSRVR) approach to forecast daily urban water demand. The approach uses the stationary wavelet transform to decompose historical time series of daily water supplies into different scales. At each scale, the wavelet coefficients are used to train a machine-learning model using the relevance vector regression (RVR) method. The estimated coefficients of the RVR outputs for all of the scales are employed to reconstruct the forecasting result through the inverse wavelet transform. To better facilitate the MSRVR forecasting, the chaos features of the daily water supply series are analyzed to determine the input variables of the RVR model. In addition, an adaptive chaos particle swarm optimization algorithm is used to find the optimal combination of the RVR model parameters. The MSRVR approach is evaluated using real data collected from two waterworks and is compared with recently reported methods. The results show that the proposed MSRVR method can forecast daily urban water demand much more precisely in terms of the normalized root-mean-square error, correlation coefficient, and mean absolute percentage error criteria.
Support Vector Regression and Genetic Algorithm for HVAC Optimal Operation
Directory of Open Access Journals (Sweden)
Ching-Wei Chen
2016-01-01
Full Text Available This study covers records of various parameters affecting the power consumption of air-conditioning systems. Using the Support Vector Machine (SVM, the chiller power consumption model, secondary chilled water pump power consumption model, air handling unit fan power consumption model, and air handling unit load model were established. In addition, it was found that R2 of the models all reached 0.998, and the training time was far shorter than that of the neural network. Through genetic programming, a combination of operating parameters with the least power consumption of air conditioning operation was searched. Moreover, the air handling unit load in line with the air conditioning cooling load was predicted. The experimental results show that for the combination of operating parameters with the least power consumption in line with the cooling load obtained through genetic algorithm search, the power consumption of the air conditioning systems under said combination of operating parameters was reduced by 22% compared to the fixed operating parameters, thus indicating significant energy efficiency.
Directory of Open Access Journals (Sweden)
Changhao Fan
2017-01-01
Full Text Available In modeling, only information from the deviation between the output of the support vector regression (SVR model and the training sample is considered, whereas the other prior information of the training sample, such as probability distribution information, is ignored. Probabilistic distribution information describes the overall distribution of sample data in a training sample that contains different degrees of noise and potential outliers, as well as helping develop a high-accuracy model. To mine and use the probability distribution information of a training sample, a new support vector regression model that incorporates probability distribution information weight SVR (PDISVR is proposed. In the PDISVR model, the probability distribution of each sample is considered as the weight and is then introduced into the error coefficient and slack variables of SVR. Thus, the deviation and probability distribution information of the training sample are both used in the PDISVR model to eliminate the influence of noise and outliers in the training sample and to improve predictive performance. Furthermore, examples with different degrees of noise were employed to demonstrate the performance of PDISVR, which was then compared with those of three SVR-based methods. The results showed that PDISVR performs better than the three other methods.
Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression
Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.
2010-01-01
In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.
Directory of Open Access Journals (Sweden)
Hongjian Wang
2014-01-01
Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.
Noise reduction by support vector regression with a Ricker wavelet kernel
International Nuclear Information System (INIS)
Deng, Xiaoying; Yang, Dinghui; Xie, Jing
2009-01-01
We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR
Noise reduction by support vector regression with a Ricker wavelet kernel
Deng, Xiaoying; Yang, Dinghui; Xie, Jing
2009-06-01
We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR.
Fruit fly optimization based least square support vector regression for blind image restoration
Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei
2014-11-01
The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and
Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie
2014-09-01
Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique. Copyright © 2014 Elsevier Ltd. All rights reserved.
Estimating transmitted waves of floating breakwater using support vector regression model
Digital Repository Service at National Institute of Oceanography (India)
Mandal, S.; Hegde, A.V.; Kumar, V.; Patil, S.G.
is first mapped onto an m-dimensional feature space using some fixed (nonlinear) mapping, and then a linear model is constructed in this feature space (Ivanciuc Ovidiu 2007). Using mathematical notation, the linear model in the feature space f(x, w... regressive vector machines, Ocean Engineering Journal, Vol – 36, pp 339 – 347, 2009. 3. Ivanciuc Ovidiu, Applications of support vector machines in chemistry, Review in Computational Chemistry, Eds K. B. Lipkouitz and T. R. Cundari, Vol – 23...
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Li, Tao
2018-06-01
The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.
Wavelength detection in FBG sensor networks using least squares support vector regression
Chen, Jing; Jiang, Hao; Liu, Tundong; Fu, Xiaoli
2014-04-01
A wavelength detection method for a wavelength division multiplexing (WDM) fiber Bragg grating (FBG) sensor network is proposed based on least squares support vector regression (LS-SVR). As a kind of promising machine learning technique, LS-SVR is employed to approximate the inverse function of the reflection spectrum. The LS-SVR detection model is established from the training samples, and then the Bragg wavelength of each FBG can be directly identified by inputting the measured spectrum into the well-trained model. We also discuss the impact of the sample size and the preprocess of the input spectrum on the performance of the training effectiveness. The results demonstrate that our approach is effective in improving the accuracy for sensor networks with a large number of FBGs.
International Nuclear Information System (INIS)
Seo, In Yong; Ha, Bok Nam; Lee, Sung Woo; Shin, Chang Hoon; Kim, Seong Jun
2010-01-01
In nuclear power plants (NPPs), periodic sensor calibrations are required to assure that sensors are operating correctly. By checking the sensor's operating status at every fuel outage, faulty sensors may remain undetected for periods of up to 24 months. Moreover, typically, only a few faulty sensors are found to be calibrated. For the safe operation of NPP and the reduction of unnecessary calibration, on-line instrument calibration monitoring is needed. In this study, principal component based auto-associative support vector regression (PCSVR) using response surface methodology (RSM) is proposed for the sensor signal validation of NPPs. This paper describes the design of a PCSVR-based sensor validation system for a power generation system. RSM is employed to determine the optimal values of SVR hyperparameters and is compared to the genetic algorithm (GA). The proposed PCSVR model is confirmed with the actual plant data of Kori Nuclear Power Plant Unit 3 and is compared with the Auto-Associative support vector regression (AASVR) and the auto-associative neural network (AANN) model. The auto-sensitivity of AASVR is improved by around six times by using a PCA, resulting in good detection of sensor drift. Compared to AANN, accuracy and cross-sensitivity are better while the auto-sensitivity is almost the same. Meanwhile, the proposed RSM for the optimization of the PCSVR algorithm performs even better in terms of accuracy, auto-sensitivity, and averaged maximum error, except in averaged RMS error, and this method is much more time efficient compared to the conventional GA method
Application of Hybrid Quantum Tabu Search with Support Vector Regression (SVR for Load Forecasting
Directory of Open Access Journals (Sweden)
Cheng-Wen Lee
2016-10-01
Full Text Available Hybridizing chaotic evolutionary algorithms with support vector regression (SVR to improve forecasting accuracy is a hot topic in electricity load forecasting. Trapping at local optima and premature convergence are critical shortcomings of the tabu search (TS algorithm. This paper investigates potential improvements of the TS algorithm by applying quantum computing mechanics to enhance the search information sharing mechanism (tabu memory to improve the forecasting accuracy. This article presents an SVR-based load forecasting model that integrates quantum behaviors and the TS algorithm with the support vector regression model (namely SVRQTS to obtain a more satisfactory forecasting accuracy. Numerical examples demonstrate that the proposed model outperforms the alternatives.
Using support vector regression to predict PM10 and PM2.5
International Nuclear Information System (INIS)
Weizhen, Hou; Zhengqiang, Li; Yuhuan, Zhang; Hua, Xu; Ying, Zhang; Kaitao, Li; Donghui, Li; Peng, Wei; Yan, Ma
2014-01-01
Support vector machine (SVM), as a novel and powerful machine learning tool, can be used for the prediction of PM 10 and PM 2.5 (particulate matter less or equal than 10 and 2.5 micrometer) in the atmosphere. This paper describes the development of a successive over relaxation support vector regress (SOR-SVR) model for the PM 10 and PM 2.5 prediction, based on the daily average aerosol optical depth (AOD) and meteorological parameters (atmospheric pressure, relative humidity, air temperature, wind speed), which were all measured in Beijing during the year of 2010–2012. The Gaussian kernel function, as well as the k-fold crosses validation and grid search method, are used in SVR model to obtain the optimal parameters to get a better generalization capability. The result shows that predicted values by the SOR-SVR model agree well with the actual data and have a good generalization ability to predict PM 10 and PM 2.5 . In addition, AOD plays an important role in predicting particulate matter with SVR model, which should be included in the prediction model. If only considering the meteorological parameters and eliminating AOD from the SVR model, the prediction results of predict particulate matter will be not satisfying
Directory of Open Access Journals (Sweden)
Suduan Chen
2014-01-01
Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
Failure and reliability prediction by support vector machines regression of time series data
International Nuclear Information System (INIS)
Chagas Moura, Marcio das; Zio, Enrico; Lins, Isis Didier; Droguett, Enrique
2011-01-01
Support Vector Machines (SVMs) are kernel-based learning methods, which have been successfully adopted for regression problems. However, their use in reliability applications has not been widely explored. In this paper, a comparative analysis is presented in order to evaluate the SVM effectiveness in forecasting time-to-failure and reliability of engineered components based on time series data. The performance on literature case studies of SVM regression is measured against other advanced learning methods such as the Radial Basis Function, the traditional MultiLayer Perceptron model, Box-Jenkins autoregressive-integrated-moving average and the Infinite Impulse Response Locally Recurrent Neural Networks. The comparison shows that in the analyzed cases, SVM outperforms or is comparable to other techniques. - Highlights: → Realistic modeling of reliability demands complex mathematical formulations. → SVM is proper when the relation input/output is unknown or very costly to be obtained. → Results indicate the potential of SVM for reliability time series prediction. → Reliability estimates support the establishment of adequate maintenance strategies.
Support vector regression to predict porosity and permeability: Effect of sample size
Al-Anazi, A. F.; Gates, I. D.
2012-02-01
Porosity and permeability are key petrophysical parameters obtained from laboratory core analysis. Cores, obtained from drilled wells, are often few in number for most oil and gas fields. Porosity and permeability correlations based on conventional techniques such as linear regression or neural networks trained with core and geophysical logs suffer poor generalization to wells with only geophysical logs. The generalization problem of correlation models often becomes pronounced when the training sample size is small. This is attributed to the underlying assumption that conventional techniques employing the empirical risk minimization (ERM) inductive principle converge asymptotically to the true risk values as the number of samples increases. In small sample size estimation problems, the available training samples must span the complexity of the parameter space so that the model is able both to match the available training samples reasonably well and to generalize to new data. This is achieved using the structural risk minimization (SRM) inductive principle by matching the capability of the model to the available training data. One method that uses SRM is support vector regression (SVR) network. In this research, the capability of SVR to predict porosity and permeability in a heterogeneous sandstone reservoir under the effect of small sample size is evaluated. Particularly, the impact of Vapnik's ɛ-insensitivity loss function and least-modulus loss function on generalization performance was empirically investigated. The results are compared to the multilayer perception (MLP) neural network, a widely used regression method, which operates under the ERM principle. The mean square error and correlation coefficients were used to measure the quality of predictions. The results demonstrate that SVR yields consistently better predictions of the porosity and permeability with small sample size than the MLP method. Also, the performance of SVR depends on both kernel function
Analyzing Big Data with the Hybrid Interval Regression Methods
Directory of Open Access Journals (Sweden)
Chia-Hui Huang
2014-01-01
Full Text Available Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM to analyze big data. Recently, the smooth support vector machine (SSVM was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.
Stochastic development regression using method of moments
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Dai, Wensheng
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740
Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie
2014-01-01
Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
Directory of Open Access Journals (Sweden)
Wensheng Dai
2014-01-01
Full Text Available Sales forecasting is one of the most important issues in managing information technology (IT chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR, is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA, temporal ICA (tICA, and spatiotemporal ICA (stICA to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.
International Nuclear Information System (INIS)
Che Jinxing; Wang Jianzhou
2010-01-01
In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.
International Nuclear Information System (INIS)
Jiang, B.T.; Zhao, F.Y.
2013-01-01
Highlights: ► CHF data are collected from the published literature. ► Less training data are used to train the LSSVR model. ► PSO is adopted to optimize the key parameters to improve the model precision. ► The reliability of LSSVR is proved through parametric trends analysis. - Abstract: In view of practical importance of critical heat flux (CHF) for design and safety of nuclear reactors, accurate prediction of CHF is of utmost significance. This paper presents a novel approach using least squares support vector regression (LSSVR) and particle swarm optimization (PSO) to predict CHF. Two available published datasets are used to train and test the proposed algorithm, in which PSO is employed to search for the best parameters involved in LSSVR model. The CHF values obtained by the LSSVR model are compared with the corresponding experimental values and those of a previous method, adaptive neuro fuzzy inference system (ANFIS). This comparison is also carried out in the investigation of parametric trends of CHF. It is found that the proposed method can achieve the desired performance and yields a more satisfactory fit with experimental results than ANFIS. Therefore, LSSVR method is likely to be suitable for other parameters processing such as CHF
Feature Vector Construction Method for IRIS Recognition
Odinokikh, G.; Fartukov, A.; Korobkin, M.; Yoo, J.
2017-05-01
One of the basic stages of iris recognition pipeline is iris feature vector construction procedure. The procedure represents the extraction of iris texture information relevant to its subsequent comparison. Thorough investigation of feature vectors obtained from iris showed that not all the vector elements are equally relevant. There are two characteristics which determine the vector element utility: fragility and discriminability. Conventional iris feature extraction methods consider the concept of fragility as the feature vector instability without respect to the nature of such instability appearance. This work separates sources of the instability into natural and encodinginduced which helps deeply investigate each source of instability independently. According to the separation concept, a novel approach of iris feature vector construction is proposed. The approach consists of two steps: iris feature extraction using Gabor filtering with optimal parameters and quantization with separated preliminary optimized fragility thresholds. The proposed method has been tested on two different datasets of iris images captured under changing environmental conditions. The testing results show that the proposed method surpasses all the methods considered as a prior art by recognition accuracy on both datasets.
Directory of Open Access Journals (Sweden)
Shahrbanoo Goli
2016-01-01
Full Text Available The Support Vector Regression (SVR model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model.
Directory of Open Access Journals (Sweden)
Jianzhou Wang
2015-01-01
Full Text Available This paper develops an effectively intelligent model to forecast short-term wind speed series. A hybrid forecasting technique is proposed based on recurrence plot (RP and optimized support vector regression (SVR. Wind caused by the interaction of meteorological systems makes itself extremely unsteady and difficult to forecast. To understand the wind system, the wind speed series is analyzed using RP. Then, the SVR model is employed to forecast wind speed, in which the input variables are selected by RP, and two crucial parameters, including the penalties factor and gamma of the kernel function RBF, are optimized by various optimization algorithms. Those optimized algorithms are genetic algorithm (GA, particle swarm optimization algorithm (PSO, and cuckoo optimization algorithm (COA. Finally, the optimized SVR models, including COA-SVR, PSO-SVR, and GA-SVR, are evaluated based on some criteria and a hypothesis test. The experimental results show that (1 analysis of RP reveals that wind speed has short-term predictability on a short-term time scale, (2 the performance of the COA-SVR model is superior to that of the PSO-SVR and GA-SVR methods, especially for the jumping samplings, and (3 the COA-SVR method is statistically robust in multi-step-ahead prediction and can be applied to practical wind farm applications.
Directory of Open Access Journals (Sweden)
Meiping Wang
2016-01-01
Full Text Available We developed an effective intelligent model to predict the dynamic heat supply of heat source. A hybrid forecasting method was proposed based on support vector regression (SVR model-optimized particle swarm optimization (PSO algorithms. Due to the interaction of meteorological conditions and the heating parameters of heating system, it is extremely difficult to forecast dynamic heat supply. Firstly, the correlations among heat supply and related influencing factors in the heating system were analyzed through the correlation analysis of statistical theory. Then, the SVR model was employed to forecast dynamic heat supply. In the model, the input variables were selected based on the correlation analysis and three crucial parameters, including the penalties factor, gamma of the kernel RBF, and insensitive loss function, were optimized by PSO algorithms. The optimized SVR model was compared with the basic SVR, optimized genetic algorithm-SVR (GA-SVR, and artificial neural network (ANN through six groups of experiment data from two heat sources. The results of the correlation coefficient analysis revealed the relationship between the influencing factors and the forecasted heat supply and determined the input variables. The performance of the PSO-SVR model is superior to those of the other three models. The PSO-SVR method is statistically robust and can be applied to practical heating system.
MANCOVA for one way classification with homogeneity of regression coefficient vectors
Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.
2017-11-01
The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.
Water demand prediction using artificial neural networks and support vector regression
CSIR Research Space (South Africa)
Msiza, IS
2008-11-01
Full Text Available Neural Networks and Support Vector Regression Ishmael S. Msiza1, Fulufhelo V. Nelwamondo1,2, Tshilidzi Marwala3 . 1Modelling and Digital Science, CSIR, Johannesburg,SOUTH AFRICA 2Graduate School of Arts and Sciences, Harvard University, Cambridge..., Massachusetts, USA 3School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg, SOUTH AFRICA Email: imsiza@csir.co.za, nelwamon@fas.harvard.edu, tshilidzi.marwala@wits.ac.za Abstract— Computational Intelligence techniques...
Implicit Social Trust Dan Support Vector Regression Untuk Sistem Rekomendasi Berita
Directory of Open Access Journals (Sweden)
Melita Widya Ningrum
2018-01-01
Full Text Available Situs berita merupakan salah satu situs yang sering diakses masyarakat karena kemampuannya dalam menyajikan informasi terkini dari berbagai topik seperti olahraga, bisnis, politik, teknologi, kesehatan dan hiburan. Masyarakat dapat mencari dan melihat berita yang sedang populer dari seluruh dunia. Di sisi lain, melimpahnya artikel berita yang tersedia dapat menyulitkan pengguna dalam menemukan artikel berita yang sesuai dengan ketertarikannya. Pemilihan artikel berita yang ditampilkan ke halaman utama pengguna menjadi penting karena dapat meningkatkan minat pengguna untuk membaca artikel berita dari situs tersebut. Selain itu, pemilihan artikel berita yang sesuai dapat meminimalisir terjadinya banjir informasi yang tidak relevan. Dalam pemilihan artikel berita dibutuhkan sistem rekomendasi yang memiliki pengetahuan mengenai ketertarikan atau relevansi pengguna akan topik berita tertentu. Pada penelitian ini, peneliti membuat sistem rekomendasi artikel berita pada New York Times berbasis implicit social trust. Social trust dihasilkan dari interaksi antara pengguna dengan teman-temannya dan bobot kepercayaan teman pengguna pada media sosial Twitter. Data yang diambil merupakan data pengguna Twitter, teman dan jumlah interaksi antar pengguna berupa retweet. Sistem ini memanfaatkan algoritma Support Vector Regression untuk memberikan estimasi penilaian pengguna terhadap suatu topik tertentu. Hasil pengolahan data dengan Support Vector Regression menunjukkan tingkat akurasi dengan MAPE sebesar 0,8243075902233644%. Keywords : Twitter, Rekomendasi Berita, Social Trust, Support Vector Regression
Online Support Vector Regression with Varying Parameters for Time-Dependent Data
International Nuclear Information System (INIS)
Omitaomu, Olufemi A.; Jeong, Myong K.; Badiru, Adedeji B.
2011-01-01
Support vector regression (SVR) is a machine learning technique that continues to receive interest in several domains including manufacturing, engineering, and medicine. In order to extend its application to problems in which datasets arrive constantly and in which batch processing of the datasets is infeasible or expensive, an accurate online support vector regression (AOSVR) technique was proposed. The AOSVR technique efficiently updates a trained SVR function whenever a sample is added to or removed from the training set without retraining the entire training data. However, the AOSVR technique assumes that the new samples and the training samples are of the same characteristics; hence, the same value of SVR parameters is used for training and prediction. This assumption is not applicable to data samples that are inherently noisy and non-stationary such as sensor data. As a result, we propose Accurate On-line Support Vector Regression with Varying Parameters (AOSVR-VP) that uses varying SVR parameters rather than fixed SVR parameters, and hence accounts for the variability that may exist in the samples. To accomplish this objective, we also propose a generalized weight function to automatically update the weights of SVR parameters in on-line monitoring applications. The proposed function allows for lower and upper bounds for SVR parameters. We tested our proposed approach and compared results with the conventional AOSVR approach using two benchmark time series data and sensor data from nuclear power plant. The results show that using varying SVR parameters is more applicable to time dependent data.
Directory of Open Access Journals (Sweden)
S.K. Lahiri
2009-09-01
Full Text Available Soft sensors have been widely used in the industrial process control to improve the quality of the product and assure safety in the production. The core of a soft sensor is to construct a soft sensing model. This paper introduces support vector regression (SVR, a new powerful machine learning methodbased on a statistical learning theory (SLT into soft sensor modeling and proposes a new soft sensing modeling method based on SVR. This paper presents an artificial intelligence based hybrid soft sensormodeling and optimization strategies, namely support vector regression – genetic algorithm (SVR-GA for modeling and optimization of mono ethylene glycol (MEG quality variable in a commercial glycol plant. In the SVR-GA approach, a support vector regression model is constructed for correlating the process data comprising values of operating and performance variables. Next, model inputs describing the process operating variables are optimized using genetic algorithm with a view to maximize the process performance. The SVR-GA is a new strategy for soft sensor modeling and optimization. The major advantage of the strategies is that modeling and optimization can be conducted exclusively from the historic process data wherein the detailed knowledge of process phenomenology (reaction mechanism, kinetics etc. is not required. Using SVR-GA strategy, a number of sets of optimized operating conditions were found. The optimized solutions, when verified in an actual plant, resulted in a significant improvement in the quality.
Neutron Buildup Factors Calculation for Support Vector Regression Application in Shielding Analysis
International Nuclear Information System (INIS)
Duckic, P.; Matijevic, M.; Grgic, D.
2016-01-01
In this paper initial set of data for neutron buildup factors determination using Support Vector Regression (SVR) method is prepared. The performance of SVR technique strongly depends on the quality of information used for model training. Thus it is very important to provide representable data to the SVR. SVR is a supervised type of learning so it demands data in the input/output form. In the case of neutron buildup factors estimation, the input parameters are the incident neutron energy, shielding thickness and shielding material and the output parameter is the neutron buildup factor value. So far the initial sets of data for different shielding configurations have been obtained using SCALE4.4 sequence SAS3. However, this results were obtained using group constants, thus the incident neutron energy was determined as the average value for each energy group. Obtained this way, the data provided to the SVR are fewer and therefore insufficient. More valuable information is obtained using SCALE6.2beta5 sequence MAVRIC which can perform calculations for the explicit incident neutron energy, which leads to greater maneuvering possibilities when active learning measures are employed, and consequently improves the quality of the developed SVR model.(author).
Estimation of the laser cutting operating cost by support vector regression methodology
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Directory of Open Access Journals (Sweden)
Xian-Xia Zhang
2013-01-01
Full Text Available This paper presents a reference function based 3D FLC design methodology using support vector regression (SVR learning. The concept of reference function is introduced to 3D FLC for the generation of 3D membership functions (MF, which enhance the capability of the 3D FLC to cope with more kinds of MFs. The nonlinear mathematical expression of the reference function based 3D FLC is derived, and spatial fuzzy basis functions are defined. Via relating spatial fuzzy basis functions of a 3D FLC to kernel functions of an SVR, an equivalence relationship between a 3D FLC and an SVR is established. Therefore, a 3D FLC can be constructed using the learned results of an SVR. Furthermore, the universal approximation capability of the proposed 3D fuzzy system is proven in terms of the finite covering theorem. Finally, the proposed method is applied to a catalytic packed-bed reactor and simulation results have verified its effectiveness.
Reservoir rock permeability prediction using support vector regression in an Iranian oil field
International Nuclear Information System (INIS)
Saffarzadeh, Sadegh; Shadizadeh, Seyed Reza
2012-01-01
Reservoir permeability is a critical parameter for the evaluation of hydrocarbon reservoirs. It is often measured in the laboratory from reservoir core samples or evaluated from well test data. The prediction of reservoir rock permeability utilizing well log data is important because the core analysis and well test data are usually only available from a few wells in a field and have high coring and laboratory analysis costs. Since most wells are logged, the common practice is to estimate permeability from logs using correlation equations developed from limited core data; however, these correlation formulae are not universally applicable. Recently, support vector machines (SVMs) have been proposed as a new intelligence technique for both regression and classification tasks. The theory has a strong mathematical foundation for dependence estimation and predictive learning from finite data sets. The ultimate test for any technique that bears the claim of permeability prediction from well log data is the accurate and verifiable prediction of permeability for wells where only the well log data are available. The main goal of this paper is to develop the SVM method to obtain reservoir rock permeability based on well log data. (paper)
International Nuclear Information System (INIS)
Lins, Isis Didier; Droguett, Enrique López; Moura, Márcio das Chagas; Zio, Enrico; Jacinto, Carlos Magno
2015-01-01
Data-driven learning methods for predicting the evolution of the degradation processes affecting equipment are becoming increasingly attractive in reliability and prognostics applications. Among these, we consider here Support Vector Regression (SVR), which has provided promising results in various applications. Nevertheless, the predictions provided by SVR are point estimates whereas in order to take better informed decisions, an uncertainty assessment should be also carried out. For this, we apply bootstrap to SVR so as to obtain confidence and prediction intervals, without having to make any assumption about probability distributions and with good performance even when only a small data set is available. The bootstrapped SVR is first verified on Monte Carlo experiments and then is applied to a real case study concerning the prediction of degradation of a component from the offshore oil industry. The results obtained indicate that the bootstrapped SVR is a promising tool for providing reliable point and interval estimates, which can inform maintenance-related decisions on degrading components. - Highlights: • Bootstrap (pairs/residuals) and SVR are used as an uncertainty analysis framework. • Numerical experiments are performed to assess accuracy and coverage properties. • More bootstrap replications does not significantly improve performance. • Degradation of equipment of offshore oil wells is estimated by bootstrapped SVR. • Estimates about the scale growth rate can support maintenance-related decisions
Directory of Open Access Journals (Sweden)
Mustakim Mustakim
2016-02-01
Full Text Available The largest region that produces oil palm in Indonesia has an important role in improving the welfare of society and economy. Oil palm has increased significantly in Riau Province in every period, to determine the production development for the next few years with the functions and benefits of oil palm carried prediction production results that were seen from time series data last 8 years (2005-2013. In its prediction implementation, it was done by comparing the performance of Support Vector Regression (SVR method and Artificial Neural Network (ANN. From the experiment, SVR produced the best model compared with ANN. It is indicated by the correlation coefficient of 95% and 6% for MSE in the kernel Radial Basis Function (RBF, whereas ANN produced only 74% for R2 and 9% for MSE on the 8th experiment with hiden neuron 20 and learning rate 0,1. SVR model generates predictions for next 3 years which increased between 3% - 6% from actual data and RBF model predictions.
Zhou, Pei-pei; Shan, Jin-feng; Jiang, Jian-lan
2015-12-01
To optimize the optimal microwave-assisted extraction method of curcuminoids from Curcuma longa. On the base of single factor experiment, the ethanol concentration, the ratio of liquid to solid and the microwave time were selected for further optimization. Support Vector Regression (SVR) and Central Composite Design-Response Surface Methodology (CCD) algorithm were utilized to design and establish models respectively, while Particle Swarm Optimization (PSO) was introduced to optimize the parameters of SVR models and to search optimal points of models. The evaluation indicator, the sum of curcumin, demethoxycurcumin and bisdemethoxycurcumin by HPLC, were used. The optimal parameters of microwave-assisted extraction were as follows: ethanol concentration of 69%, ratio of liquid to solid of 21 : 1, microwave time of 55 s. On those conditions, the sum of three curcuminoids was 28.97 mg/g (per gram of rhizomes powder). Both the CCD model and the SVR model were credible, for they have predicted the similar process condition and the deviation of yield were less than 1.2%.
Sherafati, Sh. A.; Saradjian, M. R.; Niazmardi, S.
2013-09-01
Numerous investigations on Urban Heat Island (UHI) show that land cover change is the main factor of increasing Land Surface Temperature (LST) in urban areas. Therefore, to achieve a model which is able to simulate UHI growth, urban expansion should be concerned first. Considerable researches on urban expansion modeling have been done based on cellular automata. Accordingly the objective of this paper is to implement CA method for trend detection of Tehran UHI spatiotemporal growth based on urban sprawl parameters (such as Distance to nearest road, Digital Elevation Model (DEM), Slope and Aspect ratios). It should be mentioned that UHI growth modeling may have more complexities in comparison with urban expansion, since the amount of each pixel's temperature should be investigated instead of its state (urban and non-urban areas). The most challenging part of CA model is the definition of Transfer Rules. Here, two methods have used to find appropriate transfer Rules which are Artificial Neural Networks (ANN) and Support Vector Regression (SVR). The reason of choosing these approaches is that artificial neural networks and support vector regression have significant abilities to handle the complications of such a spatial analysis in comparison with other methods like Genetic or Swarm intelligence. In this paper, UHI change trend has discussed between 1984 and 2007. For this purpose, urban sprawl parameters in 1984 have calculated and added to the retrieved LST of this year. In order to achieve LST, Thematic Mapper (TM) and Enhanced Thematic Mapper (ETM+) night-time images have exploited. The reason of implementing night-time images is that UHI phenomenon is more obvious during night hours. After that multilayer feed-forward neural networks and support vector regression have used separately to find the relationship between this data and the retrieved LST in 2007. Since the transfer rules might not be the same in different regions, the satellite image of the city has
International Nuclear Information System (INIS)
Liu, Jie
2015-01-01
This Ph. D. work is motivated by the possibility of monitoring the conditions of components of energy systems for their extended and safe use, under proper practice of operation and adequate policies of maintenance. The aim is to develop a Support Vector Regression (SVR)-based framework for predicting time series data under stationary/nonstationary environmental and operational conditions. Single SVR and SVR-based ensemble approaches are developed to tackle the prediction problem based on both small and large datasets. Strategies are proposed for adaptively updating the single SVR and SVR-based ensemble models in the existence of pattern drifts. Comparisons with other online learning approaches for kernel-based modelling are provided with reference to time series data from a critical component in Nuclear Power Plants (NPPs) provided by Electricite de France (EDF). The results show that the proposed approaches achieve comparable prediction results, considering the Mean Squared Error (MSE) and Mean Relative Error (MRE), in much less computation time. Furthermore, by analyzing the geometrical meaning of the Feature Vector Selection (FVS) method proposed in the literature, a novel geometrically interpretable kernel method, named Reduced Rank Kernel Ridge Regression-II (RRKRR-II), is proposed to describe the linear relations between a predicted value and the predicted values of the Feature Vectors (FVs) selected by FVS. Comparisons with several kernel methods on a number of public datasets prove the good prediction accuracy and the easy-of-tuning of the hyper-parameters of RRKRR-II. (author)
International Nuclear Information System (INIS)
Comesanna Garcia, Yumirka; Dago Morales, Angel; Talavera Bustamante, Isneri
2010-01-01
The recently introduction of the least squares support vector machines method for regression purposes in the field of Chemometrics has provided several advantages to linear and nonlinear multivariate calibration methods. The objective of the paper was to propose the use of the least squares support vector machine as an alternative multivariate calibration method for the prediction of the percentage of crystallinity of fluidized catalytic cracking catalysts, by means of Fourier transform mid-infrared spectroscopy. A linear kernel was used in the calculations of the regression model. The optimization of its gamma parameter was carried out using the leave-one-out cross-validation procedure. The root mean square error of prediction was used to measure the performance of the model. The accuracy of the results obtained with the application of the method is in accordance with the uncertainty of the X-ray powder diffraction reference method. To compare the generalization capability of the developed method, a comparison study was carried out, taking into account the results achieved with the new model and those reached through the application of linear calibration methods. The developed method can be easily implemented in refinery laboratories
Advanced signal processing based on support vector regression for lidar applications
Gelfusa, M.; Murari, A.; Malizia, A.; Lungaroni, M.; Peluso, E.; Parracino, S.; Talebzadeh, S.; Vega, J.; Gaudio, P.
2015-10-01
The LIDAR technique has recently found many applications in atmospheric physics and remote sensing. One of the main issues, in the deployment of systems based on LIDAR, is the filtering of the backscattered signal to alleviate the problems generated by noise. Improvement in the signal to noise ratio is typically achieved by averaging a quite large number (of the order of hundreds) of successive laser pulses. This approach can be effective but presents significant limitations. First of all, it implies a great stress on the laser source, particularly in the case of systems for automatic monitoring of large areas for long periods. Secondly, this solution can become difficult to implement in applications characterised by rapid variations of the atmosphere, for example in the case of pollutant emissions, or by abrupt changes in the noise. In this contribution, a new method for the software filtering and denoising of LIDAR signals is presented. The technique is based on support vector regression. The proposed new method is insensitive to the statistics of the noise and is therefore fully general and quite robust. The developed numerical tool has been systematically compared with the most powerful techniques available, using both synthetic and experimental data. Its performances have been tested for various statistical distributions of the noise and also for other disturbances of the acquired signal such as outliers. The competitive advantages of the proposed method are fully documented. The potential of the proposed approach to widen the capability of the LIDAR technique, particularly in the detection of widespread smoke, is discussed in detail.
Pair- ${v}$ -SVR: A Novel and Efficient Pairing nu-Support Vector Regression Algorithm.
Hao, Pei-Yi
This paper proposes a novel and efficient pairing nu-support vector regression (pair--SVR) algorithm that combines successfully the superior advantages of twin support vector regression (TSVR) and classical -SVR algorithms. In spirit of TSVR, the proposed pair--SVR solves two quadratic programming problems (QPPs) of smaller size rather than a single larger QPP, and thus has faster learning speed than classical -SVR. The significant advantage of our pair--SVR over TSVR is the improvement in the prediction speed and generalization ability by introducing the concepts of the insensitive zone and the regularization term that embodies the essence of statistical learning theory. Moreover, pair--SVR has additional advantage of using parameter for controlling the bounds on fractions of SVs and errors. Furthermore, the upper bound and lower bound functions of the regression model estimated by pair--SVR capture well the characteristics of data distributions, thus facilitating automatic estimation of the conditional mean and predictive variance simultaneously. This may be useful in many cases, especially when the noise is heteroscedastic and depends strongly on the input values. The experimental results validate the superiority of our pair--SVR in both training/prediction speed and generalization ability.This paper proposes a novel and efficient pairing nu-support vector regression (pair--SVR) algorithm that combines successfully the superior advantages of twin support vector regression (TSVR) and classical -SVR algorithms. In spirit of TSVR, the proposed pair--SVR solves two quadratic programming problems (QPPs) of smaller size rather than a single larger QPP, and thus has faster learning speed than classical -SVR. The significant advantage of our pair--SVR over TSVR is the improvement in the prediction speed and generalization ability by introducing the concepts of the insensitive zone and the regularization term that embodies the essence of statistical learning theory
Directory of Open Access Journals (Sweden)
Shokri Saeid
2015-01-01
Full Text Available An accurate prediction of sulfur content is very important for the proper operation and product quality control in hydrodesulfurization (HDS process. For this purpose, a reliable data- driven soft sensors utilizing Support Vector Regression (SVR was developed and the effects of integrating Vector Quantization (VQ with Principle Component Analysis (PCA were studied on the assessment of this soft sensor. First, in pre-processing step the PCA and VQ techniques were used to reduce dimensions of the original input datasets. Then, the compressed datasets were used as input variables for the SVR model. Experimental data from the HDS setup were employed to validate the proposed integrated model. The integration of VQ/PCA techniques with SVR model was able to increase the prediction accuracy of SVR. The obtained results show that integrated technique (VQ-SVR was better than (PCA-SVR in prediction accuracy. Also, VQ decreased the sum of the training and test time of SVR model in comparison with PCA. For further evaluation, the performance of VQ-SVR model was also compared to that of SVR. The obtained results indicated that VQ-SVR model delivered the best satisfactory predicting performance (AARE= 0.0668 and R2= 0.995 in comparison with investigated models.
Valizadeh, Maryam; Sohrabi, Mahmoud Reza
2018-03-01
In the present study, artificial neural networks (ANNs) and support vector regression (SVR) as intelligent methods coupled with UV spectroscopy for simultaneous quantitative determination of Dorzolamide (DOR) and Timolol (TIM) in eye drop. Several synthetic mixtures were analyzed for validating the proposed methods. At first, neural network time series, which one type of network from the artificial neural network was employed and its efficiency was evaluated. Afterwards, the radial basis network was applied as another neural network. Results showed that the performance of this method is suitable for predicting. Finally, support vector regression was proposed to construct the Zilomole prediction model. Also, root mean square error (RMSE) and mean recovery (%) were calculated for SVR method. Moreover, the proposed methods were compared to the high-performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them. Also, the effect of interferences was investigated in spike solutions.
International Nuclear Information System (INIS)
Chen, Kuilin; Yu, Jie
2014-01-01
Highlights: • A novel hybrid modeling method is proposed for short-term wind speed forecasting. • Support vector regression model is constructed to formulate nonlinear state-space framework. • Unscented Kalman filter is adopted to recursively update states under random uncertainty. • The new SVR–UKF approach is compared to several conventional methods for short-term wind speed prediction. • The proposed method demonstrates higher prediction accuracy and reliability. - Abstract: Accurate wind speed forecasting is becoming increasingly important to improve and optimize renewable wind power generation. Particularly, reliable short-term wind speed prediction can enable model predictive control of wind turbines and real-time optimization of wind farm operation. However, this task remains challenging due to the strong stochastic nature and dynamic uncertainty of wind speed. In this study, unscented Kalman filter (UKF) is integrated with support vector regression (SVR) based state-space model in order to precisely update the short-term estimation of wind speed sequence. In the proposed SVR–UKF approach, support vector regression is first employed to formulate a nonlinear state-space model and then unscented Kalman filter is adopted to perform dynamic state estimation recursively on wind sequence with stochastic uncertainty. The novel SVR–UKF method is compared with artificial neural networks (ANNs), SVR, autoregressive (AR) and autoregressive integrated with Kalman filter (AR-Kalman) approaches for predicting short-term wind speed sequences collected from three sites in Massachusetts, USA. The forecasting results indicate that the proposed method has much better performance in both one-step-ahead and multi-step-ahead wind speed predictions than the other approaches across all the locations
Modeling of Soil Aggregate Stability using Support Vector Machines and Multiple Linear Regression
Directory of Open Access Journals (Sweden)
Ali Asghar Besalatpour
2016-02-01
Full Text Available Introduction: Soil aggregate stability is a key factor in soil resistivity to mechanical stresses, including the impacts of rainfall and surface runoff, and thus to water erosion (Canasveras et al., 2010. Various indicators have been proposed to characterize and quantify soil aggregate stability, for example percentage of water-stable aggregates (WSA, mean weight diameter (MWD, geometric mean diameter (GMD of aggregates, and water-dispersible clay (WDC content (Calero et al., 2008. Unfortunately, the experimental methods available to determine these indicators are laborious, time-consuming and difficult to standardize (Canasveras et al., 2010. Therefore, it would be advantageous if aggregate stability could be predicted indirectly from more easily available data (Besalatpour et al., 2014. The main objective of this study is to investigate the potential use of support vector machines (SVMs method for estimating soil aggregate stability (as quantified by GMD as compared to multiple linear regression approach. Materials and Methods: The study area was part of the Bazoft watershed (31° 37′ to 32° 39′ N and 49° 34′ to 50° 32′ E, which is located in the Northern part of the Karun river basin in central Iran. A total of 160 soil samples were collected from the top 5 cm of soil surface. Some easily available characteristics including topographic, vegetation, and soil properties were used as inputs. Soil organic matter (SOM content was determined by the Walkley-Black method (Nelson & Sommers, 1986. Particle size distribution in the soil samples (clay, silt, sand, fine sand, and very fine sand were measured using the procedure described by Gee & Bauder (1986 and calcium carbonate equivalent (CCE content was determined by the back-titration method (Nelson, 1982. The modified Kemper & Rosenau (1986 method was used to determine wet-aggregate stability (GMD. The topographic attributes of elevation, slope, and aspect were characterized using a 20-m
Wang, Xiaolei
2014-12-12
Background: A quantitative understanding of interactions between transcription factors (TFs) and their DNA binding sites is key to the rational design of gene regulatory networks. Recent advances in high-throughput technologies have enabled high-resolution measurements of protein-DNA binding affinity. Importantly, such experiments revealed the complex nature of TF-DNA interactions, whereby the effects of nucleotide changes on the binding affinity were observed to be context dependent. A systematic method to give high-quality estimates of such complex affinity landscapes is, thus, essential to the control of gene expression and the advance of synthetic biology. Results: Here, we propose a two-round prediction method that is based on support vector regression (SVR) with weighted degree (WD) kernels. In the first round, a WD kernel with shifts and mismatches is used with SVR to detect the importance of subsequences with different lengths at different positions. The subsequences identified as important in the first round are then fed into a second WD kernel to fit the experimentally measured affinities. To our knowledge, this is the first attempt to increase the accuracy of the affinity prediction by applying two rounds of string kernels and by identifying a small number of crucial k-mers. The proposed method was tested by predicting the binding affinity landscape of Gcn4p in Saccharomyces cerevisiae using datasets from HiTS-FLIP. Our method explicitly identified important subsequences and showed significant performance improvements when compared with other state-of-the-art methods. Based on the identified important subsequences, we discovered two surprisingly stable 10-mers and one sensitive 10-mer which were not reported before. Further test on four other TFs in S. cerevisiae demonstrated the generality of our method. Conclusion: We proposed in this paper a two-round method to quantitatively model the DNA binding affinity landscape. Since the ability to modify
International Nuclear Information System (INIS)
Wang, Lu; Su, Steven W; Celler, Branko G; Chan, Gregory S H; Cheng, Teddy M; Savkin, Andrey V
2009-01-01
This study aims to quantitatively describe the steady-state relationships among percentage changes in key central cardiovascular variables (i.e. stroke volume, heart rate (HR), total peripheral resistance and cardiac output), measured using non-invasive means, in response to moderate exercise, and the oxygen uptake rate, using a new nonlinear regression approach—support vector regression. Ten untrained normal males exercised in an upright position on an electronically braked cycle ergometer with constant workloads ranging from 25 W to 125 W. Throughout the experiment, .VO 2 was determined breath by breath and the HR was monitored beat by beat. During the last minute of each exercise session, the cardiac output was measured beat by beat using a novel non-invasive ultrasound-based device and blood pressure was measured using a tonometric measurement device. Based on the analysis of experimental data, nonlinear steady-state relationships between key central cardiovascular variables and .VO 2 were qualitatively observed except for the HR which increased linearly as a function of increasing .VO 2 . Quantitative descriptions of these complex nonlinear behaviour were provided by nonparametric models which were obtained by using support vector regression
Directory of Open Access Journals (Sweden)
Cheng-Wen Lee
2017-11-01
Full Text Available Accurate electricity forecasting is still the critical issue in many energy management fields. The applications of hybrid novel algorithms with support vector regression (SVR models to overcome the premature convergence problem and improve forecasting accuracy levels also deserve to be widely explored. This paper applies chaotic function and quantum computing concepts to address the embedded drawbacks including crossover and mutation operations of genetic algorithms. Then, this paper proposes a novel electricity load forecasting model by hybridizing chaotic function and quantum computing with GA in an SVR model (named SVRCQGA to achieve more satisfactory forecasting accuracy levels. Experimental examples demonstrate that the proposed SVRCQGA model is superior to other competitive models.
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Fei, Cheng-Wei; Bai, Guang-Chen
2014-12-01
To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.
International Nuclear Information System (INIS)
Hong, W.-C.
2009-01-01
Accurate forecasting of electric load has always been the most important issues in the electricity industry, particularly for developing countries. Due to the various influences, electric load forecasting reveals highly nonlinear characteristics. Recently, support vector regression (SVR), with nonlinear mapping capabilities of forecasting, has been successfully employed to solve nonlinear regression and time series problems. However, it is still lack of systematic approaches to determine appropriate parameter combination for a SVR model. This investigation elucidates the feasibility of applying chaotic particle swarm optimization (CPSO) algorithm to choose the suitable parameter combination for a SVR model. The empirical results reveal that the proposed model outperforms the other two models applying other algorithms, genetic algorithm (GA) and simulated annealing algorithm (SA). Finally, it also provides the theoretical exploration of the electric load forecasting support system (ELFSS)
Directory of Open Access Journals (Sweden)
Chan-Yun Yang
2013-01-01
Full Text Available Austempered ductile iron has emerged as a notable material in several engineering fields, including marine applications. The initial austenite carbon content after austenization transform but before austempering process for generating bainite matrix proved critical in controlling the resulted microstructure and thus mechanical properties. In this paper, support vector regression is employed in order to establish a relationship between the initial carbon concentration in the austenite with austenization temperature and alloy contents, thereby exercising improved control in the mechanical properties of the austempered ductile irons. Particularly, the paper emphasizes a methodology tailored to deal with a limited amount of available data with intrinsically contracted and skewed distribution. The collected information from a variety of data sources presents another challenge of highly uncertain variance. The authors present a hybrid model consisting of a procedure of a histogram equalizer and a procedure of a support-vector-machine (SVM- based regression to gain a more robust relationship to respond to the challenges. The results show greatly improved accuracy of the proposed model in comparison to two former established methodologies. The sum squared error of the present model is less than one fifth of that of the two previous models.
International Nuclear Information System (INIS)
Hong, Wei-Chiang
2011-01-01
Support vector regression (SVR), with hybrid chaotic sequence and evolutionary algorithms to determine suitable values of its three parameters, not only can effectively avoid converging prematurely (i.e., trapping into a local optimum), but also reveals its superior forecasting performance. Electric load sometimes demonstrates a seasonal (cyclic) tendency due to economic activities or climate cyclic nature. The applications of SVR models to deal with seasonal (cyclic) electric load forecasting have not been widely explored. In addition, the concept of recurrent neural networks (RNNs), focused on using past information to capture detailed information, is helpful to be combined into an SVR model. This investigation presents an electric load forecasting model which combines the seasonal recurrent support vector regression model with chaotic artificial bee colony algorithm (namely SRSVRCABC) to improve the forecasting performance. The proposed SRSVRCABC employs the chaotic behavior of honey bees which is with better performance in function optimization to overcome premature local optimum. A numerical example from an existed reference is used to elucidate the forecasting performance of the proposed SRSVRCABC model. The forecasting results indicate that the proposed model yields more accurate forecasting results than ARIMA and TF-ε-SVR-SA models. Therefore, the SRSVRCABC model is a promising alternative for electric load forecasting. -- Highlights: → Hybridizing the seasonal adjustment and the recurrent mechanism into an SVR model. → Employing chaotic sequence to improve the premature convergence of artificial bee colony algorithm. → Successfully providing significant accurate monthly load demand forecasting.
Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model
DEFF Research Database (Denmark)
Møller, Niels Framroze
This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its stru....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its......, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well...
International Nuclear Information System (INIS)
Baser, Furkan; Demirhan, Haydar
2017-01-01
Accurate estimation of the amount of horizontal global solar radiation for a particular field is an important input for decision processes in solar radiation investments. In this article, we focus on the estimation of yearly mean daily horizontal global solar radiation by using an approach that utilizes fuzzy regression functions with support vector machine (FRF-SVM). This approach is not seriously affected by outlier observations and does not suffer from the over-fitting problem. To demonstrate the utility of the FRF-SVM approach in the estimation of horizontal global solar radiation, we conduct an empirical study over a dataset collected in Turkey and applied the FRF-SVM approach with several kernel functions. Then, we compare the estimation accuracy of the FRF-SVM approach to an adaptive neuro-fuzzy system and a coplot supported-genetic programming approach. We observe that the FRF-SVM approach with a Gaussian kernel function is not affected by both outliers and over-fitting problem and gives the most accurate estimates of horizontal global solar radiation among the applied approaches. Consequently, the use of hybrid fuzzy functions and support vector machine approaches is found beneficial in long-term forecasting of horizontal global solar radiation over a region with complex climatic and terrestrial characteristics. - Highlights: • A fuzzy regression functions with support vector machines approach is proposed. • The approach is robust against outlier observations and over-fitting problem. • Estimation accuracy of the model is superior to several existent alternatives. • A new solar radiation estimation model is proposed for the region of Turkey. • The model is useful under complex terrestrial and climatic conditions.
A method for nonlinear exponential regression analysis
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Marco, F. J.; Martínez, M. J.; López, J. A.
2015-04-01
The high quality of Hipparcos data in position, proper motion, and parallax has allowed for studies about stellar kinematics with the aim of achieving a better physical understanding of our galaxy, based on accurate calculus of the Ogorodnikov-Milne model (OMM) parameters. The use of discrete least squares is the most common adjustment method, but it may lead to errors mainly because of the inhomogeneous spatial distribution of the data. We present an example of the instability of this method using the case of a function given by a linear combination of Legendre polynomials. These polynomials are basic in the use of vector spherical harmonics, which have been used to compute the OMM parameters by several authors, such as Makarov & Murphy, Mignard & Klioner, and Vityazev & Tsvetkov. To overcome the former problem, we propose the use of a mixed method (see Marco et al.) that includes the extension of the functions of residuals to any point on the celestial sphere. The goal is to be able to work with continuous variables in the calculation of the coefficients of the vector spherical harmonic developments with stability and efficiency. We apply this mixed procedure to the study of the kinematics of the stars in our Galaxy, employing the Hipparcos velocity field data to obtain the OMM parameters. Previously, we tested the method by perturbing the Vectorial Spherical Harmonics model as well as the velocity vector field.
Seasonal prediction of winter extreme precipitation over Canada by support vector regression
Directory of Open Access Journals (Sweden)
Z. Zeng
2011-01-01
Full Text Available For forecasting the maximum 5-day accumulated precipitation over the winter season at lead times of 3, 6, 9 and 12 months over Canada from 1950 to 2007, two nonlinear and two linear regression models were used, where the models were support vector regression (SVR (nonlinear and linear versions, nonlinear Bayesian neural network (BNN and multiple linear regression (MLR. The 118 stations were grouped into six geographic regions by K-means clustering. For each region, the leading principal components of the winter maximum 5-d accumulated precipitation anomalies were the predictands. Potential predictors included quasi-global sea surface temperature anomalies and 500 hPa geopotential height anomalies over the Northern Hemisphere, as well as six climate indices (the Niño-3.4 region sea surface temperature, the North Atlantic Oscillation, the Pacific-North American teleconnection, the Pacific Decadal Oscillation, the Scandinavia pattern, and the East Atlantic pattern. The results showed that in general the two robust SVR models tended to have better forecast skills than the two non-robust models (MLR and BNN, and the nonlinear SVR model tended to forecast slightly better than the linear SVR model. Among the six regions, the Prairies region displayed the highest forecast skills, and the Arctic region the second highest. The strongest nonlinearity was manifested over the Prairies and the weakest nonlinearity over the Arctic.
Investigation of Optimal Integrated Circuit Raster Image Vectorization Method
Directory of Open Access Journals (Sweden)
Leonas Jasevičius
2011-03-01
Full Text Available Visual analysis of integrated circuit layer requires raster image vectorization stage to extract layer topology data to CAD tools. In this paper vectorization problems of raster IC layer images are presented. Various line extraction from raster images algorithms and their properties are discussed. Optimal raster image vectorization method was developed which allows utilization of common vectorization algorithms to achieve the best possible extracted vector data match with perfect manual vectorization results. To develop the optimal method, vectorized data quality dependence on initial raster image skeleton filter selection was assessed.Article in Lithuanian
Traditional and robust vector selection methods for use with similarity based models
International Nuclear Information System (INIS)
Hines, J. W.; Garvey, D. R.
2006-01-01
Vector selection, or instance selection as it is often called in the data mining literature, performs a critical task in the development of nonparametric, similarity based models. Nonparametric, similarity based modeling (SBM) is a form of 'lazy learning' which constructs a local model 'on the fly' by comparing a query vector to historical, training vectors. For large training sets the creation of local models may become cumbersome, since each training vector must be compared to the query vector. To alleviate this computational burden, varying forms of training vector sampling may be employed with the goal of selecting a subset of the training data such that the samples are representative of the underlying process. This paper describes one such SBM, namely auto-associative kernel regression (AAKR), and presents five traditional vector selection methods and one robust vector selection method that may be used to select prototype vectors from a larger data set in model training. The five traditional vector selection methods considered are min-max, vector ordering, combination min-max and vector ordering, fuzzy c-means clustering, and Adeli-Hung clustering. Each method is described in detail and compared using artificially generated data and data collected from the steam system of an operating nuclear power plant. (authors)
Supplier Short Term Load Forecasting Using Support Vector Regression and Exogenous Input
Matijaš, Marin; Vukićcević, Milan; Krajcar, Slavko
2011-09-01
In power systems, task of load forecasting is important for keeping equilibrium between production and consumption. With liberalization of electricity markets, task of load forecasting changed because each market participant has to forecast their own load. Consumption of end-consumers is stochastic in nature. Due to competition, suppliers are not in a position to transfer their costs to end-consumers; therefore it is essential to keep forecasting error as low as possible. Numerous papers are investigating load forecasting from the perspective of the grid or production planning. We research forecasting models from the perspective of a supplier. In this paper, we investigate different combinations of exogenous input on the simulated supplier loads and show that using points of delivery as a feature for Support Vector Regression leads to lower forecasting error, while adding customer number in different datasets does the opposite.
International Nuclear Information System (INIS)
Bae, In Ho; Naa, Man Gyun; Lee, Yoon Joon; Park, Goon Cherl
2009-01-01
The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, F q ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor
Directory of Open Access Journals (Sweden)
Esperanza García-Gonzalo
2016-01-01
Full Text Available Predicting paper properties based on a limited number of measured variables can be an important tool for the industry. Mathematical models were developed to predict mechanical and optical properties from the corresponding paper density for some softwood papers using support vector machine regression with the Radial Basis Function Kernel. A dataset of different properties of paper handsheets produced from pulps of pine (Pinus pinaster and P. sylvestris and cypress species (Cupressus lusitanica, C. sempervirens, and C. arizonica beaten at 1000, 4000, and 7000 revolutions was used. The results show that it is possible to obtain good models (with high coefficient of determination with two variables: the numerical variable density and the categorical variable species.
Forecasting systems reliability based on support vector regression with genetic algorithms
International Nuclear Information System (INIS)
Chen, K.-Y.
2007-01-01
This study applies a novel neural-network technique, support vector regression (SVR), to forecast reliability in engine systems. The aim of this study is to examine the feasibility of SVR in systems reliability prediction by comparing it with the existing neural-network approaches and the autoregressive integrated moving average (ARIMA) model. To build an effective SVR model, SVR's parameters must be set carefully. This study proposes a novel approach, known as GA-SVR, which searches for SVR's optimal parameters using real-value genetic algorithms, and then adopts the optimal parameters to construct the SVR models. A real reliability data for 40 suits of turbochargers were employed as the data set. The experimental results demonstrate that SVR outperforms the existing neural-network approaches and the traditional ARIMA models based on the normalized root mean square error and mean absolute percentage error
Febrian Umbara, Rian; Tarwidi, Dede; Budi Setiawan, Erwin
2018-03-01
The paper discusses the prediction of Jakarta Composite Index (JCI) in Indonesia Stock Exchange. The study is based on JCI historical data for 1286 days to predict the value of JCI one day ahead. This paper proposes predictions done in two stages., The first stage using Fuzzy Time Series (FTS) to predict values of ten technical indicators, and the second stage using Support Vector Regression (SVR) to predict the value of JCI one day ahead, resulting in a hybrid prediction model FTS-SVR. The performance of this combined prediction model is compared with the performance of the single stage prediction model using SVR only. Ten technical indicators are used as input for each model.
International Nuclear Information System (INIS)
Seo, Inyong; Ha, Bokam; Lee, Sungwoo; Shin, Changhoon; Lee, Jaeyong; Kim, Seongjun
2011-01-01
In a nuclear power plant (NPP), periodic sensor calibrations are required to assure sensors are operating correctly. However, only a few faulty sensors are found to be rectified. For the safe operation of an NPP and the reduction of unnecessary calibration, on-line calibration monitoring is needed. In this study, an on-line calibration monitoring called KPCSVR using k-means clustering and principal component based Auto-Associative support vector regression (PCSVR) is proposed for nuclear power plant. To reduce the training time of the model, k-means clustering method was used. Response surface methodology is employed to efficiently determine the optimal values of support vector regression hyperparameters. The proposed KPCSVR model was confirmed with actual plant data of Kori Nuclear Power Plant Unit 3 which were measured from the primary and secondary systems of the plant, and compared with the PCSVR model. By using data clustering, the average accuracy of PCSVR improved from 1.228×10 -4 to 0.472×10 -4 and the average sensitivity of PCSVR from 0.0930 to 0.0909, which results in good detection of sensor drift. Moreover, the training time is greatly reduced from 123.5 to 31.5 sec. (author)
Support vector regression model for the estimation of γ-ray buildup factors for multi-layer shields
International Nuclear Information System (INIS)
Trontl, Kresimir; Smuc, Tomislav; Pevec, Dubravko
2007-01-01
The accuracy of the point-kernel method, which is a widely used practical tool for γ-ray shielding calculations, strongly depends on the quality and accuracy of buildup factors used in the calculations. Although, buildup factors for single-layer shields comprised of a single material are well known, calculation of buildup factors for stratified shields, each layer comprised of different material or a combination of materials, represent a complex physical problem. Recently, a new compact mathematical model for multi-layer shield buildup factor representation has been suggested for embedding into point-kernel codes thus replacing traditionally generated complex mathematical expressions. The new regression model is based on support vector machines learning technique, which is an extension of Statistical Learning Theory. The paper gives complete description of the novel methodology with results pertaining to realistic engineering multi-layer shielding geometries. The results based on support vector regression machine learning confirm that this approach provides a framework for general, accurate and computationally acceptable multi-layer buildup factor model
A Vector Approach to Regression Analysis and Its Implications to Heavy-Duty Diesel Emissions
Energy Technology Data Exchange (ETDEWEB)
McAdams, H.T.
2001-02-14
An alternative approach is presented for the regression of response data on predictor variables that are not logically or physically separable. The methodology is demonstrated by its application to a data set of heavy-duty diesel emissions. Because of the covariance of fuel properties, it is found advantageous to redefine the predictor variables as vectors, in which the original fuel properties are components, rather than as scalars each involving only a single fuel property. The fuel property vectors are defined in such a way that they are mathematically independent and statistically uncorrelated. Because the available data set does not allow definitive separation of vehicle and fuel effects, and because test fuels used in several of the studies may be unrealistically contrived to break the association of fuel variables, the data set is not considered adequate for development of a full-fledged emission model. Nevertheless, the data clearly show that only a few basic patterns of fuel-property variation affect emissions and that the number of these patterns is considerably less than the number of variables initially thought to be involved. These basic patterns, referred to as ''eigenfuels,'' may reflect blending practice in accordance with their relative weighting in specific circumstances. The methodology is believed to be widely applicable in a variety of contexts. It promises an end to the threat of collinearity and the frustration of attempting, often unrealistically, to separate variables that are inseparable.
Directory of Open Access Journals (Sweden)
N. Sujay Raghavendra
2015-12-01
Full Text Available This research demonstrates the state-of-the-art capability of Wavelet packet analysis in improving the forecasting efficiency of Support vector regression (SVR through the development of a novel hybrid Wavelet packet–Support vector regression (WP–SVR model for forecasting monthly groundwater level fluctuations observed in three shallow unconfined coastal aquifers. The Sequential Minimal Optimization Algorithm-based SVR model is also employed for comparative study with WP–SVR model. The input variables used for modeling were monthly time series of total rainfall, average temperature, mean tide level, and past groundwater level observations recorded during the period 1996–2006 at three observation wells located near Mangalore, India. The Radial Basis function is employed as a kernel function during SVR modeling. Model parameters are calibrated using the first seven years of data, and the remaining three years data are used for model validation using various input combinations. The performance of both the SVR and WP–SVR models is assessed using different statistical indices. From the comparative result analysis of the developed models, it can be seen that WP–SVR model outperforms the classic SVR model in predicting groundwater levels at all the three well locations (e.g. NRMSE(WP–SVR = 7.14, NRMSE(SVR = 12.27; NSE(WP–SVR = 0.91, NSE(SVR = 0.8 during the test phase with respect to well location at Surathkal. Therefore, using the WP–SVR model is highly acceptable for modeling and forecasting of groundwater level fluctuations.
Fusion rule estimation using vector space methods
International Nuclear Information System (INIS)
Rao, N.S.V.
1997-01-01
In a system of N sensors, the sensor S j , j = 1, 2 .... N, outputs Y (j) element-of Re, according to an unknown probability distribution P (Y(j) /X) , corresponding to input X element-of [0, 1]. A training n-sample (X 1 , Y 1 ), (X 2 , Y 2 ), ..., (X n , Y n ) is given where Y i = (Y i (1) , Y i (2) , . . . , Y i N ) such that Y i (j) is the output of S j in response to input X i . The problem is to estimate a fusion rule f : Re N → [0, 1], based on the sample, such that the expected square error is minimized over a family of functions Y that constitute a vector space. The function f* that minimizes the expected error cannot be computed since the underlying densities are unknown, and only an approximation f to f* is feasible. We estimate the sample size sufficient to ensure that f provides a close approximation to f* with a high probability. The advantages of vector space methods are two-fold: (a) the sample size estimate is a simple function of the dimensionality of F, and (b) the estimate f can be easily computed by well-known least square methods in polynomial time. The results are applicable to the classical potential function methods and also (to a recently proposed) special class of sigmoidal feedforward neural networks
Blood glucose level prediction based on support vector regression using mobile platforms.
Reymann, Maximilian P; Dorschky, Eva; Groh, Benjamin H; Martindale, Christine; Blank, Peter; Eskofier, Bjoern M
2016-08-01
The correct treatment of diabetes is vital to a patient's health: Staying within defined blood glucose levels prevents dangerous short- and long-term effects on the body. Mobile devices informing patients about their future blood glucose levels could enable them to take counter-measures to prevent hypo or hyper periods. Previous work addressed this challenge by predicting the blood glucose levels using regression models. However, these approaches required a physiological model, representing the human body's response to insulin and glucose intake, or are not directly applicable to mobile platforms (smart phones, tablets). In this paper, we propose an algorithm for mobile platforms to predict blood glucose levels without the need for a physiological model. Using an online software simulator program, we trained a Support Vector Regression (SVR) model and exported the parameter settings to our mobile platform. The prediction accuracy of our mobile platform was evaluated with pre-recorded data of a type 1 diabetes patient. The blood glucose level was predicted with an error of 19 % compared to the true value. Considering the permitted error of commercially used devices of 15 %, our algorithm is the basis for further development of mobile prediction algorithms.
Directory of Open Access Journals (Sweden)
Weize Li
2018-01-01
Full Text Available Research on stealthiness has become an important topic in the field of data integrity (DI attacks. To construct stealthy DI attacks, a common assumption in most related studies is that attackers have prior model knowledge of physical systems. In this paper, such assumption is relaxed and a covert agent is proposed based on the least squares support vector regression (LSSVR. By estimating a plant model from control and sensory data, the LSSVR-based covert agent can closely imitate the behavior of the physical plant. Then, the covert agent is used to construct a covert loop, which can keep the controller’s input and output both stealthy over a finite time window. Experiments have been carried out to show the effectiveness of the proposed method.
International Nuclear Information System (INIS)
Wang, L.F.; Bai, L.Y.
2013-01-01
To improve the precision of quantitative structure-activity relationship (QSAR) modeling for aromatic carboxylic acid derivatives insect repellent, a novel nonlinear combination forecast model was proposed integrating support vector regression (SVR) and K-nearest neighbor (KNN): Firstly, search optimal kernel function and nonlinearly select molecular descriptors by the rule of minimum MSE value using SVR. Secondly, illuminate the effects of all descriptors on biological activity by multi-round enforcement resistance-selection. Thirdly, construct the sub-models with predicted values of different KNN. Then, get the optimal kernel and corresponding retained sub-models through subtle selection. Finally, make prediction with leave-one-out (LOO) method in the basis of reserved sub-models. Compared with previous widely used models, our work shows significant improvement in modeling performance, which demonstrates the superiority of the present combination forecast model. (author)
Directory of Open Access Journals (Sweden)
Liyang Wang
2016-01-01
Full Text Available Time-varying external disturbances cause instability of humanoid robots or even tip robots over. In this work, a trapezoidal fuzzy least squares support vector regression- (TF-LSSVR- based control system is proposed to learn the external disturbances and increase the zero-moment-point (ZMP stability margin of humanoid robots. First, the humanoid states and the corresponding control torques of the joints for training the controller are collected by implementing simulation experiments. Secondly, a TF-LSSVR with a time-related trapezoidal fuzzy membership function (TFMF is proposed to train the controller using the simulated data. Thirdly, the parameters of the proposed TF-LSSVR are updated using a cubature Kalman filter (CKF. Simulation results are provided. The proposed method is shown to be effective in learning and adapting occasional external disturbances and ensuring the stability margin of the robot.
Method of dynamic fuzzy symptom vector in intelligent diagnosis
International Nuclear Information System (INIS)
Sun Hongyan; Jiang Xuefeng
2010-01-01
Aiming at the requirement of diagnostic symptom real-time updating brought from diagnostic knowledge accumulation and great gap in unit and value of diagnostic symptom in multi parameters intelligent diagnosis, the method of dynamic fuzzy symptom vector is proposed. The concept of dynamic fuzzy symptom vector is defined. Ontology is used to specify the vector elements, and the vector transmission method based on ontology is built. The changing law of symptom value is analyzed and fuzzy normalization method based on fuzzy membership functions is built. An instance proved method of dynamic fussy symptom vector is efficient to solve the problems of symptom updating and unify of symptom value and unit. (authors)
Directory of Open Access Journals (Sweden)
Peek Andrew S
2007-06-01
Full Text Available Abstract Background RNA interference (RNAi is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid
Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal
2018-06-01
Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.
Predictive based monitoring of nuclear plant component degradation using support vector regression
International Nuclear Information System (INIS)
Agarwal, Vivek; Alamaniotis, Miltiadis; Tsoukalas, Lefteri H.
2015-01-01
Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component's respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.
Support vector regression methodology for estimating global solar radiation in Algeria
Guermoui, Mawloud; Rabehi, Abdelaziz; Gairaa, Kacem; Benkaciali, Said
2018-01-01
Accurate estimation of Daily Global Solar Radiation (DGSR) has been a major goal for solar energy applications. In this paper we show the possibility of developing a simple model based on the Support Vector Regression (SVM-R), which could be used to estimate DGSR on the horizontal surface in Algeria based only on sunshine ratio as input. The SVM model has been developed and tested using a data set recorded over three years (2005-2007). The data was collected at the Applied Research Unit for Renewable Energies (URAER) in Ghardaïa city. The data collected between 2005-2006 are used to train the model while the 2007 data are used to test the performance of the selected model. The measured and the estimated values of DGSR were compared during the testing phase statistically using the Root Mean Square Error (RMSE), Relative Square Error (rRMSE), and correlation coefficient (r2), which amount to 1.59(MJ/m2), 8.46 and 97,4%, respectively. The obtained results show that the SVM-R is highly qualified for DGSR estimation using only sunshine ratio.
Generation of daily global solar irradiation with support vector machines for regression
International Nuclear Information System (INIS)
Antonanzas-Torres, F.; Urraca, R.; Antonanzas, J.; Fernandez-Ceniceros, J.; Martinez-de-Pison, F.J.
2015-01-01
Highlights: • New methodology for estimation of daily solar irradiation with SVR. • Automatic procedure for training models and selecting meteorological features. • This methodology outperforms other well-known parametric and numeric techniques. - Abstract: Solar global irradiation is barely recorded in isolated rural areas around the world. Traditionally, solar resource estimation has been performed using parametric-empirical models based on the relationship of solar irradiation with other atmospheric and commonly measured variables, such as temperatures, rainfall, and sunshine duration, achieving a relatively high level of certainty. Considerable improvement in soft-computing techniques, which have been applied extensively in many research fields, has lead to improvements in solar global irradiation modeling, although most of these techniques lack spatial generalization. This new methodology proposes support vector machines for regression with optimized variable selection via genetic algorithms to generate non-locally dependent and accurate models. A case of study in Spain has demonstrated the value of this methodology. It achieved a striking reduction in the mean absolute error (MAE) – 41.4% and 19.9% – as compared to classic parametric models; Bristow & Campbell and Antonanzas-Torres et al., respectively
Energy Technology Data Exchange (ETDEWEB)
Brown, Joshua D.; Summers, Michael F. [University of Maryland Baltimore County, Howard Hughes Medical Institute (United States); Johnson, Bruce A., E-mail: bruce.johnson@asrc.cuny.edu [University of Maryland Baltimore County, Department of Chemistry and Biochemistry (United States)
2015-09-15
The Biological Magnetic Resonance Data Bank (BMRB) contains NMR chemical shift depositions for over 200 RNAs and RNA-containing complexes. We have analyzed the {sup 1}H NMR and {sup 13}C chemical shifts reported for non-exchangeable protons of 187 of these RNAs. Software was developed that downloads BMRB datasets and corresponding PDB structure files, and then generates residue-specific attributes based on the calculated secondary structure. Attributes represent properties present in each sequential stretch of five adjacent residues and include variables such as nucleotide type, base-pair presence and type, and tetraloop types. Attributes and {sup 1}H and {sup 13}C NMR chemical shifts of the central nucleotide are then used as input to train a predictive model using support vector regression. These models can then be used to predict shifts for new sequences. The new software tools, available as stand-alone scripts or integrated into the NMR visualization and analysis program NMRViewJ, should facilitate NMR assignment and/or validation of RNA {sup 1}H and {sup 13}C chemical shifts. In addition, our findings enabled the re-calibration a ring-current shift model using published NMR chemical shifts and high-resolution X-ray structural data as guides.
Tian, Y.; Xu, Y. P.
2017-12-01
In this paper, the Support Vector Regression (SVR) model incorporating climate indices and drought indices are developed to predict agriculture drought in Xiangjiang River basin, Central China. The agriculture droughts are presented with the Precipitation-Evapotranspiration Index (SPEI). According to the analysis of the relationship between SPEI with different time scales and soil moisture, it is found that SPEI of six months time scales (SPEI-6) could reflect the soil moisture better than that of three and one month time scale from the drought features including drought duration, severity and peak. Climate forcing like El Niño Southern Oscillation and western Pacific subtropical high (WPSH) are represented by climate indices such as MEI and series indices of WPSH. Ridge Point of WPSH is found to be the key factor that influences the agriculture drought mainly through the control of temperature. Based on the climate indices analysis, the predictions of SPEI-6 are conducted using the SVR model. The results show that the SVR model incorperating climate indices, especially ridge point of WPSH, could improve the prediction accuracy compared to that using drought index only. The improvement was more significant for the prediction of one month lead time than that of three months lead time. However, it needs to be cautious in selection of the input parameters, since adding more useless information could have a counter effect in attaining a better prediction.
Directory of Open Access Journals (Sweden)
N. Zahir
2015-12-01
Full Text Available Lake Urmia is one of the most important ecosystems of the country which is on the verge of elimination. Many factors contribute to this crisis among them is the precipitation, paly important roll. Precipitation has many forms one of them is in the form of snow. The snow on Sahand Mountain is one of the main and important sources of the Lake Urmia’s water. Snow Depth (SD is vital parameters for estimating water balance for future year. In this regards, this study is focused on SD parameter using Special Sensor Microwave/Imager (SSM/I instruments on board the Defence Meteorological Satellite Program (DMSP F16. The usual statistical methods for retrieving SD include linear and non-linear ones. These methods used least square procedure to estimate SD model. Recently, kernel base methods widely used for modelling statistical problem. From these methods, the support vector regression (SVR is achieved the high performance for modelling the statistical problem. Examination of the obtained data shows the existence of outlier in them. For omitting these outliers, wavelet denoising method is applied. After the omission of the outliers it is needed to select the optimum bands and parameters for SVR. To overcome these issues, feature selection methods have shown a direct effect on improving the regression performance. We used genetic algorithm (GA for selecting suitable features of the SSMI bands in order to estimate SD model. The results for the training and testing data in Sahand mountain is [R²_TEST=0.9049 and RMSE= 6.9654] that show the high SVR performance.
Directory of Open Access Journals (Sweden)
Gwo-Fong Lin
2016-01-01
Full Text Available This study describes the development of a reservoir inflow forecasting model for typhoon events to improve short lead-time flood forecasting performance. To strengthen the forecasting ability of the original support vector machines (SVMs model, the self-organizing map (SOM is adopted to group inputs into different clusters in advance of the proposed SOM-SVM model. Two different input methods are proposed for the SVM-based forecasting method, namely, SOM-SVM1 and SOM-SVM2. The methods are applied to an actual reservoir watershed to determine the 1 to 3 h ahead inflow forecasts. For 1, 2, and 3 h ahead forecasts, improvements in mean coefficient of efficiency (MCE due to the clusters obtained from SOM-SVM1 are 21.5%, 18.5%, and 23.0%, respectively. Furthermore, improvement in MCE for SOM-SVM2 is 20.9%, 21.2%, and 35.4%, respectively. Another SOM-SVM2 model increases the SOM-SVM1 model for 1, 2, and 3 h ahead forecasts obtained improvement increases of 0.33%, 2.25%, and 10.08%, respectively. These results show that the performance of the proposed model can provide improved forecasts of hourly inflow, especially in the proposed SOM-SVM2 model. In conclusion, the proposed model, which considers limit and higher related inputs instead of all inputs, can generate better forecasts in different clusters than are generated from the SOM process. The SOM-SVM2 model is recommended as an alternative to the original SVR (Support Vector Regression model because of its accuracy and robustness.
Near Real-Time Dust Aerosol Detection with Support Vector Machines for Regression
Rivas-Perea, P.; Rivas-Perea, P. E.; Cota-Ruiz, J.; Aragon Franco, R. A.
2015-12-01
Remote sensing instruments operating in the near-infrared spectrum usually provide the necessary information for further dust aerosol spectral analysis using statistical or machine learning algorithms. Such algorithms have proven to be effective in analyzing very specific case studies or dust events. However, very few make the analysis open to the public on a regular basis, fewer are designed specifically to operate in near real-time to higher resolutions, and almost none give a global daily coverage. In this research we investigated a large-scale approach to a machine learning algorithm called "support vector regression". The algorithm uses four near-infrared spectral bands from NASA MODIS instrument: B20 (3.66-3.84μm), B29 (8.40-8.70μm), B31 (10.78-11.28μm), and B32 (11.77-12.27μm). The algorithm is presented with ground truth from more than 30 distinct reported dust events, from different geographical regions, at different seasons, both over land and sea cover, in the presence of clouds and clear sky, and in the presence of fires. The purpose of our algorithm is to learn to distinguish the dust aerosols spectral signature from other spectral signatures, providing as output an estimate of the probability of a data point being consistent with dust aerosol signatures. During modeling with ground truth, our algorithm achieved more than 90% of accuracy, and the current live performance of the algorithm is remarkable. Moreover, our algorithm is currently operating in near real-time using NASA's Land, Atmosphere Near real-time Capability for EOS (LANCE) servers, providing a high resolution global overview including 64, 32, 16, 8, 4, 2, and 1km. The near real-time analysis of our algorithm is now available to the general public at http://dust.reev.us and archives of the results starting from 2012 are available upon request.
Estimation of Electrically-Evoked Knee Torque from Mechanomyography Using Support Vector Regression
Directory of Open Access Journals (Sweden)
Morufu Olusola Ibitoye
2016-07-01
Full Text Available The difficulty of real-time muscle force or joint torque estimation during neuromuscular electrical stimulation (NMES in physical therapy and exercise science has motivated recent research interest in torque estimation from other muscle characteristics. This study investigated the accuracy of a computational intelligence technique for estimating NMES-evoked knee extension torque based on the Mechanomyographic signals (MMG of contracting muscles that were recorded from eight healthy males. Simulation of the knee torque was modelled via Support Vector Regression (SVR due to its good generalization ability in related fields. Inputs to the proposed model were MMG amplitude characteristics, the level of electrical stimulation or contraction intensity, and knee angle. Gaussian kernel function, as well as its optimal parameters were identified with the best performance measure and were applied as the SVR kernel function to build an effective knee torque estimation model. To train and test the model, the data were partitioned into training (70% and testing (30% subsets, respectively. The SVR estimation accuracy, based on the coefficient of determination (R2 between the actual and the estimated torque values was up to 94% and 89% during the training and testing cases, with root mean square errors (RMSE of 9.48 and 12.95, respectively. The knee torque estimations obtained using SVR modelling agreed well with the experimental data from an isokinetic dynamometer. These findings support the realization of a closed-loop NMES system for functional tasks using MMG as the feedback signal source and an SVR algorithm for joint torque estimation.
Tian, Ye; Xu, Yue-Ping; Wang, Guoqing
2018-05-01
Drought can have a substantial impact on the ecosystem and agriculture of the affected region and does harm to local economy. This study aims to analyze the relation between soil moisture and drought and predict agricultural drought in Xiangjiang River basin. The agriculture droughts are presented with the Precipitation-Evapotranspiration Index (SPEI). The Support Vector Regression (SVR) model incorporating climate indices is developed to predict the agricultural droughts. Analysis of climate forcing including El Niño Southern Oscillation and western Pacific subtropical high (WPSH) are carried out to select climate indices. The results show that SPEI of six months time scales (SPEI-6) represents the soil moisture better than that of three and one month time scale on drought duration, severity and peaks. The key factor that influences the agriculture drought is the Ridge Point of WPSH, which mainly controls regional temperature. The SVR model incorporating climate indices, especially ridge point of WPSH, could improve the prediction accuracy compared to that solely using drought index by 4.4% in training and 5.1% in testing measured by Nash Sutcliffe efficiency coefficient (NSE) for three month lead time. The improvement is more significant for the prediction with one month lead (15.8% in training and 27.0% in testing) than that with three months lead time. However, it needs to be cautious in selection of the input parameters, since adding redundant information could have a counter effect in attaining a better prediction. Copyright © 2017 Elsevier B.V. All rights reserved.
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Wang, Guo-xiang; Wang, Hai-yan; Wang, Hu; Zhang, Zheng-yong; Liu, Jun
2016-03-01
It is an important and difficult research point to recognize the age of Chinese liquor rapidly and exactly in the field of liquor analyzing, which is also of great significance to the healthy development of the liquor industry and protection of the legitimate rights and interests of consumers. Spectroscopy together with the pattern recognition technology is a preferred method of achieving rapid identification of wine quality, in which the Raman Spectroscopy is promising because of its little affection of water and little or free of sample pretreatment. So, in this paper, Raman spectra and support vector regression (SVR) are used to recognize different ages and different storing time of the liquor of the same age. The innovation of this paper is mainly reflected in the following three aspects. First, the application of Raman in the area of liquor analysis is rarely reported till now. Second, the concentration of studying the recognition of wine age, while most studies focus on studying specific components of liquor and studies together with the pattern recognition method focus more on the identification of brands or different types of base wine. The third one is the application of regression analysis framework, which cannot be only used to identify different years of liquor, but also can be used to analyze different storing time, which has theoretical and practical significance to the research and quality control of liquor. Three kinds of experiments are conducted in this paper. Firstly, SVR is used to recognize different ages of 5, 8, 16 and 26 years of the Gujing Liquor; secondly, SVR is also used to classify the storing time of the 8-years liquor; thirdly, certain group of train data is deleted form the train set and put into the test set to simulate the actual situation of liquor age recognition. Results show that the SVR model has good train and predict performance in these experiments, and it has better performance than other non-liner regression method such
A multiple regression method for genomewide association studies ...
Indian Academy of Sciences (India)
Bujun Mei
2018-06-07
Jun 7, 2018 ... Similar to the typical genomewide association tests using LD ... new approach performed validly when the multiple regression based on linkage method was employed. .... the model, two groups of scenarios were simulated.
Directory of Open Access Journals (Sweden)
Shahram Mollaiy-Berneti
2018-02-01
Full Text Available Successful design of a carbon dioxide (CO2 flooding in enhanced oil recovery projects mostly depends on accurate determination of CO2-crude oil minimum miscibility pressure (MMP. Due to the high expensive and time-consuming of experimental determination of MMP, developing a fast and robust method to predict MMP is necessary. In this study, a new method based on ε-insensitive smooth support vector regression (ε-SSVR is introduced to predict MMP for both pure and impure CO2 gas injection cases. The proposed ε-SSVR is developed using dataset of reservoir temperature, crude oil composition and composition of injected CO2. To serve better understanding of the proposed, feed-forward neural network and radial basis function network applied to denoted dataset. The results show that the suggested ε-SSVR has acceptable reliability and robustness in comparison with two other models. Thus, the proposed method can be considered as an alternative way to monitor the MMP in miscible flooding process.
Regression Methods for Virtual Metrology of Layer Thickness in Chemical Vapor Deposition
DEFF Research Database (Denmark)
Purwins, Hendrik; Barak, Bernd; Nagi, Ahmed
2014-01-01
The quality of wafer production in semiconductor manufacturing cannot always be monitored by a costly physical measurement. Instead of measuring a quantity directly, it can be predicted by a regression method (Virtual Metrology). In this paper, a survey on regression methods is given to predict...... average Silicon Nitride cap layer thickness for the Plasma Enhanced Chemical Vapor Deposition (PECVD) dual-layer metal passivation stack process. Process and production equipment Fault Detection and Classification (FDC) data are used as predictor variables. Various variable sets are compared: one most...... algorithm, and Support Vector Regression (SVR). On a test set, SVR outperforms the other methods by a large margin, being more robust towards changes in the production conditions. The method performs better on high-dimensional multivariate input data than on the most predictive variables alone. Process...
BOX-COX REGRESSION METHOD IN TIME SCALING
Directory of Open Access Journals (Sweden)
ATİLLA GÖKTAŞ
2013-06-01
Full Text Available Box-Cox regression method with λj, for j = 1, 2, ..., k, power transformation can be used when dependent variable and error term of the linear regression model do not satisfy the continuity and normality assumptions. The situation obtaining the smallest mean square error when optimum power λj, transformation for j = 1, 2, ..., k, of Y has been discussed. Box-Cox regression method is especially appropriate to adjust existence skewness or heteroscedasticity of error terms for a nonlinear functional relationship between dependent and explanatory variables. In this study, the advantage and disadvantage use of Box-Cox regression method have been discussed in differentiation and differantial analysis of time scale concept.
Sidik, S. M.
1975-01-01
Ridge, Marquardt's generalized inverse, shrunken, and principal components estimators are discussed in terms of the objectives of point estimation of parameters, estimation of the predictive regression function, and hypothesis testing. It is found that as the normal equations approach singularity, more consideration must be given to estimable functions of the parameters as opposed to estimation of the full parameter vector; that biased estimators all introduce constraints on the parameter space; that adoption of mean squared error as a criterion of goodness should be independent of the degree of singularity; and that ordinary least-squares subset regression is the best overall method.
Chiogna, Gabriele; Marcolini, Giorgia; Liu, Wanying; Pérez Ciria, Teresa; Tuo, Ye
2018-08-15
Water management in the alpine region has an important impact on streamflow. In particular, hydropower production is known to cause hydropeaking i.e., sudden fluctuations in river stage caused by the release or storage of water in artificial reservoirs. Modeling hydropeaking with hydrological models, such as the Soil Water Assessment Tool (SWAT), requires knowledge of reservoir management rules. These data are often not available since they are sensitive information belonging to hydropower production companies. In this short communication, we propose to couple the results of a calibrated hydrological model with a machine learning method to reproduce hydropeaking without requiring the knowledge of the actual reservoir management operation. We trained a support vector machine (SVM) with SWAT model outputs, the day of the week and the energy price. We tested the model for the Upper Adige river basin in North-East Italy. A wavelet analysis showed that energy price has a significant influence on river discharge, and a wavelet coherence analysis demonstrated the improved performance of the SVM model in comparison to the SWAT model alone. The SVM model was also able to capture the fluctuations in streamflow caused by hydropeaking when both energy price and river discharge displayed a complex temporal dynamic. Copyright © 2018 Elsevier B.V. All rights reserved.
Chen, Jing; Qiu, Xiaojie; Yin, Cunyi; Jiang, Hao
2018-02-01
An efficient method to design the broadband gain-flattened Raman fiber amplifier with multiple pumps is proposed based on least squares support vector regression (LS-SVR). A multi-input multi-output LS-SVR model is introduced to replace the complicated solving process of the nonlinear coupled Raman amplification equation. The proposed approach contains two stages: offline training stage and online optimization stage. During the offline stage, the LS-SVR model is trained. Owing to the good generalization capability of LS-SVR, the net gain spectrum can be directly and accurately obtained when inputting any combination of the pump wavelength and power to the well-trained model. During the online stage, we incorporate the LS-SVR model into the particle swarm optimization algorithm to find the optimal pump configuration. The design results demonstrate that the proposed method greatly shortens the computation time and enhances the efficiency of the pump parameter optimization for Raman fiber amplifier design.
Li, Yongxin; Li, Yuanqian; Zheng, Bo; Qu, Lingli; Li, Can
2009-06-08
A rapid and sensitive method based on microchip capillary electrophoresis with condition optimization of genetic algorithm-support vector regression (GA-SVR) was developed and applied to simultaneous analysis of multiplex PCR products of four foodborne pathogenic bacteria. Four pairs of oligonucleotide primers were designed to exclusively amplify the targeted gene of Vibrio parahemolyticus, Salmonella, Escherichia coli (E. coli) O157:H7, Shigella and the quadruplex PCR parameters were optimized. At the same time, GA-SVR was employed to optimize the separation conditions of DNA fragments in microchip capillary electrophoresis. The proposed method was applied to simultaneously detect the multiplex PCR products of four foodborne pathogenic bacteria under the optimal conditions within 8 min. The levels of detection were as low as 1.2 x 10(2) CFU mL(-1) of Vibrio parahemolyticus, 2.9 x 10(2) CFU mL(-1) of Salmonella, 8.7 x 10(1) CFU mL(-1) of E. coli O157:H7 and 5.2 x 10(1) CFU mL(-1) of Shigella, respectively. The relative standard deviation of migration time was in the range of 0.74-2.09%. The results demonstrated that the good resolution and less analytical time were achieved due to the application of the multivariate strategy. This study offers an efficient alternative to routine foodborne pathogenic bacteria detection in a fast, reliable, and sensitive way.
Directory of Open Access Journals (Sweden)
Fereshteh Shiri
2010-08-01
Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.
On two flexible methods of 2-dimensional regression analysis
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2012-01-01
Roč. 18, č. 4 (2012), s. 154-164 ISSN 1803-9782 Grant - others:GA ČR(CZ) GAP209/10/2045 Institutional support: RVO:67985556 Keywords : regression analysis * Gordon surface * prediction error * projection pursuit Subject RIV: BB - Applied Statistics, Operational Research http://library.utia.cas.cz/separaty/2013/SI/volf-on two flexible methods of 2-dimensional regression analysis.pdf
Energy Technology Data Exchange (ETDEWEB)
Jiang, Huaiguang [National Renewable Energy Laboratory (NREL), Golden, CO (United States)
2017-08-25
This work proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of the hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system.
Thermal Efficiency Degradation Diagnosis Method Using Regression Model
International Nuclear Information System (INIS)
Jee, Chang Hyun; Heo, Gyun Young; Jang, Seok Won; Lee, In Cheol
2011-01-01
This paper proposes an idea for thermal efficiency degradation diagnosis in turbine cycles, which is based on turbine cycle simulation under abnormal conditions and a linear regression model. The correlation between the inputs for representing degradation conditions (normally unmeasured but intrinsic states) and the simulation outputs (normally measured but superficial states) was analyzed with the linear regression model. The regression models can inversely response an associated intrinsic state for a superficial state observed from a power plant. The diagnosis method proposed herein is classified into three processes, 1) simulations for degradation conditions to get measured states (referred as what-if method), 2) development of the linear model correlating intrinsic and superficial states, and 3) determination of an intrinsic state using the superficial states of current plant and the linear regression model (referred as inverse what-if method). The what-if method is to generate the outputs for the inputs including various root causes and/or boundary conditions whereas the inverse what-if method is the process of calculating the inverse matrix with the given superficial states, that is, component degradation modes. The method suggested in this paper was validated using the turbine cycle model for an operating power plant
Qiu, Shibin; Lane, Terran
2009-01-01
The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions.
A Vector AutoRegressive (VAR) Approach to the Credit Channel for ...
African Journals Online (AJOL)
This paper is an attempt to determine the presence and empirical significance of monetary policy and the bank lending view of the credit channel for Mauritius, which is particularly relevant at these times. A vector autoregressive (VAR) model of order three is used to examine the monetary transmission mechanism using ...
Modern methods in topological vector spaces
Wilansky, Albert
2013-01-01
Designed for a one-year course in topological vector spaces, this text is geared toward advanced undergraduates and beginning graduate students of mathematics. The subjects involve properties employed by researchers in classical analysis, differential and integral equations, distributions, summability, and classical Banach and Frechét spaces. Optional problems with hints and references introduce non-locally convex spaces, Köthe-Toeplitz spaces, Banach algebra, sequentially barrelled spaces, and norming subspaces.Extensive introductory chapters cover metric ideas, Banach space, topological vect
Linear regression methods a ccording to objective functions
Yasemin Sisman; Sebahattin Bektas
2012-01-01
The aim of the study is to explain the parameter estimation methods and the regression analysis. The simple linear regressionmethods grouped according to the objective function are introduced. The numerical solution is achieved for the simple linear regressionmethods according to objective function of Least Squares and theLeast Absolute Value adjustment methods. The success of the appliedmethods is analyzed using their objective function values.
Georga, Eleni I; Protopappas, Vasilios C; Ardigò, Diego; Polyzos, Demosthenes; Fotiadis, Dimitrios I
2013-08-01
The prevention of hypoglycemic events is of paramount importance in the daily management of insulin-treated diabetes. The use of short-term prediction algorithms of the subcutaneous (s.c.) glucose concentration may contribute significantly toward this direction. The literature suggests that, although the recent glucose profile is a prominent predictor of hypoglycemia, the overall patient's context greatly impacts its accurate estimation. The objective of this study is to evaluate the performance of a support vector for regression (SVR) s.c. glucose method on hypoglycemia prediction. We extend our SVR model to predict separately the nocturnal events during sleep and the non-nocturnal (i.e., diurnal) ones over 30-min and 60-min horizons using information on recent glucose profile, meals, insulin intake, and physical activities for a hypoglycemic threshold of 70 mg/dL. We also introduce herein additional variables accounting for recurrent nocturnal hypoglycemia due to antecedent hypoglycemia, exercise, and sleep. SVR predictions are compared with those from two other machine learning techniques. The method is assessed on a dataset of 15 patients with type 1 diabetes under free-living conditions. Nocturnal hypoglycemic events are predicted with 94% sensitivity for both horizons and with time lags of 5.43 min and 4.57 min, respectively. As concerns the diurnal events, when physical activities are not considered, the sensitivity is 92% and 96% for a 30-min and 60-min horizon, respectively, with both time lags being less than 5 min. However, when such information is introduced, the diurnal sensitivity decreases by 8% and 3%, respectively. Both nocturnal and diurnal predictions show a high (>90%) precision. Results suggest that hypoglycemia prediction using SVR can be accurate and performs better in most diurnal and nocturnal cases compared with other techniques. It is advised that the problem of hypoglycemia prediction should be handled differently for nocturnal
Model reduction methods for vector autoregressive processes
Brüggemann, Ralf
2004-01-01
1. 1 Objective of the Study Vector autoregressive (VAR) models have become one of the dominant research tools in the analysis of macroeconomic time series during the last two decades. The great success of this modeling class started with Sims' (1980) critique of the traditional simultaneous equation models (SEM). Sims criticized the use of 'too many incredible restrictions' based on 'supposed a priori knowledge' in large scale macroeconometric models which were popular at that time. Therefore, he advo cated largely unrestricted reduced form multivariate time series models, unrestricted VAR models in particular. Ever since his influential paper these models have been employed extensively to characterize the underlying dynamics in systems of time series. In particular, tools to summarize the dynamic interaction between the system variables, such as impulse response analysis or forecast error variance decompo sitions, have been developed over the years. The econometrics of VAR models and related quantities i...
Vector-Parallel processing of the successive overrelaxation method
International Nuclear Information System (INIS)
Yokokawa, Mitsuo
1988-02-01
Successive overrelaxation method, called SOR method, is one of iterative methods for solving linear system of equations, and it has been calculated in serial with a natural ordering in many nuclear codes. After the appearance of vector processors, this natural SOR method has been changed for the parallel algorithm such as hyperplane or red-black method, in which the calculation order is modified. These methods are suitable for vector processors, and more high-speed calculation can be obtained compared with the natural SOR method on vector processors. In this report, a new scheme named 4-colors SOR method is proposed. We find that the 4-colors SOR method can be executed on vector-parallel processors and it gives the most high-speed calculation among all SOR methods according to results of the vector-parallel execution on the Alliant FX/8 multiprocessor system. It is also shown that the theoretical optimal acceleration parameters are equal among five different ordering SOR methods, and the difference between convergence rates of these SOR methods are examined. (author)
Delbari, Masoomeh; Sharifazari, Salman; Mohammadi, Ehsan
2018-02-01
The knowledge of soil temperature at different depths is important for agricultural industry and for understanding climate change. The aim of this study is to evaluate the performance of a support vector regression (SVR)-based model in estimating daily soil temperature at 10, 30 and 100 cm depth at different climate conditions over Iran. The obtained results were compared to those obtained from a more classical multiple linear regression (MLR) model. The correlation sensitivity for the input combinations and periodicity effect were also investigated. Climatic data used as inputs to the models were minimum and maximum air temperature, solar radiation, relative humidity, dew point, and the atmospheric pressure (reduced to see level), collected from five synoptic stations Kerman, Ahvaz, Tabriz, Saghez, and Rasht located respectively in the hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid climate conditions. According to the results, the performance of both MLR and SVR models was quite well at surface layer, i.e., 10-cm depth. However, SVR performed better than MLR in estimating soil temperature at deeper layers especially 100 cm depth. Moreover, both models performed better in humid climate condition than arid and hyper-arid areas. Further, adding a periodicity component into the modeling process considerably improved the models' performance especially in the case of SVR.
Comparing parametric and nonparametric regression methods for panel data
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
Application of support vector regression (SVR) for stream flow prediction on the Amazon basin
CSIR Research Space (South Africa)
Du Toit, Melise
2016-10-01
Full Text Available regression technique is used in this study to analyse historical stream flow occurrences and predict stream flow values for the Amazon basin. Up to twelve month predictions are made and the coefficient of determination and root-mean-square error are used...
Fast computation of the characteristics method on vector computers
International Nuclear Information System (INIS)
Kugo, Teruhiko
2001-11-01
Fast computation of the characteristics method to solve the neutron transport equation in a heterogeneous geometry has been studied. Two vector computation algorithms; an odd-even sweep (OES) method and an independent sequential sweep (ISS) method have been developed and their efficiency to a typical fuel assembly calculation has been investigated. For both methods, a vector computation is 15 times faster than a scalar computation. From a viewpoint of comparison between the OES and ISS methods, the followings are found: 1) there is a small difference in a computation speed, 2) the ISS method shows a faster convergence and 3) the ISS method saves about 80% of computer memory size compared with the OES method. It is, therefore, concluded that the ISS method is superior to the OES method as a vectorization method. In the vector computation, a table-look-up method to reduce computation time of an exponential function saves only 20% of a whole computation time. Both the coarse mesh rebalance method and the Aitken acceleration method are effective as acceleration methods for the characteristics method, a combination of them saves 70-80% of outer iterations compared with a free iteration. (author)
Astuti, Yuniar Andi
2011-01-01
This study examines techniques Support Vector Regression and Decision Tree C4.5 has been used in studies in various fields, in order to know the advantages and disadvantages of both techniques that appear in Data Mining. From the ten studies that use both techniques, the results of the analysis showed that the accuracy of the SVR technique for 59,64% and C4.5 for 76,97% So in this study obtained a statement that C4.5 is better than SVR 097038020
Directory of Open Access Journals (Sweden)
Rachid Darnag
2017-02-01
Full Text Available Support vector machines (SVM represent one of the most promising Machine Learning (ML tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR models using molecular descriptors. Multiple linear regression (MLR and artificial neural networks (ANNs were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure–activity relationships was evaluated.
Xian, Guangming
2018-03-01
In this paper, the vibration flow field parameters of polymer melts in a visual slit die are optimized by using intelligent algorithm. Experimental small angle light scattering (SALS) patterns are shown to characterize the processing process. In order to capture the scattered light, a polarizer and an analyzer are placed before and after the polymer melts. The results reported in this study are obtained using high-density polyethylene (HDPE) with rotation speed at 28 rpm. In addition, support vector regression (SVR) analytical method is introduced for optimization the parameters of vibration flow field. This work establishes the general applicability of SVR for predicting the optimal parameters of vibration flow field.
Directory of Open Access Journals (Sweden)
ANDRÉS M. ÁLVAREZ MEZA
2012-01-01
Full Text Available RESUMEN: En este trabajo, se propone una metodología para la selección automática de los parámetros libres de la técnica de regresión basada en mínimos cuadrados máquinas de vectores de soporte (LS-SVM, a partir de un análisis de validación cruzada generalizada multidimensional sobre el conjunto de ecuaciones lineales de LS-SVM. La técnica desarrollada no requiere de un conocimiento a priori por parte del usuario acerca de la influencia de los parámetros libres en los resultados. Se realizan experimentos sobre dos bases de datos artificiales y dos bases de datos reales. De acuerdo a los resultados obtenidos, se concluye que el algoritmo desarrollado calcula regresiones apropiadas con errores relativos competentes.
Directory of Open Access Journals (Sweden)
Naradasu Kumar Ravi
2013-01-01
Full Text Available Diesel engine designers are constantly on the look-out for performance enhancement through efficient control of operating parameters. In this paper, the concept of an intelligent engine control system is proposed that seeks to ensure optimized performance under varying operating conditions. The concept is based on arriving at the optimum engine operating parameters to ensure the desired output in terms of efficiency. In addition, a Support Vector Machines based prediction model has been developed to predict the engine performance under varying operating conditions. Experiments were carried out at varying loads, compression ratios and amounts of exhaust gas recirculation using a variable compression ratio diesel engine for data acquisition. It was observed that the SVM model was able to predict the engine performance accurately.
Face Hallucination with Linear Regression Model in Semi-Orthogonal Multilinear PCA Method
Asavaskulkiet, Krissada
2018-04-01
In this paper, we propose a new face hallucination technique, face images reconstruction in HSV color space with a semi-orthogonal multilinear principal component analysis method. This novel hallucination technique can perform directly from tensors via tensor-to-vector projection by imposing the orthogonality constraint in only one mode. In our experiments, we use facial images from FERET database to test our hallucination approach which is demonstrated by extensive experiments with high-quality hallucinated color faces. The experimental results assure clearly demonstrated that we can generate photorealistic color face images by using the SO-MPCA subspace with a linear regression model.
Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler
2013-02-01
The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.
Zhu, Xiaofeng; Suk, Heung-Il; Wang, Li; Lee, Seong-Whan; Shen, Dinggang
2017-05-01
In this paper, we focus on joint regression and classification for Alzheimer's disease diagnosis and propose a new feature selection method by embedding the relational information inherent in the observations into a sparse multi-task learning framework. Specifically, the relational information includes three kinds of relationships (such as feature-feature relation, response-response relation, and sample-sample relation), for preserving three kinds of the similarity, such as for the features, the response variables, and the samples, respectively. To conduct feature selection, we first formulate the objective function by imposing these three relational characteristics along with an ℓ 2,1 -norm regularization term, and further propose a computationally efficient algorithm to optimize the proposed objective function. With the dimension-reduced data, we train two support vector regression models to predict the clinical scores of ADAS-Cog and MMSE, respectively, and also a support vector classification model to determine the clinical label. We conducted extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset to validate the effectiveness of the proposed method. Our experimental results showed the efficacy of the proposed method in enhancing the performances of both clinical scores prediction and disease status identification, compared to the state-of-the-art methods. Copyright © 2015 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Danilo A. López-Sarmiento
2013-11-01
Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.
FATAL, General Experiment Fitting Program by Nonlinear Regression Method
International Nuclear Information System (INIS)
Salmon, L.; Budd, T.; Marshall, M.
1982-01-01
1 - Description of problem or function: A generalized fitting program with a free-format keyword interface to the user. It permits experimental data to be fitted by non-linear regression methods to any function describable by the user. The user requires the minimum of computer experience but needs to provide a subroutine to define his function. Some statistical output is included as well as 'best' estimates of the function's parameters. 2 - Method of solution: The regression method used is based on a minimization technique devised by Powell (Harwell Subroutine Library VA05A, 1972) which does not require the use of analytical derivatives. The method employs a quasi-Newton procedure balanced with a steepest descent correction. Experience shows this to be efficient for a very wide range of application. 3 - Restrictions on the complexity of the problem: The current version of the program permits functions to be defined with up to 20 parameters. The function may be fitted to a maximum of 400 points, preferably with estimated values of weight given
Mapping urban environmental noise: a land use regression method.
Xie, Dan; Liu, Yi; Chen, Jining
2011-09-01
Forecasting and preventing urban noise pollution are major challenges in urban environmental management. Most existing efforts, including experiment-based models, statistical models, and noise mapping, however, have limited capacity to explain the association between urban growth and corresponding noise change. Therefore, these conventional methods can hardly forecast urban noise at a given outlook of development layout. This paper, for the first time, introduces a land use regression method, which has been applied for simulating urban air quality for a decade, to construct an urban noise model (LUNOS) in Dalian Municipality, Northwest China. The LUNOS model describes noise as a dependent variable of surrounding various land areas via a regressive function. The results suggest that a linear model performs better in fitting monitoring data, and there is no significant difference of the LUNOS's outputs when applied to different spatial scales. As the LUNOS facilitates a better understanding of the association between land use and urban environmental noise in comparison to conventional methods, it can be regarded as a promising tool for noise prediction for planning purposes and aid smart decision-making.
DEFF Research Database (Denmark)
Boeriis, Morten; van Leeuwen, Theo
2017-01-01
should be taken into account in discussing ‘reactions’, which Kress and van Leeuwen link only to eyeline vectors. Finally, the question can be raised as to whether actions are always realized by vectors. Drawing on a re-reading of Rudolf Arnheim’s account of vectors, these issues are outlined......This article revisits the concept of vectors, which, in Kress and van Leeuwen’s Reading Images (2006), plays a crucial role in distinguishing between ‘narrative’, action-oriented processes and ‘conceptual’, state-oriented processes. The use of this concept in image analysis has usually focused...
Directory of Open Access Journals (Sweden)
Giuliano de Oliveira Freitas
2013-10-01
Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.
A vector matching method for analysing logic Petri nets
Du, YuYue; Qi, Liang; Zhou, MengChu
2011-11-01
Batch processing function and passing value indeterminacy in cooperative systems can be described and analysed by logic Petri nets (LPNs). To directly analyse the properties of LPNs, the concept of transition enabling vector sets is presented and a vector matching method used to judge the enabling transitions is proposed in this article. The incidence matrix of LPNs is defined; an equation about marking change due to a transition's firing is given; and a reachable tree is constructed. The state space explosion is mitigated to a certain extent from directly analysing LPNs. Finally, the validity and reliability of the proposed method are illustrated by an example in electronic commerce.
Levine, Matthew E; Albers, David J; Hripcsak, George
2016-01-01
Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.
Lu, Chi-Jie; Chang, Chi-Chang
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
Yan, Jun; Huang, Jian-Hua; He, Min; Lu, Hong-Bing; Yang, Rui; Kong, Bo; Xu, Qing-Song; Liang, Yi-Zeng
2013-08-01
Retention indices for frequently reported compounds of plant essential oils on three different stationary phases were investigated. Multivariate linear regression, partial least squares, and support vector machine combined with a new variable selection approach called random-frog recently proposed by our group, were employed to model quantitative structure-retention relationships. Internal and external validations were performed to ensure the stability and predictive ability. All the three methods could obtain an acceptable model, and the optimal results by support vector machine based on a small number of informative descriptors with the square of correlation coefficient for cross validation, values of 0.9726, 0.9759, and 0.9331 on the dimethylsilicone stationary phase, the dimethylsilicone phase with 5% phenyl groups, and the PEG stationary phase, respectively. The performances of two variable selection approaches, random-frog and genetic algorithm, are compared. The importance of the variables was found to be consistent when estimated from correlation coefficients in multivariate linear regression equations and selection probability in model spaces. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Dimension Reduction and Discretization in Stochastic Problems by Regression Method
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager
1996-01-01
The chapter mainly deals with dimension reduction and field discretizations based directly on the concept of linear regression. Several examples of interesting applications in stochastic mechanics are also given.Keywords: Random fields discretization, Linear regression, Stochastic interpolation, ...
Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried
2018-03-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
Oguntunde, Philip G; Lischeid, Gunnar; Dietrich, Ottfried
2018-03-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease (P 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann
2012-01-01
Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... with sparse LAR, lasso and Elastic Net (EN) sparse regression methods. Due to the inconsistent measurement condition, Locally Weighted Scatter plot Smoothing (Loess) has been employed to alleviate the undesired variation in the estimated viscosity. The experimental results of applying different methods show...
International Nuclear Information System (INIS)
Zheng, Xiujuan; Fang, Huajing
2015-01-01
The gradual decreasing capacity of lithium-ion batteries can serve as a health indicator for tracking the degradation of lithium-ion batteries. It is important to predict the capacity of a lithium-ion battery for future cycles to assess its health condition and remaining useful life (RUL). In this paper, a novel method is developed using unscented Kalman filter (UKF) with relevance vector regression (RVR) and applied to RUL and short-term capacity prediction of batteries. A RVR model is employed as a nonlinear time-series prediction model to predict the UKF future residuals which otherwise remain zero during the prediction period. Taking the prediction step into account, the predictive value through the RVR method and the latest real residual value constitute the future evolution of the residuals with a time-varying weighting scheme. Next, the future residuals are utilized by UKF to recursively estimate the battery parameters for predicting RUL and short-term capacity. Finally, the performance of the proposed method is validated and compared to other predictors with the experimental data. According to the experimental and analysis results, the proposed approach has high reliability and prediction accuracy, which can be applied to battery monitoring and prognostics, as well as generalized to other prognostic applications. - Highlights: • An integrated method is proposed for RUL prediction as well as short-term capacity prediction. • Relevance vector regression model is employed as a nonlinear time-series prediction model. • Unscented Kalman filter is used to recursively update the states for battery model parameters during the prediction. • A time-varying weighting scheme is utilized to improve the accuracy of the RUL prediction. • The proposed method demonstrates high reliability and prediction accuracy.
Improvement of vector compensation method for vehicle magnetic distortion field
Energy Technology Data Exchange (ETDEWEB)
Pang, Hongfeng, E-mail: panghongfeng@126.com; Zhang, Qi; Li, Ji; Luo, Shitu; Chen, Dixiang; Pan, Mengchun; Luo, Feilu
2014-03-15
Magnetic distortions such as eddy-current field and low frequency magnetic field have not been considered in vector compensation methods. A new compensation method is proposed to suppress these magnetic distortions and improve compensation performance, in which the magnetic distortions related to measurement vectors and time are considered. The experimental system mainly consists of a three-axis fluxgate magnetometer (DM-050), an underwater vehicle and a proton magnetometer, in which the scalar value of magnetic field is obtained with the proton magnetometer and considered to be the true value. Comparing with traditional compensation methods, experimental results show that the magnetic distortions can be further reduced by two times. After compensation, error intensity and RMS error are reduced from 11684.013 nT and 7794.604 nT to 16.219 nT and 5.907 nT respectively. It suggests an effective way to improve the compensation performance of magnetic distortions. - Highlights: • A new vector compensation method is proposed for vehicle magnetic distortion. • The proposed model not only includes magnetometer error but also considers magnetic distortion. • Compensation parameters are computed directly by solving nonlinear equations. • Compared with traditional methods, the proposed method is not related with rotation angle rate. • Error intensity and RMS error can be reduced to 1/2 of the error with traditional methods.
Improvement of vector compensation method for vehicle magnetic distortion field
International Nuclear Information System (INIS)
Pang, Hongfeng; Zhang, Qi; Li, Ji; Luo, Shitu; Chen, Dixiang; Pan, Mengchun; Luo, Feilu
2014-01-01
Magnetic distortions such as eddy-current field and low frequency magnetic field have not been considered in vector compensation methods. A new compensation method is proposed to suppress these magnetic distortions and improve compensation performance, in which the magnetic distortions related to measurement vectors and time are considered. The experimental system mainly consists of a three-axis fluxgate magnetometer (DM-050), an underwater vehicle and a proton magnetometer, in which the scalar value of magnetic field is obtained with the proton magnetometer and considered to be the true value. Comparing with traditional compensation methods, experimental results show that the magnetic distortions can be further reduced by two times. After compensation, error intensity and RMS error are reduced from 11684.013 nT and 7794.604 nT to 16.219 nT and 5.907 nT respectively. It suggests an effective way to improve the compensation performance of magnetic distortions. - Highlights: • A new vector compensation method is proposed for vehicle magnetic distortion. • The proposed model not only includes magnetometer error but also considers magnetic distortion. • Compensation parameters are computed directly by solving nonlinear equations. • Compared with traditional methods, the proposed method is not related with rotation angle rate. • Error intensity and RMS error can be reduced to 1/2 of the error with traditional methods
Naguib, Ibrahim A.; Darwish, Hany W.
2012-02-01
A comparison between support vector regression (SVR) and Artificial Neural Networks (ANNs) multivariate regression methods is established showing the underlying algorithm for each and making a comparison between them to indicate the inherent advantages and limitations. In this paper we compare SVR to ANN with and without variable selection procedure (genetic algorithm (GA)). To project the comparison in a sensible way, the methods are used for the stability indicating quantitative analysis of mixtures of mebeverine hydrochloride and sulpiride in binary mixtures as a case study in presence of their reported impurities and degradation products (summing up to 6 components) in raw materials and pharmaceutical dosage form via handling the UV spectral data. For proper analysis, a 6 factor 5 level experimental design was established resulting in a training set of 25 mixtures containing different ratios of the interfering species. An independent test set consisting of 5 mixtures was used to validate the prediction ability of the suggested models. The proposed methods (linear SVR (without GA) and linear GA-ANN) were successfully applied to the analysis of pharmaceutical tablets containing mebeverine hydrochloride and sulpiride mixtures. The results manifest the problem of nonlinearity and how models like the SVR and ANN can handle it. The methods indicate the ability of the mentioned multivariate calibration models to deconvolute the highly overlapped UV spectra of the 6 components' mixtures, yet using cheap and easy to handle instruments like the UV spectrophotometer.
Seshan, Hari; Goyal, Manish K; Falk, Michael W; Wuertz, Stefan
2014-04-15
The relationship between microbial community structure and function has been examined in detail in natural and engineered environments, but little work has been done on using microbial community information to predict function. We processed microbial community and operational data from controlled experiments with bench-scale bioreactor systems to predict reactor process performance. Four membrane-operated sequencing batch reactors treating synthetic wastewater were operated in two experiments to test the effects of (i) the toxic compound 3-chloroaniline (3-CA) and (ii) bioaugmentation targeting 3-CA degradation, on the sludge microbial community in the reactors. In the first experiment, two reactors were treated with 3-CA and two reactors were operated as controls without 3-CA input. In the second experiment, all four reactors were additionally bioaugmented with a Pseudomonas putida strain carrying a plasmid with a portion of the pathway for 3-CA degradation. Molecular data were generated from terminal restriction fragment length polymorphism (T-RFLP) analysis targeting the 16S rRNA and amoA genes from the sludge community. The electropherograms resulting from these T-RFs were used to calculate diversity indices - community richness, dynamics and evenness - for the domain Bacteria as well as for ammonia-oxidizing bacteria in each reactor over time. These diversity indices were then used to train and test a support vector regression (SVR) model to predict reactor performance based on input microbial community indices and operational data. Considering the diversity indices over time and across replicate reactors as discrete values, it was found that, although bioaugmentation with a bacterial strain harboring a subset of genes involved in the degradation of 3-CA did not bring about 3-CA degradation, it significantly affected the community as measured through all three diversity indices in both the general bacterial community and the ammonia-oxidizer community (
Impact of regression methods on improved effects of soil structure on soil water retention estimates
Nguyen, Phuong Minh; De Pue, Jan; Le, Khoa Van; Cornelis, Wim
2015-06-01
Increasing the accuracy of pedotransfer functions (PTFs), an indirect method for predicting non-readily available soil features such as soil water retention characteristics (SWRC), is of crucial importance for large scale agro-hydrological modeling. Adding significant predictors (i.e., soil structure), and implementing more flexible regression algorithms are among the main strategies of PTFs improvement. The aim of this study was to investigate whether the improved effect of categorical soil structure information on estimating soil-water content at various matric potentials, which has been reported in literature, could be enduringly captured by regression techniques other than the usually applied linear regression. Two data mining techniques, i.e., Support Vector Machines (SVM), and k-Nearest Neighbors (kNN), which have been recently introduced as promising tools for PTF development, were utilized to test if the incorporation of soil structure will improve PTF's accuracy under a context of rather limited training data. The results show that incorporating descriptive soil structure information, i.e., massive, structured and structureless, as grouping criterion can improve the accuracy of PTFs derived by SVM approach in the range of matric potential of -6 to -33 kPa (average RMSE decreased up to 0.005 m3 m-3 after grouping, depending on matric potentials). The improvement was primarily attributed to the outperformance of SVM-PTFs calibrated on structureless soils. No improvement was obtained with kNN technique, at least not in our study in which the data set became limited in size after grouping. Since there is an impact of regression techniques on the improved effect of incorporating qualitative soil structure information, selecting a proper technique will help to maximize the combined influence of flexible regression algorithms and soil structure information on PTF accuracy.
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
Statistical learning method in regression analysis of simulated positron spectral data
International Nuclear Information System (INIS)
Avdic, S. Dz.
2005-01-01
Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)
Methods of Detecting Outliers in A Regression Analysis Model ...
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... especially true in observational studies .... Simple linear regression and multiple ... The simple linear ..... Grubbs,F.E (1950): Sample Criteria for Testing Outlying observations: Annals of ... In experimental design, the Relative.
Classification Method in Integrated Information Network Using Vector Image Comparison
Directory of Open Access Journals (Sweden)
Zhou Yuan
2014-05-01
Full Text Available Wireless Integrated Information Network (WMN consists of integrated information that can get data from its surrounding, such as image, voice. To transmit information, large resource is required which decreases the service time of the network. In this paper we present a Classification Approach based on Vector Image Comparison (VIC for WMN that improve the service time of the network. The available methods for sub-region selection and conversion are also proposed.
Directory of Open Access Journals (Sweden)
Yongquan Dong
2018-04-01
Full Text Available Providing accurate electric load forecasting results plays a crucial role in daily energy management of the power supply system. Due to superior forecasting performance, the hybridizing support vector regression (SVR model with evolutionary algorithms has received attention and deserves to continue being explored widely. The cuckoo search (CS algorithm has the potential to contribute more satisfactory electric load forecasting results. However, the original CS algorithm suffers from its inherent drawbacks, such as parameters that require accurate setting, loss of population diversity, and easy trapping in local optima (i.e., premature convergence. Therefore, proposing some critical improvement mechanisms and employing an improved CS algorithm to determine suitable parameter combinations for an SVR model is essential. This paper proposes the SVR with chaotic cuckoo search (SVRCCS model based on using a tent chaotic mapping function to enrich the cuckoo search space and diversify the population to avoid trapping in local optima. In addition, to deal with the cyclic nature of electric loads, a seasonal mechanism is combined with the SVRCCS model, namely giving a seasonal SVR with chaotic cuckoo search (SSVRCCS model, to produce more accurate forecasting performances. The numerical results, tested by using the datasets from the National Electricity Market (NEM, Queensland, Australia and the New York Independent System Operator (NYISO, NY, USA, show that the proposed SSVRCCS model outperforms other alternative models.
Langut, Yael; Talhami, Alaa; Mamidi, Samarasimhareddy; Shir, Alexei; Zigler, Maya; Joubran, Salim; Sagalov, Anna; Flashner-Abramson, Efrat; Edinger, Nufar; Klein, Shoshana; Levitzki, Alexander
2017-12-26
There is an urgent need for an effective treatment for metastatic prostate cancer (PC). Prostate tumors invariably overexpress prostate surface membrane antigen (PSMA). We designed a nonviral vector, PEI-PEG-DUPA (PPD), comprising polyethylenimine-polyethyleneglycol (PEI-PEG) tethered to the PSMA ligand, 2-[3-(1, 3-dicarboxy propyl)ureido] pentanedioic acid (DUPA), to treat PC. The purpose of PEI is to bind polyinosinic/polycytosinic acid (polyIC) and allow endosomal release, while DUPA targets PC cells. PolyIC activates multiple pathways that lead to tumor cell death and to the activation of bystander effects that harness the immune system against the tumor, attacking nontargeted neighboring tumor cells and reducing the probability of acquired resistance and disease recurrence. Targeting polyIC directly to tumor cells avoids the toxicity associated with systemic delivery. PPD selectively delivered polyIC into PSMA-overexpressing PC cells, inducing apoptosis, cytokine secretion, and the recruitment of human peripheral blood mononuclear cells (PBMCs). PSMA-overexpressing tumors in nonobese diabetic/severe combined immunodeficiency (NOD/SCID) mice with partially reconstituted immune systems were significantly shrunken following PPD/polyIC treatment, in all cases. Half of the tumors showed complete regression. PPD/polyIC invokes antitumor immunity, but unlike many immunotherapies does not need to be personalized for each patient. The potent antitumor effects of PPD/polyIC should spur its development for clinical use.
Li, Liwei; Wang, Bo; Meroueh, Samy O
2011-09-26
The community structure-activity resource (CSAR) data sets are used to develop and test a support vector machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the two CSAR data sets. The features used to train SVR-KB are knowledge-based pairwise potentials, while SVR-EP is based on physicochemical properties. SVR-KB and SVR-EP were compared to seven other widely used scoring functions, including Glide, X-score, GoldScore, ChemScore, Vina, Dock, and PMF. Results showed that SVR-KB trained with features obtained from three-dimensional complexes of the PDBbind data set outperformed all other scoring functions, including best performing X-score, by nearly 0.1 using three correlation coefficients, namely Pearson, Spearman, and Kendall. It was interesting that higher performance in rank ordering did not translate into greater enrichment in virtual screening assessed using the 40 targets of the Directory of Useful Decoys (DUD). To remedy this situation, a variant of SVR-KB (SVR-KBD) was developed by following a target-specific tailoring strategy that we had previously employed to derive SVM-SP. SVR-KBD showed a much higher enrichment, outperforming all other scoring functions tested, and was comparable in performance to our previously derived scoring function SVM-SP.
Petković, Dalibor; Shamshirband, Shahaboddin; Saboohi, Hadi; Ang, Tan Fong; Anuar, Nor Badrul; Rahman, Zulkanain Abdul; Pavlović, Nenad T.
2014-07-01
The quantitative assessment of image quality is an important consideration in any type of imaging system. The modulation transfer function (MTF) is a graphical description of the sharpness and contrast of an imaging system or of its individual components. The MTF is also known and spatial frequency response. The MTF curve has different meanings according to the corresponding frequency. The MTF of an optical system specifies the contrast transmitted by the system as a function of image size, and is determined by the inherent optical properties of the system. In this study, the polynomial and radial basis function (RBF) are applied as the kernel function of Support Vector Regression (SVR) to estimate and predict estimate MTF value of the actual optical system according to experimental tests. Instead of minimizing the observed training error, SVR_poly and SVR_rbf attempt to minimize the generalization error bound so as to achieve generalized performance. The experimental results show that an improvement in predictive accuracy and capability of generalization can be achieved by the SVR_rbf approach in compare to SVR_poly soft computing methodology.
Heddam, Salim; Kisi, Ozgur
2018-04-01
In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.
Tang, J. L.; Cai, C. Z.; Xiao, T. T.; Huang, S. J.
2012-07-01
The electrical conductivity of solid oxide fuel cell (SOFC) cathode is one of the most important indices affecting the efficiency of SOFC. In order to improve the performance of fuel cell system, it is advantageous to have accurate model with which one can predict the electrical conductivity. In this paper, a model utilizing support vector regression (SVR) approach combined with particle swarm optimization (PSO) algorithm for its parameter optimization was established to modeling and predicting the electrical conductivity of Ba0.5Sr0.5Co0.8Fe0.2 O3-δ-xSm0.5Sr0.5CoO3-δ (BSCF-xSSC) composite cathode under two influence factors, including operating temperature (T) and SSC content (x) in BSCF-xSSC composite cathode. The leave-one-out cross validation (LOOCV) test result by SVR strongly supports that the generalization ability of SVR model is high enough. The absolute percentage error (APE) of 27 samples does not exceed 0.05%. The mean absolute percentage error (MAPE) of all 30 samples is only 0.09% and the correlation coefficient (R2) as high as 0.999. This investigation suggests that the hybrid PSO-SVR approach may be not only a promising and practical methodology to simulate the properties of fuel cell system, but also a powerful tool to be used for optimal designing or controlling the operating process of a SOFC system.
Directory of Open Access Journals (Sweden)
Ping Jiang
2015-01-01
Full Text Available Wind speed/power has received increasing attention around the earth due to its renewable nature as well as environmental friendliness. With the global installed wind power capacity rapidly increasing, wind industry is growing into a large-scale business. Reliable short-term wind speed forecasts play a practical and crucial role in wind energy conversion systems, such as the dynamic control of wind turbines and power system scheduling. In this paper, an intelligent hybrid model for short-term wind speed prediction is examined; the model is based on cross correlation (CC analysis and a support vector regression (SVR model that is coupled with brainstorm optimization (BSO and cuckoo search (CS algorithms, which are successfully utilized for parameter determination. The proposed hybrid models were used to forecast short-term wind speeds collected from four wind turbines located on a wind farm in China. The forecasting results demonstrate that the intelligent hybrid models outperform single models for short-term wind speed forecasting, which mainly results from the superiority of BSO and CS for parameter optimization.
Analysis of some methods for reduced rank Gaussian process regression
DEFF Research Database (Denmark)
Quinonero-Candela, J.; Rasmussen, Carl Edward
2005-01-01
While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent...... proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank...... Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning...
A method for generating double-ring-shaped vector beams
Huan, Chen; Xiao-Hui, Ling; Zhi-Hong, Chen; Qian-Guang, Li; Hao, Lv; Hua-Qing, Yu; Xu-Nong, Yi
2016-07-01
We propose a method for generating double-ring-shaped vector beams. A step phase introduced by a spatial light modulator (SLM) first makes the incident laser beam have a nodal cycle. This phase is dynamic in nature because it depends on the optical length. Then a Pancharatnam-Berry phase (PBP) optical element is used to manipulate the local polarization of the optical field by modulating the geometric phase. The experimental results show that this scheme can effectively create double-ring-shaped vector beams. It provides much greater flexibility to manipulate the phase and polarization by simultaneously modulating the dynamic and the geometric phases. Project supported by the National Natural Science Foundation of China (Grant No. 11547017), the Hubei Engineering University Research Foundation, China (Grant No. z2014001), and the Natural Science Foundation of Hubei Province, China (Grant No. 2014CFB578).
Biosensor method and system based on feature vector extraction
Greenbaum, Elias [Knoxville, TN; Rodriguez, Jr., Miguel; Qi, Hairong [Knoxville, TN; Wang, Xiaoling [San Jose, CA
2012-04-17
A method of biosensor-based detection of toxins comprises the steps of providing at least one time-dependent control signal generated by a biosensor in a gas or liquid medium, and obtaining a time-dependent biosensor signal from the biosensor in the gas or liquid medium to be monitored or analyzed for the presence of one or more toxins selected from chemical, biological or radiological agents. The time-dependent biosensor signal is processed to obtain a plurality of feature vectors using at least one of amplitude statistics and a time-frequency analysis. At least one parameter relating to toxicity of the gas or liquid medium is then determined from the feature vectors based on reference to the control signal.
A New Method for Estimation of Velocity Vectors
DEFF Research Database (Denmark)
Jensen, Jørgen Arendt; Munk, Peter
1998-01-01
The paper describes a new method for determining the velocity vector of a remotely sensed object using either sound or electromagnetic radiation. The movement of the object is determined from a field with spatial oscillations in both the axial direction of the transducer and in one or two...... directions transverse to the axial direction. By using a number of pulse emissions, the inter-pulse movement can be estimated and the velocity found from the estimated movement and the time between pulses. The method is based on the principle of using transverse spatial modulation for making the received...
Kernel method for clustering based on optimal target vector
International Nuclear Information System (INIS)
Angelini, Leonardo; Marinazzo, Daniele; Pellicoro, Mario; Stramaglia, Sebastiano
2006-01-01
We introduce Ising models, suitable for dichotomic clustering, with couplings that are (i) both ferro- and anti-ferromagnetic (ii) depending on the whole data-set and not only on pairs of samples. Couplings are determined exploiting the notion of optimal target vector, here introduced, a link between kernel supervised and unsupervised learning. The effectiveness of the method is shown in the case of the well-known iris data-set and in benchmarks of gene expression levels, where it works better than existing methods for dichotomic clustering
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
DEFF Research Database (Denmark)
Kirkeby, Carsten Thure; Hisham Beshara Halasa, Tariq; Gussmann, Maya Katrin
2017-01-01
the transmission rate. We use data from the two simulation models and vary the sampling intervals and the size of the population sampled. We devise two new methods to determine transmission rate, and compare these to the frequently used Poisson regression method in both epidemic and endemic situations. For most...... tested scenarios these new methods perform similar or better than Poisson regression, especially in the case of long sampling intervals. We conclude that transmission rate estimates are easily biased, which is important to take into account when using these rates in simulation models....
Regression Methods for Ophthalmic Glucose Sensing Using Metamaterials
Directory of Open Access Journals (Sweden)
Philipp Rapp
2011-01-01
Full Text Available We present a novel concept for in vivo sensing of glucose using metamaterials in combination with automatic learning systems. In detail, we use the plasmonic analogue of electromagnetically induced transparency (EIT as sensor and evaluate the acquired data with support vector machines. The metamaterial can be integrated into a contact lens. This sensor changes its optical properties such as reflectivity upon the ambient glucose concentration, which allows for in situ measurements in the eye. We demonstrate that estimation errors below 2% at physiological concentrations are possible using simulations of the optical properties of the metamaterial in combination with an appropriate electrical circuitry and signal processing scheme. In the future, functionalization of our sensor with hydrogel will allow for a glucose-specific detection which is insensitive to other tear liquid substances providing both excellent selectivity and sensitivity.
Helmreich, James E.; Krog, K. Peter
2018-01-01
We present a short, inquiry-based learning course on concepts and methods underlying ordinary least squares (OLS), least absolute deviation (LAD), and quantile regression (QR). Students investigate squared, absolute, and weighted absolute distance functions (metrics) as location measures. Using differential calculus and properties of convex…
Hong, S-M; Bukhari, W
2014-07-07
The motion of thoracic and abdominal tumours induced by respiratory motion often exceeds 20 mm, and can significantly compromise dose conformality. Motion-adaptive radiotherapy aims to deliver a conformal dose distribution to the tumour with minimal normal tissue exposure by compensating for the tumour motion. This adaptive radiotherapy, however, requires the prediction of the tumour movement that can occur over the system latency period. In general, motion prediction approaches can be classified into two groups: model-based and model-free. Model-based approaches utilize a motion model in predicting respiratory motion. These approaches are computationally efficient and responsive to irregular changes in respiratory motion. Model-free approaches do not assume an explicit model of motion dynamics, and predict future positions by learning from previous observations. Artificial neural networks (ANNs) and support vector regression (SVR) are examples of model-free approaches. In this article, we present a prediction algorithm that combines a model-based and a model-free approach in a cascade structure. The algorithm, which we call EKF-SVR, first employs a model-based algorithm (named LCM-EKF) to predict the respiratory motion, and then uses a model-free SVR algorithm to estimate and correct the error of the LCM-EKF prediction. Extensive numerical experiments based on a large database of 304 respiratory motion traces are performed. The experimental results demonstrate that the EKF-SVR algorithm successfully reduces the prediction error of the LCM-EKF, and outperforms the model-free ANN and SVR algorithms in terms of prediction accuracy across lookahead lengths of 192, 384, and 576 ms.
International Nuclear Information System (INIS)
Hong, S-M; Bukhari, W
2014-01-01
The motion of thoracic and abdominal tumours induced by respiratory motion often exceeds 20 mm, and can significantly compromise dose conformality. Motion-adaptive radiotherapy aims to deliver a conformal dose distribution to the tumour with minimal normal tissue exposure by compensating for the tumour motion. This adaptive radiotherapy, however, requires the prediction of the tumour movement that can occur over the system latency period. In general, motion prediction approaches can be classified into two groups: model-based and model-free. Model-based approaches utilize a motion model in predicting respiratory motion. These approaches are computationally efficient and responsive to irregular changes in respiratory motion. Model-free approaches do not assume an explicit model of motion dynamics, and predict future positions by learning from previous observations. Artificial neural networks (ANNs) and support vector regression (SVR) are examples of model-free approaches. In this article, we present a prediction algorithm that combines a model-based and a model-free approach in a cascade structure. The algorithm, which we call EKF–SVR, first employs a model-based algorithm (named LCM–EKF) to predict the respiratory motion, and then uses a model-free SVR algorithm to estimate and correct the error of the LCM–EKF prediction. Extensive numerical experiments based on a large database of 304 respiratory motion traces are performed. The experimental results demonstrate that the EKF–SVR algorithm successfully reduces the prediction error of the LCM–EKF, and outperforms the model-free ANN and SVR algorithms in terms of prediction accuracy across lookahead lengths of 192, 384, and 576 ms. (paper)
Qin, Zijian; Wang, Maolin; Yan, Aixia
2017-07-01
In this study, quantitative structure-activity relationship (QSAR) models using various descriptor sets and training/test set selection methods were explored to predict the bioactivity of hepatitis C virus (HCV) NS3/4A protease inhibitors by using a multiple linear regression (MLR) and a support vector machine (SVM) method. 512 HCV NS3/4A protease inhibitors and their IC 50 values which were determined by the same FRET assay were collected from the reported literature to build a dataset. All the inhibitors were represented with selected nine global and 12 2D property-weighted autocorrelation descriptors calculated from the program CORINA Symphony. The dataset was divided into a training set and a test set by a random and a Kohonen's self-organizing map (SOM) method. The correlation coefficients (r 2 ) of training sets and test sets were 0.75 and 0.72 for the best MLR model, 0.87 and 0.85 for the best SVM model, respectively. In addition, a series of sub-dataset models were also developed. The performances of all the best sub-dataset models were better than those of the whole dataset models. We believe that the combination of the best sub- and whole dataset SVM models can be used as reliable lead designing tools for new NS3/4A protease inhibitors scaffolds in a drug discovery pipeline. Copyright © 2017 Elsevier Ltd. All rights reserved.
Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry
2018-03-29
The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.
Directory of Open Access Journals (Sweden)
Mohammadmehdi Saberioon
2018-03-01
Full Text Available The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss were fed either a fish-meal based diet (80 fish or a 100% plant-based diet (80 fish and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF, Support vector machine (SVM, Logistic regression (LR and k-Nearest neighbours (k-NN. The SVM with radial based kernel provided the best classifier with correct classification rate (CCR of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40% classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin.
Multiresolution and Explicit Methods for Vector Field Analysis and Visualization
Nielson, Gregory M.
1997-01-01
This is a request for a second renewal (3d year of funding) of a research project on the topic of multiresolution and explicit methods for vector field analysis and visualization. In this report, we describe the progress made on this research project during the second year and give a statement of the planned research for the third year. There are two aspects to this research project. The first is concerned with the development of techniques for computing tangent curves for use in visualizing flow fields. The second aspect of the research project is concerned with the development of multiresolution methods for curvilinear grids and their use as tools for visualization, analysis and archiving of flow data. We report on our work on the development of numerical methods for tangent curve computation first.
Digital Repository Service at National Institute of Oceanography (India)
Patil, S.G.; Mandal, S.; Hegde, A.V.; Muruganandam, A.
Support Vector Machine (SVM) works on structural risk minimization principle that has greater generalization ability and is superior to the empirical risk minimization principle as adopted in conventional neural network models. However...
International Nuclear Information System (INIS)
Jain, Rishee K.; Smith, Kevin M.; Culligan, Patricia J.; Taylor, John E.
2014-01-01
Highlights: • We develop a building energy forecasting model using support vector regression. • Model is applied to data from a multi-family residential building in New York City. • We extend sensor based energy forecasting to multi-family residential buildings. • We examine the impact temporal and spatial granularity has on model accuracy. • Optimal granularity occurs at the by floor in hourly temporal intervals. - Abstract: Buildings are the dominant source of energy consumption and environmental emissions in urban areas. Therefore, the ability to forecast and characterize building energy consumption is vital to implementing urban energy management and efficiency initiatives required to curb emissions. Advances in smart metering technology have enabled researchers to develop “sensor based” approaches to forecast building energy consumption that necessitate less input data than traditional methods. Sensor-based forecasting utilizes machine learning techniques to infer the complex relationships between consumption and influencing variables (e.g., weather, time of day, previous consumption). While sensor-based forecasting has been studied extensively for commercial buildings, there is a paucity of research applying this data-driven approach to the multi-family residential sector. In this paper, we build a sensor-based forecasting model using Support Vector Regression (SVR), a commonly used machine learning technique, and apply it to an empirical data-set from a multi-family residential building in New York City. We expand our study to examine the impact of temporal (i.e., daily, hourly, 10 min intervals) and spatial (i.e., whole building, by floor, by unit) granularity have on the predictive power of our single-step model. Results indicate that sensor based forecasting models can be extended to multi-family residential buildings and that the optimal monitoring granularity occurs at the by floor level in hourly intervals. In addition to implications for
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels
Naguib, Ibrahim A.; Abdelaleem, Eglal A.; Draz, Mohammed E.; Zaazaa, Hala E.
2014-09-01
Partial least squares regression (PLSR) and support vector regression (SVR) are two popular chemometric models that are being subjected to a comparative study in the presented work. The comparison shows their characteristics via applying them to analyze Hydrochlorothiazide (HCZ) and Benazepril hydrochloride (BZ) in presence of HCZ impurities; Chlorothiazide (CT) and Salamide (DSA) as a case study. The analysis results prove to be valid for analysis of the two active ingredients in raw materials and pharmaceutical dosage form through handling UV spectral data in range (220-350 nm). For proper analysis a 4 factor 4 level experimental design was established resulting in a training set consisting of 16 mixtures containing different ratios of interfering species. An independent test set consisting of 8 mixtures was used to validate the prediction ability of the suggested models. The results presented indicate the ability of mentioned multivariate calibration models to analyze HCZ and BZ in presence of HCZ impurities CT and DSA with high selectivity and accuracy of mean percentage recoveries of (101.01 ± 0.80) and (100.01 ± 0.87) for HCZ and BZ respectively using PLSR model and of (99.78 ± 0.80) and (99.85 ± 1.08) for HCZ and BZ respectively using SVR model. The analysis results of the dosage form were statistically compared to the reference HPLC method with no significant differences regarding accuracy and precision. SVR model gives more accurate results compared to PLSR model and show high generalization ability, however, PLSR still keeps the advantage of being fast to optimize and implement.
Finding-equal regression method and its application in predication of U resources
International Nuclear Information System (INIS)
Cao Huimo
1995-03-01
The commonly adopted deposit model method in mineral resources predication has two main part: one is model data that show up geological mineralization law for deposit, the other is statistics predication method that accords with characters of the data namely pretty regression method. This kind of regression method may be called finding-equal regression, which is made of the linear regression and distribution finding-equal method. Because distribution finding-equal method is a data pretreatment which accords with advanced mathematical precondition for the linear regression namely equal distribution theory, and this kind of data pretreatment is possible of realization. Therefore finding-equal regression not only can overcome nonlinear limitations, that are commonly occurred in traditional linear regression or other regression and always have no solution, but also can distinguish outliers and eliminate its weak influence, which would usually appeared when Robust regression possesses outlier in independent variables. Thus this newly finding-equal regression stands the best status in all kind of regression methods. Finally, two good examples of U resource quantitative predication are provided
Directory of Open Access Journals (Sweden)
Xianyu Yu
2016-05-01
Full Text Available In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%–19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.
Yu, Xianyu; Wang, Yi; Niu, Ruiqing; Hu, Youjian
2016-05-11
In this study, a novel coupling model for landslide susceptibility mapping is presented. In practice, environmental factors may have different impacts at a local scale in study areas. To provide better predictions, a geographically weighted regression (GWR) technique is firstly used in our method to segment study areas into a series of prediction regions with appropriate sizes. Meanwhile, a support vector machine (SVM) classifier is exploited in each prediction region for landslide susceptibility mapping. To further improve the prediction performance, the particle swarm optimization (PSO) algorithm is used in the prediction regions to obtain optimal parameters for the SVM classifier. To evaluate the prediction performance of our model, several SVM-based prediction models are utilized for comparison on a study area of the Wanzhou district in the Three Gorges Reservoir. Experimental results, based on three objective quantitative measures and visual qualitative evaluation, indicate that our model can achieve better prediction accuracies and is more effective for landslide susceptibility mapping. For instance, our model can achieve an overall prediction accuracy of 91.10%, which is 7.8%-19.1% higher than the traditional SVM-based models. In addition, the obtained landslide susceptibility map by our model can demonstrate an intensive correlation between the classified very high-susceptibility zone and the previously investigated landslides.
Duan, Libin; Xiao, Ning-cong; Li, Guangyao; Cheng, Aiguo; Chen, Tao
2017-07-01
Tailor-rolled blank thin-walled (TRB-TH) structures have become important vehicle components owing to their advantages of light weight and crashworthiness. The purpose of this article is to provide an efficient lightweight design for improving the energy-absorbing capability of TRB-TH structures under dynamic loading. A finite element (FE) model for TRB-TH structures is established and validated by performing a dynamic axial crash test. Different material properties for individual parts with different thicknesses are considered in the FE model. Then, a multi-objective crashworthiness design of the TRB-TH structure is constructed based on the ɛ-support vector regression (ɛ-SVR) technique and non-dominated sorting genetic algorithm-II. The key parameters (C, ɛ and σ) are optimized to further improve the predictive accuracy of ɛ-SVR under limited sample points. Finally, the technique for order preference by similarity to the ideal solution method is used to rank the solutions in Pareto-optimal frontiers and find the best compromise optima. The results demonstrate that the light weight and crashworthiness performance of the optimized TRB-TH structures are superior to their uniform thickness counterparts. The proposed approach provides useful guidance for designing TRB-TH energy absorbers for vehicle bodies.
Vectorization of a particle simulation method for hypersonic rarefied flow
Mcdonald, Jeffrey D.; Baganoff, Donald
1988-01-01
An efficient particle simulation technique for hypersonic rarefied flows is presented at an algorithmic and implementation level. The implementation is for a vector computer architecture, specifically the Cray-2. The method models an ideal diatomic Maxwell molecule with three translational and two rotational degrees of freedom. Algorithms are designed specifically for compatibility with fine grain parallelism by reducing the number of data dependencies in the computation. By insisting on this compatibility, the method is capable of performing simulation on a much larger scale than previously possible. A two-dimensional simulation of supersonic flow over a wedge is carried out for the near-continuum limit where the gas is in equilibrium and the ideal solution can be used as a check on the accuracy of the gas model employed in the method. Also, a three-dimensional, Mach 8, rarefied flow about a finite-span flat plate at a 45 degree angle of attack was simulated. It utilized over 10 to the 7th particles carried through 400 discrete time steps in less than one hour of Cray-2 CPU time. This problem was chosen to exhibit the capability of the method in handling a large number of particles and a true three-dimensional geometry.
Vectorization of a particle simulation method for hypersonic rarefied flow
International Nuclear Information System (INIS)
Mcdonald, J.D.; Baganoff, D.
1988-01-01
An efficient particle simulation technique for hypersonic rarefied flows is presented at an algorithmic and implementation level. The implementation is for a vector computer architecture, specifically the Cray-2. The method models an ideal diatomic Maxwell molecule with three translational and two rotational degrees of freedom. Algorithms are designed specifically for compatibility with fine grain parallelism by reducing the number of data dependencies in the computation. By insisting on this compatibility, the method is capable of performing simulation on a much larger scale than previously possible. A two-dimensional simulation of supersonic flow over a wedge is carried out for the near-continuum limit where the gas is in equilibrium and the ideal solution can be used as a check on the accuracy of the gas model employed in the method. Also, a three-dimensional, Mach 8, rarefied flow about a finite-span flat plate at a 45 degree angle of attack was simulated. It utilized over 10 to the 7th particles carried through 400 discrete time steps in less than one hour of Cray-2 CPU time. This problem was chosen to exhibit the capability of the method in handling a large number of particles and a true three-dimensional geometry. 14 references
DNS Tunneling Detection Method Based on Multilabel Support Vector Machine
Directory of Open Access Journals (Sweden)
Ahmed Almusawi
2018-01-01
Full Text Available DNS tunneling is a method used by malicious users who intend to bypass the firewall to send or receive commands and data. This has a significant impact on revealing or releasing classified information. Several researchers have examined the use of machine learning in terms of detecting DNS tunneling. However, these studies have treated the problem of DNS tunneling as a binary classification where the class label is either legitimate or tunnel. In fact, there are different types of DNS tunneling such as FTP-DNS tunneling, HTTP-DNS tunneling, HTTPS-DNS tunneling, and POP3-DNS tunneling. Therefore, there is a vital demand to not only detect the DNS tunneling but rather classify such tunnel. This study aims to propose a multilabel support vector machine in order to detect and classify the DNS tunneling. The proposed method has been evaluated using a benchmark dataset that contains numerous DNS queries and is compared with a multilabel Bayesian classifier based on the number of corrected classified DNS tunneling instances. Experimental results demonstrate the efficacy of the proposed SVM classification method by obtaining an f-measure of 0.80.
International Nuclear Information System (INIS)
Díaz, Santiago; Carta, José A.; Matías, José M.
2017-01-01
Highlights: • Eight measure-correlate-predict (MCP) models used to estimate the wind power densities (WPDs) at a target site are compared. • Support vector regressions are used as the main prediction techniques in the proposed MCPs. • The most precise MCP uses two sub-models which predict wind speed and air density in an unlinked manner. • The most precise model allows to construct a bivariable (wind speed and air density) WPD probability density function. • MCP models trained to minimise wind speed prediction error do not minimise WPD prediction error. - Abstract: The long-term annual mean wind power density (WPD) is an important indicator of wind as a power source which is usually included in regional wind resource maps as useful prior information to identify potentially attractive sites for the installation of wind projects. In this paper, a comparison is made of eight proposed Measure-Correlate-Predict (MCP) models to estimate the WPDs at a target site. Seven of these models use the Support Vector Regression (SVR) and the eighth the Multiple Linear Regression (MLR) technique, which serves as a basis to compare the performance of the other models. In addition, a wrapper technique with 10-fold cross-validation has been used to select the optimal set of input features for the SVR and MLR models. Some of the eight models were trained to directly estimate the mean hourly WPDs at a target site. Others, however, were firstly trained to estimate the parameters on which the WPD depends (i.e. wind speed and air density) and then, using these parameters, the target site mean hourly WPDs. The explanatory features considered are different combinations of the mean hourly wind speeds, wind directions and air densities recorded in 2014 at ten weather stations in the Canary Archipelago (Spain). The conclusions that can be drawn from the study undertaken include the argument that the most accurate method for the long-term estimation of WPDs requires the execution of a
Balabin, Roman M; Lomakina, Ekaterina I
2011-06-28
A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is
A Novel Self-Calibration Method for Acoustic Vector Sensor
Directory of Open Access Journals (Sweden)
Yao Zhang
2018-01-01
Full Text Available The acoustic vector sensor (AVS can measure the acoustic pressure field’s spatial gradient, so it has directionality. But its channels may have nonideal gain/phase responses, which will severely degrade its performance in finding source direction. To solve this problem, in this study, a self-calibration algorithm based on all-phase FFT spectrum analysis is proposed. This method is “self-calibrated” because prior knowledge of the training signal’s arrival angle is not required. By measuring signals from different directions, the initial phase can be achieved by taking the all-phase FFT transform to each channel. We use the amplitude of the main spectrum peak of every channel in different direction to formulate an equation; the amplitude gain estimates can be achieved by solving this equation. In order to get better estimation accuracy, bearing difference of different training signals should be larger than a threshold, which is related to SNR. Finally, the reference signal’s direction of arrival can be estimated. This method is easy to implement and has advantage in accuracy and antinoise. The efficacy of this proposed scheme is verified with simulation results.
Directory of Open Access Journals (Sweden)
Zhenliang Yin
2017-11-01
Full Text Available This study aims to project future variability of reference evapotranspiration (ET0 using artificial intelligence methods, constructed with an extreme-learning machine (ELM and support vector regression (SVR in a mountainous inland watershed in north-west China. Eight global climate model (GCM outputs retrieved from the Coupled Model Inter-comparison Project Phase 5 (CMIP5 were employed to downscale monthly ET0 for the historical period 1960–2005 as a validation approach and for the future period 2010–2099 as a projection of ET0 under the Representative Concentration Pathway (RCP 4.5 and 8.5 scenarios. The following conclusions can be drawn: the ELM and SVR methods demonstrate a very good performance in estimating Food and Agriculture Organization (FAO-56 Penman–Monteith ET0. Variation in future ET0 mainly occurs in the spring and autumn seasons, while the summer and winter ET0 changes are moderately small. Annually, the ET0 values were shown to increase at a rate of approximately 7.5 mm, 7.5 mm, 0.0 mm (8.2 mm, 15.0 mm, 15.0 mm decade−1, respectively, for the near-term projection (2010–2039, mid-term projection (2040–2069, and long-term projection (2070–2099 under the RCP4.5 (RCP8.5 scenario. Compared to the historical period, the relative changes in ET0 were found to be approximately 2%, 5% and 6% (2%, 7% and 13%, during the near, mid- and long-term periods, respectively, under the RCP4.5 (RCP8.5 warming scenarios. In accordance with the analyses, we aver that the opportunity to downscale monthly ET0 with artificial intelligence is useful in practice for water-management policies.
Spectral element method for vector radiative transfer equation
International Nuclear Information System (INIS)
Zhao, J.M.; Liu, L.H.; Hsu, P.-F.; Tan, J.Y.
2010-01-01
A spectral element method (SEM) is developed to solve polarized radiative transfer in multidimensional participating medium. The angular discretization is based on the discrete-ordinates approach, and the spatial discretization is conducted by spectral element approach. Chebyshev polynomial is used to build basis function on each element. Four various test problems are taken as examples to verify the performance of the SEM. The effectiveness of the SEM is demonstrated. The h and the p convergence characteristics of the SEM are studied. The convergence rate of p-refinement follows the exponential decay trend and is superior to that of h-refinement. The accuracy and efficiency of the higher order approximation in the SEM is well demonstrated for the solution of the VRTE. The predicted angular distribution of brightness temperature and Stokes vector by the SEM agree very well with the benchmark solutions in references. Numerical results show that the SEM is accurate, flexible and effective to solve multidimensional polarized radiative transfer problems.
A multistage motion vector processing method for motion-compensated frame interpolation.
Huang, Ai- Mei; Nguyen, Truong Q
2008-05-01
In this paper, a novel, low-complexity motion vector processing algorithm at the decoder is proposed for motion-compensated frame interpolation or frame rate up-conversion. We address the problems of having broken edges and deformed structures in an interpolated frame by hierarchically refining motion vectors on different block sizes. Our method explicitly considers the reliability of each received motion vector and has the capability of preserving the structure information. This is achieved by analyzing the distribution of residual energies and effectively merging blocks that have unreliable motion vectors. The motion vector reliability information is also used as a prior knowledge in motion vector refinement using a constrained vector median filter to avoid choosing identical unreliable one. We also propose using chrominance information in our method. Experimental results show that the proposed scheme has better visual quality and is also robust, even in video sequences with complex scenes and fast motion.
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...
Shastri, Niket; Pathak, Kamlesh
2018-05-01
The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.
Directory of Open Access Journals (Sweden)
Roberto Romaniello
2015-12-01
Full Text Available The aim of this work is to evaluate the potential of least squares support vector machine (LS-SVM regression to develop an efficient method to measure the colour of food materials in L*a*b* units by means of a computer vision systems (CVS. A laboratory CVS, based on colour digital camera (CDC, was implemented and three LS-SVM models were trained and validated, one for each output variables (L*, a*, and b* required by this problem, using the RGB signals generated by the CDC as input variables to these models. The colour target-based approach was used to camera characterization and a standard reference target of 242 colour samples was acquired using the CVS and a colorimeter. This data set was split in two sets of equal sizes, for training and validating the LS-SVM models. An effective two-stage grid search process on the parameters space was performed in MATLAB to tune the regularization parameters γ and the kernel parameters σ2 of the three LS-SVM models. A 3-8-3 multilayer feed-forward neural network (MFNN, according to the research conducted by León et al. (2006, was also trained in order to compare its performance with those of LS-SVM models. The LS-SVM models developed in this research have been shown better generalization capability then the MFNN, allowed to obtain high correlations between L*a*b* data acquired using the colorimeter and the corresponding data obtained by transformation of the RGB data acquired by the CVS. In particular, for the validation set, R2 values equal to 0.9989, 0.9987, and 0.9994 for L*, a* and b* parameters were obtained. The root mean square error values were 0.6443, 0.3226, and 0.2702 for L*, a*, and b* respectively, and the average of colour differences ΔEab was 0.8232±0.5033 units. Thus, LS-SVM regression seems to be a useful tool to measurement of food colour using a low cost CVS.
Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A
2014-01-01
Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).
Easy methods for extracting individual regression slopes: Comparing SPSS, R, and Excel
Directory of Open Access Journals (Sweden)
Roland Pfister
2013-10-01
Full Text Available Three different methods for extracting coefficientsof linear regression analyses are presented. The focus is on automatic and easy-to-use approaches for common statistical packages: SPSS, R, and MS Excel / LibreOffice Calc. Hands-on examples are included for each analysis, followed by a brief description of how a subsequent regression coefficient analysis is performed.
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Modeling vector nonlinear time series using POLYMARS
de Gooijer, J.G.; Ray, B.K.
2003-01-01
A modified multivariate adaptive regression splines method for modeling vector nonlinear time series is investigated. The method results in models that can capture certain types of vector self-exciting threshold autoregressive behavior, as well as provide good predictions for more general vector
Energy Technology Data Exchange (ETDEWEB)
Boucher, Thomas F., E-mail: boucher@cs.umass.edu [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Ozanne, Marie V. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Carmosino, Marco L. [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Dyar, M. Darby [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Mahadevan, Sridhar [School of Computer Science, University of Massachusetts Amherst, 140 Governor' s Drive, Amherst, MA 01003, United States. (United States); Breves, Elly A.; Lepore, Kate H. [Department of Astronomy, Mount Holyoke College, South Hadley, MA 01075 (United States); Clegg, Samuel M. [Los Alamos National Laboratory, P.O. Box 1663, MS J565, Los Alamos, NM 87545 (United States)
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO{sub 2}, Fe{sub 2}O{sub 3}, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na{sub 2}O, K{sub 2}O, TiO{sub 2}, and P{sub 2}O{sub 5}, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high
International Nuclear Information System (INIS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-01-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO 2 , Fe 2 O 3 , CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na 2 O, K 2 O, TiO 2 , and P 2 O 5 , the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144
Gusriani, N.; Firdaniza
2018-03-01
The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.
Energy Technology Data Exchange (ETDEWEB)
Lopez Fontan, J.L.; Costa, J.; Ruso, J.M.; Prieto, G. [Dept. of Applied Physics, Univ. of Santiago de Compostela, Santiago de Compostela (Spain); Sarmiento, F. [Dept. of Mathematics, Faculty of Informatics, Univ. of A Coruna, A Coruna (Spain)
2004-02-01
The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found. (orig.)
Fuzzy Linear Regression for the Time Series Data which is Fuzzified with SMRGT Method
Directory of Open Access Journals (Sweden)
Seçil YALAZ
2016-10-01
Full Text Available Our work on regression and classification provides a new contribution to the analysis of time series used in many areas for years. Owing to the fact that convergence could not obtained with the methods used in autocorrelation fixing process faced with time series regression application, success is not met or fall into obligation of changing the models’ degree. Changing the models’ degree may not be desirable in every situation. In our study, recommended for these situations, time series data was fuzzified by using the simple membership function and fuzzy rule generation technique (SMRGT and to estimate future an equation has created by applying fuzzy least square regression (FLSR method which is a simple linear regression method to this data. Although SMRGT has success in determining the flow discharge in open channels and can be used confidently for flow discharge modeling in open canals, as well as in pipe flow with some modifications, there is no clue about that this technique is successful in fuzzy linear regression modeling. Therefore, in order to address the luck of such a modeling, a new hybrid model has been described within this study. In conclusion, to demonstrate our methods’ efficiency, classical linear regression for time series data and linear regression for fuzzy time series data were applied to two different data sets, and these two approaches performances were compared by using different measures.
An NCME Instructional Module on Data Mining Methods for Classification and Regression
Sinharay, Sandip
2016-01-01
Data mining methods for classification and regression are becoming increasingly popular in various scientific fields. However, these methods have not been explored much in educational measurement. This module first provides a review, which should be accessible to a wide audience in education measurement, of some of these methods. The module then…
Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti
2010-01-01
In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…
Statistical approach for selection of regression model during validation of bioanalytical method
Directory of Open Access Journals (Sweden)
Natalija Nakov
2014-06-01
Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.
Timothy G. Wade; James D. Wickham; Maliha S. Nash; Anne C. Neale; Kurt H. Riitters; K. Bruce Jones
2003-01-01
AbstractGIS-based measurements that combine native raster and native vector data are commonly used in environmental assessments. Most of these measurements can be calculated using either raster or vector data formats and processing methods. Raster processes are more commonly used because they can be significantly faster computationally...
Directory of Open Access Journals (Sweden)
ELİF BULUT
2013-06-01
Full Text Available Partial Least Squares Regression (PLSR is a multivariate statistical method that consists of partial least squares and multiple linear regression analysis. Explanatory variables, X, having multicollinearity are reduced to components which explain the great amount of covariance between explanatory and response variable. These components are few in number and they don’t have multicollinearity problem. Then multiple linear regression analysis is applied to those components to model the response variable Y. There are various PLSR algorithms. In this study NIPALS and PLS-Kernel algorithms will be studied and illustrated on a real data set.
The Bland-Altman Method Should Not Be Used in Regression Cross-Validation Studies
O'Connor, Daniel P.; Mahar, Matthew T.; Laughlin, Mitzi S.; Jackson, Andrew S.
2011-01-01
The purpose of this study was to demonstrate the bias in the Bland-Altman (BA) limits of agreement method when it is used to validate regression models. Data from 1,158 men were used to develop three regression equations to estimate maximum oxygen uptake (R[superscript 2] = 0.40, 0.61, and 0.82, respectively). The equations were evaluated in a…
Sparling, D.W.; Barzen, J.A.; Lovvorn, J.R.; Serie, J.R.
1992-01-01
Regression equations that use mensural data to estimate body condition have been developed for several water birds. These equations often have been based on data that represent different sexes, age classes, or seasons, without being adequately tested for intergroup differences. We used proximate carcass analysis of 538 adult and juvenile canvasbacks (Aythya valisineria ) collected during fall migration, winter, and spring migrations in 1975-76 and 1982-85 to test regression methods for estimating body condition.
Leone, Robert Matthew
A search for vector-like quarks (VLQs) decaying to a Z boson using multi-stage machine learning was compared to a search using a standard square cuts search strategy. VLQs are predicted by several new theories beyond the Standard Model. The searches used 20.3 inverse femtobarns of proton-proton collisions at a center-of-mass energy of 8 TeV collected with the ATLAS detector in 2012 at the CERN Large Hadron Collider. CLs upper limits on production cross sections of vector-like top and bottom quarks were computed for VLQs produced singly or in pairs, Tsingle, Bsingle, Tpair, and Bpair. The two stage machine learning classification search strategy did not provide any improvement over the standard square cuts strategy, but for Tpair, Bpair, and Tsingle, a third stage of machine learning regression was able to lower the upper limits of high signal masses by as much as 50%. Additionally, new test statistics were developed for use in the Neyman construction of confidence regions in order to address deficiencies in c...
Treating experimental data of inverse kinetic method by unitary linear regression analysis
International Nuclear Information System (INIS)
Zhao Yusen; Chen Xiaoliang
2009-01-01
The theory of treating experimental data of inverse kinetic method by unitary linear regression analysis was described. Not only the reactivity, but also the effective neutron source intensity could be calculated by this method. Computer code was compiled base on the inverse kinetic method and unitary linear regression analysis. The data of zero power facility BFS-1 in Russia were processed and the results were compared. The results show that the reactivity and the effective neutron source intensity can be obtained correctly by treating experimental data of inverse kinetic method using unitary linear regression analysis and the precision of reactivity measurement is improved. The central element efficiency can be calculated by using the reactivity. The result also shows that the effect to reactivity measurement caused by external neutron source should be considered when the reactor power is low and the intensity of external neutron source is strong. (authors)
Statistical methods in regression and calibration analysis of chromosome aberration data
International Nuclear Information System (INIS)
Merkle, W.
1983-01-01
The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
Calculation of U, Ra, Th and K contents in uranium ore by multiple linear regression method
International Nuclear Information System (INIS)
Lin Chao; Chen Yingqiang; Zhang Qingwen; Tan Fuwen; Peng Guanghui
1991-01-01
A multiple linear regression method was used to compute γ spectra of uranium ore samples and to calculate contents of U, Ra, Th, and K. In comparison with the inverse matrix method, its advantage is that no standard samples of pure U, Ra, Th and K are needed for obtaining response coefficients
Martens, Edwin P; de Boer, Anthonius; Pestman, Wiebe R; Belitser, Svetlana V; Stricker, Bruno H Ch; Klungel, Olaf H
PURPOSE: To compare adjusted effects of drug treatment for hypertension on the risk of stroke from propensity score (PS) methods with a multivariable Cox proportional hazards (Cox PH) regression in an observational study with censored data. METHODS: From two prospective population-based cohort
Active damage detection method based on support vector machine and impulse response
International Nuclear Information System (INIS)
Taniguchi, Ryuta; Mita, Akira
2004-01-01
An active damage detection method was proposed to characterize damage in bolted joints. The purpose of this study is to propose a damage detection method that can obtain the detailed information of the damage by creating feature vectors for pattern recognition. In the proposed method, the wavelet transform is applied to the sensor signals, and the feature vectors are defined by second power average of the amplitude. The feature vectors generated by experiments were successfully used as the training data for Support Vector Machine (SVM). By applying the wavelet transform to time-frequency analysis, the accuracy of pattern recognition was raised in both correlation coefficient and SVM applications. Moreover, the SVM could identify the damage with very strong discernment capability than others. Applicability of the proposed method was successfully demonstrated. (author)
Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding
de los Campos, Gustavo; Hickey, John M.; Pong-Wong, Ricardo; Daetwyler, Hans D.; Calus, Mario P. L.
2013-01-01
Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade. PMID:22745228
Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Sallehuddin, Roselina
2013-01-01
Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.
Directory of Open Access Journals (Sweden)
Razana Alwee
2013-01-01
Full Text Available Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR and autoregressive integrated moving average (ARIMA to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.
Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin
2017-01-01
Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction.
An improved partial least-squares regression method for Raman spectroscopy
Momenpour Tehran Monfared, Ali; Anis, Hanan
2017-10-01
It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.
Novel method of finding extreme edges in a convex set of N-dimension vectors
Hu, Chia-Lun J.
2001-11-01
As we published in the last few years, for a binary neural network pattern recognition system to learn a given mapping {Um mapped to Vm, m=1 to M} where um is an N- dimension analog (pattern) vector, Vm is a P-bit binary (classification) vector, the if-and-only-if (IFF) condition that this network can learn this mapping is that each i-set in {Ymi, m=1 to M} (where Ymithere existsVmiUm and Vmi=+1 or -1, is the i-th bit of VR-m).)(i=1 to P and there are P sets included here.) Is POSITIVELY, LINEARLY, INDEPENDENT or PLI. We have shown that this PLI condition is MORE GENERAL than the convexity condition applied to a set of N-vectors. In the design of old learning machines, we know that if a set of N-dimension analog vectors form a convex set, and if the machine can learn the boundary vectors (or extreme edges) of this set, then it can definitely learn the inside vectors contained in this POLYHEDRON CONE. This paper reports a new method and new algorithm to find the boundary vectors of a convex set of ND analog vectors.
Wang, Jiangbo; Liu, Junhui; Li, Tiantian; Yin, Shuo; He, Xinhui
2018-01-01
The monthly electricity sales forecasting is a basic work to ensure the safety of the power system. This paper presented a monthly electricity sales forecasting method which comprehensively considers the coupled multi-factors of temperature, economic growth, electric power replacement and business expansion. The mathematical model is constructed by using regression method. The simulation results show that the proposed method is accurate and effective.
Ghaedi, M; Rahimi, Mahmoud Reza; Ghaedi, A M; Tyagi, Inderjeet; Agarwal, Shilpi; Gupta, Vinod Kumar
2016-01-01
Two novel and eco friendly adsorbents namely tin oxide nanoparticles loaded on activated carbon (SnO2-NP-AC) and activated carbon prepared from wood tree Pistacia atlantica (AC-PAW) were used for the rapid removal and fast adsorption of methyl orange (MO) from the aqueous phase. The dependency of MO removal with various adsorption influential parameters was well modeled and optimized using multiple linear regressions (MLR) and least squares support vector regression (LSSVR). The optimal parameters for the LSSVR model were found based on γ value of 0.76 and σ(2) of 0.15. For testing the data set, the mean square error (MSE) values of 0.0010 and the coefficient of determination (R(2)) values of 0.976 were obtained for LSSVR model, and the MSE value of 0.0037 and the R(2) value of 0.897 were obtained for the MLR model. The adsorption equilibrium and kinetic data was found to be well fitted and in good agreement with Langmuir isotherm model and second-order equation and intra-particle diffusion models respectively. The small amount of the proposed SnO2-NP-AC and AC-PAW (0.015 g and 0.08 g) is applicable for successful rapid removal of methyl orange (>95%). The maximum adsorption capacity for SnO2-NP-AC and AC-PAW was 250 mg g(-1) and 125 mg g(-1) respectively. Copyright © 2015 Elsevier Inc. All rights reserved.
Vectorization and parallelization of the finite strip method for dynamic Mindlin plate problems
Chen, Hsin-Chu; He, Ai-Fang
1993-01-01
The finite strip method is a semi-analytical finite element process which allows for a discrete analysis of certain types of physical problems by discretizing the domain of the problem into finite strips. This method decomposes a single large problem into m smaller independent subproblems when m harmonic functions are employed, thus yielding natural parallelism at a very high level. In this paper we address vectorization and parallelization strategies for the dynamic analysis of simply-supported Mindlin plate bending problems and show how to prevent potential conflicts in memory access during the assemblage process. The vector and parallel implementations of this method and the performance results of a test problem under scalar, vector, and vector-concurrent execution modes on the Alliant FX/80 are also presented.
International Nuclear Information System (INIS)
Shuke, Noriyuki
1991-01-01
In hepatobiliary scintigraphy, kinetic model analysis, which provides kinetic parameters like hepatic extraction or excretion rate, have been done for quantitative evaluation of liver function. In this analysis, unknown model parameters are usually determined using nonlinear least square regression method (NLS method) where iterative calculation and initial estimate for unknown parameters are required. As a simple alternative to NLS method, direct integral linear least square regression method (DILS method), which can determine model parameters by a simple calculation without initial estimate, is proposed, and tested the applicability to analysis of hepatobiliary scintigraphy. In order to see whether DILS method could determine model parameters as good as NLS method, or to determine appropriate weight for DILS method, simulated theoretical data based on prefixed parameters were fitted to 1 compartment model using both DILS method with various weightings and NLS method. The parameter values obtained were then compared with prefixed values which were used for data generation. The effect of various weights on the error of parameter estimate was examined, and inverse of time was found to be the best weight to make the error minimum. When using this weight, DILS method could give parameter values close to those obtained by NLS method and both parameter values were very close to prefixed values. With appropriate weighting, the DILS method could provide reliable parameter estimate which is relatively insensitive to the data noise. In conclusion, the DILS method could be used as a simple alternative to NLS method, providing reliable parameter estimate. (author)
A different approach to estimate nonlinear regression model using numerical methods
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper concerns with the computational methods namely the Gauss-Newton method, Gradient algorithm methods (Newton-Raphson method, Steepest Descent or Steepest Ascent algorithm method, the Method of Scoring, the Method of Quadratic Hill-Climbing) based on numerical analysis to estimate parameters of nonlinear regression model in a very different way. Principles of matrix calculus have been used to discuss the Gradient-Algorithm methods. Yonathan Bard [1] discussed a comparison of gradient methods for the solution of nonlinear parameter estimation problems. However this article discusses an analytical approach to the gradient algorithm methods in a different way. This paper describes a new iterative technique namely Gauss-Newton method which differs from the iterative technique proposed by Gorden K. Smyth [2]. Hans Georg Bock et.al [10] proposed numerical methods for parameter estimation in DAE’s (Differential algebraic equation). Isabel Reis Dos Santos et al [11], Introduced weighted least squares procedure for estimating the unknown parameters of a nonlinear regression metamodel. For large-scale non smooth convex minimization the Hager and Zhang (HZ) conjugate gradient Method and the modified HZ (MHZ) method were presented by Gonglin Yuan et al [12].
Regression dilution bias: tools for correction methods and sample size calculation.
Berglund, Lars
2012-08-01
Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
Directory of Open Access Journals (Sweden)
Yi-Ming Kuo
2011-06-01
Full Text Available Fine airborne particulate matter (PM2.5 has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS, the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME method. The resulting epistemic framework can assimilate knowledge bases including: (a empirical-based spatial trends of PM concentration based on landuse regression, (b the spatio-temporal dependence among PM observation information, and (c site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan from 2005–2007.
Yu, Hwa-Lung; Wang, Chih-Hsih; Liu, Ming-Che; Kuo, Yi-Ming
2011-06-01
Fine airborne particulate matter (PM2.5) has adverse effects on human health. Assessing the long-term effects of PM2.5 exposure on human health and ecology is often limited by a lack of reliable PM2.5 measurements. In Taipei, PM2.5 levels were not systematically measured until August, 2005. Due to the popularity of geographic information systems (GIS), the landuse regression method has been widely used in the spatial estimation of PM concentrations. This method accounts for the potential contributing factors of the local environment, such as traffic volume. Geostatistical methods, on other hand, account for the spatiotemporal dependence among the observations of ambient pollutants. This study assesses the performance of the landuse regression model for the spatiotemporal estimation of PM2.5 in the Taipei area. Specifically, this study integrates the landuse regression model with the geostatistical approach within the framework of the Bayesian maximum entropy (BME) method. The resulting epistemic framework can assimilate knowledge bases including: (a) empirical-based spatial trends of PM concentration based on landuse regression, (b) the spatio-temporal dependence among PM observation information, and (c) site-specific PM observations. The proposed approach performs the spatiotemporal estimation of PM2.5 levels in the Taipei area (Taiwan) from 2005-2007.
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
Cox regression with missing covariate data using a modified partial likelihood method
DEFF Research Database (Denmark)
Martinussen, Torben; Holst, Klaus K.; Scheike, Thomas H.
2016-01-01
Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard...
Convert a low-cost sensor to a colorimeter using an improved regression method
Wu, Yifeng
2008-01-01
Closed loop color calibration is a process to maintain consistent color reproduction for color printers. To perform closed loop color calibration, a pre-designed color target should be printed, and automatically measured by a color measuring instrument. A low cost sensor has been embedded to the printer to perform the color measurement. A series of sensor calibration and color conversion methods have been developed. The purpose is to get accurate colorimetric measurement from the data measured by the low cost sensor. In order to get high accuracy colorimetric measurement, we need carefully calibrate the sensor, and minimize all possible errors during the color conversion. After comparing several classical color conversion methods, a regression based color conversion method has been selected. The regression is a powerful method to estimate the color conversion functions. But the main difficulty to use this method is to find an appropriate function to describe the relationship between the input and the output data. In this paper, we propose to use 1D pre-linearization tables to improve the linearity between the input sensor measuring data and the output colorimetric data. Using this method, we can increase the accuracy of the regression method, so as to improve the accuracy of the color conversion.
Anderson, Carl A.; McRae, Allan F.; Visscher, Peter M.
2006-01-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using...
USING LEARNING VECTOR QUANTIZATION METHOD FOR AUTOMATED IDENTIFICATION OF MYCOBACTERIUM TUBERCULOSIS
Directory of Open Access Journals (Sweden)
Endah Purwanti
2012-01-01
Full Text Available In this paper, we are developing an automated method for the detection of tubercle bacilli in clinical specimens, principally the sputum. This investigation is the first attempt to automatically identify TB bacilli in sputum using image processing and learning vector quantization (LVQ techniques. The evaluation of the learning vector quantization (LVQ was carried out on Tuberculosis dataset show that average of accuracy is 91,33%.
On the asymptotic form of the recursion method basis vectors for periodic Hamiltonians
International Nuclear Information System (INIS)
O'Reilly, E.P.; Weaire, D.
1984-01-01
The authors present the first detailed study of the recursion method basis vectors for the case of a periodic Hamiltonian. In the examples chosen, the probability density scales linearly with n as n → infinity, whenever the local density of states is bounded. Whenever it is unbounded and the recursion coefficients diverge, different scaling behaviour is found. These findings are explained and a scaling relationship between the asymptotic forms of the recursion coefficients and basis vectors is proposed. (author)
A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression.
Stock, Michiel; Pahikkala, Tapio; Airola, Antti; De Baets, Bernard; Waegeman, Willem
2018-06-12
Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Track Circuit Fault Diagnosis Method based on Least Squares Support Vector
Cao, Yan; Sun, Fengru
2018-01-01
In order to improve the troubleshooting efficiency and accuracy of the track circuit, track circuit fault diagnosis method was researched. Firstly, the least squares support vector machine was applied to design the multi-fault classifier of the track circuit, and then the measured track data as training samples was used to verify the feasibility of the methods. Finally, the results based on BP neural network fault diagnosis methods and the methods used in this paper were compared. Results shows that the track fault classifier based on least squares support vector machine can effectively achieve the five track circuit fault diagnosis with less computing time.
Minimum-Voltage Vector Injection Method for Sensorless Control of PMSM for Low-Speed Operations
DEFF Research Database (Denmark)
Xie, Ge; Lu, Kaiyuan; Kumar, Dwivedi Sanjeet
2016-01-01
In this paper, a simple signal injection method is proposed for sensorless control of PMSM at low speed, which ideally requires one voltage vector only for position estimation. The proposed method is easy to implement resulting in low computation burden. No filters are needed for extracting...... may also be further developed to inject two opposite voltage vectors to reduce the effects of inverter voltage error on the position estimation accuracy. The effectiveness of the proposed method is demonstrated by comparing with other sensorless control method. Theoretical analysis and experimental...
Non-overlapped P- and S-wave Poynting vectors and its solution on Grid Method
Lu, Yong Ming; Liu, Qiancheng
2017-01-01
Poynting vector represents the local directional energy flux density of seismic waves in geophysics. It is widely used in elastic reverse time migration (RTM) to analyze source illumination, suppress low-wavenumber noise, correct for image polarity and extract angle-domain common imaging gather (ADCIG). However, the P and S waves are mixed together during wavefield propagation such that the P and S energy fluxes are not clean everywhere, especially at the overlapped points. In this paper, we use a modified elastic wave equation in which the P and S vector wavefields are naturally separated. Then, we develop an efficient method to evaluate the separable P and S poynting vectors, respectively, based on the view that the group velocity and phase velocity have the same direction in isotropic elastic media. We furthermore formulate our method using an unstructured mesh based modeling method named the grid method. Finally, we verify our method using two numerical examples.
Non-overlapped P- and S-wave Poynting vectors and its solution on Grid Method
Lu, Yong Ming
2017-12-12
Poynting vector represents the local directional energy flux density of seismic waves in geophysics. It is widely used in elastic reverse time migration (RTM) to analyze source illumination, suppress low-wavenumber noise, correct for image polarity and extract angle-domain common imaging gather (ADCIG). However, the P and S waves are mixed together during wavefield propagation such that the P and S energy fluxes are not clean everywhere, especially at the overlapped points. In this paper, we use a modified elastic wave equation in which the P and S vector wavefields are naturally separated. Then, we develop an efficient method to evaluate the separable P and S poynting vectors, respectively, based on the view that the group velocity and phase velocity have the same direction in isotropic elastic media. We furthermore formulate our method using an unstructured mesh based modeling method named the grid method. Finally, we verify our method using two numerical examples.
An Asymmetrical Space Vector Method for Single Phase Induction Motor
DEFF Research Database (Denmark)
Cui, Yuanhai; Blaabjerg, Frede; Andersen, Gert Karmisholt
2002-01-01
Single phase induction motors are the workhorses in low-power applications in the world, and also the variable speed is necessary. Normally it is achieved either by the mechanical method or by controlling the capacitor connected with the auxiliary winding. Any above method has some drawback which...
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary
Gillis, Nicolas; Luce, Robert
2018-01-01
A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Sparse Method for Direction of Arrival Estimation Using Denoised Fourth-Order Cumulants Vector.
Fan, Yangyu; Wang, Jianshu; Du, Rui; Lv, Guoyun
2018-06-04
Fourth-order cumulants (FOCs) vector-based direction of arrival (DOA) estimation methods of non-Gaussian sources may suffer from poor performance for limited snapshots or difficulty in setting parameters. In this paper, a novel FOCs vector-based sparse DOA estimation method is proposed. Firstly, by utilizing the concept of a fourth-order difference co-array (FODCA), an advanced FOCs vector denoising or dimension reduction procedure is presented for arbitrary array geometries. Then, a novel single measurement vector (SMV) model is established by the denoised FOCs vector, and efficiently solved by an off-grid sparse Bayesian inference (OGSBI) method. The estimation errors of FOCs are integrated in the SMV model, and are approximately estimated in a simple way. A necessary condition regarding the number of identifiable sources of our method is presented that, in order to uniquely identify all sources, the number of sources K must fulfill K ≤ ( M 4 - 2 M 3 + 7 M 2 - 6 M ) / 8 . The proposed method suits any geometry, does not need prior knowledge of the number of sources, is insensitive to associated parameters, and has maximum identifiability O ( M 4 ) , where M is the number of sensors in the array. Numerical simulations illustrate the superior performance of the proposed method.
Valente, Giancarlo; De Martino, Federico; Esposito, Fabrizio; Goebel, Rainer; Formisano, Elia
2011-05-15
In this work we illustrate the approach of the Maastricht Brain Imaging Center to the PBAIC 2007 competition, where participants had to predict, based on fMRI measurements of brain activity, subject driven actions and sensory experience in a virtual world. After standard pre-processing (slice scan time correction, motion correction), we generated rating predictions based on linear Relevance Vector Machine (RVM) learning from all brain voxels. Spatial and temporal filtering of the time series was optimized rating by rating. For some of the ratings (e.g. Instructions, Hits, Faces, Velocity), linear RVM regression was accurate and very consistent within and between subjects. For other ratings (e.g. Arousal, Valence) results were less satisfactory. Our approach ranked overall second. To investigate the role of different brain regions in ratings prediction we generated predictive maps, i.e. maps of the weighted contribution of each voxel to the predicted rating. These maps generally included (but were not limited to) "specialized" regions which are consistent with results from conventional neuroimaging studies and known functional neuroanatomy. In conclusion, Sparse Bayesian Learning models, such as RVM, appear to be a valuable approach to the multivariate regression of fMRI time series. The implementation of the Automatic Relevance Determination criterion is particularly suitable and provides a good generalization, despite the limited number of samples which is typically available in fMRI. Predictive maps allow disclosing multi-voxel patterns of brain activity that predict perceptual and behavioral subjective experience. Copyright © 2010 Elsevier Inc. All rights reserved.
Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu
2016-09-01
We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1-10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633-0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926-0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation.
Using the fuzzy linear regression method to benchmark the energy efficiency of commercial buildings
International Nuclear Information System (INIS)
Chung, William
2012-01-01
Highlights: ► Fuzzy linear regression method is used for developing benchmarking systems. ► The systems can be used to benchmark energy efficiency of commercial buildings. ► The resulting benchmarking model can be used by public users. ► The resulting benchmarking model can capture the fuzzy nature of input–output data. -- Abstract: Benchmarking systems from a sample of reference buildings need to be developed to conduct benchmarking processes for the energy efficiency of commercial buildings. However, not all benchmarking systems can be adopted by public users (i.e., other non-reference building owners) because of the different methods in developing such systems. An approach for benchmarking the energy efficiency of commercial buildings using statistical regression analysis to normalize other factors, such as management performance, was developed in a previous work. However, the field data given by experts can be regarded as a distribution of possibility. Thus, the previous work may not be adequate to handle such fuzzy input–output data. Consequently, a number of fuzzy structures cannot be fully captured by statistical regression analysis. This present paper proposes the use of fuzzy linear regression analysis to develop a benchmarking process, the resulting model of which can be used by public users. An illustrative example is given as well.
Kisi, Ozgur; Parmar, Kulwinder Singh
2016-03-01
This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.
Directory of Open Access Journals (Sweden)
Massoud Tabesh
2011-07-01
Full Text Available Optimum operation of water distribution networks is one of the priorities of sustainable development of water resources, considering the issues of increasing efficiency and decreasing the water losses. One of the key subjects in optimum operational management of water distribution systems is preparing rehabilitation and replacement schemes, prediction of pipes break rate and evaluation of their reliability. Several approaches have been presented in recent years regarding prediction of pipe failure rates which each one requires especial data sets. Deterministic models based on age and deterministic multi variables and stochastic group modeling are examples of the solutions which relate pipe break rates to parameters like age, material and diameters. In this paper besides the mentioned parameters, more factors such as pipe depth and hydraulic pressures are considered as well. Then using multi variable regression method, intelligent approaches (Artificial neural network and neuro fuzzy models and Evolutionary polynomial Regression method (EPR pipe burst rate are predicted. To evaluate the results of different approaches, a case study is carried out in a part ofMashhadwater distribution network. The results show the capability and advantages of ANN and EPR methods to predict pipe break rates, in comparison with neuro fuzzy and multi-variable regression methods.
Kaneko, Hiromasa
2018-02-26
To develop a new ensemble learning method and construct highly predictive regression models in chemoinformatics and chemometrics, applicability domains (ADs) are introduced into the ensemble learning process of prediction. When estimating values of an objective variable using subregression models, only the submodels with ADs that cover a query sample, i.e., the sample is inside the model's AD, are used. By constructing submodels and changing a list of selected explanatory variables, the union of the submodels' ADs, which defines the overall AD, becomes large, and the prediction performance is enhanced for diverse compounds. By analyzing a quantitative structure-activity relationship data set and a quantitative structure-property relationship data set, it is confirmed that the ADs can be enlarged and the estimation performance of regression models is improved compared with traditional methods.
Development of Compressive Failure Strength for Composite Laminate Using Regression Analysis Method
Energy Technology Data Exchange (ETDEWEB)
Lee, Myoung Keon [Agency for Defense Development, Daejeon (Korea, Republic of); Lee, Jeong Won; Yoon, Dong Hyun; Kim, Jae Hoon [Chungnam Nat’l Univ., Daejeon (Korea, Republic of)
2016-10-15
This paper provides the compressive failure strength value of composite laminate developed by using regression analysis method. Composite material in this document is a Carbon/Epoxy unidirection(UD) tape prepreg(Cycom G40-800/5276-1) cured at 350°F(177°C). The operating temperature is –60°F~+200°F(-55°C - +95°C). A total of 56 compression tests were conducted on specimens from eight (8) distinct laminates that were laid up by standard angle layers (0°, +45°, –45° and 90°). The ASTM-D-6484 standard was used for test method. The regression analysis was performed with the response variable being the laminate ultimate fracture strength and the regressor variables being two ply orientations (0° and ±45°)
Development of Compressive Failure Strength for Composite Laminate Using Regression Analysis Method
International Nuclear Information System (INIS)
Lee, Myoung Keon; Lee, Jeong Won; Yoon, Dong Hyun; Kim, Jae Hoon
2016-01-01
This paper provides the compressive failure strength value of composite laminate developed by using regression analysis method. Composite material in this document is a Carbon/Epoxy unidirection(UD) tape prepreg(Cycom G40-800/5276-1) cured at 350°F(177°C). The operating temperature is –60°F~+200°F(-55°C - +95°C). A total of 56 compression tests were conducted on specimens from eight (8) distinct laminates that were laid up by standard angle layers (0°, +45°, –45° and 90°). The ASTM-D-6484 standard was used for test method. The regression analysis was performed with the response variable being the laminate ultimate fracture strength and the regressor variables being two ply orientations (0° and ±45°)
James W. Hardin; Henrik Schmeidiche; Raymond J. Carroll
2003-01-01
This paper discusses and illustrates the method of regression calibration. This is a straightforward technique for fitting models with additive measurement error. We present this discussion in terms of generalized linear models (GLMs) following the notation defined in Hardin and Carroll (2003). Discussion will include specified measurement error, measurement error estimated by replicate error-prone proxies, and measurement error estimated by instrumental variables. The discussion focuses on s...
Assessing the performance of variational methods for mixed logistic regression models
Czech Academy of Sciences Publication Activity Database
Rijmen, F.; Vomlel, Jiří
2008-01-01
Roč. 78, č. 8 (2008), s. 765-779 ISSN 0094-9655 R&D Projects: GA MŠk 1M0572 Grant - others:GA MŠk(CZ) 2C06019 Institutional research plan: CEZ:AV0Z10750506 Keywords : Mixed models * Logistic regression * Variational methods * Lower bound approximation Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.353, year: 2008
Comparison of Adaline and Multiple Linear Regression Methods for Rainfall Forecasting
Sutawinaya, IP; Astawa, INGA; Hariyanti, NKD
2018-01-01
Heavy rainfall can cause disaster, therefore need a forecast to predict rainfall intensity. Main factor that cause flooding is there is a high rainfall intensity and it makes the river become overcapacity. This will cause flooding around the area. Rainfall factor is a dynamic factor, so rainfall is very interesting to be studied. In order to support the rainfall forecasting, there are methods that can be used from Artificial Intelligence (AI) to statistic. In this research, we used Adaline for AI method and Regression for statistic method. The more accurate forecast result shows the method that used is good for forecasting the rainfall. Through those methods, we expected which is the best method for rainfall forecasting here.
Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N
2017-07-01
Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant
Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.
2008-04-01
Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.
Permanent Magnet Flux Online Estimation Based on Zero-Voltage Vector Injection Method
DEFF Research Database (Denmark)
Xie, Ge; Lu, Kaiyuan; Kumar, Dwivedi Sanjeet
2015-01-01
In this paper, a simple signal injection method is proposed for sensorless control of PMSM at low speed, which ideally requires one voltage vector only for position estimation. The proposed method is easy to implement resulting in low computation burden. No filters are needed for extracting...
Methods of treating Parkinson's disease using viral vectors
Energy Technology Data Exchange (ETDEWEB)
Bankiewicz, Krystof; Cunningham, Janet
2016-11-15
Methods of delivering viral vectors, particularly recombinant adeno-associated virus (rAAV) virions, to the central nervous system (CNS) using convection enhanced delivery (CED) are provided. The rAAV virions include a nucleic acid sequence encoding a therapeutic polypeptide. The methods can be used for treating CNS disorders such as for treating Parkinson's Disease.
Datum Feature Extraction and Deformation Analysis Method Based on Normal Vector of Point Cloud
Sun, W.; Wang, J.; Jin, F.; Liang, Z.; Yang, Y.
2018-04-01
In order to solve the problem lacking applicable analysis method in the application of three-dimensional laser scanning technology to the field of deformation monitoring, an efficient method extracting datum feature and analysing deformation based on normal vector of point cloud was proposed. Firstly, the kd-tree is used to establish the topological relation. Datum points are detected by tracking the normal vector of point cloud determined by the normal vector of local planar. Then, the cubic B-spline curve fitting is performed on the datum points. Finally, datum elevation and the inclination angle of the radial point are calculated according to the fitted curve and then the deformation information was analyzed. The proposed approach was verified on real large-scale tank data set captured with terrestrial laser scanner in a chemical plant. The results show that the method could obtain the entire information of the monitor object quickly and comprehensively, and reflect accurately the datum feature deformation.
Correcting for cryptic relatedness by a regression-based genomic control method
Directory of Open Access Journals (Sweden)
Yang Yaning
2009-12-01
Full Text Available Abstract Background Genomic control (GC method is a useful tool to correct for the cryptic relatedness in population-based association studies. It was originally proposed for correcting for the variance inflation of Cochran-Armitage's additive trend test by using information from unlinked null markers, and was later generalized to be applicable to other tests with the additional requirement that the null markers are matched with the candidate marker in allele frequencies. However, matching allele frequencies limits the number of available null markers and thus limits the applicability of the GC method. On the other hand, errors in genotype/allele frequencies may cause further bias and variance inflation and thereby aggravate the effect of GC correction. Results In this paper, we propose a regression-based GC method using null markers that are not necessarily matched in allele frequencies with the candidate marker. Variation of allele frequencies of the null markers is adjusted by a regression method. Conclusion The proposed method can be readily applied to the Cochran-Armitage's trend tests other than the additive trend test, the Pearson's chi-square test and other robust efficiency tests. Simulation results show that the proposed method is effective in controlling type I error in the presence of population substructure.
A subagging regression method for estimating the qualitative and quantitative state of groundwater
Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kue-Young
2017-08-01
A subsample aggregating (subagging) regression (SBR) method for the analysis of groundwater data pertaining to trend-estimation-associated uncertainty is proposed. The SBR method is validated against synthetic data competitively with other conventional robust and non-robust methods. From the results, it is verified that the estimation accuracies of the SBR method are consistent and superior to those of other methods, and the uncertainties are reasonably estimated; the others have no uncertainty analysis option. To validate further, actual groundwater data are employed and analyzed comparatively with Gaussian process regression (GPR). For all cases, the trend and the associated uncertainties are reasonably estimated by both SBR and GPR regardless of Gaussian or non-Gaussian skewed data. However, it is expected that GPR has a limitation in applications to severely corrupted data by outliers owing to its non-robustness. From the implementations, it is determined that the SBR method has the potential to be further developed as an effective tool of anomaly detection or outlier identification in groundwater state data such as the groundwater level and contaminant concentration.
Directory of Open Access Journals (Sweden)
Zhang Jing
2016-01-01
Full Text Available To assist physicians to quickly find the required 3D model from the mass medical model, we propose a novel retrieval method, called DRFVT, which combines the characteristics of dimensionality reduction (DR and feature vector transformation (FVT method. The DR method reduces the dimensionality of feature vector; only the top M low frequency Discrete Fourier Transform coefficients are retained. The FVT method does the transformation of the original feature vector and generates a new feature vector to solve the problem of noise sensitivity. The experiment results demonstrate that the DRFVT method achieves more effective and efficient retrieval results than other proposed methods.
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Prastuti, M.; Suhartono; Salehah, NA
2018-04-01
The need for energy supply, especially for electricity in Indonesia has been increasing in the last past years. Furthermore, the high electricity usage by people at different times leads to the occurrence of heteroscedasticity issue. Estimate the electricity supply that could fulfilled the community’s need is very important, but the heteroscedasticity issue often made electricity forecasting hard to be done. An accurate forecast of electricity consumptions is one of the key challenges for energy provider to make better resources and service planning and also take control actions in order to balance the electricity supply and demand for community. In this paper, hybrid ARIMAX Quantile Regression (ARIMAX-QR) approach was proposed to predict the short-term electricity consumption in East Java. This method will also be compared to time series regression using RMSE, MAPE, and MdAPE criteria. The data used in this research was the electricity consumption per half-an-hour data during the period of September 2015 to April 2016. The results show that the proposed approach can be a competitive alternative to forecast short-term electricity in East Java. ARIMAX-QR using lag values and dummy variables as predictors yield more accurate prediction in both in-sample and out-sample data. Moreover, both time series regression and ARIMAX-QR methods with addition of lag values as predictor could capture accurately the patterns in the data. Hence, it produces better predictions compared to the models that not use additional lag variables.
A robust and efficient stepwise regression method for building sparse polynomial chaos expansions
Energy Technology Data Exchange (ETDEWEB)
Abraham, Simon, E-mail: Simon.Abraham@ulb.ac.be [Vrije Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2, 1050 Brussels (Belgium); Raisee, Mehrdad [School of Mechanical Engineering, College of Engineering, University of Tehran, P.O. Box: 11155-4563, Tehran (Iran, Islamic Republic of); Ghorbaniasl, Ghader; Contino, Francesco; Lacor, Chris [Vrije Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2, 1050 Brussels (Belgium)
2017-03-01
Polynomial Chaos (PC) expansions are widely used in various engineering fields for quantifying uncertainties arising from uncertain parameters. The computational cost of classical PC solution schemes is unaffordable as the number of deterministic simulations to be calculated grows dramatically with the number of stochastic dimension. This considerably restricts the practical use of PC at the industrial level. A common approach to address such problems is to make use of sparse PC expansions. This paper presents a non-intrusive regression-based method for building sparse PC expansions. The most important PC contributions are detected sequentially through an automatic search procedure. The variable selection criterion is based on efficient tools relevant to probabilistic method. Two benchmark analytical functions are used to validate the proposed algorithm. The computational efficiency of the method is then illustrated by a more realistic CFD application, consisting of the non-deterministic flow around a transonic airfoil subject to geometrical uncertainties. To assess the performance of the developed methodology, a detailed comparison is made with the well established LAR-based selection technique. The results show that the developed sparse regression technique is able to identify the most significant PC contributions describing the problem. Moreover, the most important stochastic features are captured at a reduced computational cost compared to the LAR method. The results also demonstrate the superior robustness of the method by repeating the analyses using random experimental designs.
Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry
2013-08-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
International Nuclear Information System (INIS)
Celikel, Oguz
2011-01-01
This paper presents the application of the vector modulation method (VMM) to an open-loop interferometric fiber optic gyroscope, called the north finder capability gyroscope (NFCG), designed and assembled in TUBITAK UME (National Metrology Institute of Turkey). The method contains a secondary modulation/demodulation circuit with an AD630 chip, depending on the periodic variation of the orientation of the sensing coil sensitive surface vector with respect to geographic north at a laboratory latitude and collection of dc voltage at the secondary demodulation circuit output in the time domain. The resultant dc voltage proportional to the first-kind Bessel function based on Sagnac phase shift for the first order is obtained as a result of vector modulation together with the Earth's rotation. A new model function is developed and introduced to evaluate the angular errors of the NFCG with VMM in finding geographic north
Discrete-ordinate method with matrix exponential for a pseudo-spherical atmosphere: Vector case
International Nuclear Information System (INIS)
Doicu, A.; Trautmann, T.
2009-01-01
The paper is devoted to the extension of the matrix-exponential formalism for the scalar radiative transfer to the vector case. Using basic results of the theory of matrix-exponential functions we provide a compact and versatile formulation of the vector radiative transfer. As in the scalar case, we operate with the concept of the layer equation incorporating the level values of the Stokes vector. The matrix exponentials which enter in the expression of the layer equation are computed by using the matrix eigenvalue method and the Pade approximation. A discussion of the computational efficiency of the proposed method for both an aerosol-loaded atmosphere as well as a cloudy atmosphere is also provided
Vectorization on the star computer of several numerical methods for a fluid flow problem
Lambiotte, J. J., Jr.; Howser, L. M.
1974-01-01
A reexamination of some numerical methods is considered in light of the new class of computers which use vector streaming to achieve high computation rates. A study has been made of the effect on the relative efficiency of several numerical methods applied to a particular fluid flow problem when they are implemented on a vector computer. The method of Brailovskaya, the alternating direction implicit method, a fully implicit method, and a new method called partial implicitization have been applied to the problem of determining the steady state solution of the two-dimensional flow of a viscous imcompressible fluid in a square cavity driven by a sliding wall. Results are obtained for three mesh sizes and a comparison is made of the methods for serial computation.
Directory of Open Access Journals (Sweden)
Rui Silva
2018-02-01
Full Text Available The performance of a support vector regression (SVR model with a Gaussian radial basis kernel to predict anthocyanin concentration, pH index and sugar content in whole grape berries, using spectroscopic measurements obtained in reflectance mode, was evaluated. Each sample contained a small number of whole berries and the spectrum of each sample was collected during ripening using hyperspectral imaging in the range of 380–1028 nm. Touriga Franca (TF variety samples were collected for the 2012–2015 vintages, and Touriga Nacional (TN and Tinta Barroca (TB variety samples were collected for the 2013 vintage. These TF vintages were independently used to train, validate and test the SVR methodology; different combinations of TF vintages were used to train and test each model to assess the performance differences under wider and more variable datasets; the varieties that were not employed in the model training and validation (TB and TN were used to test the generalization ability of the SVR approach. Each case was tested using an external independent set (with data not included in the model training or validation steps. The best R2 results obtained with varieties and vintages not employed in the model’s training step were 0.89, 0.81 and 0.90, with RMSE values of 35.6 mg·L−1, 0.25 and 3.19 °Brix, for anthocyanin concentration, pH index and sugar content, respectively. The present results indicate a good overall performance for all cases, improving the state-of-the-art results for external test sets, and suggesting that a robust model, with a generalization capacity over different varieties and harvest years may be obtainable without further training, which makes this a very competitive approach when compared to the models from other authors, since it makes the problem significantly simpler and more cost-effective.
Ichii, Kazuhito; Ueyama, Masahito; Kondo, Masayuki; Saigusa, Nobuko; Kim, Joon; Alberto, Ma. Carmelita; Ardö, Jonas; Euskirchen, Eugénie S.; Kang, Minseok; Hirano, Takashi; Joiner, Joanna; Kobayashi, Hideki; Marchesini, Luca Belelli; Merbold, Lutz; Miyata, Akira; Saitoh, Taku M.; Takagi, Kentaro; Varlagin, Andrej; Bret-Harte, M. Syndonia; Kitamura, Kenzo; Kosugi, Yoshiko; Kotani, Ayumi; Kumar, Kireet; Li, Sheng-Gong; Machimura, Takashi; Matsuura, Yojiro; Mizoguchi, Yasuko; Ohta, Takeshi; Mukherjee, Sandipan; Yanagi, Yuji; Yasuda, Yukio; Zhang, Yiping; Zhao, Fenghua
2017-04-01
The lack of a standardized database of eddy covariance observations has been an obstacle for data-driven estimation of terrestrial CO2 fluxes in Asia. In this study, we developed such a standardized database using 54 sites from various databases by applying consistent postprocessing for data-driven estimation of gross primary productivity (GPP) and net ecosystem CO2 exchange (NEE). Data-driven estimation was conducted by using a machine learning algorithm: support vector regression (SVR), with remote sensing data for 2000 to 2015 period. Site-level evaluation of the estimated CO2 fluxes shows that although performance varies in different vegetation and climate classifications, GPP and NEE at 8 days are reproduced (e.g., r2 = 0.73 and 0.42 for 8 day GPP and NEE). Evaluation of spatially estimated GPP with Global Ozone Monitoring Experiment 2 sensor-based Sun-induced chlorophyll fluorescence shows that monthly GPP variations at subcontinental scale were reproduced by SVR (r2 = 1.00, 0.94, 0.91, and 0.89 for Siberia, East Asia, South Asia, and Southeast Asia, respectively). Evaluation of spatially estimated NEE with net atmosphere-land CO2 fluxes of Greenhouse Gases Observing Satellite (GOSAT) Level 4A product shows that monthly variations of these data were consistent in Siberia and East Asia; meanwhile, inconsistency was found in South Asia and Southeast Asia. Furthermore, differences in the land CO2 fluxes from SVR-NEE and GOSAT Level 4A were partially explained by accounting for the differences in the definition of land CO2 fluxes. These data-driven estimates can provide a new opportunity to assess CO2 fluxes in Asia and evaluate and constrain terrestrial ecosystem models.
Kim, Yoonsang; Emery, Sherry
2013-01-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
da Silva, Claudia Pereira; Emídio, Elissandro Soares; de Marchi, Mary Rosa Rodrigues
2015-01-01
This paper describes the validation of a method consisting of solid-phase extraction followed by gas chromatography-tandem mass spectrometry for the analysis of the ultraviolet (UV) filters benzophenone-3, ethylhexyl salicylate, ethylhexyl methoxycinnamate and octocrylene. The method validation criteria included evaluation of selectivity, analytical curve, trueness, precision, limits of detection and limits of quantification. The non-weighted linear regression model has traditionally been used for calibration, but it is not necessarily the optimal model in all cases. Because the assumption of homoscedasticity was not met for the analytical data in this work, a weighted least squares linear regression was used for the calibration method. The evaluated analytical parameters were satisfactory for the analytes and showed recoveries at four fortification levels between 62% and 107%, with relative standard deviations less than 14%. The detection limits ranged from 7.6 to 24.1 ng L(-1). The proposed method was used to determine the amount of UV filters in water samples from water treatment plants in Araraquara and Jau in São Paulo, Brazil. Copyright © 2014 Elsevier B.V. All rights reserved.
Increasing the computational efficient of digital cross correlation by a vectorization method
Chang, Ching-Yuan; Ma, Chien-Ching
2017-08-01
This study presents a vectorization method for use in MATLAB programming aimed at increasing the computational efficiency of digital cross correlation in sound and images, resulting in a speedup of 6.387 and 36.044 times compared with performance values obtained from looped expression. This work bridges the gap between matrix operations and loop iteration, preserving flexibility and efficiency in program testing. This paper uses numerical simulation to verify the speedup of the proposed vectorization method as well as experiments to measure the quantitative transient displacement response subjected to dynamic impact loading. The experiment involved the use of a high speed camera as well as a fiber optic system to measure the transient displacement in a cantilever beam under impact from a steel ball. Experimental measurement data obtained from the two methods are in excellent agreement in both the time and frequency domain, with discrepancies of only 0.68%. Numerical and experiment results demonstrate the efficacy of the proposed vectorization method with regard to computational speed in signal processing and high precision in the correlation algorithm. We also present the source code with which to build MATLAB-executable functions on Windows as well as Linux platforms, and provide a series of examples to demonstrate the application of the proposed vectorization method.
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Neutron–gamma discrimination based on the support vector machine method
International Nuclear Information System (INIS)
Yu, Xunzhen; Zhu, Jingjun; Lin, ShinTed; Wang, Li; Xing, Haoyang; Zhang, Caixun; Xia, Yuxi; Liu, Shukui; Yue, Qian; Wei, Weiwei; Du, Qiang; Tang, Changjian
2015-01-01
In this study, the combination of the support vector machine (SVM) method with the moment analysis method (MAM) is proposed and utilized to perform neutron/gamma (n/γ) discrimination of the pulses from an organic liquid scintillator (OLS). Neutron and gamma events, which can be firmly separated on the scatter plot drawn by the charge comparison method (CCM), are detected to form the training data set and the test data set for the SVM, and the MAM is used to create the feature vectors for individual events in the data sets. Compared to the traditional methods, such as CCM, the proposed method can not only discriminate the neutron and gamma signals, even at lower energy levels, but also provide the corresponding classification accuracy for each event, which is useful in validating the discrimination. Meanwhile, the proposed method can also offer a predication of the classification for the under-energy-limit events
Directory of Open Access Journals (Sweden)
Gholam Reza Sheykhzadeh
2017-02-01
Full Text Available Introduction: Penetration resistance is one of the criteria for evaluating soil compaction. It correlates with several soil properties such as vehicle trafficability, resistance to root penetration, seedling emergence, and soil compaction by farm machinery. Direct measurement of penetration resistance is time consuming and difficult because of high temporal and spatial variability. Therefore, many different regressions and artificial neural network pedotransfer functions have been proposed to estimate penetration resistance from readily available soil variables such as particle size distribution, bulk density (Db and gravimetric water content (θm. The lands of Ardabil Province are one of the main production regions of potato in Iran, thus, obtaining the soil penetration resistance in these regions help with the management of potato production. The objective of this research was to derive pedotransfer functions by using regression and artificial neural network to predict penetration resistance from some soil variations in the agricultural soils of Ardabil plain and to compare the performance of artificial neural network with regression models. Materials and methods: Disturbed and undisturbed soil samples (n= 105 were systematically taken from 0-10 cm soil depth with nearly 3000 m distance in the agricultural lands of the Ardabil plain ((lat 38°15' to 38°40' N, long 48°16' to 48°61' E. The contents of sand, silt and clay (hydrometer method, CaCO3 (titration method, bulk density (cylinder method, particle density (Dp (pychnometer method, organic carbon (wet oxidation method, total porosity(calculating from Db and Dp, saturated (θs and field soil water (θf using the gravimetric method were measured in the laboratory. Mean geometric diameter (dg and standard deviation (σg of soil particles were computed using the percentages of sand, silt and clay. Penetration resistance was measured in situ using cone penetrometer (analog model at 10
Landslide susceptibility mapping on a global scale using the method of logistic regression
Directory of Open Access Journals (Sweden)
L. Lin
2017-08-01
Full Text Available This paper proposes a statistical model for mapping global landslide susceptibility based on logistic regression. After investigating explanatory factors for landslides in the existing literature, five factors were selected for model landslide susceptibility: relative relief, extreme precipitation, lithology, ground motion and soil moisture. When building the model, 70 % of landslide and nonlandslide points were randomly selected for logistic regression, and the others were used for model validation. To evaluate the accuracy of predictive models, this paper adopts several criteria including a receiver operating characteristic (ROC curve method. Logistic regression experiments found all five factors to be significant in explaining landslide occurrence on a global scale. During the modeling process, percentage correct in confusion matrix of landslide classification was approximately 80 % and the area under the curve (AUC was nearly 0.87. During the validation process, the above statistics were about 81 % and 0.88, respectively. Such a result indicates that the model has strong robustness and stable performance. This model found that at a global scale, soil moisture can be dominant in the occurrence of landslides and topographic factor may be secondary.
Liu, Ke; Chen, Xiaojing; Li, Limin; Chen, Huiling; Ruan, Xiukai; Liu, Wenbin
2015-02-09
The successive projections algorithm (SPA) is widely used to select variables for multiple linear regression (MLR) modeling. However, SPA used only once may not obtain all the useful information of the full spectra, because the number of selected variables cannot exceed the number of calibration samples in the SPA algorithm. Therefore, the SPA-MLR method risks the loss of useful information. To make a full use of the useful information in the spectra, a new method named "consensus SPA-MLR" (C-SPA-MLR) is proposed herein. This method is the combination of consensus strategy and SPA-MLR method. In the C-SPA-MLR method, SPA-MLR is used to construct member models with different subsets of variables, which are selected from the remaining variables iteratively. A consensus prediction is obtained by combining the predictions of the member models. The proposed method is evaluated by analyzing the near infrared (NIR) spectra of corn and diesel. The results of C-SPA-MLR method showed a better prediction performance compared with the SPA-MLR and full-spectra PLS methods. Moreover, these results could serve as a reference for combination the consensus strategy and other variable selection methods when analyzing NIR spectra and other spectroscopic techniques. Copyright © 2014 Elsevier B.V. All rights reserved.
Development of K-Nearest Neighbour Regression Method in Forecasting River Stream Flow
Directory of Open Access Journals (Sweden)
Mohammad Azmi
2012-07-01
Full Text Available Different statistical, non-statistical and black-box methods have been used in forecasting processes. Among statistical methods, K-nearest neighbour non-parametric regression method (K-NN due to its natural simplicity and mathematical base is one of the recommended methods for forecasting processes. In this study, K-NN method is explained completely. Besides, development and improvement approaches such as best neighbour estimation, data transformation functions, distance functions and proposed extrapolation method are described. K-NN method in company with its development approaches is used in streamflow forecasting of Zayandeh-Rud Dam upper basin. Comparing between final results of classic K-NN method and modified K-NN (number of neighbour 5, transformation function of Range Scaling, distance function of Mahanalobis and proposed extrapolation method shows that modified K-NN in criteria of goodness of fit, root mean square error, percentage of volume of error and correlation has had performance improvement 45% , 59% and 17% respectively. These results approve necessity of applying mentioned approaches to derive more accurate forecasts.
Comparing the index-flood and multiple-regression methods using L-moments
Malekinezhad, H.; Nachtnebel, H. P.; Klik, A.
In arid and semi-arid regions, the length of records is usually too short to ensure reliable quantile estimates. Comparing index-flood and multiple-regression analyses based on L-moments was the main objective of this study. Factor analysis was applied to determine main influencing variables on flood magnitude. Ward’s cluster and L-moments approaches were applied to several sites in the Namak-Lake basin in central Iran to delineate homogeneous regions based on site characteristics. Homogeneity test was done using L-moments-based measures. Several distributions were fitted to the regional flood data and index-flood and multiple-regression methods as two regional flood frequency methods were compared. The results of factor analysis showed that length of main waterway, compactness coefficient, mean annual precipitation, and mean annual temperature were the main variables affecting flood magnitude. The study area was divided into three regions based on the Ward’s method of clustering approach. The homogeneity test based on L-moments showed that all three regions were acceptably homogeneous. Five distributions were fitted to the annual peak flood data of three homogeneous regions. Using the L-moment ratios and the Z-statistic criteria, GEV distribution was identified as the most robust distribution among five candidate distributions for all the proposed sub-regions of the study area, and in general, it was concluded that the generalised extreme value distribution was the best-fit distribution for every three regions. The relative root mean square error (RRMSE) measure was applied for evaluating the performance of the index-flood and multiple-regression methods in comparison with the curve fitting (plotting position) method. In general, index-flood method gives more reliable estimations for various flood magnitudes of different recurrence intervals. Therefore, this method should be adopted as regional flood frequency method for the study area and the Namak-Lake basin
International Nuclear Information System (INIS)
Wu, Jie; Wang, Jianzhou; Lu, Haiyan; Dong, Yao; Lu, Xiaoxiao
2013-01-01
Highlights: ► The seasonal and trend items of the data series are forecasted separately. ► Seasonal item in the data series is verified by the Kendall τ correlation testing. ► Different regression models are applied to the trend item forecasting. ► We examine the superiority of the combined models by the quartile value comparison. ► Paired-sample T test is utilized to confirm the superiority of the combined models. - Abstract: For an energy-limited economy system, it is crucial to forecast load demand accurately. This paper devotes to 1-week-ahead daily load forecasting approach in which load demand series are predicted by employing the information of days before being similar to that of the forecast day. As well as in many nonlinear systems, seasonal item and trend item are coexisting in load demand datasets. In this paper, the existing of the seasonal item in the load demand data series is firstly verified according to the Kendall τ correlation testing method. Then in the belief of the separate forecasting to the seasonal item and the trend item would improve the forecasting accuracy, hybrid models by combining seasonal exponential adjustment method (SEAM) with the regression methods are proposed in this paper, where SEAM and the regression models are employed to seasonal and trend items forecasting respectively. Comparisons of the quartile values as well as the mean absolute percentage error values demonstrate this forecasting technique can significantly improve the accuracy though models applied to the trend item forecasting are eleven different ones. This superior performance of this separate forecasting technique is further confirmed by the paired-sample T tests
Parameters Estimation For A Patellofemoral Joint Of A Human Knee Using A Vector Method
Ciszkiewicz, A.; Knapczyk, J.
2015-08-01
Position and displacement analysis of a spherical model of a human knee joint using the vector method was presented. Sensitivity analysis and parameter estimation were performed using the evolutionary algorithm method. Computer simulations for the mechanism with estimated parameters proved the effectiveness of the prepared software. The method itself can be useful when solving problems concerning the displacement and loads analysis in the knee joint.
The influence of deflation vectors at interfaces on the deflated conjugate gradient method
Vermolen, F.J.; Vuik, C.
2001-01-01
We investigate the influence of the value of deflation vectors at interfaces on the rate of convergence of preconditioned conjugate gradient methods. Our set-up is a Laplace problem in two dimensions with continuous or discontinuous coeffcients that vary in several orders of magnitude. In the
An Empirical Method to Fuse Partially Overlapping State Vectors for Distributed State Estimation
Sijs, J.; Hanebeck, U.; Noack, B.
2013-01-01
State fusion is a method for merging multiple estimates of the same state into a single fused estimate. Dealing with multiple estimates is one of the main concerns in distributed state estimation, where an estimated value of the desired state vector is computed in each node of a networked system.
Vector method for strain estimation in phase-sensitive optical coherence elastography
Matveyev, A. L.; Matveev, L. A.; Sovetsky, A. A.; Gelikonov, G. V.; Moiseev, A. A.; Zaitsev, V. Y.
2018-06-01
A noise-tolerant approach to strain estimation in phase-sensitive optical coherence elastography, robust to decorrelation distortions, is discussed. The method is based on evaluation of interframe phase-variation gradient, but its main feature is that the phase is singled out at the very last step of the gradient estimation. All intermediate steps operate with complex-valued optical coherence tomography (OCT) signals represented as vectors in the complex plane (hence, we call this approach the ‘vector’ method). In comparison with such a popular method as least-square fitting of the phase-difference slope over a selected region (even in the improved variant with amplitude weighting for suppressing small-amplitude noisy pixels), the vector approach demonstrates superior tolerance to both additive noise in the receiving system and speckle-decorrelation caused by tissue straining. Another advantage of the vector approach is that it obviates the usual necessity of error-prone phase unwrapping. Here, special attention is paid to modifications of the vector method that make it especially suitable for processing deformations with significant lateral inhomogeneity, which often occur in real situations. The method’s advantages are demonstrated using both simulated and real OCT scans obtained during reshaping of a collagenous tissue sample irradiated by an IR laser beam producing complex spatially inhomogeneous deformations.
Directory of Open Access Journals (Sweden)
Bangyong Sun
2014-01-01
Full Text Available The polynomial regression method is employed to calculate the relationship of device color space and CIE color space for color characterization, and the performance of different expressions with specific parameters is evaluated. Firstly, the polynomial equation for color conversion is established and the computation of polynomial coefficients is analysed. And then different forms of polynomial equations are used to calculate the RGB and CMYK’s CIE color values, while the corresponding color errors are compared. At last, an optimal polynomial expression is obtained by analysing several related parameters during color conversion, including polynomial numbers, the degree of polynomial terms, the selection of CIE visual spaces, and the linearization.
Real-time prediction of respiratory motion based on local regression methods
International Nuclear Information System (INIS)
Ruan, D; Fessler, J A; Balter, J M
2007-01-01
Recent developments in modulation techniques enable conformal delivery of radiation doses to small, localized target volumes. One of the challenges in using these techniques is real-time tracking and predicting target motion, which is necessary to accommodate system latencies. For image-guided-radiotherapy systems, it is also desirable to minimize sampling rates to reduce imaging dose. This study focuses on predicting respiratory motion, which can significantly affect lung tumours. Predicting respiratory motion in real-time is challenging, due to the complexity of breathing patterns and the many sources of variability. We propose a prediction method based on local regression. There are three major ingredients of this approach: (1) forming an augmented state space to capture system dynamics, (2) local regression in the augmented space to train the predictor from previous observation data using semi-periodicity of respiratory motion, (3) local weighting adjustment to incorporate fading temporal correlations. To evaluate prediction accuracy, we computed the root mean square error between predicted tumor motion and its observed location for ten patients. For comparison, we also investigated commonly used predictive methods, namely linear prediction, neural networks and Kalman filtering to the same data. The proposed method reduced the prediction error for all imaging rates and latency lengths, particularly for long prediction lengths
A NEW METHOD OF CHANNEL FRICTION INVERSION BASED ON KALMAN FILTER WITH UNKNOWN PARAMETER VECTOR
Institute of Scientific and Technical Information of China (English)
CHENG Wei-ping; MAO Gen-hai; LIU Guo-hua
2005-01-01
Channel friction is an important parameter in hydraulic analysis.A channel friction parameter inversion method based on Kalman Filter with unknown parameter vector is proposed.Numerical simulations indicate that when the number of monitoring stations exceeds a critical value, the solution is hardly affected.In addition, Kalman Filter with unknown parameter vector is effective only at unsteady state.For the nonlinear equations, computations of sensitivity matrices are time-costly.Two simplified measures can reduce computing time, but not influence the results.One is to reduce sensitivity matrix analysis time, the other is to substitute for sensitivity matrix.
Local regression type methods applied to the study of geophysics and high frequency financial data
Mariani, M. C.; Basu, K.
2014-09-01
In this work we applied locally weighted scatterplot smoothing techniques (Lowess/Loess) to Geophysical and high frequency financial data. We first analyze and apply this technique to the California earthquake geological data. A spatial analysis was performed to show that the estimation of the earthquake magnitude at a fixed location is very accurate up to the relative error of 0.01%. We also applied the same method to a high frequency data set arising in the financial sector and obtained similar satisfactory results. The application of this approach to the two different data sets demonstrates that the overall method is accurate and efficient, and the Lowess approach is much more desirable than the Loess method. The previous works studied the time series analysis; in this paper our local regression models perform a spatial analysis for the geophysics data providing different information. For the high frequency data, our models estimate the curve of best fit where data are dependent on time.
Wolstenholme, E Œ
1978-01-01
Elementary Vectors, Third Edition serves as an introductory course in vector analysis and is intended to present the theoretical and application aspects of vectors. The book covers topics that rigorously explain and provide definitions, principles, equations, and methods in vector analysis. Applications of vector methods to simple kinematical and dynamical problems; central forces and orbits; and solutions to geometrical problems are discussed as well. This edition of the text also provides an appendix, intended for students, which the author hopes to bridge the gap between theory and appl
Wong, Jacklyn; Bayoh, Nabie; Olang, George; Killeen, Gerry F; Hamel, Mary J; Vulule, John M; Gimnig, John E
2013-04-30
Operational vector sampling methods lack standardization, making quantitative comparisons of malaria transmission across different settings difficult. Human landing catch (HLC) is considered the research gold standard for measuring human-mosquito contact, but is unsuitable for large-scale sampling. This study assessed mosquito catch rates of CDC light trap (CDC-LT), Ifakara tent trap (ITT), window exit trap (WET), pot resting trap (PRT), and box resting trap (BRT) relative to HLC in western Kenya to 1) identify appropriate methods for operational sampling in this region, and 2) contribute to a larger, overarching project comparing standardized evaluations of vector trapping methods across multiple countries. Mosquitoes were collected from June to July 2009 in four districts: Rarieda, Kisumu West, Nyando, and Rachuonyo. In each district, all trapping methods were rotated 10 times through three houses in a 3 × 3 Latin Square design. Anophelines were identified by morphology and females classified as fed or non-fed. Anopheles gambiae s.l. were further identified as Anopheles gambiae s.s. or Anopheles arabiensis by PCR. Relative catch rates were estimated by negative binomial regression. When data were pooled across all four districts, catch rates (relative to HLC indoor) for An. gambiae s.l (95.6% An. arabiensis, 4.4% An. gambiae s.s) were high for HLC outdoor (RR = 1.01), CDC-LT (RR = 1.18), and ITT (RR = 1.39); moderate for WET (RR = 0.52) and PRT outdoor (RR = 0.32); and low for all remaining types of resting traps (PRT indoor, BRT indoor, and BRT outdoor; RR < 0.08 for all). For Anopheles funestus, relative catch rates were high for ITT (RR = 1.21); moderate for HLC outdoor (RR = 0.47), CDC-LT (RR = 0.69), and WET (RR = 0.49); and low for all resting traps (RR < 0.02 for all). At finer geographic scales, however, efficacy of each trap type varied from district to district. ITT, CDC-LT, and WET appear to be effective methods for large-scale vector sampling in
A method for attitude measurement of a test vehicle based on the tracking of vectors
International Nuclear Information System (INIS)
Yang, Ning; Yang, Ming; Huo, Ju
2015-01-01
In the vehicle simulation test, in order to improve the measuring precision for the attitude of a test vehicle, a measuring method based on the vectors of light beams is presented, in which light beams are mounted on the test vehicle as the cooperation target, and the attitude of the test vehicle is calculated with the light beams’ vectors in the test vehicle’s coordinate system and the world coordinate system. Meanwhile, in order to expand the measuring range of the attitude parameters, cooperation targets and light beams in each cooperation target are increased. On this basis, the concept of an attitude calculation container is defined, and the selection method for the attitude calculation container that participates in the calculation is given. Simultaneously, the vectors of light beams are tracked so as to ensure the normal calculation of the attitude parameters. The experiments results show that this measuring method based on the tracking of vectors can achieve the high precision and wide range of measurement for the attitude of the test vehicle. (paper)
Geographically weighted regression based methods for merging satellite and gauge precipitation
Chao, Lijun; Zhang, Ke; Li, Zhijia; Zhu, Yuelong; Wang, Jingfeng; Yu, Zhongbo
2018-03-01
Real-time precipitation data with high spatiotemporal resolutions are crucial for accurate hydrological forecasting. To improve the spatial resolution and quality of satellite precipitation, a three-step satellite and gauge precipitation merging method was formulated in this study: (1) bilinear interpolation is first applied to downscale coarser satellite precipitation to a finer resolution (PS); (2) the (mixed) geographically weighted regression methods coupled with a weighting function are then used to estimate biases of PS as functions of gauge observations (PO) and PS; and (3) biases of PS are finally corrected to produce a merged precipitation product. Based on the above framework, eight algorithms, a combination of two geographically weighted regression methods and four weighting functions, are developed to merge CMORPH (CPC MORPHing technique) precipitation with station observations on a daily scale in the Ziwuhe Basin of China. The geographical variables (elevation, slope, aspect, surface roughness, and distance to the coastline) and a meteorological variable (wind speed) were used for merging precipitation to avoid the artificial spatial autocorrelation resulting from traditional interpolation methods. The results show that the combination of the MGWR and BI-square function (MGWR-BI) has the best performance (R = 0.863 and RMSE = 7.273 mm/day) among the eight algorithms. The MGWR-BI algorithm was then applied to produce hourly merged precipitation product. Compared to the original CMORPH product (R = 0.208 and RMSE = 1.208 mm/hr), the quality of the merged data is significantly higher (R = 0.724 and RMSE = 0.706 mm/hr). The developed merging method not only improves the spatial resolution and quality of the satellite product but also is easy to implement, which is valuable for hydrological modeling and other applications.
Directory of Open Access Journals (Sweden)
Adi Syahputra
2014-03-01
Full Text Available Quantitative structure activity relationship (QSAR for 21 insecticides of phthalamides containing hydrazone (PCH was studied using multiple linear regression (MLR, principle component regression (PCR and artificial neural network (ANN. Five descriptors were included in the model for MLR and ANN analysis, and five latent variables obtained from principle component analysis (PCA were used in PCR analysis. Calculation of descriptors was performed using semi-empirical PM6 method. ANN analysis was found to be superior statistical technique compared to the other methods and gave a good correlation between descriptors and activity (r2 = 0.84. Based on the obtained model, we have successfully designed some new insecticides with higher predicted activity than those of previously synthesized compounds, e.g.2-(decalinecarbamoyl-5-chloro-N’-((5-methylthiophen-2-ylmethylene benzohydrazide, 2-(decalinecarbamoyl-5-chloro-N’-((thiophen-2-yl-methylene benzohydrazide and 2-(decaline carbamoyl-N’-(4-fluorobenzylidene-5-chlorobenzohydrazide with predicted log LC50 of 1.640, 1.672, and 1.769 respectively.
Dual linear structured support vector machine tracking method via scale correlation filter
Li, Weisheng; Chen, Yanquan; Xiao, Bin; Feng, Chen
2018-01-01
Adaptive tracking-by-detection methods based on structured support vector machine (SVM) performed well on recent visual tracking benchmarks. However, these methods did not adopt an effective strategy of object scale estimation, which limits the overall tracking performance. We present a tracking method based on a dual linear structured support vector machine (DLSSVM) with a discriminative scale correlation filter. The collaborative tracker comprised of a DLSSVM model and a scale correlation filter obtains good results in tracking target position and scale estimation. The fast Fourier transform is applied for detection. Extensive experiments show that our tracking approach outperforms many popular top-ranking trackers. On a benchmark including 100 challenging video sequences, the average precision of the proposed method is 82.8%.
Nonparametric Methods in Astronomy: Think, Regress, Observe—Pick Any Three
Steinhardt, Charles L.; Jermyn, Adam S.
2018-02-01
Telescopes are much more expensive than astronomers, so it is essential to minimize required sample sizes by using the most data-efficient statistical methods possible. However, the most commonly used model-independent techniques for finding the relationship between two variables in astronomy are flawed. In the worst case they can lead without warning to subtly yet catastrophically wrong results, and even in the best case they require more data than necessary. Unfortunately, there is no single best technique for nonparametric regression. Instead, we provide a guide for how astronomers can choose the best method for their specific problem and provide a python library with both wrappers for the most useful existing algorithms and implementations of two new algorithms developed here.
An Application of Robust Method in Multiple Linear Regression Model toward Credit Card Debt
Amira Azmi, Nur; Saifullah Rusiman, Mohd; Khalid, Kamil; Roslan, Rozaini; Sufahani, Suliadi; Mohamad, Mahathir; Salleh, Rohayu Mohd; Hamzah, Nur Shamsidah Amir
2018-04-01
Credit card is a convenient alternative replaced cash or cheque, and it is essential component for electronic and internet commerce. In this study, the researchers attempt to determine the relationship and significance variables between credit card debt and demographic variables such as age, household income, education level, years with current employer, years at current address, debt to income ratio and other debt. The provided data covers 850 customers information. There are three methods that applied to the credit card debt data which are multiple linear regression (MLR) models, MLR models with least quartile difference (LQD) method and MLR models with mean absolute deviation method. After comparing among three methods, it is found that MLR model with LQD method became the best model with the lowest value of mean square error (MSE). According to the final model, it shows that the years with current employer, years at current address, household income in thousands and debt to income ratio are positively associated with the amount of credit debt. Meanwhile variables for age, level of education and other debt are negatively associated with amount of credit debt. This study may serve as a reference for the bank company by using robust methods, so that they could better understand their options and choice that is best aligned with their goals for inference regarding to the credit card debt.
A temporal subtraction method for thoracic CT images based on generalized gradient vector flow
International Nuclear Information System (INIS)
Miyake, Noriaki; Kim, H.; Maeda, Shinya; Itai, Yoshinori; Tan, J.K.; Ishikawa, Seiji; Katsuragawa, Shigehiko
2010-01-01
A temporal subtraction image, which is obtained by subtraction of a previous image from a current one, can be used for enhancing interval changes (such as formation of new lesions and changes in existing abnormalities) on medical images by removing most of the normal structures. If image registration is incorrect, not only the interval changes but also the normal structures would be appeared as some artifacts on the temporal subtraction image. In a temporal subtraction technique for 2-D X-ray image, the effectiveness is shown through a lot of clinical evaluation experiments, and practical use is advancing. Moreover, the MDCT (Multi-Detector row Computed Tomography) can easily introduced on medical field, the development of a temporal subtraction for thoracic CT Images is expected. In our study, a temporal subtraction technique for thoracic CT Images is developed. As the technique, the vector fields are described by use of GGVF (Generalized Gradient Vector Flow) from the previous and current CT images. Afterwards, VOI (Volume of Interest) are set up on the previous and current CT image pairs. The shift vectors are calculated by using nearest neighbor matching of the vector fields in these VOIs. The search kernel on previous CT image is set up from the obtained shift vector. The previous CT voxel which resemble standard the current voxel is detected by voxel value and vector of the GGVF in the kernel. And, the previous CT image is transformed to the same coordinate of standard voxel. Finally, temporal subtraction image is made by subtraction of a warping image from a current one. To verify the proposal method, the result of application to 7 cases and the effectiveness are described. (author)
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs
Directory of Open Access Journals (Sweden)
Faqir Muhammad
2007-01-01
Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
Applications of Monte Carlo method to nonlinear regression of rheological data
Kim, Sangmo; Lee, Junghaeng; Kim, Sihyun; Cho, Kwang Soo
2018-02-01
In rheological study, it is often to determine the parameters of rheological models from experimental data. Since both rheological data and values of the parameters vary in logarithmic scale and the number of the parameters is quite large, conventional method of nonlinear regression such as Levenberg-Marquardt (LM) method is usually ineffective. The gradient-based method such as LM is apt to be caught in local minima which give unphysical values of the parameters whenever the initial guess of the parameters is far from the global optimum. Although this problem could be solved by simulated annealing (SA), the Monte Carlo (MC) method needs adjustable parameter which could be determined in ad hoc manner. We suggest a simplified version of SA, a kind of MC methods which results in effective values of the parameters of most complicated rheological models such as the Carreau-Yasuda model of steady shear viscosity, discrete relaxation spectrum and zero-shear viscosity as a function of concentration and molecular weight.
A simple method of equine limb force vector analysis and its potential applications
Directory of Open Access Journals (Sweden)
Sarah Jane Hobbs
2018-02-01
Full Text Available Background Ground reaction forces (GRF measured during equine gait analysis are typically evaluated by analyzing discrete values obtained from continuous force-time data for the vertical, longitudinal and transverse GRF components. This paper describes a simple, temporo-spatial method of displaying and analyzing sagittal plane GRF vectors. In addition, the application of statistical parametric mapping (SPM is introduced to analyse differences between contra-lateral fore and hindlimb force-time curves throughout the stance phase. The overall aim of the study was to demonstrate alternative methods of evaluating functional (asymmetry within horses. Methods GRF and kinematic data were collected from 10 horses trotting over a series of four force plates (120 Hz. The kinematic data were used to determine clean hoof contacts. The stance phase of each hoof was determined using a 50 N threshold. Vertical and longitudinal GRF for each stance phase were plotted both as force-time curves and as force vector diagrams in which vectors originating at the centre of pressure on the force plate were drawn at intervals of 8.3 ms for the duration of stance. Visual evaluation was facilitated by overlay of the vector diagrams for different limbs. Summary vectors representing the magnitude (VecMag and direction (VecAng of the mean force over the entire stance phase were superimposed on the force vector diagram. Typical measurements extracted from the force-time curves (peak forces, impulses were compared with VecMag and VecAng using partial correlation (controlling for speed. Paired samples t-tests (left v. right diagonal pair comparison and high v. low vertical force diagonal pair comparison were performed on discrete and vector variables using traditional methods and Hotelling’s T2 tests on normalized stance phase data using SPM. Results Evidence from traditional statistical tests suggested that VecMag is more influenced by the vertical force and impulse, whereas
Logistic Regression and Path Analysis Method to Analyze Factors influencing Students’ Achievement
Noeryanti, N.; Suryowati, K.; Setyawan, Y.; Aulia, R. R.
2018-04-01
Students' academic achievement cannot be separated from the influence of two factors namely internal and external factors. The first factors of the student (internal factors) consist of intelligence (X1), health (X2), interest (X3), and motivation of students (X4). The external factors consist of family environment (X5), school environment (X6), and society environment (X7). The objects of this research are eighth grade students of the school year 2016/2017 at SMPN 1 Jiwan Madiun sampled by using simple random sampling. Primary data are obtained by distributing questionnaires. The method used in this study is binary logistic regression analysis that aims to identify internal and external factors that affect student’s achievement and how the trends of them. Path Analysis was used to determine the factors that influence directly, indirectly or totally on student’s achievement. Based on the results of binary logistic regression, variables that affect student’s achievement are interest and motivation. And based on the results obtained by path analysis, factors that have a direct impact on student’s achievement are students’ interest (59%) and students’ motivation (27%). While the factors that have indirect influences on students’ achievement, are family environment (97%) and school environment (37).
Wulandari, S. P.; Salamah, M.; Rositawati, A. F. D.
2018-04-01
Food security is the condition where the food fulfilment is managed well for the country till the individual. Indonesia is one of the country which has the commitment to create the food security becomes main priority. However, the food necessity becomes common thing means that it doesn’t care about nutrient standard and the health condition of family member, so in the fulfilment of food necessity also has to consider the disease suffered by the family member, one of them is pulmonary tuberculosa. From that reasons, this research is conducted to know the factors which influence on household food security status which suffered from pulmonary tuberculosis in the coastal area of Surabaya by using binary logistic regression method. The analysis result by using binary logistic regression shows that the variables wife latest education, house density and spacious house ventilation significantly affect on household food security status which suffered from pulmonary tuberculosis in the coastal area of Surabaya, where the wife education level is University/equivalent, the house density is eligible or 8 m2/person and spacious house ventilation 10% of the floor area has the opportunity to become food secure households amounted to 0.911089. While the chance of becoming food insecure households amounted to 0.088911. The model household food security status which suffered from pulmonary tuberculosis in the coastal area of Surabaya has been conformable, and the overall percentages of those classifications are at 71.8%.
International Nuclear Information System (INIS)
Gupta, N
2008-01-01
3013 containers are designed in accordance with the DOE-STD-3013-2004. These containers are qualified to store plutonium (Pu) bearing materials such as PuO2 for 50 years. DOT shipping packages such as the 9975 are used to store the 3013 containers in the K-Area Material Storage (KAMS) facility at Savannah River Site (SRS). DOE-STD-3013-2004 requires that a comprehensive surveillance program be set up to ensure that the 3013 container design parameters are not violated during the long term storage. To ensure structural integrity of the 3013 containers, thermal analyses using finite element models were performed to predict the contents and component temperatures for different but well defined parameters such as storage ambient temperature, PuO 2 density, fill heights, weights, and thermal loading. Interpolation is normally used to calculate temperatures if the actual parameter values are different from the analyzed values. A statistical analysis technique using regression methods is proposed to develop simple polynomial relations to predict temperatures for the actual parameter values found in the containers. The analysis shows that regression analysis is a powerful tool to develop simple relations to assess component temperatures
Directory of Open Access Journals (Sweden)
Wen-Gang Zhou
2015-06-01
Full Text Available With the deep research of genomics and proteomics, the number of new protein sequences has expanded rapidly. With the obvious shortcomings of high cost and low efficiency of the traditional experimental method, the calculation method for protein localization prediction has attracted a lot of attention due to its convenience and low cost. In the machine learning techniques, neural network and support vector machine (SVM are often used as learning tools. Due to its complete theoretical framework, SVM has been widely applied. In this paper, we make an improvement on the existing machine learning algorithm of the support vector machine algorithm, and a new improved algorithm has been developed, combined with Bayesian algorithms. The proposed algorithm can improve calculation efficiency, and defects of the original algorithm are eliminated. According to the verification, the method has proved to be valid. At the same time, it can reduce calculation time and improve prediction efficiency.
Chui, Siu Lit; Lu, Ya Yan
2004-03-01
Wide-angle full-vector beam propagation methods (BPMs) for three-dimensional wave-guiding structures can be derived on the basis of rational approximants of a square root operator or its exponential (i.e., the one-way propagator). While the less accurate BPM based on the slowly varying envelope approximation can be efficiently solved by the alternating direction implicit (ADI) method, the wide-angle variants involve linear systems that are more difficult to handle. We present an efficient solver for these linear systems that is based on a Krylov subspace method with an ADI preconditioner. The resulting wide-angle full-vector BPM is used to simulate the propagation of wave fields in a Y branch and a taper.
Multi-step polynomial regression method to model and forecast malaria incidence.
Directory of Open Access Journals (Sweden)
Chandrajit Chatterjee
Full Text Available Malaria is one of the most severe problems faced by the world even today. Understanding the causative factors such as age, sex, social factors, environmental variability etc. as well as underlying transmission dynamics of the disease is important for epidemiological research on malaria and its eradication. Thus, development of suitable modeling approach and methodology, based on the available data on the incidence of the disease and other related factors is of utmost importance. In this study, we developed a simple non-linear regression methodology in modeling and forecasting malaria incidence in Chennai city, India, and predicted future disease incidence with high confidence level. We considered three types of data to develop the regression methodology: a longer time series data of Slide Positivity Rates (SPR of malaria; a smaller time series data (deaths due to Plasmodium vivax of one year; and spatial data (zonal distribution of P. vivax deaths for the city along with the climatic factors, population and previous incidence of the disease. We performed variable selection by simple correlation study, identification of the initial relationship between variables through non-linear curve fitting and used multi-step methods for induction of variables in the non-linear regression analysis along with applied Gauss-Markov models, and ANOVA for testing the prediction, validity and constructing the confidence intervals. The results execute the applicability of our method for different types of data, the autoregressive nature of forecasting, and show high prediction power for both SPR and P. vivax deaths, where the one-lag SPR values plays an influential role and proves useful for better prediction. Different climatic factors are identified as playing crucial role on shaping the disease curve. Further, disease incidence at zonal level and the effect of causative factors on different zonal clusters indicate the pattern of malaria prevalence in the city
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections
Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.
2014-01-01
A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choe, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B; Gupta, Neha; Kohane, Isaac S; Green, Robert C; Kong, Sek Won
2014-08-01
As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false-positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here, we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false-negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous single nucleotide variants (SNVs); 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery in NA12878, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and an ensemble genotyping would be essential to minimize false-positive DNM candidates. © 2014 WILEY PERIODICALS, INC.
International Nuclear Information System (INIS)
Sun Li-Sha; Kang Xiao-Yun; Zhang Qiong; Lin Lan-Xin
2011-01-01
Based on symbolic dynamics, a novel computationally efficient algorithm is proposed to estimate the unknown initial vectors of globally coupled map lattices (CMLs). It is proved that not all inverse chaotic mapping functions are satisfied for contraction mapping. It is found that the values in phase space do not always converge on their initial values with respect to sufficient backward iteration of the symbolic vectors in terms of global convergence or divergence (CD). Both CD property and the coupling strength are directly related to the mapping function of the existing CML. Furthermore, the CD properties of Logistic, Bernoulli, and Tent chaotic mapping functions are investigated and compared. Various simulation results and the performances of the initial vector estimation with different signal-to-noise ratios (SNRs) are also provided to confirm the proposed algorithm. Finally, based on the spatiotemporal chaotic characteristics of the CML, the conditions of estimating the initial vectors using symbolic dynamics are discussed. The presented method provides both theoretical and experimental results for better understanding and characterizing the behaviours of spatiotemporal chaotic systems. (general)
Sun, Li-Sha; Kang, Xiao-Yun; Zhang, Qiong; Lin, Lan-Xin
2011-12-01
Based on symbolic dynamics, a novel computationally efficient algorithm is proposed to estimate the unknown initial vectors of globally coupled map lattices (CMLs). It is proved that not all inverse chaotic mapping functions are satisfied for contraction mapping. It is found that the values in phase space do not always converge on their initial values with respect to sufficient backward iteration of the symbolic vectors in terms of global convergence or divergence (CD). Both CD property and the coupling strength are directly related to the mapping function of the existing CML. Furthermore, the CD properties of Logistic, Bernoulli, and Tent chaotic mapping functions are investigated and compared. Various simulation results and the performances of the initial vector estimation with different signal-to-noise ratios (SNRs) are also provided to confirm the proposed algorithm. Finally, based on the spatiotemporal chaotic characteristics of the CML, the conditions of estimating the initial vectors using symbolic dynamics are discussed. The presented method provides both theoretical and experimental results for better understanding and characterizing the behaviours of spatiotemporal chaotic systems.
The crux of the method: assumptions in ordinary least squares and logistic regression.
Long, Rebecca G
2008-10-01
Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.
Collective vector method for calculation of E1 moments in atomic transition arrays
International Nuclear Information System (INIS)
Bloom, S.D.; Goldberg, A.
1985-10-01
The CV (collective vector) method for calculating E1 moments for a transition array is described and applied in two cases, herein denoted Z26A and Z26B, pertaining to two different configurations of iron VI. The basic idea of the method is to create a CV from each of the parent (''initial state'') state-vectors of the transition array by application of the E1 operator. The moments of each of these CV's, referred to the parent energy, are then the rigorous moments for that parent, requiring no state decomposition of the manifold of daughter state-vectors. Since, in cases of practical interest, the daughter manifold can be orders of magnitude larger in size than the parent manifold, this makes possible the calculation of many moments higher than the second in situations hitherto unattainable via standard methods. The combination of the moments of all the parents, with proper statistical weighting, then yields the transition array moments from which the transition strength distribution can be derived by various procedures. We describe two of these procedures: (1) The well-known GC (Gram-Charlier) expansion in terms of Hermite polynomials, (2) The Lanczos algorithm or Stieltjes imaging method, also called herein the delta expansion. Application is made in the cases of Z26A (50 lines) and Z26B (5523 lines) and the relative merits and shortcomings of the two procedures are discussed. 10 refs., 15 figs., 2 tabs
Directory of Open Access Journals (Sweden)
Ang Gong
2015-12-01
Full Text Available For Global Navigation Satellite System (GNSS single frequency, single epoch attitude determination, this paper proposes a new reliable method with baseline vector constraint. First, prior knowledge of baseline length, heading, and pitch obtained from other navigation equipment or sensors are used to reconstruct objective function rigorously. Then, searching strategy is improved. It substitutes gradually Enlarged ellipsoidal search space for non-ellipsoidal search space to ensure correct ambiguity candidates are within it and make the searching process directly be carried out by least squares ambiguity decorrelation algorithm (LAMBDA method. For all vector candidates, some ones are further eliminated by derived approximate inequality, which accelerates the searching process. Experimental results show that compared to traditional method with only baseline length constraint, this new method can utilize a priori baseline three-dimensional knowledge to fix ambiguity reliably and achieve a high success rate. Experimental tests also verify it is not very sensitive to baseline vector error and can perform robustly when angular error is not great.
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
a Method for the Seamlines Network Automatic Selection Based on Building Vector
Li, P.; Dong, Y.; Hu, Y.; Li, X.; Tan, P.
2018-04-01
In order to improve the efficiency of large scale orthophoto production of city, this paper presents a method for automatic selection of seamlines network in large scale orthophoto based on the buildings' vector. Firstly, a simple model of the building is built by combining building's vector, height and DEM, and the imaging area of the building on single DOM is obtained. Then, the initial Voronoi network of the measurement area is automatically generated based on the positions of the bottom of all images. Finally, the final seamlines network is obtained by optimizing all nodes and seamlines in the network automatically based on the imaging areas of the buildings. The experimental results show that the proposed method can not only get around the building seamlines network quickly, but also remain the Voronoi network' characteristics of projection distortion minimum theory, which can solve the problem of automatic selection of orthophoto seamlines network in image mosaicking effectively.
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.
International Nuclear Information System (INIS)
Tsushima, Motoo; Fujii, Shigeki; Yutani, Chikao; Yamamoto, Akira; Naitoh, Hiroaki.
1990-01-01
We evaluated the wall thickening and stenosis rate (ASI), the calcification rate (ACI), and the wall thickening and calcification stenosis rate (SCI) of the lower abdominal aorta calculated by the 12 sector method from simple or enhanced computed tomography. The intra-observer variation of the calculation of ASI was 5.7% and that of ACI was 2.4%. In 9 patients who underwent an autopsy examination, ACI was significantly correlated with the rate of the calcification dimension to the whole objective area of the abdominal aorta (r=0.856, p<0.01). However, there were no correlations between ASI and the surface involvement or the atherosclerotic index obtained by the point-counting method of the autopsy materials. In the analysis of 40 patients with atherosclerotic vascular diseases, ASI and ACI were also highly correlated with the percentage volume of the arterial wall in relation to the whole volume of the observed artery (r=0.852, p<0.0001) and also the percentage calcification volume (r=0.913, p<0.0001) calculated by the computed method, respectively. The percentage of atherosclerotic vascular diseases increased in the group of both high ASI (over 10%) and high ACI (over 20%). We used SCI as a reliable index when the progression and regression of atherosclerosis was considered. Among patients of hypercholesterolemia consisting of 15 with familial hypercholesterolemia (FH) and 6 non-FH patients, the change of SCI (d-SCI) was significantly correlated with the change of total cholesterol concentration (d-TC) after the treatment (r=0.466, p<0.05) and the change of the right Achilles' tendon thickening (d-ATT) was also correlated with d-TC (r=0.634, p<0.005). However, no correlation between d-SCI and d-ATT was observed. In conclusion, CT indices of atherosclerosis were useful as a noninvasive quantitative diagnostic method and we were able to use them to assess the progression and regression of atherosclerosis. (author)
A simple method of equine limb force vector analysis and its potential applications.
Hobbs, Sarah Jane; Robinson, Mark A; Clayton, Hilary M
2018-01-01
Ground reaction forces (GRF) measured during equine gait analysis are typically evaluated by analyzing discrete values obtained from continuous force-time data for the vertical, longitudinal and transverse GRF components. This paper describes a simple, temporo-spatial method of displaying and analyzing sagittal plane GRF vectors. In addition, the application of statistical parametric mapping (SPM) is introduced to analyse differences between contra-lateral fore and hindlimb force-time curves throughout the stance phase. The overall aim of the study was to demonstrate alternative methods of evaluating functional (a)symmetry within horses. GRF and kinematic data were collected from 10 horses trotting over a series of four force plates (120 Hz). The kinematic data were used to determine clean hoof contacts. The stance phase of each hoof was determined using a 50 N threshold. Vertical and longitudinal GRF for each stance phase were plotted both as force-time curves and as force vector diagrams in which vectors originating at the centre of pressure on the force plate were drawn at intervals of 8.3 ms for the duration of stance. Visual evaluation was facilitated by overlay of the vector diagrams for different limbs. Summary vectors representing the magnitude (VecMag) and direction (VecAng) of the mean force over the entire stance phase were superimposed on the force vector diagram. Typical measurements extracted from the force-time curves (peak forces, impulses) were compared with VecMag and VecAng using partial correlation (controlling for speed). Paired samples t -tests (left v. right diagonal pair comparison and high v. low vertical force diagonal pair comparison) were performed on discrete and vector variables using traditional methods and Hotelling's T 2 tests on normalized stance phase data using SPM. Evidence from traditional statistical tests suggested that VecMag is more influenced by the vertical force and impulse, whereas VecAng is more influenced by the
Reflexion on linear regression trip production modelling method for ensuring good model quality
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
Directory of Open Access Journals (Sweden)
Nina L. Timofeeva
2014-01-01
Full Text Available The article presents the methodological and technical bases for the creation of regression models that adequately reflect reality. The focus is on methods of removing residual autocorrelation in models. Algorithms eliminating heteroscedasticity and autocorrelation of the regression model residuals: reweighted least squares method, the method of Cochran-Orkutta are given. A model of "pure" regression is build, as well as to compare the effect on the dependent variable of the different explanatory variables when the latter are expressed in different units, a standardized form of the regression equation. The scheme of abatement techniques of heteroskedasticity and autocorrelation for the creation of regression models specific to the social and cultural sphere is developed.
2017-12-01
Fig. 2 Simulation method; the process for one iteration of the simulation . It was repeated 250 times per combination of HR and FAR. Analysis was...distribution is unlimited. 8 Fig. 2 Simulation method; the process for one iteration of the simulation . It was repeated 250 times per combination of HR...stimuli. Simulations show that this regression method results in an unbiased and accurate estimate of target detection performance. The regression
Isa, Zakiah Mohd; Tawfiq, Omar Farouq; Noor, Norliza Mohd; Shamsudheen, Mohd Iqbal; Rijal, Omar Mohd
2010-03-01
In rehabilitating edentulous patients, selecting appropriately sized teeth in the absence of preextraction records is problematic. The purpose of this study was to investigate the relationships between some facial dimensions and widths of the maxillary anterior teeth to potentially provide a guide for tooth selection. Sixty full dentate Malaysian adults (18-36 years) representing 2 ethnic groups (Malay and Chinese), with well aligned maxillary anterior teeth and minimal attrition, participated in this study. Standardized digital images of the face, viewed frontally, were recorded. Using image analyzing software, the images were used to determine the interpupillary distance (IPD), inner canthal distance (ICD), and interalar width (IA). Widths of the 6 maxillary anterior teeth were measured directly from casts of the subjects using digital calipers. Regression analyses were conducted to measure the strength of the associations between the variables (alpha=.10). The means (standard deviations) of IPD, IA, and ICD of the subjects were 62.28 (2.47), 39.36 (3.12), and 34.36 (2.15) mm, respectively. The mesiodistal diameters of the maxillary central incisors, lateral incisors, and canines were 8.54 (0.50), 7.09 (0.48), and 7.94 (0.40) mm, respectively. The width of the central incisors was highly correlated to the IPD (r=0.99), while the widths of the lateral incisors and canines were highly correlated to a combination of IPD and IA (r=0.99 and 0.94, respectively). Using regression methods, the widths of the anterior teeth within the population tested may be predicted by a combination of the facial dimensions studied. (c) 2010 The Editorial Council of the Journal of Prosthetic Dentistry. Published by Mosby, Inc. All rights reserved.
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various
Geometrical Modification of Learning Vector Quantization Method for Solving Classification Problems
Directory of Open Access Journals (Sweden)
Korhan GÜNEL
2016-09-01
Full Text Available In this paper, a geometrical scheme is presented to show how to overcome an encountered problem arising from the use of generalized delta learning rule within competitive learning model. It is introduced a theoretical methodology for describing the quantization of data via rotating prototype vectors on hyper-spheres.The proposed learning algorithm is tested and verified on different multidimensional datasets including a binary class dataset and two multiclass datasets from the UCI repository, and a multiclass dataset constructed by us. The proposed method is compared with some baseline learning vector quantization variants in literature for all domains. Large number of experiments verify the performance of our proposed algorithm with acceptable accuracy and macro f1 scores.
Multi-color incomplete Cholesky conjugate gradient methods for vector computers. Ph.D. Thesis
Poole, E. L.
1986-01-01
In this research, we are concerned with the solution on vector computers of linear systems of equations, Ax = b, where A is a larger, sparse symmetric positive definite matrix. We solve the system using an iterative method, the incomplete Cholesky conjugate gradient method (ICCG). We apply a multi-color strategy to obtain p-color matrices for which a block-oriented ICCG method is implemented on the CYBER 205. (A p-colored matrix is a matrix which can be partitioned into a pXp block matrix where the diagonal blocks are diagonal matrices). This algorithm, which is based on a no-fill strategy, achieves O(N/p) length vector operations in both the decomposition of A and in the forward and back solves necessary at each iteration of the method. We discuss the natural ordering of the unknowns as an ordering that minimizes the number of diagonals in the matrix and define multi-color orderings in terms of disjoint sets of the unknowns. We give necessary and sufficient conditions to determine which multi-color orderings of the unknowns correpond to p-color matrices. A performance model is given which is used both to predict execution time for ICCG methods and also to compare an ICCG method to conjugate gradient without preconditioning or another ICCG method. Results are given from runs on the CYBER 205 at NASA's Langley Research Center for four model problems.
Whole-genome regression and prediction methods applied to plant and animal breeding
Los Campos, De G.; Hickey, J.M.; Pong-Wong, R.; Daetwyler, H.D.; Calus, M.P.L.
2013-01-01
Genomic-enabled prediction is becoming increasingly important in animal and plant breeding, and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of
Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method
Prahutama, Alan; Sudarno
2018-05-01
The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).
Selecting minimum dataset soil variables using PLSR as a regressive multivariate method
Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.
2017-04-01
Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP
A multi-label learning based kernel automatic recommendation method for support vector machine.
Zhang, Xueying; Song, Qinbao
2015-01-01
Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance.
Directory of Open Access Journals (Sweden)
Cheng-Wei Fei
2014-01-01
Full Text Available To improve the diagnosis capacity of rotor vibration fault in stochastic process, an effective fault diagnosis method (named Process Power Spectrum Entropy (PPSE and Support Vector Machine (SVM (PPSE-SVM, for short method was proposed. The fault diagnosis model of PPSE-SVM was established by fusing PPSE method and SVM theory. Based on the simulation experiment of rotor vibration fault, process data for four typical vibration faults (rotor imbalance, shaft misalignment, rotor-stator rubbing, and pedestal looseness were collected under multipoint (multiple channels and multispeed. By using PPSE method, the PPSE values of these data were extracted as fault feature vectors to establish the SVM model of rotor vibration fault diagnosis. From rotor vibration fault diagnosis, the results demonstrate that the proposed method possesses high precision, good learning ability, good generalization ability, and strong fault-tolerant ability (robustness in four aspects of distinguishing fault types, fault severity, fault location, and noise immunity of rotor stochastic vibration. This paper presents a novel method (PPSE-SVM for rotor vibration fault diagnosis and real-time vibration monitoring. The presented effort is promising to improve the fault diagnosis precision of rotating machinery like gas turbine.
EPMLR: sequence-based linear B-cell epitope prediction method using multiple linear regression.
Lian, Yao; Ge, Meng; Pan, Xian-Ming
2014-12-19
B-cell epitopes have been studied extensively due to their immunological applications, such as peptide-based vaccine development, antibody production, and disease diagnosis and therapy. Despite several decades of research, the accurate prediction of linear B-cell epitopes has remained a challenging task. In this work, based on the antigen's primary sequence information, a novel linear B-cell epitope prediction model was developed using the multiple linear regression (MLR). A 10-fold cross-validation test on a large non-redundant dataset was performed to evaluate the performance of our model. To alleviate the problem caused by the noise of negative dataset, 300 experiments utilizing 300 sub-datasets were performed. We achieved overall sensitivity of 81.8%, precision of 64.1% and area under the receiver operating characteristic curve (AUC) of 0.728. We have presented a reliable method for the identification of linear B cell epitope using antigen's primary sequence information. Moreover, a web server EPMLR has been developed for linear B-cell epitope prediction: http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/ .
An enhanced method for sequence walking and paralog mining: TOPO® Vector-Ligation PCR
Directory of Open Access Journals (Sweden)
Davis Thomas M
2010-03-01
Full Text Available Abstract Background Although technological advances allow for the economical acquisition of whole genome sequences, many organisms' genomes remain unsequenced, and fully sequenced genomes may contain gaps. Researchers reliant upon partial genomic or heterologous sequence information require methods for obtaining unknown sequences from loci of interest. Various PCR based techniques are available for sequence walking - i.e., the acquisition of unknown DNA sequence adjacent to known sequence. Many such methods require rigid, elaborate protocols and/or impose narrowly confined options in the choice of restriction enzymes for necessary genomic digests. We describe a new method, TOPO® Vector-Ligation PCR (or TVL-PCR that innovatively integrates available tools and familiar concepts to offer advantages as a means of both targeted sequence walking and paralog mining. Findings TVL-PCR exploits the ligation efficiency of the pCR®4-TOPO® (Invitrogen, Carlsbad, California vector system to capture fragments of unknown sequence by creating chimeric molecules containing defined priming sites at both ends. Initially, restriction enzyme-digested genomic DNA is end-repaired to create 3' adenosine overhangs and is then ligated to pCR4-TOPO vectors. The ligation product pool is used directly as a template for nested PCR, using specific primers to target orthologous sequences, or degenerate primers to enable capture of paralogous gene family members. We demonstrated the efficacy of this method by capturing entire coding and partial promoter sequences of several strawberry Superman-like genes. Conclusions TVL-PCR is a convenient and efficient method for DNA sequence walking and paralog mining that is applicable to any organism for which relevant DNA sequence is available as a basis for primer design.
Modulation transfer function (MTF) measurement method based on support vector machine (SVM)
Zhang, Zheng; Chen, Yueting; Feng, Huajun; Xu, Zhihai; Li, Qi
2016-03-01
An imaging system's spatial quality can be expressed by the system's modulation spread function (MTF) as a function of spatial frequency in terms of the linear response theory. Methods have been proposed to assess the MTF of an imaging system using point, slit or edge techniques. The edge method is widely used for the low requirement of targets. However, the traditional edge methods are limited by the edge angle. Besides, image noise will impair the measurement accuracy, making the measurement result unstable. In this paper, a novel measurement method based on the support vector machine (SVM) is proposed. Image patches with different edge angles and MTF levels are generated as the training set. Parameters related with MTF and image structure are extracted from the edge images. Trained with image parameters and the corresponding MTF, the SVM classifier can assess the MTF of any edge image. The result shows that the proposed method has an excellent performance on measuring accuracy and stability.
International Nuclear Information System (INIS)
Sanghavi, Suniti; Davis, Anthony B.; Eldering, Annmarie
2014-01-01
In this paper, we build up on the scalar model smartMOM to arrive at a formalism for linearized vector radiative transfer based on the matrix operator method (vSmartMOM). Improvements have been made with respect to smartMOM in that a novel method of computing intensities for the exact viewing geometry (direct raytracing) without interpolation between quadrature points has been implemented. Also, the truncation method employed for dealing with highly peaked phase functions has been changed to a vector adaptation of Wiscombe's delta-m method. These changes enable speedier and more accurate radiative transfer computations by eliminating the need for a large number of quadrature points and coefficients for generalized spherical functions. We verify our forward model against the benchmarking results of Kokhanovsky et al. (2010) [22]. All non-zero Stokes vector elements are found to show agreement up to mostly the seventh significant digit for the Rayleigh atmosphere. Intensity computations for aerosol and cloud show an agreement of well below 0.03% and 0.05% at all viewing angles except around the solar zenith angle (60°), where most radiative models demonstrate larger variances due to the strongly forward-peaked phase function. We have for the first time linearized vector radiative transfer based on the matrix operator method with respect to aerosol optical and microphysical parameters. We demonstrate this linearization by computing Jacobian matrices for all Stokes vector elements for a multi-angular and multispectral measurement setup. We use these Jacobians to compare the aerosol information content of measurements using only the total intensity component against those using the idealized measurements of full Stokes vector [I,Q,U,V] as well as the more practical use of only [I,Q,U]. As expected, we find for the considered example that the accuracy of the retrieved parameters improves when the full Stokes vector is used. The information content for the full Stokes
Al-Ghraibah, Amani
Solar flares release stored magnetic energy in the form of radiation and can have significant detrimental effects on earth including damage to technological infrastructure. Recent work has considered methods to predict future flare activity on the basis of quantitative measures of the solar magnetic field. Accurate advanced warning of solar flare occurrence is an area of increasing concern and much research is ongoing in this area. Our previous work 111] utilized standard pattern recognition and classification techniques to determine (classify) whether a region is expected to flare within a predictive time window, using a Relevance Vector Machine (RVM) classification method. We extracted 38 features which describing the complexity of the photospheric magnetic field, the result classification metrics will provide the baseline against which we compare our new work. We find a true positive rate (TPR) of 0.8, true negative rate (TNR) of 0.7, and true skill score (TSS) of 0.49. This dissertation proposes three basic topics; the first topic is an extension to our previous work [111, where we consider a feature selection method to determine an appropriate feature subset with cross validation classification based on a histogram analysis of selected features. Classification using the top five features resulting from this analysis yield better classification accuracies across a large unbalanced dataset. In particular, the feature subsets provide better discrimination of the many regions that flare where we find a TPR of 0.85, a TNR of 0.65 sightly lower than our previous work, and a TSS of 0.5 which has an improvement comparing with our previous work. In the second topic, we study the prediction of solar flare size and time-to-flare using support vector regression (SVR). When we consider flaring regions only, we find an average error in estimating flare size of approximately half a GOES class. When we additionally consider non-flaring regions, we find an increased average
Astawa, INGA; Gusti Ngurah Bagus Caturbawa, I.; Made Sajayasa, I.; Dwi Suta Atmaja, I. Made Ari
2018-01-01
The license plate recognition usually used as part of system such as parking system. License plate detection considered as the most important step in the license plate recognition system. We propose methods that can be used to detect the vehicle plate on mobile phone. In this paper, we used Sliding Window, Histogram of Oriented Gradient (HOG), and Support Vector Machines (SVM) method to license plate detection so it will increase the detection level even though the image is not in a good quality. The image proceed by Sliding Window method in order to find plate position. Feature extraction in every window movement had been done by HOG and SVM method. Good result had shown in this research, which is 96% of accuracy.
A Numerical Comparison of Rule Ensemble Methods and Support Vector Machines
Energy Technology Data Exchange (ETDEWEB)
Meza, Juan C.; Woods, Mark
2009-12-18
Machine or statistical learning is a growing field that encompasses many scientific problems including estimating parameters from data, identifying risk factors in health studies, image recognition, and finding clusters within datasets, to name just a few examples. Statistical learning can be described as 'learning from data' , with the goal of making a prediction of some outcome of interest. This prediction is usually made on the basis of a computer model that is built using data where the outcomes and a set of features have been previously matched. The computer model is called a learner, hence the name machine learning. In this paper, we present two such algorithms, a support vector machine method and a rule ensemble method. We compared their predictive power on three supernova type 1a data sets provided by the Nearby Supernova Factory and found that while both methods give accuracies of approximately 95%, the rule ensemble method gives much lower false negative rates.
A method for real-time three-dimensional vector velocity imaging
DEFF Research Database (Denmark)
Jensen, Jørgen Arendt; Nikolov, Svetoslav
2003-01-01
The paper presents an approach for making real-time three-dimensional vector flow imaging. Synthetic aperture data acquisition is used, and the data is beamformed along the flow direction to yield signals usable for flow estimation. The signals are cross-related to determine the shift in position...... are done using 16 × 16 = 256 elements at a time and the received signals from the same elements are sampled. Access to the individual elements is done through 16-to-1 multiplexing, so that only a 256 channels transmitting and receiving system are needed. The method has been investigated using Field II...
A Shellcode Detection Method Based on Full Native API Sequence and Support Vector Machine
Cheng, Yixuan; Fan, Wenqing; Huang, Wei; An, Jing
2017-09-01
Dynamic monitoring the behavior of a program is widely used to discriminate between benign program and malware. It is usually based on the dynamic characteristics of a program, such as API call sequence or API call frequency to judge. The key innovation of this paper is to consider the full Native API sequence and use the support vector machine to detect the shellcode. We also use the Markov chain to extract and digitize Native API sequence features. Our experimental results show that the method proposed in this paper has high accuracy and low detection rate.
Using Vector Projection Method to evaluate maintainability of mechanical system in design review
International Nuclear Information System (INIS)
Chen Lu; Cai Jianguo
2003-01-01
Maintainability of a mechanical system is one of the system design parameters that has a great impact in terms of ease of maintenance. In this article, based on the definition of the terms of maintenance and maintainability, an important tool of Design for Maintenance is developed as a way to improve maintainability through design. A set of standard and organized guidelines is provided and maintainability factors in terms of physical design, logistics support and ergonomics are identified. As a specific application of design review, a methodology so called Vector Projection Method is developed to evaluate the maintainability of the mechanical system. Lastly, an example is discussed
Directory of Open Access Journals (Sweden)
Yukai Yao
2015-01-01
Full Text Available We propose an optimized Support Vector Machine classifier, named PMSVM, in which System Normalization, PCA, and Multilevel Grid Search methods are comprehensively considered for data preprocessing and parameters optimization, respectively. The main goals of this study are to improve the classification efficiency and accuracy of SVM. Sensitivity, Specificity, Precision, and ROC curve, and so forth, are adopted to appraise the performances of PMSVM. Experimental results show that PMSVM has relatively better accuracy and remarkable higher efficiency compared with traditional SVM algorithms.
Functional regression method for whole genome eQTL epistasis analysis with sequencing data.
Xu, Kelin; Jin, Li; Xiong, Momiao
2017-05-18
Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction
International Nuclear Information System (INIS)
Pang, Hongfeng; Zhu, XueJun; Pan, Mengchun; Zhang, Qi; Wan, Chengbiao; Luo, Shitu; Chen, Dixiang; Chen, Jinfei; Li, Ji; Lv, Yunxiao
2016-01-01
Misalignment error is one key factor influencing the measurement accuracy of geomagnetic vector measurement system, which should be calibrated with the difficulties that sensors measure different physical information and coordinates are invisible. A new misalignment calibration method by rotating a parallelepiped frame is proposed. Simulation and experiment result show the effectiveness of calibration method. The experimental system mainly contains DM-050 three-axis fluxgate magnetometer, INS (inertia navigation system), aluminium parallelepiped frame, aluminium plane base. Misalignment angles are calculated by measured data of magnetometer and INS after rotating the aluminium parallelepiped frame on aluminium plane base. After calibration, RMS error of geomagnetic north, vertical and east are reduced from 349.441 nT, 392.530 nT and 562.316 nT to 40.130 nT, 91.586 nT and 141.989 nT respectively. - Highlights: • A new misalignment calibration method by rotating a parallelepiped frame is proposed. • It does not need to know sensor attitude information or local dip angle. • The calibration system attitude change angle is not strictly required. • It can be widely used when sensors measure different physical information. • Geomagnetic vector measurement error is reduced evidently.
Fluxgate magnetometer offset vector determination by the 3D mirror mode method
Plaschke, F.; Goetz, C.; Volwerk, M.; Richter, I.; Frühauff, D.; Narita, Y.; Glassmeier, K.-H.; Dougherty, M. K.
2017-07-01
Fluxgate magnetometers on-board spacecraft need to be regularly calibrated in flight. In low fields, the most important calibration parameters are the three offset vector components, which represent the magnetometer measurements in vanishing ambient magnetic fields. In case of three-axis stabilized spacecraft, a few methods exist to determine offsets: (I) by analysis of Alfvénic fluctuations present in the pristine interplanetary magnetic field, (II) by rolling the spacecraft around at least two axes, (III) by cross-calibration against measurements from electron drift instruments or absolute magnetometers, and (IV) by taking measurements in regions of well-known magnetic fields, e.g. cometary diamagnetic cavities. In this paper, we introduce a fifth option, the 3-dimensional (3D) mirror mode method, by which 3D offset vectors can be determined using magnetic field measurements of highly compressional waves, e.g. mirror modes in the Earth's magnetosheath. We test the method by applying it to magnetic field data measured by the following: the Time History of Events and Macroscale Interactions during Substorms-C spacecraft in the terrestrial magnetosheath, the Cassini spacecraft in the Jovian magnetosheath and the Rosetta spacecraft in the vicinity of comet 67P/Churyumov-Gerasimenko. The tests reveal that the achievable offset accuracies depend on the ambient magnetic field strength (lower strength meaning higher accuracy), on the length of the underlying data interval (more data meaning higher accuracy) and on the stability of the offset that is to be determined.
Energy Technology Data Exchange (ETDEWEB)
Pang, Hongfeng [Academy of Equipment, Beijing 101416 (China); College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073 (China); Zhu, XueJun, E-mail: zhuxuejun1990@126.com [College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073 (China); Pan, Mengchun; Zhang, Qi; Wan, Chengbiao; Luo, Shitu; Chen, Dixiang; Chen, Jinfei; Li, Ji; Lv, Yunxiao [College of Mechatronics Engineering and Automation, National University of Defense Technology, Changsha 410073 (China)
2016-12-01
Misalignment error is one key factor influencing the measurement accuracy of geomagnetic vector measurement system, which should be calibrated with the difficulties that sensors measure different physical information and coordinates are invisible. A new misalignment calibration method by rotating a parallelepiped frame is proposed. Simulation and experiment result show the effectiveness of calibration method. The experimental system mainly contains DM-050 three-axis fluxgate magnetometer, INS (inertia navigation system), aluminium parallelepiped frame, aluminium plane base. Misalignment angles are calculated by measured data of magnetometer and INS after rotating the aluminium parallelepiped frame on aluminium plane base. After calibration, RMS error of geomagnetic north, vertical and east are reduced from 349.441 nT, 392.530 nT and 562.316 nT to 40.130 nT, 91.586 nT and 141.989 nT respectively. - Highlights: • A new misalignment calibration method by rotating a parallelepiped frame is proposed. • It does not need to know sensor attitude information or local dip angle. • The calibration system attitude change angle is not strictly required. • It can be widely used when sensors measure different physical information. • Geomagnetic vector measurement error is reduced evidently.
Dynamic analysis of suspension cable based on vector form intrinsic finite element method
Qin, Jian; Qiao, Liang; Wan, Jiancheng; Jiang, Ming; Xia, Yongjun
2017-10-01
A vector finite element method is presented for the dynamic analysis of cable structures based on the vector form intrinsic finite element (VFIFE) and mechanical properties of suspension cable. Firstly, the suspension cable is discretized into different elements by space points, the mass and external forces of suspension cable are transformed into space points. The structural form of cable is described by the space points at different time. The equations of motion for the space points are established according to the Newton’s second law. Then, the element internal forces between the space points are derived from the flexible truss structure. Finally, the motion equations of space points are solved by the central difference method with reasonable time integration step. The tangential tension of the bearing rope in a test ropeway with the moving concentrated loads is calculated and compared with the experimental data. The results show that the tangential tension of suspension cable with moving loads is consistent with the experimental data. This method has high calculated precision and meets the requirements of engineering application.
International Nuclear Information System (INIS)
Sergienko, I.V.; Golodnikov, A.N.
1984-01-01
This article applies the methods of decompositions, which are used to solve continuous linear problems, to integer and partially integer problems. The fall-vector method is used to solve the obtained coordinate problems. An algorithm of the fall-vector is described. The Kornai-Liptak decomposition principle is used to reduce the integer linear programming problem to integer linear programming problems of a smaller dimension and to a discrete coordinate problem with simple constraints
Directory of Open Access Journals (Sweden)
Sergei Vladimirovich Varaksin
2017-06-01
Full Text Available Purpose. Construction of a mathematical model of the dynamics of childbearing change in the Altai region in 2000–2016, analysis of the dynamics of changes in birth rates for multiple age categories of women of childbearing age. Methodology. A auxiliary analysis element is the construction of linear mathematical models of the dynamics of childbearing by using fuzzy linear regression method based on fuzzy numbers. Fuzzy linear regression is considered as an alternative to standard statistical linear regression for short time series and unknown distribution law. The parameters of fuzzy linear and standard statistical regressions for childbearing time series were defined with using the built in language MatLab algorithm. Method of fuzzy linear regression is not used in sociological researches yet. Results. There are made the conclusions about the socio-demographic changes in society, the high efficiency of the demographic policy of the leadership of the region and the country, and the applicability of the method of fuzzy linear regression for sociological analysis.
Community effectiveness of pyriproxyfen as a dengue vector control method: A systematic review.
Directory of Open Access Journals (Sweden)
Dorit Maoz
2017-07-01
Full Text Available Vector control is the only widely utilised method for primary prevention and control of dengue. The use of pyriproxyfen may be promising, and autodissemination approach may reach hard to reach breeding places. It offers a unique mode of action (juvenile hormone mimic and as an additional tool for the management of insecticide resistance among Aedes vectors. However, evidence of efficacy and community effectiveness (CE remains limited.The aim of this systematic review is to compile and analyse the existing literature for evidence on the CE of pyriproxyfen as a vector control method for reducing Ae. aegypti and Ae. albopictus populations and thereby human dengue transmission.Systematic search of PubMed, Embase, Lilacs, Cochrane library, WHOLIS, Web of Science, Google Scholar as well as reference lists of all identified studies. Removal of duplicates, screening of abstracts and assessment for eligibility of the remaining studies followed. Relevant data were extracted, and a quality assessment conducted. Results were classified into four main categories of how pyriproxyfen was applied: - 1 container treatment, 2 fumigation, 3 auto-dissemination or 4 combination treatments,-and analysed with a view to their public health implication.Out of 745 studies 17 studies were identified that fulfilled all eligibility criteria. The results show that pyriproxyfen can be effective in reducing the numbers of Aedes spp. immatures with different methods of application when targeting their main breeding sites. However, the combination of pyriproxyfen with a second product increases efficacy and/or persistence of the intervention and may also slow down the development of insecticide resistance. Open questions concern concentration and frequency of application in the various treatments. Area-wide ultra-low volume treatment with pyriproxyfen currently lacks evidence and cannot be recommended. Community participation and acceptance has not consistently been successful
Community effectiveness of pyriproxyfen as a dengue vector control method: A systematic review.
Maoz, Dorit; Ward, Tara; Samuel, Moody; Müller, Pie; Runge-Ranzinger, Silvia; Toledo, Joao; Boyce, Ross; Velayudhan, Raman; Horstick, Olaf
2017-07-01
Vector control is the only widely utilised method for primary prevention and control of dengue. The use of pyriproxyfen may be promising, and autodissemination approach may reach hard to reach breeding places. It offers a unique mode of action (juvenile hormone mimic) and as an additional tool for the management of insecticide resistance among Aedes vectors. However, evidence of efficacy and community effectiveness (CE) remains limited. The aim of this systematic review is to compile and analyse the existing literature for evidence on the CE of pyriproxyfen as a vector control method for reducing Ae. aegypti and Ae. albopictus populations and thereby human dengue transmission. Systematic search of PubMed, Embase, Lilacs, Cochrane library, WHOLIS, Web of Science, Google Scholar as well as reference lists of all identified studies. Removal of duplicates, screening of abstracts and assessment for eligibility of the remaining studies followed. Relevant data were extracted, and a quality assessment conducted. Results were classified into four main categories of how pyriproxyfen was applied: - 1) container treatment, 2) fumigation, 3) auto-dissemination or 4) combination treatments,-and analysed with a view to their public health implication. Out of 745 studies 17 studies were identified that fulfilled all eligibility criteria. The results show that pyriproxyfen can be effective in reducing the numbers of Aedes spp. immatures with different methods of application when targeting their main breeding sites. However, the combination of pyriproxyfen with a second product increases efficacy and/or persistence of the intervention and may also slow down the development of insecticide resistance. Open questions concern concentration and frequency of application in the various treatments. Area-wide ultra-low volume treatment with pyriproxyfen currently lacks evidence and cannot be recommended. Community participation and acceptance has not consistently been successful and needs to
Hano, Mitsuo; Hotta, Masashi
A new multigrid method based on high-order vector finite elements is proposed in this paper. Low level discretizations in this method are obtained by using low-order vector finite elements for the same mesh. Gauss-Seidel method is used as a smoother, and a linear equation of lowest level is solved by ICCG method. But it is often found that multigrid solutions do not converge into ICCG solutions. An elimination algolithm of constant term using a null space of the coefficient matrix is also described. In three dimensional magnetostatic field analysis, convergence time and number of iteration of this multigrid method are discussed with the convectional ICCG method.
International Nuclear Information System (INIS)
Wang Weida; Xia Junding; Zhou Zhixin; Leung, P.L.
2001-01-01
Thermoluminescence (TL) dating using a regression method of saturating exponential in pre-dose technique was described. 23 porcelain samples from past dynasties of China were dated by this method. The results show that the TL ages are in reasonable agreement with archaeological dates within a standard deviation of 27%. Such error can be accepted in porcelain dating
The analysis of survival data in nephrology: basic concepts and methods of Cox regression
van Dijk, Paul C.; Jager, Kitty J.; Zwinderman, Aeilko H.; Zoccali, Carmine; Dekker, Friedo W.
2008-01-01
How much does the survival of one group differ from the survival of another group? How do differences in age in these two groups affect such a comparison? To obtain a quantity to compare the survival of different patient groups and to account for confounding effects, a multiple regression technique
The development of vector based 2.5D print methods for a painting machine
Parraman, Carinna
2013-02-01
Through recent trends in the application of digitally printed decorative finishes to products, CAD, 3D additive layer manufacturing and research in material perception, [1, 2] there is a growing interest in the accurate rendering of materials and tangible displays. Although current advances in colour management and inkjet printing has meant that users can take for granted high-quality colour and resolution in their printed images, digital methods for transferring a photographic coloured image from screen to paper is constrained by pixel count, file size, colorimetric conversion between colour spaces and the gamut limits of input and output devices. This paper considers new approaches to applying alternative colour palettes by using a vector-based approach through the application of paint mixtures, towards what could be described as a 2.5D printing method. The objective is to not apply an image to a textured surface, but where texture and colour are integral to the mark, that like a brush, delineates the contours in the image. The paper describes the difference between the way inks and paints are mixed and applied. When transcribing the fluid appearance of a brush stroke, there is a difference between a halftone printed mark and a painted mark. The issue of surface quality is significant to subjective qualities when studying the appearance of ink or paint on paper. The paper provides examples of a range of vector marks that are then transcribed into brush stokes by the painting machine.
Faults Classification Of Power Electronic Circuits Based On A Support Vector Data Description Method
Directory of Open Access Journals (Sweden)
Cui Jiang
2015-06-01
Full Text Available Power electronic circuits (PECs are prone to various failures, whose classification is of paramount importance. This paper presents a data-driven based fault diagnosis technique, which employs a support vector data description (SVDD method to perform fault classification of PECs. In the presented method, fault signals (e.g. currents, voltages, etc. are collected from accessible nodes of circuits, and then signal processing techniques (e.g. Fourier analysis, wavelet transform, etc. are adopted to extract feature samples, which are subsequently used to perform offline machine learning. Finally, the SVDD classifier is used to implement fault classification task. However, in some cases, the conventional SVDD cannot achieve good classification performance, because this classifier may generate some so-called refusal areas (RAs, and in our design these RAs are resolved with the one-against-one support vector machine (SVM classifier. The obtained experiment results from simulated and actual circuits demonstrate that the improved SVDD has a classification performance close to the conventional one-against-one SVM, and can be applied to fault classification of PECs in practice.
Estimating traffic volume on Wyoming low volume roads using linear and logistic regression methods
Directory of Open Access Journals (Sweden)
Dick Apronti
2016-12-01
Full Text Available Traffic volume is an important parameter in most transportation planning applications. Low volume roads make up about 69% of road miles in the United States. Estimating traffic on the low volume roads is a cost-effective alternative to taking traffic counts. This is because traditional traffic counts are expensive and impractical for low priority roads. The purpose of this paper is to present the development of two alternative means of cost-effectively estimating traffic volumes for low volume roads in Wyoming and to make recommendations for their implementation. The study methodology involves reviewing existing studies, identifying data sources, and carrying out the model development. The utility of the models developed were then verified by comparing actual traffic volumes to those predicted by the model. The study resulted in two regression models that are inexpensive and easy to implement. The first regression model was a linear regression model that utilized pavement type, access to highways, predominant land use types, and population to estimate traffic volume. In verifying the model, an R2 value of 0.64 and a root mean square error of 73.4% were obtained. The second model was a logistic regression model that identified the level of traffic on roads using five thresholds or levels. The logistic regression model was verified by estimating traffic volume thresholds and determining the percentage of roads that were accurately classified as belonging to the given thresholds. For the five thresholds, the percentage of roads classified correctly ranged from 79% to 88%. In conclusion, the verification of the models indicated both model types to be useful for accurate and cost-effective estimation of traffic volumes for low volume Wyoming roads. The models developed were recommended for use in traffic volume estimations for low volume roads in pavement management and environmental impact assessment studies.
Imani, Moslem; You, Rey-Jer; Kuo, Chung-Yen
2014-10-01
Sea level forecasting at various time intervals is of great importance in water supply management. Evolutionary artificial intelligence (AI) approaches have been accepted as an appropriate tool for modeling complex nonlinear phenomena in water bodies. In the study, we investigated the ability of two AI techniques: support vector machine (SVM), which is mathematically well-founded and provides new insights into function approximation, and gene expression programming (GEP), which is used to forecast Caspian Sea level anomalies using satellite altimetry observations from June 1992 to December 2013. SVM demonstrates the best performance in predicting Caspian Sea level anomalies, given the minimum root mean square error (RMSE = 0.035) and maximum coefficient of determination (R2 = 0.96) during the prediction periods. A comparison between the proposed AI approaches and the cascade correlation neural network (CCNN) model also shows the superiority of the GEP and SVM models over the CCNN.
Directory of Open Access Journals (Sweden)
Fabio Faria da Mota
Full Text Available BACKGROUND: Chagas disease is a trypanosomiasis whose agent is the protozoan parasite Trypanosoma cruzi, which is transmitted to humans by hematophagous bugs known as triatomines. Even though insecticide treatments allow effective control of these bugs in most Latin American countries where Chagas disease is endemic, the disease still affects a large proportion of the population of South America. The features of the disease in humans have been extensively studied, and the genome of the parasite has been sequenced, but no effective drug is yet available to treat Chagas disease. The digestive tract of the insect vectors in which T. cruzi develops has been much less well investigated than blood from its human hosts and constitutes a dynamic environment with very different conditions. Thus, we investigated the composition of the predominant bacterial species of the microbiota in insect vectors from Rhodnius, Triatoma, Panstrongylus and Dipetalogaster genera. METHODOLOGY/PRINCIPAL FINDINGS: Microbiota of triatomine guts were investigated using cultivation-independent methods, i.e., phylogenetic analysis of 16s rDNA using denaturing gradient gel electrophoresis (DGGE and cloned-based sequencing. The Chao index showed that the diversity of bacterial species in triatomine guts is low, comprising fewer than 20 predominant species, and that these species vary between insect species. The analyses showed that Serratia predominates in Rhodnius, Arsenophonus predominates in Triatoma and Panstrongylus, while Candidatus Rohrkolberia predominates in Dipetalogaster. CONCLUSIONS/SIGNIFICANCE: The microbiota of triatomine guts represents one of the factors that may interfere with T. cruzi transmission and virulence in humans. The knowledge of its composition according to insect species is important for designing measures of biological control for T. cruzi. We found that the predominant species of the bacterial microbiota in triatomines form a group of low
da Mota, Fabio Faria; Marinho, Lourena Pinheiro; Moreira, Carlos José de Carvalho; Lima, Marli Maria; Mello, Cícero Brasileiro; Garcia, Eloi Souza; Carels, Nicolas; Azambuja, Patricia
2012-01-01
Chagas disease is a trypanosomiasis whose agent is the protozoan parasite Trypanosoma cruzi, which is transmitted to humans by hematophagous bugs known as triatomines. Even though insecticide treatments allow effective control of these bugs in most Latin American countries where Chagas disease is endemic, the disease still affects a large proportion of the population of South America. The features of the disease in humans have been extensively studied, and the genome of the parasite has been sequenced, but no effective drug is yet available to treat Chagas disease. The digestive tract of the insect vectors in which T. cruzi develops has been much less well investigated than blood from its human hosts and constitutes a dynamic environment with very different conditions. Thus, we investigated the composition of the predominant bacterial species of the microbiota in insect vectors from Rhodnius, Triatoma, Panstrongylus and Dipetalogaster genera. Microbiota of triatomine guts were investigated using cultivation-independent methods, i.e., phylogenetic analysis of 16s rDNA using denaturing gradient gel electrophoresis (DGGE) and cloned-based sequencing. The Chao index showed that the diversity of bacterial species in triatomine guts is low, comprising fewer than 20 predominant species, and that these species vary between insect species. The analyses showed that Serratia predominates in Rhodnius, Arsenophonus predominates in Triatoma and Panstrongylus, while Candidatus Rohrkolberia predominates in Dipetalogaster. The microbiota of triatomine guts represents one of the factors that may interfere with T. cruzi transmission and virulence in humans. The knowledge of its composition according to insect species is important for designing measures of biological control for T. cruzi. We found that the predominant species of the bacterial microbiota in triatomines form a group of low complexity whose structure differs according to the vector genus.
Ultrasonic 3-D Vector Flow Method for Quantitative In Vivo Peak Velocity and Flow Rate Estimation
DEFF Research Database (Denmark)
Holbek, Simon; Ewertsen, Caroline; Bouzari, Hamed
2017-01-01
Current clinical ultrasound (US) systems are limited to show blood flow movement in either 1-D or 2-D. In this paper, a method for estimating 3-D vector velocities in a plane using the transverse oscillation method, a 32×32 element matrix array, and the experimental US scanner SARUS is presented...... is validated in two phantom studies, where flow rates are measured in a flow-rig, providing a constant parabolic flow, and in a straight-vessel phantom ( ∅=8 mm) connected to a flow pump capable of generating time varying waveforms. Flow rates are estimated to be 82.1 ± 2.8 L/min in the flow-rig compared...
Classification of ECG signal with Support Vector Machine Method for Arrhythmia Detection
Turnip, Arjon; Ilham Rizqywan, M.; Kusumandari, Dwi E.; Turnip, Mardi; Sihombing, Poltak
2018-03-01
An electrocardiogram is a potential bioelectric record that occurs as a result of cardiac activity. QRS Detection with zero crossing calculation is one method that can precisely determine peak R of QRS wave as part of arrhythmia detection. In this paper, two experimental scheme (2 minutes duration with different activities: relaxed and, typing) were conducted. From the two experiments it were obtained: accuracy, sensitivity, and positive predictivity about 100% each for the first experiment and about 79%, 93%, 83% for the second experiment, respectively. Furthermore, the feature set of MIT-BIH arrhythmia using the support vector machine (SVM) method on the WEKA software is evaluated. By combining the available attributes on the WEKA algorithm, the result is constant since all classes of SVM goes to the normal class with average 88.49% accuracy.
Nonlinear optimization method of ship floating condition calculation in wave based on vector
Ding, Ning; Yu, Jian-xing
2014-08-01
Ship floating condition in regular waves is calculated. New equations controlling any ship's floating condition are proposed by use of the vector operation. This form is a nonlinear optimization problem which can be solved using the penalty function method with constant coefficients. And the solving process is accelerated by dichotomy. During the solving process, the ship's displacement and buoyant centre have been calculated by the integration of the ship surface according to the waterline. The ship surface is described using an accumulative chord length theory in order to determine the displacement, the buoyancy center and the waterline. The draught forming the waterline at each station can be found out by calculating the intersection of the ship surface and the wave surface. The results of an example indicate that this method is exact and efficient. It can calculate the ship floating condition in regular waves as well as simplify the calculation and improve the computational efficiency and the precision of results.
Energy Technology Data Exchange (ETDEWEB)
Keilacker, H; Becker, G; Ziegler, M; Gottschling, H D [Zentralinstitut fuer Diabetes, Karlsburg (German Democratic Republic)
1980-10-01
In order to handle all types of radioimmunoassay (RIA) calibration curves obtained in the authors' laboratory in the same way, they tried to find a non-linear expression for their regression which allows calibration curves with different degrees of curvature to be fitted. Considering the two boundary cases of the incubation protocol they derived a hyperbolic inverse regression function: x = a/sub 1/y + a/sub 0/ + asub(-1)y/sup -1/, where x is the total concentration of antigen, asub(i) are constants, and y is the specifically bound radioactivity. An RIA evaluation procedure based on this function is described providing a fitted inverse RIA calibration curve and some statistical quality parameters. The latter are of an order which is normal for RIA systems. There is an excellent agreement between fitted and experimentally obtained calibration curves having a different degree of curvature.
A vector field method on the distorted Fourier side and decay for wave equations with potentials
Donninger, Roland
2016-01-01
The authors study the Cauchy problem for the one-dimensional wave equation \\partial_t^2 u(t,x)-\\partial_x^2 u(t,x)+V(x)u(t,x)=0. The potential V is assumed to be smooth with asymptotic behavior V(x)\\sim -\\tfrac14 |x|^{-2}\\mbox{ as } |x|\\to \\infty. They derive dispersive estimates, energy estimates, and estimates involving the scaling vector field t\\partial_t+x\\partial_x, where the latter are obtained by employing a vector field method on the âeoedistortedâe Fourier side. In addition, they prove local energy decay estimates. Their results have immediate applications in the context of geometric evolution problems. The theory developed in this paper is fundamental for the proof of the co-dimension 1 stability of the catenoid under the vanishing mean curvature flow in Minkowski space; see Donninger, Krieger, Szeftel, and Wong, âeoeCodimension one stability of the catenoid under the vanishing mean curvature flow in Minkowski spaceâe, preprint arXiv:1310.5606 (2013).
Simple method for the characterization of intense Laguerre-Gauss vector vortex beams
Allahyari, E.; JJ Nivas, J.; Cardano, F.; Bruzzese, R.; Fittipaldi, R.; Marrucci, L.; Paparo, D.; Rubano, A.; Vecchione, A.; Amoruso, S.
2018-05-01
We report on a method for the characterization of intense, structured optical fields through the analysis of the size and surface structures formed inside the annular ablation crater created on the target surface. In particular, we apply the technique to laser ablation of crystalline silicon induced by femtosecond vector vortex beams. We show that a rapid direct estimate of the beam waist parameter is obtained through a measure of the crater radii. The variation of the internal and external radii of the annular crater as a function of the laser pulse energy, at fixed number of pulses, provides another way to evaluate the beam spot size through numerical fitting of the obtained experimental data points. A reliable estimate of the spot size is of paramount importance to investigate pulsed laser-induced effects on the target material. Our experimental findings offer a facile way to characterize focused, high intensity complex optical vector beams which are more and more applied in laser-matter interaction experiments.
Sub-piexl methods for improving vector quality in echo PIV flow, imaging technology.
Niu, Lili; Wang, Jing; Qian, Ming; Zheng, Hairong
2009-01-01
Developments of many cardiovascular problems have been shown to have a close relationship with arterial flow conditions. An ultrasound-based particle image velocimetry technique(Echo PIV) was recently developed to measure multi-component velocity vectors and local shear rates in arteries and opaque fluid flows by identifying and tracking flow tracers (ultrasound contrast microbubbles) within these flow fields. To improve the measurement accuracy, sub-pixel calculation method was adopted in this paper to maximize the ultrasound RF signal and B mode image correlation accuracy and increase the image spatial resolution. This algorithm is employed in processing both computer-generated particle image patterns and the B-mode images of microbubbles in rotating flows obtained by a high frame rate (up to 1000 frames per second) ultrasound imaging system. The results show the correlation of particle patterns and individual flow vector quality are improved and the overall flow mappings are also improved significantly. This would help the Echo PIV system to provide better multi-component velocity accuracy.
Directory of Open Access Journals (Sweden)
Milan Gocic
2016-01-01
Full Text Available The monthly precipitation data from 29 stations in Serbia during the period of 1946–2012 were considered. Precipitation trends were calculated using linear regression method. Three CLINO periods (1961–1990, 1971–2000, and 1981–2010 in three subregions were analysed. The CLINO 1981–2010 period had a significant increasing trend. Spatial pattern of the precipitation concentration index (PCI was presented. For the purpose of PCI prediction, three Support Vector Machine (SVM models, namely, SVM coupled with the discrete wavelet transform (SVM-Wavelet, the firefly algorithm (SVM-FFA, and using the radial basis function (SVM-RBF, were developed and used. The estimation and prediction results of these models were compared with each other using three statistical indicators, that is, root mean square error, coefficient of determination, and coefficient of efficiency. The experimental results showed that an improvement in predictive accuracy and capability of generalization can be achieved by the SVM-Wavelet approach. Moreover, the results indicated the proposed SVM-Wavelet model can adequately predict the PCI.
Directory of Open Access Journals (Sweden)
Jun Bi
2018-04-01
Full Text Available Battery electric vehicles (BEVs reduce energy consumption and air pollution as compared with conventional vehicles. However, the limited driving range and potential long charging time of BEVs create new problems. Accurate charging time prediction of BEVs helps drivers determine travel plans and alleviate their range anxiety during trips. This study proposed a combined model for charging time prediction based on regression and time-series methods according to the actual data from BEVs operating in Beijing, China. After data analysis, a regression model was established by considering the charged amount for charging time prediction. Furthermore, a time-series method was adopted to calibrate the regression model, which significantly improved the fitting accuracy of the model. The parameters of the model were determined by using the actual data. Verification results confirmed the accuracy of the model and showed that the model errors were small. The proposed model can accurately depict the charging time characteristics of BEVs in Beijing.
Lusiana, Evellin Dewi
2017-12-01
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
Regression to fuzziness method for estimation of remaining useful life in power plant components
Alamaniotis, Miltiadis; Grelle, Austin; Tsoukalas, Lefteri H.
2014-10-01
Mitigation of severe accidents in power plants requires the reliable operation of all systems and the on-time replacement of mechanical components. Therefore, the continuous surveillance of power systems is a crucial concern for the overall safety, cost control, and on-time maintenance of a power plant. In this paper a methodology called regression to fuzziness is presented that estimates the remaining useful life (RUL) of power plant components. The RUL is defined as the difference between the time that a measurement was taken and the estimated failure time of that component. The methodology aims to compensate for a potential lack of historical data by modeling an expert's operational experience and expertise applied to the system. It initially identifies critical degradation parameters and their associated value range. Once completed, the operator's experience is modeled through fuzzy sets which span the entire parameter range. This model is then synergistically used with linear regression and a component's failure point to estimate the RUL. The proposed methodology is tested on estimating the RUL of a turbine (the basic electrical generating component of a power plant) in three different cases. Results demonstrate the benefits of the methodology for components for which operational data is not readily available and emphasize the significance of the selection of fuzzy sets and the effect of knowledge representation on the predicted output. To verify the effectiveness of the methodology, it was benchmarked against the data-based simple linear regression model used for predictions which was shown to perform equal or worse than the presented methodology. Furthermore, methodology comparison highlighted the improvement in estimation offered by the adoption of appropriate of fuzzy sets for parameter representation.
Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi
2017-06-01
Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p logistic regression model for the classification of risk groups for PTB.
Eekhout, I.; Wiel, M.A. van de; Heymans, M.W.
2017-01-01
Background. Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels
Mofavvaz, Shirin; Sohrabi, Mahmoud Reza; Nezamzadeh-Ejhieh, Alireza
2017-07-01
In the present study, artificial neural networks (ANNs) and least squares support vector machines (LS-SVM) as intelligent methods based on absorption spectra in the range of 230-300 nm have been used for determination of antihistamine decongestant contents. In the first step, one type of network (feed-forward back-propagation) from the artificial neural network with two different training algorithms, Levenberg-Marquardt (LM) and gradient descent with momentum and adaptive learning rate back-propagation (GDX) algorithm, were employed and their performance was evaluated. The performance of the LM algorithm was better than the GDX algorithm. In the second one, the radial basis network was utilized and results compared with the previous network. In the last one, the other intelligent method named least squares support vector machine was proposed to construct the antihistamine decongestant prediction model and the results were compared with two of the aforementioned networks. The values of the statistical parameters mean square error (MSE), Regression coefficient (R2), correlation coefficient (r) and also mean recovery (%), relative standard deviation (RSD) used for selecting the best model between these methods. Moreover, the proposed methods were compared to the high- performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them.
Determination of benzo(apyrene content in PM10 using regression methods
Directory of Open Access Journals (Sweden)
Jacek Gębicki
2015-12-01
Full Text Available The paper presents an attempt of application of multidimensional linear regression to estimation of an empirical model describing the factors influencing on B(aP content in suspended dust PM10 in Olsztyn and Elbląg city regions between 2010 and 2013. During this period annual average concentration of B(aP in PM10 exceeded the admissible level 1.5-3 times. Conducted investigations confirm that the reasons of B(aP concentration increase are low-efficiency individual home heat stations or low-temperature heat sources, which are responsible for so-called low emission during heating period. Dependences between the following quantities were analysed: concentration of PM10 dust in air, air temperature, wind velocity, air humidity. A measure of model fitting to actual B(aP concentration in PM10 was the coefficient of determination of the model. Application of multidimensional linear regression yielded the equations characterized by high values of the coefficient of determination of the model, especially during heating season. This parameter ranged from 0.54 to 0.80 during the analyzed period.
A Review of Methods for Detection of Hepatozoon Infection in Carnivores and Arthropod Vectors.
Modrý, David; Beck, Relja; Hrazdilová, Kristýna; Baneth, Gad
2017-01-01
Vector-borne protists of the genus Hepatozoon belong to the apicomplexan suborder Adeleorina. The taxonomy of Hepatozoon is unsettled and different phylogenetic clades probably represent evolutionary units deserving the status of separate genera. Throughout our review, we focus on the monophyletic assemblage of Hepatozoon spp. from carnivores, classified as Hepatozoon sensu stricto that includes important pathogens of domestic and free-ranging canine and feline hosts. We provide an overview of diagnostic methods and approaches from classical detection in biological materials, through serological tests to nucleic acid amplification tests (NAATs). Critical review of used primers for the 18S rDNA is provided, together with information on individual primer pairs. Extension of used NAATs target to cover also mitochondrial genes is suggested as a key step in understanding the diversity and molecular epidemiology of Hepatozoon infections in mammals.
Fachrurrozi, Muhammad; Saparudin; Erwin
2017-04-01
Real-time Monitoring and early detection system which measures the quality standard of waste in Musi River, Palembang, Indonesia is a system for determining air and water pollution level. This system was designed in order to create an integrated monitoring system and provide real time information that can be read. It is designed to measure acidity and water turbidity polluted by industrial waste, as well as to show and provide conditional data integrated in one system. This system consists of inputting and processing the data, and giving output based on processed data. Turbidity, substances, and pH sensor is used as a detector that produce analog electrical direct current voltage (DC). Early detection system works by determining the value of the ammonia threshold, acidity, and turbidity level of water in Musi River. The results is then presented based on the level group pollution by the Support Vector Machine classification method.
Accuracy and Precision of a Plane Wave Vector Flow Imaging Method in the Healthy Carotid Artery
DEFF Research Database (Denmark)
Jensen, Jonas; Villagómez Hoyos, Carlos Armando; Traberg, Marie Sand
2018-01-01
The objective of the study described here was to investigate the accuracy and precision of a plane wave 2-D vector flow imaging (VFI) method in laminar and complex blood flow conditions in the healthy carotid artery. The approach was to study (i) the accuracy for complex flow by comparing...... of laminar flow in vivo. The precision in vivo was calculated as the mean standard deviation (SD) of estimates aligned to the heart cycle and was highest in the center of the common carotid artery (SD = 3.6% for velocity magnitudes and 4.5° for angles) and lowest in the external branch and for vortices (SD...... the velocity field from a computational fluid dynamics (CFD) simulation to VFI estimates obtained from the scan of an anthropomorphic flow phantom and from an in vivo scan; (ii) the accuracy for laminar unidirectional flow in vivo by comparing peak systolic velocities from VFI with magnetic resonance...
Diagnostic Method of Diabetes Based on Support Vector Machine and Tongue Images
Directory of Open Access Journals (Sweden)
Jianfeng Zhang
2017-01-01
Full Text Available Objective. The purpose of this research is to develop a diagnostic method of diabetes based on standardized tongue image using support vector machine (SVM. Methods. Tongue images of 296 diabetic subjects and 531 nondiabetic subjects were collected by the TDA-1 digital tongue instrument. Tongue body and tongue coating were separated by the division-merging method and chrominance-threshold method. With extracted color and texture features of the tongue image as input variables, the diagnostic model of diabetes with SVM was trained. After optimizing the combination of SVM kernel parameters and input variables, the influences of the combinations on the model were analyzed. Results. After normalizing parameters of tongue images, the accuracy rate of diabetes predication was increased from 77.83% to 78.77%. The accuracy rate and area under curve (AUC were not reduced after reducing the dimensions of tongue features with principal component analysis (PCA, while substantially saving the training time. During the training for selecting SVM parameters by genetic algorithm (GA, the accuracy rate of cross-validation was grown from 72% or so to 83.06%. Finally, we compare with several state-of-the-art algorithms, and experimental results show that our algorithm has the best predictive accuracy. Conclusions. The diagnostic method of diabetes on the basis of tongue images in Traditional Chinese Medicine (TCM is of great value, indicating the feasibility of digitalized tongue diagnosis.
Directory of Open Access Journals (Sweden)
Xiangbing Zhou
2018-04-01
Full Text Available Rapidly growing GPS (Global Positioning System trajectories hide much valuable information, such as city road planning, urban travel demand, and population migration. In order to mine the hidden information and to capture better clustering results, a trajectory regression clustering method (an unsupervised trajectory clustering method is proposed to reduce local information loss of the trajectory and to avoid getting stuck in the local optimum. Using this method, we first define our new concept of trajectory clustering and construct a novel partitioning (angle-based partitioning method of line segments; second, the Lagrange-based method and Hausdorff-based K-means++ are integrated in fuzzy C-means (FCM clustering, which are used to maintain the stability and the robustness of the clustering process; finally, least squares regression model is employed to achieve regression clustering of the trajectory. In our experiment, the performance and effectiveness of our method is validated against real-world taxi GPS data. When comparing our clustering algorithm with the partition-based clustering algorithms (K-means, K-median, and FCM, our experimental results demonstrate that the presented method is more effective and generates a more reasonable trajectory.
International Nuclear Information System (INIS)
Zhang Chen; Lu Hong; Hua Ning; Tang Xue-Zheng; Tang Fa-Kuan; Shou Guo-Fa; Xia Ling; Ma Ping
2013-01-01
A cardiac vector model is presented and verified, and then the forward problem for cardiac magnetic fields and electric potential are discussed based on this model and the realistic human torso volume conductor model, including lungs. A torso—cardiac vector model is used for a 12-lead electrocardiographic (ECG) and magneto-cardiogram (MCG) simulation study by using the boundary element method (BEM). Also, we obtain the MCG wave picture using a compound four-channel HT c ·SQUID system in a magnetically shielded room. By comparing the simulated results and experimental results, we verify the cardiac vector model and then do a preliminary study of the forward problem of MCG and ECG. Therefore, the results show that the vector model is reasonable in cardiac electrophysiology. (general)
Application of Numerical Integration and Data Fusion in Unit Vector Method
Zhang, J.
2012-01-01
The Unit Vector Method (UVM) is a series of orbit determination methods which are designed by Purple Mountain Observatory (PMO) and have been applied extensively. It gets the conditional equations for different kinds of data by projecting the basic equation to different unit vectors, and it suits for weighted process for different kinds of data. The high-precision data can play a major role in orbit determination, and accuracy of orbit determination is improved obviously. The improved UVM (PUVM2) promoted the UVM from initial orbit determination to orbit improvement, and unified the initial orbit determination and orbit improvement dynamically. The precision and efficiency are improved further. In this thesis, further research work has been done based on the UVM: Firstly, for the improvement of methods and techniques for observation, the types and decision of the observational data are improved substantially, it is also asked to improve the decision of orbit determination. The analytical perturbation can not meet the requirement. So, the numerical integration for calculating the perturbation has been introduced into the UVM. The accuracy of dynamical model suits for the accuracy of the real data, and the condition equations of UVM are modified accordingly. The accuracy of orbit determination is improved further. Secondly, data fusion method has been introduced into the UVM. The convergence mechanism and the defect of weighted strategy have been made clear in original UVM. The problem has been solved in this method, the calculation of approximate state transition matrix is simplified and the weighted strategy has been improved for the data with different dimension and different precision. Results of orbit determination of simulation and real data show that the work of this thesis is effective: (1) After the numerical integration has been introduced into the UVM, the accuracy of orbit determination is improved obviously, and it suits for the high-accuracy data of
Evaluation of auto-assessment method for C-D analysis based on support vector machine
International Nuclear Information System (INIS)
Takei, Takaaki; Ikeda, Mitsuru; Imai, Kuniharu; Kamihira, Hiroaki; Kishimoto, Tomonari; Goto, Hiroya
2010-01-01
Contrast-Detail (C-D) analysis is one of the visual quality assessment methods in medical imaging, and many auto-assessment methods for C-D analysis have been developed in recent years. However, for the auto-assessment method for C-D analysis, the effects of nonlinear image processing are not clear. So, we have made an auto-assessment method for C-D analysis using a support vector machine (SVM), and have evaluated its performance for the images processed with a noise reduction method. The feature indexes used in the SVM were the normalized cross correlation (NCC) coefficient on each signal between the noise-free and noised image, the contrast to noise ratio (CNR) on each signal, the radius of each signal, and the Student's t-test statistic for the mean difference between the signal and background pixel values. The results showed that the auto-assessment method for C-D analysis by using Student's t-test statistic agreed well with the visual assessment for the non-processed images, but disagreed for the images processed with the noise reduction method. Our results also showed that the auto-assessment method for C-D analysis by the SVM made of NCC and CNR agreed well with the visual assessment for the non-processed and noise-reduced images. Therefore, the auto-assessment method for C-D analysis by the SVM will be expected to have the robustness for the non-linear image processing. (author)
A method to determine the necessity for global signal regression in resting-state fMRI studies.
Chen, Gang; Chen, Guangyu; Xie, Chunming; Ward, B Douglas; Li, Wenjun; Antuono, Piero; Li, Shi-Jiang
2012-12-01
In resting-state functional MRI studies, the global signal (operationally defined as the global average of resting-state functional MRI time courses) is often considered a nuisance effect and commonly removed in preprocessing. This global signal regression method can introduce artifacts, such as false anticorrelated resting-state networks in functional connectivity analyses. Therefore, the efficacy of this technique as a correction tool remains questionable. In this article, we establish that the accuracy of the estimated global signal is determined by the level of global noise (i.e., non-neural noise that has a global effect on the resting-state functional MRI signal). When the global noise level is low, the global signal resembles the resting-state functional MRI time courses of the largest cluster, but not those of the global noise. Using real data, we demonstrate that the global signal is strongly correlated with the default mode network components and has biological significance. These results call into question whether or not global signal regression should be applied. We introduce a method to quantify global noise levels. We show that a criteria for global signal regression can be found based on the method. By using the criteria, one can determine whether to include or exclude the global signal regression in minimizing errors in functional connectivity measures. Copyright © 2012 Wiley Periodicals, Inc.
Kolasa-Wiecek, Alicja
2015-04-01
The energy sector in Poland is the source of 81% of greenhouse gas (GHG) emissions. Poland, among other European Union countries, occupies a leading position with regard to coal consumption. Polish energy sector actively participates in efforts to reduce GHG emissions to the atmosphere, through a gradual decrease of the share of coal in the fuel mix and development of renewable energy sources. All evidence which completes the knowledge about issues related to GHG emissions is a valuable source of information. The article presents the results of modeling of GHG emissions which are generated by the energy sector in Poland. For a better understanding of the quantitative relationship between total consumption of primary energy and greenhouse gas emission, multiple stepwise regression model was applied. The modeling results of CO2 emissions demonstrate a high relationship (0.97) with the hard coal consumption variable. Adjustment coefficient of the model to actual data is high and equal to 95%. The backward step regression model, in the case of CH4 emission, indicated the presence of hard coal (0.66), peat and fuel wood (0.34), solid waste fuels, as well as other sources (-0.64) as the most important variables. The adjusted coefficient is suitable and equals R2=0.90. For N2O emission modeling the obtained coefficient of determination is low and equal to 43%. A significant variable influencing the amount of N2O emission is the peat and wood fuel consumption. Copyright © 2015. Published by Elsevier B.V.
Efectivity of Additive Spline for Partial Least Square Method in Regression Model Estimation
Directory of Open Access Journals (Sweden)
Ahmad Bilfarsah
2005-04-01
Full Text Available Additive Spline of Partial Least Square method (ASPL as one generalization of Partial Least Square (PLS method. ASPLS method can be acommodation to non linear and multicollinearity case of predictor variables. As a principle, The ASPLS method approach is cahracterized by two idea. The first is to used parametric transformations of predictors by spline function; the second is to make ASPLS components mutually uncorrelated, to preserve properties of the linear PLS components. The performance of ASPLS compared with other PLS method is illustrated with the fisher economic application especially the tuna fish production.
International Nuclear Information System (INIS)
Meng Qinghu; Meng Qingfeng; Feng Wuwei
2012-01-01
Fukushima nuclear power plant accident caused huge losses and pollution and it showed that the reactor coolant pump is very important in a nuclear power plant. Therefore, to keep the safety and reliability, the condition of the coolant pump needs to be online condition monitored and fault analyzed. In this paper, condition monitoring and analysis based on support vector machine (SVM) is proposed. This method is just to aim at the small sample studies such as reactor coolant pump. Both experiment data and field data are analyzed. In order to eliminate the noise and useless frequency, these data are disposed through a multi-band FIR filter. After that, a fault feature selection method based on principal component analysis is proposed. The related variable quantity is changed into unrelated variable quantity, and the dimension is descended. Then the SVM method is used to separate different fault characteristics. Firstly, this method is used as a two-kind classifier to separate each two different running conditions. Then the SVM is used as a multiple classifier to separate all of the different condition types. The SVM could separate these conditions successfully. After that, software based on SVM was designed for reactor coolant pump condition analysis. This software is installed on the reactor plant control system of Qinshan nuclear power plant in China. It could monitor the online data and find the pump mechanical fault automatically.
A Novel Method for Vertical Acceleration Noise Suppression of a Thrust-Vectored VTOL UAV.
Li, Huanyu; Wu, Linfeng; Li, Yingjie; Li, Chunwen; Li, Hangyu
2016-12-02
Acceleration is of great importance in motion control for unmanned aerial vehicles (UAVs), especially during the takeoff and landing stages. However, the measured acceleration is inevitably polluted by severe noise. Therefore, a proper noise suppression procedure is required. This paper presents a novel method to reduce the noise in the measured vertical acceleration for a thrust-vectored tail-sitter vertical takeoff and landing (VTOL) UAV. In the new procedure, a Kalman filter is first applied to estimate the UAV mass by using the information in the vertical thrust and measured acceleration. The UAV mass is then used to compute an estimate of UAV vertical acceleration. The estimated acceleration is finally fused with the measured acceleration to obtain the minimum variance estimate of vertical acceleration. By doing this, the new approach incorporates the thrust information into the acceleration estimate. The method is applied to the data measured in a VTOL UAV takeoff experiment. Two other denoising approaches developed by former researchers are also tested for comparison. The results demonstrate that the new method is able to suppress the acceleration noise substantially. It also maintains the real-time performance in the final estimated acceleration, which is not seen in the former denoising approaches. The acceleration treated with the new method can be readily used in the motion control applications for UAVs to achieve improved accuracy.
A Novel Method for Vertical Acceleration Noise Suppression of a Thrust-Vectored VTOL UAV
Directory of Open Access Journals (Sweden)
Huanyu Li
2016-12-01
Full Text Available Acceleration is of great importance in motion control for unmanned aerial vehicles (UAVs, especially during the takeoff and landing stages. However, the measured acceleration is inevitably polluted by severe noise. Therefore, a proper noise suppression procedure is required. This paper presents a novel method to reduce the noise in the measured vertical acceleration for a thrust-vectored tail-sitter vertical takeoff and landing (VTOL UAV. In the new procedure, a Kalman filter is first applied to estimate the UAV mass by using the information in the vertical thrust and measured acceleration. The UAV mass is then used to compute an estimate of UAV vertical acceleration. The estimated acceleration is finally fused with the measured acceleration to obtain the minimum variance estimate of vertical acceleration. By doing this, the new approach incorporates the thrust information into the acceleration estimate. The method is applied to the data measured in a VTOL UAV takeoff experiment. Two other denoising approaches developed by former researchers are also tested for comparison. The results demonstrate that the new method is able to suppress the acceleration noise substantially. It also maintains the real-time performance in the final estimated acceleration, which is not seen in the former denoising approaches. The acceleration treated with the new method can be readily used in the motion control applications for UAVs to achieve improved accuracy.
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Directory of Open Access Journals (Sweden)
Liyun Su
2012-01-01
Full Text Available We introduce the extension of local polynomial fitting to the linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to nonparametric technique of local polynomial estimation, we do not need to know the heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we focus on comparison of parameters and reach an optimal fitting. Besides, we verify the asymptotic normality of parameters based on numerical simulations. Finally, this approach is applied to a case of economics, and it indicates that our method is surely effective in finite-sample situations.
DEFF Research Database (Denmark)
Wu, Xuan; Huang, Shoudao; Liu, Xiao
2017-01-01
This paper presents a new initial rotor position estimation method for an interior permanent magnet synchronous motor. The proposed method includes two steps: firstly, the minimum voltage vectors are injected to estimate the rotor position. Secondly, in order to identify the magnet polarity...
Xiangfeng, Zhang; Hong, Jiang
2018-03-01
In this paper, the full vector LCD method is proposed to solve the misjudgment problem caused by the change of the working condition. First, the signal from different working condition is decomposed by LCD, to obtain the Intrinsic Scale Component (ISC)whose instantaneous frequency with physical significance. Then, calculate of the cross correlation coefficient between ISC and the original signal, signal denoising based on the principle of mutual information minimum. At last, calculate the sum of absolute Vector mutual information of the sample under different working condition and the denoised ISC as the characteristics to classify by use of Support vector machine (SVM). The wind turbines vibration platform gear box experiment proves that this method can identify fault characteristics under different working conditions. The advantage of this method is that it reduce dependence of man’s subjective experience, identify fault directly from the original data of vibration signal. It will has high engineering value.
Gilstrap, Donald L.
2013-01-01
In addition to qualitative methods presented in chaos and complexity theories in educational research, this article addresses quantitative methods that may show potential for future research studies. Although much in the social and behavioral sciences literature has focused on computer simulations, this article explores current chaos and…
A Fault Alarm and Diagnosis Method Based on Sensitive Parameters and Support Vector Machine
Zhang, Jinjie; Yao, Ziyun; Lv, Zhiquan; Zhu, Qunxiong; Xu, Fengtian; Jiang, Zhinong
2015-08-01
Study on the extraction of fault feature and the diagnostic technique of reciprocating compressor is one of the hot research topics in the field of reciprocating machinery fault diagnosis at present. A large number of feature extraction and classification methods have been widely applied in the related research, but the practical fault alarm and the accuracy of diagnosis have not been effectively improved. Developing feature extraction and classification methods to meet the requirements of typical fault alarm and automatic diagnosis in practical engineering is urgent task. The typical mechanical faults of reciprocating compressor are presented in the paper, and the existing data of online monitoring system is used to extract fault feature parameters within 15 types in total; the inner sensitive connection between faults and the feature parameters has been made clear by using the distance evaluation technique, also sensitive characteristic parameters of different faults have been obtained. On this basis, a method based on fault feature parameters and support vector machine (SVM) is developed, which will be applied to practical fault diagnosis. A better ability of early fault warning has been proved by the experiment and the practical fault cases. Automatic classification by using the SVM to the data of fault alarm has obtained better diagnostic accuracy.
MAPPING LOCAL CLIMATE ZONES WITH A VECTOR-BASED GIS METHOD
Directory of Open Access Journals (Sweden)
E. Lelovics
2013-03-01
Full Text Available In this study we determined Local Climate Zones in a South-Hungarian city, using vector-based and raster-based databases. We calculated seven of the originally proposed ten physical (geometric, surface cover and radiative properties for areas which are based on the mobile temperature measurement campaigns earlier carried out in this city.As input data we applied 3D building database (earlier created with photogrammetric methods, 2D road database, topographic map, aerial photographs, remotely sensed reflectance information from RapidEye satellite image and our local knowledge about the area. The values of the properties were calculated by GIS methods developed for this purpose.We derived for the examined areas and applied for classification sky view factor, mean building height, terrain roughness class, building surface fraction, pervious surface fraction, impervious surface fraction and albedo.Six built and one land cover LCZ classes could be detected with this method on our study area. From each class one circle area was selected, which is representative for that class. Their thermal reactions were examined with the application of mobile temperature measurement dataset. The comparison was made in cases, when the weather was clear and calm and the surface was dry. We found that compact built-in types have more temperature surplus than open ones, and midrise types also have more than lowrise ones. According to our primary results, these categories provide a useful opportunity for intra- and inter-urban comparisons.
A Method of Vector Map Multi-scale Representation Considering User Interest on Subdivision Gird
Directory of Open Access Journals (Sweden)
YU Tong
2016-12-01
Full Text Available Compared with the traditional spatial data model and method, global subdivision grid show a great advantage in the organization and expression of massive spatial data. In view of this, a method of vector map multi-scale representation considering user interest on subdivision gird is proposed. First, the spatial interest field is built using a large number POI data to describe the spatial distribution of the user interest in geographic information. Second, spatial factor is classified and graded, and its representation scale range can be determined. Finally, different levels of subdivision surfaces are divided based on GeoSOT subdivision theory, and the corresponding relation of subdivision level and scale is established. According to the user interest of subdivision surfaces, the spatial feature can be expressed in different degree of detail. It can realize multi-scale representation of spatial data based on user interest. The experimental results show that this method can not only satisfy general-to-detail and important-to-secondary space cognitive demands of users, but also achieve better multi-scale representation effect.
Directory of Open Access Journals (Sweden)
Yaping Huang
Full Text Available The selection of seismic attributes is a key process in reservoir prediction because the prediction accuracy relies on the reliability and credibility of the seismic attributes. However, effective selection method for useful seismic attributes is still a challenge. This paper presents a novel selection method of seismic attributes for reservoir prediction based on the gray relational degree (GRD and support vector machine (SVM. The proposed method has a two-hierarchical structure. In the first hierarchy, the primary selection of seismic attributes is achieved by calculating the GRD between seismic attributes and reservoir parameters, and the GRD between the seismic attributes. The principle of the primary selection is that these seismic attributes with higher GRD to the reservoir parameters will have smaller GRD between themselves as compared to those with lower GRD to the reservoir parameters. Then the SVM is employed in the second hierarchy to perform an interactive error verification using training samples for the purpose of determining the final seismic attributes. A real-world case study was conducted to evaluate the proposed GRD-SVM method. Reliable seismic attributes were selected to predict the coalbed methane (CBM content in southern Qinshui basin, China. In the analysis, the instantaneous amplitude, instantaneous bandwidth, instantaneous frequency, and minimum negative curvature were selected, and the predicted CBM content was fundamentally consistent with the measured CBM content. This real-world case study demonstrates that the proposed method is able to effectively select seismic attributes, and improve the prediction accuracy. Thus, the proposed GRD-SVM method can be used for the selection of seismic attributes in practice.
In-vivo Examples of Flow Patterns With The Fast Vector Velocity Ultrasound Method
DEFF Research Database (Denmark)
Hansen, Kristoffer Lindskov; Udesen, Jesper; Gran, Fredrik
2009-01-01
and using a 100 CPU linux cluster for post processing, PWE can achieve a frame of 100 Hz where one vector velocity sequence of approximately 3 sec, takes 10 h to store and 48 h to process. In this paper a case study is presented of in-vivo vector velocity estimates in different complex vessel geometries...
Guan, Jinge; Ren, Wei; Cheng, Yaoyu
2018-04-01
We demonstrate an efficient polarization-difference imaging system in turbid conditions by using the Stokes vector of light. The interaction of scattered light with the polarizer is analyzed by the Stokes-Mueller formalism. An interpolation method is proposed to replace the mechanical rotation of the polarization axis of the analyzer theoretically, and its performance is verified by the experiment at different turbidity levels. We show that compared with direct imaging, the Stokes vector based imaging method can effectively reduce the effect of light scattering and enhance the image contrast.
Directory of Open Access Journals (Sweden)
Sara Mortaz Hejri
2013-01-01
Full Text Available Background: One of the methods used for standard setting is the borderline regression method (BRM. This study aims to assess the reliability of BRM when the pass-fail standard in an objective structured clinical examination (OSCE was calculated by averaging the BRM standards obtained for each station separately. Materials and Methods: In nine stations of the OSCE with direct observation the examiners gave each student a checklist score and a global score. Using a linear regression model for each station, we calculated the checklist score cut-off on the regression equation for the global scale cut-off set at 2. The OSCE pass-fail standard was defined as the average of all station′s standard. To determine the reliability, the root mean square error (RMSE was calculated. The R2 coefficient and the inter-grade discrimination were calculated to assess the quality of OSCE. Results: The mean total test score was 60.78. The OSCE pass-fail standard and its RMSE were 47.37 and 0.55, respectively. The R2 coefficients ranged from 0.44 to 0.79. The inter-grade discrimination score varied greatly among stations. Conclusion: The RMSE of the standard was very small indicating that BRM is a reliable method of setting standard for OSCE, which has the advantage of providing data for quality assurance.
International Nuclear Information System (INIS)
Ballini, J.-P.; Cazes, P.; Turpin, P.-Y.
1976-01-01
Analysing the histogram of anode pulse amplitudes allows a discussion of the hypothesis that has been proposed to account for the statistical processes of secondary multiplication in a photomultiplier. In an earlier work, good agreement was obtained between experimental and reconstructed spectra, assuming a first dynode distribution including two Poisson distributions of distinct mean values. This first approximation led to a search for a method which could give the weights of several Poisson distributions of distinct mean values. Three methods have been briefly exposed: classical linear regression, constraint regression (d'Esopo's method), and regression on variables subject to error. The use of these methods gives an approach of the frequency function which represents the dispersion of the punctual mean gain around the whole first dynode mean gain value. Comparison between this function and the one employed in Polya distribution allows the statement that the latter is inadequate to describe the statistical process of secondary multiplication. Numerous spectra obtained with two kinds of photomultiplier working under different physical conditions have been analysed. Then two points are discussed: - Does the frequency function represent the dynode structure and the interdynode collection process. - Is the model (the multiplication process of all dynodes but the first one, is Poissonian) valid whatever the photomultiplier and the utilization conditions. (Auth.)
Borodachev, S. M.
2016-06-01
The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.
Huang, Lei
2015-01-01
To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required. PMID:26437409
Support vector machine-based facial-expression recognition method combining shape and appearance
Han, Eun Jung; Kang, Byung Jun; Park, Kang Ryoung; Lee, Sangyoun
2010-11-01
Facial expression recognition can be widely used for various applications, such as emotion-based human-machine interaction, intelligent robot interfaces, face recognition robust to expression variation, etc. Previous studies have been classified as either shape- or appearance-based recognition. The shape-based method has the disadvantage that the individual variance of facial feature points exists irrespective of similar expressions, which can cause a reduction of the recognition accuracy. The appearance-based method has a limitation in that the textural information of the face is very sensitive to variations in illumination. To overcome these problems, a new facial-expression recognition method is proposed, which combines both shape and appearance information, based on the support vector machine (SVM). This research is novel in the following three ways as compared to previous works. First, the facial feature points are automatically detected by using an active appearance model. From these, the shape-based recognition is performed by using the ratios between the facial feature points based on the facial-action coding system. Second, the SVM, which is trained to recognize the same and different expression classes, is proposed to combine two matching scores obtained from the shape- and appearance-based recognitions. Finally, a single SVM is trained to discriminate four different expressions, such as neutral, a smile, anger, and a scream. By determining the expression of the input facial image whose SVM output is at a minimum, the accuracy of the expression recognition is much enhanced. The experimental results showed that the recognition accuracy of the proposed method was better than previous researches and other fusion methods.
Bolarinwa, O A; Adeola, O
2012-12-01
Digestible and metabolizable energy contents of feed ingredients for pigs can be determined by direct or indirect methods. There are situations when only the indirect approach is suitable and the regression method is a robust indirect approach. This study was conducted to compare the direct and regression methods for determining the energy value of wheat for pigs. Twenty-four barrows with an average initial BW of 31 kg were assigned to 4 diets in a randomized complete block design. The 4 diets consisted of 969 g wheat/kg plus minerals and vitamins (sole wheat) for the direct method, corn (Zea mays)-soybean (Glycine max) meal reference diet (RD), RD + 300 g wheat/kg, and RD + 600 g wheat/kg. The 3 corn-soybean meal diets were used for the regression method and wheat replaced the energy-yielding ingredients, corn and soybean meal, so that the same ratio of corn and soybean meal across the experimental diets was maintained. The wheat used was analyzed to contain 883 g DM, 15.2 g N, and 3.94 Mcal GE/kg. Each diet was fed to 6 barrows in individual metabolism crates for a 5-d acclimation followed by a 5-d total but separate collection of feces and urine. The DE and ME for the sole wheat diet were 3.83 and 3.77 Mcal/kg DM, respectively. Because the sole wheat diet contained 969 g wheat/kg, these translate to 3.95 Mcal DE/kg DM and 3.89 Mcal ME/kg DM. The RD used for the regression approach yielded 4.00 Mcal DE and 3.91 Mcal ME/kg DM diet. Increasing levels of wheat in the RD linearly reduced (P direct method (3.95 and 3.89 Mcal/kg DM) did not differ (0.78 < P < 0.89) from those obtained using the regression method (3.96 and 3.88 Mcal/kg DM).
BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins
Directory of Open Access Journals (Sweden)
MuthuKrishnan Selvaraj
2016-01-01
Full Text Available The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM models were developed for predicting HbL proteins based upon amino acid composition (AC, dipeptide composition (DC, hybrid method (AC + DC, and position specific scoring matrix (PSSM. In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM profiles. The average accuracy, standard deviation (SD, false positive rate (FPR, confusion matrix, and receiver operating characteristic (ROC were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.
Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.
2017-08-01
Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically
Energy Technology Data Exchange (ETDEWEB)
Jabr, R.A. [Electrical, Computer and Communication Engineering Department, Notre Dame University, P.O. Box 72, Zouk Mikhael, Zouk Mosbeh (Lebanon)
2006-02-15
This paper presents an implementation of the least absolute value (LAV) power system state estimator based on obtaining a sequence of solutions to the L{sub 1}-regression problem using an iteratively reweighted least squares (IRLS{sub L1}) method. The proposed implementation avoids reformulating the regression problem into standard linear programming (LP) form and consequently does not require the use of common methods of LP, such as those based on the simplex method or interior-point methods. It is shown that the IRLS{sub L1} method is equivalent to solving a sequence of linear weighted least squares (LS) problems. Thus, its implementation presents little additional effort since the sparse LS solver is common to existing LS state estimators. Studies on the termination criteria of the IRLS{sub L1} method have been carried out to determine a procedure for which the proposed estimator is more computationally efficient than a previously proposed non-linear iteratively reweighted least squares (IRLS) estimator. Indeed, it is revealed that the proposed method is a generalization of the previously reported IRLS estimator, but is based on more rigorous theory. (author)
Directory of Open Access Journals (Sweden)
Tamer Khatib
2014-01-01
Full Text Available In this research an improved approach for sizing standalone PV system (SAPV is presented. This work is an improved work developed previously by the authors. The previous work is based on the analytical method which faced some concerns regarding the difficulty of finding the model’s coefficients. Therefore, the proposed approach in this research is based on a combination of an analytical method and a machine learning approach for a generalized artificial neural network (GRNN. The GRNN assists to predict the optimal size of a PV system using the geographical coordinates of the targeted site instead of using mathematical formulas. Employing the GRNN facilitates the use of a previously developed method by the authors and avoids some of its drawbacks. The approach has been tested using data from five Malaysian sites. According to the results, the proposed method can be efficiently used for SAPV sizing whereas the proposed GRNN based model predicts the sizing curves of the PV system accurately with a prediction error of 0.6%. Moreover, hourly meteorological and load demand data are used in this research in order to consider the uncertainty of the solar energy and the load demand.
Comparison of Sparse and Jack-knife partial least squares regression methods for variable selection
DEFF Research Database (Denmark)
Karaman, Ibrahim; Qannari, El Mostafa; Martens, Harald
2013-01-01
The objective of this study was to compare two different techniques of variable selection, Sparse PLSR and Jack-knife PLSR, with respect to their predictive ability and their ability to identify relevant variables. Sparse PLSR is a method that is frequently used in genomics, whereas Jack-knife PL...
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Directory of Open Access Journals (Sweden)
Xiaoyan Yang
2018-04-01
Full Text Available The Advanced Spaceborne Thermal-Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM is important to a wide range of geographical and environmental studies. Its accuracy, to some extent associated with land-use types reflecting topography, vegetation coverage, and human activities, impacts the results and conclusions of these studies. In order to improve the accuracy of ASTER GDEM prior to its application, we investigated ASTER GDEM errors based on individual land-use types and proposed two linear regression calibration methods, one considering only land use-specific errors and the other considering the impact of both land-use and topography. Our calibration methods were tested on the coastal prefectural city of Lianyungang in eastern China. Results indicate that (1 ASTER GDEM is highly accurate for rice, wheat, grass and mining lands but less accurate for scenic, garden, wood and bare lands; (2 despite improvements in ASTER GDEM2 accuracy, multiple linear regression calibration requires more data (topography and a relatively complex calibration process; (3 simple linear regression calibration proves a practicable and simplified means to systematically investigate and improve the impact of land-use on ASTER GDEM accuracy. Our method is applicable to areas with detailed land-use data based on highly accurate field-based point-elevation measurements.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Directory of Open Access Journals (Sweden)
Man Zhu
2017-03-01
Full Text Available Determination of ship maneuvering models is a tough task of ship maneuverability prediction. Among several prime approaches of estimating ship maneuvering models, system identification combined with the full-scale or free- running model test is preferred. In this contribution, real-time system identification programs using recursive identification method, such as the recursive least square method (RLS, are exerted for on-line identification of ship maneuvering models. However, this method seriously depends on the objects of study and initial values of identified parameters. To overcome this, an intelligent technology, i.e., support vector machines (SVM, is firstly used to estimate initial values of the identified parameters with finite samples. As real measured motion data of the Mariner class ship always involve noise from sensors and external disturbances, the zigzag simulation test data include a substantial quantity of Gaussian white noise. Wavelet method and empirical mode decomposition (EMD are used to filter the data corrupted by noise, respectively. The choice of the sample number for SVM to decide initial values of identified parameters is extensively discussed and analyzed. With de-noised motion data as input-output training samples, parameters of ship maneuvering models are estimated using RLS and SVM-RLS, respectively. The comparison between identification results and true values of parameters demonstrates that both the identified ship maneuvering models from RLS and SVM-RLS have reasonable agreements with simulated motions of the ship, and the increment of the sample for SVM positively affects the identification results. Furthermore, SVM-RLS using data de-noised by EMD shows the highest accuracy and best convergence.
Gross, Samuel M; Tibshirani, Robert
2015-04-01
We consider the scenario where one observes an outcome variable and sets of features from multiple assays, all measured on the same set of samples. One approach that has been proposed for dealing with these type of data is "sparse multiple canonical correlation analysis" (sparse mCCA). All of the current sparse mCCA techniques are biconvex and thus have no guarantees about reaching a global optimum. We propose a method for performing sparse supervised canonical correlation analysis (sparse sCCA), a specific case of sparse mCCA when one of the datasets is a vector. Our proposal for sparse sCCA is convex and thus does not face the same difficulties as the other methods. We derive efficient algorithms for this problem that can be implemented with off the shelf solvers, and illustrate their use on simulated and real data. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Predicting metabolic syndrome using decision tree and support vector machine methods
Directory of Open Access Journals (Sweden)
Farzaneh Karimi-Alavijeh
2016-06-01
Full Text Available BACKGROUND: Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. METHODS: This study aims to employ decision tree and support vector machine (SVM to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP, diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs, total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. RESULTS: SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758, 0.74 (0.72 and 0.757 (0.739 in SVM (decision tree method. CONCLUSION: The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most
Directory of Open Access Journals (Sweden)
Hukharnsusatrue, A.
2005-11-01
Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than
A computer program for uncertainty analysis integrating regression and Bayesian methods
Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary
2014-01-01
This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.
Consistency analysis of subspace identification methods based on a linear regression approach
DEFF Research Database (Denmark)
Knudsen, Torben
2001-01-01
In the literature results can be found which claim consistency for the subspace method under certain quite weak assumptions. Unfortunately, a new result gives a counter example showing inconsistency under these assumptions and then gives new more strict sufficient assumptions which however does n...... not include important model structures as e.g. Box-Jenkins. Based on a simple least squares approach this paper shows the possible inconsistency under the weak assumptions and develops only slightly stricter assumptions sufficient for consistency and which includes any model structure...
International Nuclear Information System (INIS)
Sambou, Soussou
2004-01-01
In flood forecasting modelling, large basins are often considered as hydrological systems with multiple inputs and one output. Inputs are hydrological variables such rainfall, runoff and physical characteristics of basin; output is runoff. Relating inputs to output can be achieved using deterministic, conceptual, or stochastic models. Rainfall runoff models generally lack of accuracy. Physical hydrological processes based models, either deterministic or conceptual are highly data requirement demanding and by the way very complex. Stochastic multiple input-output models, using only historical chronicles of hydrological variables particularly runoff are by the way very popular among the hydrologists for large river basin flood forecasting. Application is made on the Senegal River upstream of Bakel, where the River is formed by the main branch, Bafing, and two tributaries, Bakoye and Faleme; Bafing being regulated by Manantaly Dam. A three inputs and one output model has been used for flood forecasting on Bakel. Influence of the lead forecasting, and of the three inputs taken separately, then associated two by two, and altogether has been verified using a dimensionless variance as criterion of quality. Inadequacies occur generally between model output and observations; to put model in better compliance with current observations, we have compared four parameter updating procedure, recursive least squares, Kalman filtering, stochastic gradient method, iterative method, and an AR errors forecasting model. A combination of these model updating have been used in real time flood forecasting.(Author)
High cycle fatigue test and regression methods of S-N curve
International Nuclear Information System (INIS)
Kim, D. W.; Park, J. Y.; Kim, W. G.; Yoon, J. H.
2011-11-01
The fatigue design curve in the ASME Boiler and Pressure Vessel Code Section III are based on the assumption that fatigue life is infinite after 106 cycles. This is because standard fatigue testing equipment prior to the past decades was limited in speed to less than 200 cycles per second. Traditional servo-hydraulic machines work at frequency of 50 Hz. Servo-hydraulic machines working at 1000 Hz have been developed after 1997. This machines allow high frequency and displacement of up to ±0.1 mm and dynamic load of ±20 kN are guaranteed. The frequency of resonant fatigue test machine is 50-250 Hz. Various forced vibration-based system works at 500 Hz or 1.8 kHz. Rotating bending machines allow testing frequency at 0.1-200 Hz. The main advantage of ultrasonic fatigue testing at 20 kHz is performing Although S-N curve is determined by experiment, the fatigue strength corresponding to a given fatigue life should be determined by statistical method considering the scatter of fatigue properties. In this report, the statistical methods for evaluation of fatigue test data is investigated
Computer-aided event tree analysis by the impact vector method
International Nuclear Information System (INIS)
Lima, J.E.P.
1984-01-01
In the development of the Probabilistic Risk Analysis of Angra I, the ' large event tree/small fault tree' approach was adopted for the analysis of the plant behavior in an emergency situation. In this work, the event tree methodology is presented along with the adaptations which had to be made in order to attain a correct description of the safety system performances according to the selected analysis method. The problems appearing in the application of the methodology and their respective solutions are presented and discussed, with special emphasis to the impact vector technique. A description of the ETAP code ('Event Tree Analysis Program') developed for constructing and quantifying event trees is also given in this work. A preliminary version of the small-break LOCA analysis for Angra 1 is presented as an example of application of the methodology and of the code. It is shown that the use of the ETAP code sigmnificantly contributes to decreasing the time spent in event tree analyses, making it viable the practical application of the analysis approach referred above. (author) [pt
International Nuclear Information System (INIS)
Zappe, D.
1983-01-01
For safety analyses of underground repositories for radioactive wastes various possible release scenarios have to be defind and anticipated consequences to be calculated and compared. Normally only the main exposure pathways (i.e. the critical pathways) of the radionuclides disposed of in the repository are calculated using deterministic methods and varying the parameters. It is proposed to evaluate all the individual pathways including those differing considerably from the critical pathway by forming weighted averages of their consequences. This offers the possibility of including, without any restriction, in the evaluation of the repository the various possible events and processes that influence the function of barriers for the retention of radionuclides. Various states (scenarios) of a repository in a salt formation, which might occur in the course of time have been used as an example. The consequences related to these states and the probabilities of their occurrence or the scenario weights form the components of 'status vectors'. For low- and intermediate-level wastes the overall consequences obtained from these calculations are negligibly small, for high-level wastes they are about 3 x 10 - 5 Sv a - 1 /GW a. These values are reached if at least a part of the barriers is effective. Variations of the weighting factors for the states and their influence on the overall consequences are given. (author)
DEFF Research Database (Denmark)
Minarik, David; Senneby, Martin; Wollmer, Per
2015-01-01
Background The interpretation of myocardial perfusion scintigraphy (MPS) largely relies on visual assessment by the physician of the localization and extent of a perfusion defect. The aim of this study was to introduce the concept of the perfusion vector as a new objective quantitative method...
Predicting metabolic syndrome using decision tree and support vector machine methods.
Karimi-Alavijeh, Farzaneh; Jalili, Saeed; Sadeghi, Masoumeh
2016-05-01
Metabolic syndrome which underlies the increased prevalence of cardiovascular disease and Type 2 diabetes is considered as a group of metabolic abnormalities including central obesity, hypertriglyceridemia, glucose intolerance, hypertension, and dyslipidemia. Recently, artificial intelligence based health-care systems are highly regarded because of its success in diagnosis, prediction, and choice of treatment. This study employs machine learning technics for predict the metabolic syndrome. This study aims to employ decision tree and support vector machine (SVM) to predict the 7-year incidence of metabolic syndrome. This research is a practical one in which data from 2107 participants of Isfahan Cohort Study has been utilized. The subjects without metabolic syndrome according to the ATPIII criteria were selected. The features that have been used in this data set include: gender, age, weight, body mass index, waist circumference, waist-to-hip ratio, hip circumference, physical activity, smoking, hypertension, antihypertensive medication use, systolic blood pressure (BP), diastolic BP, fasting blood sugar, 2-hour blood glucose, triglycerides (TGs), total cholesterol, low-density lipoprotein, high density lipoprotein-cholesterol, mean corpuscular volume, and mean corpuscular hemoglobin. Metabolic syndrome was diagnosed based on ATPIII criteria and two methods of decision tree and SVM were selected to predict the metabolic syndrome. The criteria of sensitivity, specificity and accuracy were used for validation. SVM and decision tree methods were examined according to the criteria of sensitivity, specificity and accuracy. Sensitivity, specificity and accuracy were 0.774 (0.758), 0.74 (0.72) and 0.757 (0.739) in SVM (decision tree) method. The results show that SVM method sensitivity, specificity and accuracy is more efficient than decision tree. The results of decision tree method show that the TG is the most important feature in predicting metabolic syndrome. According
Parallel/vector algorithms for the spherical SN transport theory method
International Nuclear Information System (INIS)
Haghighat, A.; Mattis, R.E.
1990-01-01
This paper discusses vector and parallel processing of a 1-D curvilinear (i.e. spherical) S N transport theory algorithm on the Cornell National SuperComputer Facility (CNSF) IBM 3090/600E. Two different vector algorithms were developed and parallelized based on angular decomposition. It is shown that significant speedups are attainable. For example, for problems with large granularity, using 4 processors, the parallel/vector algorithm achieves speedups (for wall-clock time) of more than 4.5 relative to the old serial/scalar algorithm. Furthermore, this work has demonstrated the existing potential for the development of faster processing vector and parallel algorithms for multidimensional curvilinear geometries. (author)
Variable selection methods in PLS regression - a comparison study on metabolomics data
DEFF Research Database (Denmark)
Karaman, İbrahim; Hedemann, Mette Skou; Knudsen, Knud Erik Bach
. The aim of the metabolomics study was to investigate the metabolic profile in pigs fed various cereal fractions with special attention to the metabolism of lignans using LC-MS based metabolomic approach. References 1. Lê Cao KA, Rossouw D, Robert-Granié C, Besse P: A Sparse PLS for Variable Selection when...... integrated approach. Due to the high number of variables in data sets (both raw data and after peak picking) the selection of important variables in an explorative analysis is difficult, especially when different data sets of metabolomics data need to be related. Variable selection (or removal of irrelevant...... different strategies for variable selection on PLSR method were considered and compared with respect to selected subset of variables and the possibility for biological validation. Sparse PLSR [1] as well as PLSR with Jack-knifing [2] was applied to data in order to achieve variable selection prior...
Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat
2015-01-01
Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
Dai, Huanping; Micheyl, Christophe
2012-11-01
Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.
Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi
2018-04-01
Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.
Coskuntuncel, Orkun
2013-01-01
The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…
International Nuclear Information System (INIS)
Jiang Bin; Zhang Yejing; Wang Yufei; Liu Anjin; Zheng Wanhua
2012-01-01
We present the extended Dirichlet-to-Neumann wave vector eigenvalue equation (DtN-WVEE) method to calculate the equi-frequency contour (EFC) of square lattice photonic crystals (PhCs). With the extended DtN-WVEE method and Snell's law, the effective refractive index of the mode with a circular EFC can be obtained, which is further validated with the refractive index weighted by the electric field or magnetic field. To further verify the EFC calculated by the DtN-WVEE method, the finite-difference time-domain method is also used. Compared with other wave vector eigenvalue equation methods that calculate EFC directly, the size of the eigenmatrix used in the DtN-WVEE method is much smaller, and the computation time is significantly reduced. Since the DtN-WVEE method solves wave vectors for given arbitrary frequencies, it can also find applications in studying the optical properties of a PhC with dispersive, lossy and magnetic materials. (paper)
Krzyżek, Robert
2015-12-01
The paper deals with the problem of surveying buildings in the RTN GNSS mode using modernized indirect methods of measurement. As a result of the classical realtime measurements using indirect methods (intersection of straight lines or a point on a straight line), we obtain a building structure (a building) which is largely deformed. This distortion is due to the inconsistency of the actual dimensions of the building (tie distances) relative to the obtained measurement results. In order to eliminate these discrepancies, and thus to ensure full consistency of the building geometric structure, an innovative solution was applied - the method of vector addition - to modify the linear values (tie distances) of the external face of the building walls. A separate research problem tackled in the article, although not yet fully solved, is the issue of coordinates of corners of a building obtained after the application of the method of vector addition.
Stone, Christopher P.; Alferman, Andrew T.; Niemeyer, Kyle E.
2018-05-01
multithreaded CPU code; however, this was significantly slower than the SIMD versions on the host CPU or the Xeon Phi. The performance difference between the three platforms was attributed to thread divergence caused by the adaptive step-sizes within the ODE integrators. Analysis showed that the wider vector width of the GPU incurs a higher level of divergence than the narrower Sandy Bridge or Xeon Phi. The significant performance improvement provided by the SIMD parallel strategy motivates further research into more ODE solver methods that are both SIMD-friendly and computationally efficient.
Killing vector fields in three dimensions: a method to solve massive gravity field equations
Energy Technology Data Exchange (ETDEWEB)
Guerses, Metin, E-mail: gurses@fen.bilkent.edu.t [Department of Mathematics, Faculty of Sciences, Bilkent University, 06800 Ankara (Turkey)
2010-10-21
Killing vector fields in three dimensions play an important role in the construction of the related spacetime geometry. In this work we show that when a three-dimensional geometry admits a Killing vector field then the Ricci tensor of the geometry is determined in terms of the Killing vector field and its scalars. In this way we can generate all products and covariant derivatives at any order of the Ricci tensor. Using this property we give ways to solve the field equations of topologically massive gravity (TMG) and new massive gravity (NMG) introduced recently. In particular when the scalars of the Killing vector field (timelike, spacelike and null cases) are constants then all three-dimensional symmetric tensors of the geometry, the Ricci and Einstein tensors, their covariant derivatives at all orders, and their products of all orders are completely determined by the Killing vector field and the metric. Hence, the corresponding three-dimensional metrics are strong candidates for solving all higher derivative gravitational field equations in three dimensions.
Secmen, Mustafa
2011-10-01
This paper introduces the performance of an electromagnetic target recognition method in resonance scattering region, which includes pseudo spectrum Multiple Signal Classification (MUSIC) algorithm and principal component analysis (PCA) technique. The aim of this method is to classify an "unknown" target as one of the "known" targets in an aspect-independent manner. The suggested method initially collects the late-time portion of noise-free time-scattered signals obtained from different reference aspect angles of known targets. Afterward, these signals are used to obtain MUSIC spectrums in real frequency domain having super-resolution ability and noise resistant feature. In the final step, PCA technique is applied to these spectrums in order to reduce dimensionality and obtain only one feature vector per known target. In the decision stage, noise-free or noisy scattered signal of an unknown (test) target from an unknown aspect angle is initially obtained. Subsequently, MUSIC algorithm is processed for this test signal and resulting test vector is compared with feature vectors of known targets one by one. Finally, the highest correlation gives the type of test target. The method is applied to wire models of airplane targets, and it is shown that it can tolerate considerable noise levels although it has a few different reference aspect angles. Besides, the runtime of the method for a test target is sufficiently low, which makes the method suitable for real-time applications.
A novel vector-based method for exclusive overexpression of star-form microRNAs.
Directory of Open Access Journals (Sweden)
Bo Qu
Full Text Available The roles of microRNAs (miRNAs as important regulators of gene expression have been studied intensively. Although most of these investigations have involved the highly expressed form of the two mature miRNA species, increasing evidence points to essential roles for star-form microRNAs (miRNA*, which are usually expressed at much lower levels. Owing to the nature of miRNA biogenesis, it is challenging to use plasmids containing miRNA coding sequences for gain-of-function experiments concerning the roles of microRNA* species. Synthetic microRNA mimics could introduce specific miRNA* species into cells, but this transient overexpression system has many shortcomings. Here, we report that specific miRNA* species can be overexpressed by introducing artificially designed stem-loop sequences into short hairpin RNA (shRNA overexpression vectors. By our prototypic plasmid, designed to overexpress hsa-miR-146b-3p, we successfully expressed high levels of hsa-miR-146b-3p without detectable change of hsa-miR-146b-5p. Functional analysis involving luciferase reporter assays showed that, like natural miRNAs, the overexpressed hsa-miR-146b-3p inhibited target gene expression by 3'UTR seed pairing. Our demonstration that this method could overexpress two other miRNAs suggests that the approach should be broadly applicable. Our novel strategy opens the way for exclusively stable overexpression of miRNA* species and analyzing their unique functions both in vitro and in vivo.
Robinson, Gilbert de B
2011-01-01
This brief undergraduate-level text by a prominent Cambridge-educated mathematician explores the relationship between algebra and geometry. An elementary course in plane geometry is the sole requirement for Gilbert de B. Robinson's text, which is the result of several years of teaching and learning the most effective methods from discussions with students. Topics include lines and planes, determinants and linear equations, matrices, groups and linear transformations, and vectors and vector spaces. Additional subjects range from conics and quadrics to homogeneous coordinates and projective geom
Ogawa, Kazuhisa; Kobayashi, Hirokazu; Tomita, Akihisa
2018-02-01
The quantum interference of entangled photons forms a key phenomenon underlying various quantum-optical technologies. It is known that the quantum interference patterns of entangled photon pairs can be reconstructed classically by the time-reversal method; however, the time-reversal method has been applied only to time-frequency-entangled two-photon systems in previous experiments. Here, we apply the time-reversal method to the position-wave-vector-entangled two-photon systems: the two-photon Young interferometer and the two-photon beam focusing system. We experimentally demonstrate that the time-reversed systems classically reconstruct the same interference patterns as the position-wave-vector-entangled two-photon systems.
DEFF Research Database (Denmark)
Le, T.H.A.; Pham, D. T.; Canh, Nam Nguyen
2010-01-01
Both the efficient and weakly efficient sets of an affine fractional vector optimization problem, in general, are neither convex nor given explicitly. Optimization problems over one of these sets are thus nonconvex. We propose two methods for optimizing a real-valued function over the efficient...... and weakly efficient sets of an affine fractional vector optimization problem. The first method is a local one. By using a regularization function, we reformulate the problem into a standard smooth mathematical programming problem that allows applying available methods for smooth programming. In case...... the objective function is linear, we have investigated a global algorithm based upon a branch-and-bound procedure. The algorithm uses Lagrangian bound coupling with a simplicial bisection in the criteria space. Preliminary computational results show that the global algorithm is promising....
Directory of Open Access Journals (Sweden)
Huiling Qiu
Full Text Available DNA vector-encoded Tough Decoy (TuD miRNA inhibitor is attracting increased attention due to its high efficiency in miRNA suppression. The current methods used to construct TuD vectors are based on synthesizing long oligonucleotides (~90 mer, which have been costly and problematic because of mutations during synthesis. In this study, we report a PCR-based method for the generation of double Tough Decoy (dTuD vector in which only two sets of shorter oligonucleotides (< 60 mer were used. Different approaches were employed to test the inhibitory potency of dTuDs. We demonstrated that dTuD is the most efficient method in miRNA inhibition in vitro and in vivo. Using this method, a mini dTuD library against 88 human miRNAs was constructed and used for a high-throughput screening (HTS of AP-1 pathway-related miRNAs. Seven miRNAs (miR-18b-5p, -101-3p, -148b-3p, -130b-3p, -186-3p, -187-3p and -1324 were identified as candidates involved in AP-1 pathway regulation. This novel method allows for an accurate and cost-effective generation of dTuD miRNA inhibitor, providing a powerful tool for efficient miRNA suppression in vitro and in vivo.
Fatekurohman, Mohamat; Nurmala, Nita; Anggraeni, Dian
2018-04-01
Lungs are the most important organ, in the case of respiratory system. Problems related to disorder of the lungs are various, i.e. pneumonia, emphysema, tuberculosis and lung cancer. Comparing all those problems, lung cancer is the most harmful. Considering about that, the aim of this research applies survival analysis and factors affecting the endurance of the lung cancer patient using comparison of exact, Efron and Breslow parameter approach method on hazard ratio and stratified cox regression model. The data applied are based on the medical records of lung cancer patients in Jember Paru-paru hospital on 2016, east java, Indonesia. The factors affecting the endurance of the lung cancer patients can be classified into several criteria, i.e. sex, age, hemoglobin, leukocytes, erythrocytes, sedimentation rate of blood, therapy status, general condition, body weight. The result shows that exact method of stratified cox regression model is better than other. On the other hand, the endurance of the patients is affected by their age and the general conditions.
International Nuclear Information System (INIS)
Asayama, Tai
2003-03-01
For the commercialization of fast breeder reactors, 'System Based Code', a completely new scheme of a code on structural integrity, is being developed. One of the distinguished features of the System Based Code is that it is able to determine a reasonable total margin on a structural of system, by allowing the exchanges of margins between various technical items. Detailed estimation of failure probability of a given combination of technical items and its comparison with a target value is one way to achieve this. However, simpler and easier methods that allow margin exchange without detailed calculation of failure probability are desirable in design. The authors have developed a simplified method such as a 'design factor method' from this viewpoint. This report describes a 'Vector Method', which was been newly developed. Following points are reported: 1) The Vector Method allows margin exchange evaluation on an 'equi-quality assurance plane' using vector calculation. Evaluation is easy and sufficient accuracy is achieved. The equi-quality assurance plane is obtained by a projection of an 'equi-failure probability surface in a n-dimensional space, which is calculated beforehand for typical combinations of design variables. 2) The Vector Method is considered to give the 'Quality Assurance Index Method' a probabilistic interpretation. 3) An algebraic method was proposed for the calculation of failure probabilities, which is necessary to obtain a equi-failure probability surface. This method calculates failure probabilities without using numerical methods such as Monte Carlo simulation or numerical integration. Under limited conditions, this method is quite effective compared to numerical methods. 4) An illustration of the procedure of margin exchange evaluation is given. It may be possible to use this method to optimize ISI plans; even it is not fully implemented in the System Based Code. (author)