Reduction of the number of parameters needed for a polynomial random regression test-day model
Pool, M.H.; Meuwissen, T.H.E.
2000-01-01
Legendre polynomials were used to describe the (co)variance matrix within a random regression test day model. The goodness of fit depended on the polynomial order of fit, i.e., number of parameters to be estimated per animal but is limited by computing capacity. Two aspects: incomplete lactation
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Bruno, Delia Evelina; Barca, Emanuele; Goncalves, Rodrigo Mikosz; de Araujo Queiroz, Heithor Alexandre; Berardi, Luigi; Passarella, Giuseppe
2018-01-01
In this paper, the Evolutionary Polynomial Regression data modelling strategy has been applied to study small scale, short-term coastal morphodynamics, given its capability for treating a wide database of known information, non-linearly. Simple linear and multilinear regression models were also applied to achieve a balance between the computational load and reliability of estimations of the three models. In fact, even though it is easy to imagine that the more complex the model, the more the prediction improves, sometimes a "slight" worsening of estimations can be accepted in exchange for the time saved in data organization and computational load. The models' outcomes were validated through a detailed statistical, error analysis, which revealed a slightly better estimation of the polynomial model with respect to the multilinear model, as expected. On the other hand, even though the data organization was identical for the two models, the multilinear one required a simpler simulation setting and a faster run time. Finally, the most reliable evolutionary polynomial regression model was used in order to make some conjecture about the uncertainty increase with the extension of extrapolation time of the estimation. The overlapping rate between the confidence band of the mean of the known coast position and the prediction band of the estimated position can be a good index of the weakness in producing reliable estimations when the extrapolation time increases too much. The proposed models and tests have been applied to a coastal sector located nearby Torre Colimena in the Apulia region, south Italy.
A method for fitting regression splines with varying polynomial order in the linear mixed model.
Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W
2006-02-15
The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Modeling Source Water TOC Using Hydroclimate Variables and Local Polynomial Regression.
Samson, Carleigh C; Rajagopalan, Balaji; Summers, R Scott
2016-04-19
To control disinfection byproduct (DBP) formation in drinking water, an understanding of the source water total organic carbon (TOC) concentration variability can be critical. Previously, TOC concentrations in water treatment plant source waters have been modeled using streamflow data. However, the lack of streamflow data or unimpaired flow scenarios makes it difficult to model TOC. In addition, TOC variability under climate change further exacerbates the problem. Here we proposed a modeling approach based on local polynomial regression that uses climate, e.g. temperature, and land surface, e.g., soil moisture, variables as predictors of TOC concentration, obviating the need for streamflow. The local polynomial approach has the ability to capture non-Gaussian and nonlinear features that might be present in the relationships. The utility of the methodology is demonstrated using source water quality and climate data in three case study locations with surface source waters including river and reservoir sources. The models show good predictive skill in general at these locations, with lower skills at locations with the most anthropogenic influences in their streams. Source water TOC predictive models can provide water treatment utilities important information for making treatment decisions for DBP regulation compliance under future climate scenarios.
DEFF Research Database (Denmark)
Xu, Man; Pinson, Pierre; Lu, Zongxiang
2016-01-01
of the lack of time adaptivity. In this paper, a refined local polynomial regression algorithm is proposed to yield an adaptive robust model of the time-varying scattered power curve for forecasting applications. The time adaptivity of the algorithm is considered with a new data-driven bandwidth selection......Wind farm power curve modeling, which characterizes the relationship between meteorological variables and power production, is a crucial procedure for wind power forecasting. In many cases, power curve modeling is more impacted by the limited quality of input data rather than the stochastic nature...... of the energy conversion process. Such nature may be due the varying wind conditions, aging and state of the turbines, etc. And, an equivalent steady-state power curve, estimated under normal operating conditions with the intention to filter abnormal data, is not sufficient to solve the problem because...
transformation of independent variables in polynomial regression ...
African Journals Online (AJOL)
Ada
preferable when possible to work with a simple functional form in transformed variables rather than with a more complicated form in the original variables. In this paper, it is shown that linear transformations applied to independent variables in polynomial regression models affect the t ratio and hence the statistical ...
Directory of Open Access Journals (Sweden)
Liyun Su
2012-01-01
Full Text Available We introduce the extension of local polynomial fitting to the linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to nonparametric technique of local polynomial estimation, we do not need to know the heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we focus on comparison of parameters and reach an optimal fitting. Besides, we verify the asymptotic normality of parameters based on numerical simulations. Finally, this approach is applied to a case of economics, and it indicates that our method is surely effective in finite-sample situations.
Directory of Open Access Journals (Sweden)
Maria Gabriela Campolina Diniz Peixoto
2014-05-01
Full Text Available The objective of this work was to compare random regression models for the estimation of genetic parameters for Guzerat milk production, using orthogonal Legendre polynomials. Records (20,524 of test-day milk yield (TDMY from 2,816 first-lactation Guzerat cows were used. TDMY grouped into 10-monthly classes were analyzed for additive genetic effect and for environmental and residual permanent effects (random effects, whereas the contemporary group, calving age (linear and quadratic effects and mean lactation curve were analized as fixed effects. Trajectories for the additive genetic and permanent environmental effects were modeled by means of a covariance function employing orthogonal Legendre polynomials ranging from the second to the fifth order. Residual variances were considered in one, four, six, or ten variance classes. The best model had six residual variance classes. The heritability estimates for the TDMY records varied from 0.19 to 0.32. The random regression model that used a second-order Legendre polynomial for the additive genetic effect, and a fifth-order polynomial for the permanent environmental effect is adequate for comparison by the main employed criteria. The model with a second-order Legendre polynomial for the additive genetic effect, and that with a fourth-order for the permanent environmental effect could also be employed in these analyses.
Salleh, Nur Hanim Mohd; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Saad, Ahmad Ramli; Sulaiman, Husna Mahirah; Ahmad, Wan Muhamad Amir W.
2014-07-01
Polynomial regression is used to model a curvilinear relationship between a response variable and one or more predictor variables. It is a form of a least squares linear regression model that predicts a single response variable by decomposing the predictor variables into an nth order polynomial. In a curvilinear relationship, each curve has a number of extreme points equal to the highest order term in the polynomial. A quadratic model will have either a single maximum or minimum, whereas a cubic model has both a relative maximum and a minimum. This study used quadratic modeling techniques to analyze the effects of environmental factors: temperature, relative humidity, and rainfall distribution on the breeding of Aedes albopictus, a type of Aedes mosquito. Data were collected at an urban area in south-west Penang from September 2010 until January 2011. The results indicated that the breeding of Aedes albopictus in the urban area is influenced by all three environmental characteristics. The number of mosquito eggs is estimated to reach a maximum value at a medium temperature, a medium relative humidity and a high rainfall distribution.
Climate Impacts on Chinese Corn Yields: A Fractional Polynomial Regression Model
Kooten, van G.C.; Sun, Baojing
2012-01-01
In this study, we examine the effect of climate on corn yields in northern China using data from ten districts in Inner Mongolia and two in Shaanxi province. A regression model with a flexible functional form is specified, with explanatory variables that include seasonal growing degree days,
Function approximation with polynomial regression slines
International Nuclear Information System (INIS)
Urbanski, P.
1996-01-01
Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)
Multi-step polynomial regression method to model and forecast malaria incidence.
Directory of Open Access Journals (Sweden)
Chandrajit Chatterjee
Full Text Available Malaria is one of the most severe problems faced by the world even today. Understanding the causative factors such as age, sex, social factors, environmental variability etc. as well as underlying transmission dynamics of the disease is important for epidemiological research on malaria and its eradication. Thus, development of suitable modeling approach and methodology, based on the available data on the incidence of the disease and other related factors is of utmost importance. In this study, we developed a simple non-linear regression methodology in modeling and forecasting malaria incidence in Chennai city, India, and predicted future disease incidence with high confidence level. We considered three types of data to develop the regression methodology: a longer time series data of Slide Positivity Rates (SPR of malaria; a smaller time series data (deaths due to Plasmodium vivax of one year; and spatial data (zonal distribution of P. vivax deaths for the city along with the climatic factors, population and previous incidence of the disease. We performed variable selection by simple correlation study, identification of the initial relationship between variables through non-linear curve fitting and used multi-step methods for induction of variables in the non-linear regression analysis along with applied Gauss-Markov models, and ANOVA for testing the prediction, validity and constructing the confidence intervals. The results execute the applicability of our method for different types of data, the autoregressive nature of forecasting, and show high prediction power for both SPR and P. vivax deaths, where the one-lag SPR values plays an influential role and proves useful for better prediction. Different climatic factors are identified as playing crucial role on shaping the disease curve. Further, disease incidence at zonal level and the effect of causative factors on different zonal clusters indicate the pattern of malaria prevalence in the city
Multivariate Local Polynomial Regression with Application to Shenzhen Component Index
Directory of Open Access Journals (Sweden)
Liyun Su
2011-01-01
Full Text Available This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.
REGSTEP - stepwise multivariate polynomial regression with singular extensions
International Nuclear Information System (INIS)
Davierwalla, D.M.
1977-09-01
The program REGSTEP determines a polynomial approximation, in the least squares sense, to tabulated data. The polynomial may be univariate or multivariate. The computational method is that of stepwise regression. A variable is inserted into the regression basis if it is significant with respect to an appropriate F-test at a preselected risk level. In addition, should a variable already in the basis, become nonsignificant (again with respect to an appropriate F-test) after the entry of a new variable, it is expelled from the model. Thus only significant variables are retained in the model. Although written expressly to be incorporated into CORCOD, a code for predicting nuclear cross sections for given values of power, temperature, void fractions, Boron content etc. there is nothing to limit the use of REGSTEP to nuclear applications, as the examples demonstrate. A separate version has been incorporated into RSYST for the general user. (Auth.)
Pestaña-Melero, Francisco Luis; Haff, G Gregory; Rojas, Francisco Javier; Pérez-Castilla, Alejandro; García-Ramos, Amador
2017-12-18
This study aimed to compare the between-session reliability of the load-velocity relationship between (1) linear vs. polynomial regression models, (2) concentric-only vs. eccentric-concentric bench press variants, as well as (3) the within-participants vs. the between-participants variability of the velocity attained at each percentage of the one-repetition maximum (%1RM). The load-velocity relationship of 30 men (age: 21.2±3.8 y; height: 1.78±0.07 m, body mass: 72.3±7.3 kg; bench press 1RM: 78.8±13.2 kg) were evaluated by means of linear and polynomial regression models in the concentric-only and eccentric-concentric bench press variants in a Smith Machine. Two sessions were performed with each bench press variant. The main findings were: (1) first-order-polynomials (CV: 4.39%-4.70%) provided the load-velocity relationship with higher reliability than second-order-polynomials (CV: 4.68%-5.04%); (2) the reliability of the load-velocity relationship did not differ between the concentric-only and eccentric-concentric bench press variants; (3) the within-participants variability of the velocity attained at each %1RM was markedly lower than the between-participants variability. Taken together, these results highlight that, regardless of the bench press variant considered, the individual determination of the load-velocity relationship by a linear regression model could be recommended to monitor and prescribe the relative load in the Smith machine bench press exercise.
On the estimation of the degree of regression polynomial
International Nuclear Information System (INIS)
Toeroek, Cs.
1997-01-01
The mathematical functions most commonly used to model curvature in plots are polynomials. Generally, the higher the degree of the polynomial, the more complex is the trend that its graph can represent. We propose a new statistical-graphical approach based on the discrete projective transformation (DPT) to estimating the degree of polynomial that adequately describes the trend in the plot
Czech Academy of Sciences Publication Activity Database
Knížek, J.; Tichý, Petr; Beránek, L.; Šindelář, Jan; Vojtěšek, B.; Bouchal, P.; Nenutil, R.; Dedík, O.
2010-01-01
Roč. 7, č. 10 (2010), s. 48-60 ISSN 0974-5718 Grant - others:GA MZd(CZ) NS9812; GA ČR(CZ) GAP304/10/0868 Institutional research plan: CEZ:AV0Z10300504; CEZ:AV0Z10750506 Keywords : polynomial regression * orthogonalization * numerical methods * markers * biomarkers Subject RIV: BA - General Mathematics
Implementing fuzzy polynomial interpolation (FPI and fuzzy linear regression (LFR
Directory of Open Access Journals (Sweden)
Maria Cristina Floreno
1996-05-01
Full Text Available This paper presents some preliminary results arising within a general framework concerning the development of software tools for fuzzy arithmetic. The program is in a preliminary stage. What has been already implemented consists of a set of routines for elementary operations, optimized functions evaluation, interpolation and regression. Some of these have been applied to real problems.This paper describes a prototype of a library in C++ for polynomial interpolation of fuzzifying functions, a set of routines in FORTRAN for fuzzy linear regression and a program with graphical user interface allowing the use of such routines.
Optimum short-time polynomial regression for signal analysis
Indian Academy of Sciences (India)
A Sreenivasa Murthy
the Proceedings of European Signal Processing Conference. (EUSIPCO) 2008. ... In a seminal paper, Savitzky and Golay [4] showed that short-time polynomial modeling is ...... We next consider a linearly frequency-modulated chirp with an exponentially .... 1 http://www.physionet.org/physiotools/matlab/ECGwaveGen/.
Directory of Open Access Journals (Sweden)
Bangyong Sun
2014-01-01
Full Text Available The polynomial regression method is employed to calculate the relationship of device color space and CIE color space for color characterization, and the performance of different expressions with specific parameters is evaluated. Firstly, the polynomial equation for color conversion is established and the computation of polynomial coefficients is analysed. And then different forms of polynomial equations are used to calculate the RGB and CMYK’s CIE color values, while the corresponding color errors are compared. At last, an optimal polynomial expression is obtained by analysing several related parameters during color conversion, including polynomial numbers, the degree of polynomial terms, the selection of CIE visual spaces, and the linearization.
International Nuclear Information System (INIS)
Alvarez Huerta, A.; Gonzalez Miguelez, R.; Garcia Metola, D.; Noriega Gonzalez, A.
2011-01-01
The modelization is carried out through two different techniques, a conventional polynomial regression and other based on an approach by neural networks artificial. He is a comparison between the quality of the forecast would make different models based on the polynomial regression and neural network with generalization by Bayesian regulation, using the indicators of the root of the mean square error and the coefficient of determination, in view of the results, the neural network generates a prediction more accurate and reliable than the polynomial regression.
Ding, A Adam; Wu, Hulin
2014-10-01
We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method.
Considering a non-polynomial basis for local kernel regression problem
Silalahi, Divo Dharma; Midi, Habshah
2017-01-01
A common used as solution for local kernel nonparametric regression problem is given using polynomial regression. In this study, we demonstrated the estimator and properties using maximum likelihood estimator for a non-polynomial basis such B-spline to replacing the polynomial basis. This estimator allows for flexibility in the selection of a bandwidth and a knot. The best estimator was selected by finding an optimal bandwidth and knot through minimizing the famous generalized validation function.
Energy Technology Data Exchange (ETDEWEB)
Lopez Fontan, J.L.; Costa, J.; Ruso, J.M.; Prieto, G. [Dept. of Applied Physics, Univ. of Santiago de Compostela, Santiago de Compostela (Spain); Sarmiento, F. [Dept. of Mathematics, Faculty of Informatics, Univ. of A Coruna, A Coruna (Spain)
2004-02-01
The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found. (orig.)
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Neck curve polynomials in neck rupture model
International Nuclear Information System (INIS)
Kurniadi, Rizal; Perkasa, Yudha S.; Waris, Abdul
2012-01-01
The Neck Rupture Model is a model that explains the scission process which has smallest radius in liquid drop at certain position. Old fashion of rupture position is determined randomly so that has been called as Random Neck Rupture Model (RNRM). The neck curve polynomials have been employed in the Neck Rupture Model for calculation the fission yield of neutron induced fission reaction of 280 X 90 with changing of order of polynomials as well as temperature. The neck curve polynomials approximation shows the important effects in shaping of fission yield curve.
Vertex models, TASEP and Grothendieck polynomials
International Nuclear Information System (INIS)
Motegi, Kohei; Sakai, Kazumitsu
2013-01-01
We examine the wavefunctions and their scalar products of a one-parameter family of integrable five-vertex models. At a special point of the parameter, the model investigated is related to an irreversible interacting stochastic particle system—the so-called totally asymmetric simple exclusion process (TASEP). By combining the quantum inverse scattering method with a matrix product representation of the wavefunctions, the on-/off-shell wavefunctions of the five-vertex models are represented as a certain determinant form. Up to some normalization factors, we find that the wavefunctions are given by Grothendieck polynomials, which are a one-parameter deformation of Schur polynomials. Introducing a dual version of the Grothendieck polynomials, and utilizing the determinant representation for the scalar products of the wavefunctions, we derive a generalized Cauchy identity satisfied by the Grothendieck polynomials and their duals. Several representation theoretical formulae for the Grothendieck polynomials are also presented. As a byproduct, the relaxation dynamics such as Green functions for the periodic TASEP are found to be described in terms of the Grothendieck polynomials. (paper)
On weighted and locally polynomial directional quantile regression
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 32, č. 3 (2017), s. 929-946 ISSN 0943-4062 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : Quantile regression * Nonparametric regression * Nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8) Impact factor: 0.434, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0458380.pdf
Quadratic Polynomial Regression using Serial Observation Processing:Implementation within DART
Hodyss, D.; Anderson, J. L.; Collins, N.; Campbell, W. F.; Reinecke, P. A.
2017-12-01
Many Ensemble-Based Kalman ltering (EBKF) algorithms process the observations serially. Serial observation processing views the data assimilation process as an iterative sequence of scalar update equations. What is useful about this data assimilation algorithm is that it has very low memory requirements and does not need complex methods to perform the typical high-dimensional inverse calculation of many other algorithms. Recently, the push has been towards the prediction, and therefore the assimilation of observations, for regions and phenomena for which high-resolution is required and/or highly nonlinear physical processes are operating. For these situations, a basic hypothesis is that the use of the EBKF is sub-optimal and performance gains could be achieved by accounting for aspects of the non-Gaussianty. To this end, we develop here a new component of the Data Assimilation Research Testbed [DART] to allow for a wide-variety of users to test this hypothesis. This new version of DART allows one to run several variants of the EBKF as well as several variants of the quadratic polynomial lter using the same forecast model and observations. Dierences between the results of the two systems will then highlight the degree of non-Gaussianity in the system being examined. We will illustrate in this work the differences between the performance of linear versus quadratic polynomial regression in a hierarchy of models from Lorenz-63 to a simple general circulation model.
Kang, Seung-Wan; Byun, Gukdo; Park, Hun-Joon
2014-12-01
This paper presents empirical research into the relationship between leader-follower value congruence in social responsibility and the level of ethical satisfaction for employees in the workplace. 163 dyads were analyzed, each consisting of a team leader and an employee working at a large manufacturing company in South Korea. Following current methodological recommendations for congruence research, polynomial regression and response surface modeling methodologies were used to determine the effects of value congruence. Results indicate that leader-follower value congruence in social responsibility was positively related to the ethical satisfaction of employees. Furthermore, employees' ethical satisfaction was stronger when aligned with a leader with high social responsibility. The theoretical and practical implications are discussed.
Stability analysis of polynomial fuzzy models via polynomial fuzzy Lyapunov functions
Bernal Reza, Miguel Ángel; Sala, Antonio; JAADARI, ABDELHAFIDH; Guerra, Thierry-Marie
2011-01-01
In this paper, the stability of continuous-time polynomial fuzzy models by means of a polynomial generalization of fuzzy Lyapunov functions is studied. Fuzzy Lyapunov functions have been fruitfully used in the literature for local analysis of Takagi-Sugeno models, a particular class of the polynomial fuzzy ones. Based on a recent Taylor-series approach which allows a polynomial fuzzy model to exactly represent a nonlinear model in a compact set of the state space, it is shown that a refinemen...
Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti
2010-01-01
In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…
Dillon, Paul; Phillips, L Alison; Gallagher, Paul; Smith, Susan M; Stewart, Derek; Cousins, Gráinne
2018-02-05
The Necessity-Concerns Framework (NCF) is a multidimensional theory describing the relationship between patients' positive and negative evaluations of their medication which interplay to influence adherence. Most studies evaluating the NCF have failed to account for the multidimensional nature of the theory, placing the separate dimensions of medication "necessity beliefs" and "concerns" onto a single dimension (e.g., the Beliefs about Medicines Questionnaire-difference score model). To assess the multidimensional effect of patient medication beliefs (concerns and necessity beliefs) on medication adherence using polynomial regression with response surface analysis. Community-dwelling older adults >65 years (n = 1,211) presenting their own prescription for antihypertensive medication to 106 community pharmacies in the Republic of Ireland rated their concerns and necessity beliefs to antihypertensive medications at baseline and their adherence to antihypertensive medication at 12 months via structured telephone interview. Confirmatory polynomial regression found the difference-score model to be inaccurate; subsequent exploratory analysis identified a quadratic model to be the best-fitting polynomial model. Adherence was lowest among those with strong medication concerns and weak necessity beliefs, and adherence was greatest for those with weak concerns and strong necessity beliefs (slope β = -0.77, pnecessity beliefs had lower adherence than those with simultaneously low concerns and necessity beliefs (slope β = -0.36, p = .004; curvature β = -0.25, p = .003). The difference-score model fails to account for the potential nonreciprocal effects. Results extend evidence supporting the use of polynomial regression to assess the multidimensional effect of medication beliefs on adherence.
Polynomial fuzzy model-based approach for underactuated surface vessels
DEFF Research Database (Denmark)
Khooban, Mohammad Hassan; Vafamand, Navid; Dragicevic, Tomislav
2018-01-01
The main goal of this study is to introduce a new polynomial fuzzy model-based structure for a class of marine systems with non-linear and polynomial dynamics. The suggested technique relies on a polynomial Takagi–Sugeno (T–S) fuzzy modelling, a polynomial dynamic parallel distributed compensation...... surface vessel (USV). Additionally, in order to overcome the USV control challenges, including the USV un-modelled dynamics, complex nonlinear dynamics, external disturbances and parameter uncertainties, the polynomial fuzzy model representation is adopted. Moreover, the USV-based control structure...... and a sum-of-squares (SOS) decomposition. The new proposed approach is a generalisation of the standard T–S fuzzy models and linear matrix inequality which indicated its effectiveness in decreasing the tracking time and increasing the efficiency of the robust tracking control problem for an underactuated...
Adaptive regression for modeling nonlinear relationships
Knafl, George J
2016-01-01
This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...
Energy Technology Data Exchange (ETDEWEB)
Alvarez Huerta, A.; Gonzalez Miguelez, R.; Garcia Metola, D.; Noriega Gonzalez, A.
2011-07-01
The modelization is carried out through two different techniques, a conventional polynomial regression and other based on an approach by neural networks artificial. He is a comparison between the quality of the forecast would make different models based on the polynomial regression and neural network with generalization by Bayesian regulation, using the indicators of the root of the mean square error and the coefficient of determination, in view of the results, the neural network generates a prediction more accurate and reliable than the polynomial regression.
International Nuclear Information System (INIS)
Herda, Trent J; Ryan, Eric D; Costa, Pablo B; DeFreitas, Jason M; Walter, Ashley A; Stout, Jeffrey R; Beck, Travis W; Cramer, Joel T; Housh, Terry J; Weir, Joseph P
2009-01-01
The primary purpose of this study was to examine the consistency of ordinary least-squares (OLS) and generalized least-squares (GLS) polynomial regression analyses utilizing linear, quadratic and cubic models on either five or ten data points that characterize the mechanomyographic amplitude (MMG RMS ) versus isometric torque relationship. The secondary purpose was to examine the consistency of OLS and GLS polynomial regression utilizing only linear and quadratic models (excluding cubic responses) on either ten or five data points. Eighteen participants (mean ± SD age = 24 ± 4 yr) completed ten randomly ordered isometric step muscle actions from 5% to 95% of the maximal voluntary contraction (MVC) of the right leg extensors during three separate trials. MMG RMS was recorded from the vastus lateralis during the MVCs and each submaximal muscle action. MMG RMS versus torque relationships were analyzed on a subject-by-subject basis using OLS and GLS polynomial regression. When using ten data points, only 33% and 27% of the subjects were fitted with the same model (utilizing linear, quadratic and cubic models) across all three trials for OLS and GLS, respectively. After eliminating the cubic model, there was an increase to 55% of the subjects being fitted with the same model across all trials for both OLS and GLS regression. Using only five data points (instead of ten data points), 55% of the subjects were fitted with the same model across all trials for OLS and GLS regression. Overall, OLS and GLS polynomial regression models were only able to consistently describe the torque-related patterns of response for MMG RMS in 27–55% of the subjects across three trials. Future studies should examine alternative methods for improving the consistency and reliability of the patterns of response for the MMG RMS versus isometric torque relationship
Automatic Control Systems Modeling by Volterra Polynomials
Directory of Open Access Journals (Sweden)
S. V. Solodusha
2012-01-01
Full Text Available The problem of the existence of the solutions of polynomial Volterra integral equations of the first kind of the second degree is considered. An algorithm of the numerical solution of one class of Volterra nonlinear systems of the first kind is developed. Numerical results for test examples are presented.
A robust and efficient stepwise regression method for building sparse polynomial chaos expansions
Energy Technology Data Exchange (ETDEWEB)
Abraham, Simon, E-mail: Simon.Abraham@ulb.ac.be [Vrije Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2, 1050 Brussels (Belgium); Raisee, Mehrdad [School of Mechanical Engineering, College of Engineering, University of Tehran, P.O. Box: 11155-4563, Tehran (Iran, Islamic Republic of); Ghorbaniasl, Ghader; Contino, Francesco; Lacor, Chris [Vrije Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2, 1050 Brussels (Belgium)
2017-03-01
Polynomial Chaos (PC) expansions are widely used in various engineering fields for quantifying uncertainties arising from uncertain parameters. The computational cost of classical PC solution schemes is unaffordable as the number of deterministic simulations to be calculated grows dramatically with the number of stochastic dimension. This considerably restricts the practical use of PC at the industrial level. A common approach to address such problems is to make use of sparse PC expansions. This paper presents a non-intrusive regression-based method for building sparse PC expansions. The most important PC contributions are detected sequentially through an automatic search procedure. The variable selection criterion is based on efficient tools relevant to probabilistic method. Two benchmark analytical functions are used to validate the proposed algorithm. The computational efficiency of the method is then illustrated by a more realistic CFD application, consisting of the non-deterministic flow around a transonic airfoil subject to geometrical uncertainties. To assess the performance of the developed methodology, a detailed comparison is made with the well established LAR-based selection technique. The results show that the developed sparse regression technique is able to identify the most significant PC contributions describing the problem. Moreover, the most important stochastic features are captured at a reduced computational cost compared to the LAR method. The results also demonstrate the superior robustness of the method by repeating the analyses using random experimental designs.
Polynomial algebra of discrete models in systems biology.
Veliz-Cuba, Alan; Jarrah, Abdul Salam; Laubenbacher, Reinhard
2010-07-01
An increasing number of discrete mathematical models are being published in Systems Biology, ranging from Boolean network models to logical models and Petri nets. They are used to model a variety of biochemical networks, such as metabolic networks, gene regulatory networks and signal transduction networks. There is increasing evidence that such models can capture key dynamic features of biological networks and can be used successfully for hypothesis generation. This article provides a unified framework that can aid the mathematical analysis of Boolean network models, logical models and Petri nets. They can be represented as polynomial dynamical systems, which allows the use of a variety of mathematical tools from computer algebra for their analysis. Algorithms are presented for the translation into polynomial dynamical systems. Examples are given of how polynomial algebra can be used for the model analysis. alanavc@vt.edu Supplementary data are available at Bioinformatics online.
Identification of Super Phenix steam generator by a simple polynomial model
International Nuclear Information System (INIS)
Rousseau, I.
1981-01-01
This note suggests a method of identification for the steam generator of the Super-Phenix fast neutron power plant for simple polynomial models. This approach is justified in the selection of the adaptive control. The identification algorithms presented will be applied to multivariable input-output behaviours. The results obtained with the representation in self-regressive form and by simple polynomial models will be compared and the effect of perturbations on the output signal will be tested, in order to select a good identification algorithm for multivariable adaptive regulation [fr
Directory of Open Access Journals (Sweden)
Liang Yan
2018-03-01
Full Text Available In this paper, we examine two widely-used approaches, the polynomial chaos expansion (PCE and Gaussian process (GP regression, for the development of surrogate models. The theoretical differences between the PCE and GP approximations are discussed. A state-of-the-art PCE approach is constructed based on high precision quadrature points; however, the need for truncation may result in potential precision loss; the GP approach performs well on small datasets and allows a fine and precise trade-off between fitting the data and smoothing, but its overall performance depends largely on the training dataset. The reproducing kernel Hilbert space (RKHS and Mercer’s theorem are introduced to form a linkage between the two methods. The theorem has proven that the two surrogates can be embedded in two isomorphic RKHS, by which we propose a novel method named Gaussian process on polynomial chaos basis (GPCB that incorporates the PCE and GP. A theoretical comparison is made between the PCE and GPCB with the help of the Kullback–Leibler divergence. We present that the GPCB is as stable and accurate as the PCE method. Furthermore, the GPCB is a one-step Bayesian method that chooses the best subset of RKHS in which the true function should lie, while the PCE method requires an adaptive procedure. Simulations of 1D and 2D benchmark functions show that GPCB outperforms both the PCE and classical GP methods. In order to solve high dimensional problems, a random sample scheme with a constructive design (i.e., tensor product of quadrature points is proposed to generate a valid training dataset for the GPCB method. This approach utilizes the nature of the high numerical accuracy underlying the quadrature points while ensuring the computational feasibility. Finally, the experimental results show that our sample strategy has a higher accuracy than classical experimental designs; meanwhile, it is suitable for solving high dimensional problems.
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Polynomial constitutive model for shape memory and pseudo elasticity
International Nuclear Information System (INIS)
Savi, M.A.; Kouzak, Z.
1995-01-01
This paper reports an one-dimensional phenomenological constitutive model for shape memory and pseudo elasticity using a polynomial expression for the free energy which is based on the classical Devonshire theory. This study identifies the main characteristics of the classical theory and introduces a simple modification to obtain better results. (author). 9 refs., 6 figs
(Non) linear regression modelling
Cizek, P.; Gentle, J.E.; Hardle, W.K.; Mori, Y.
2012-01-01
We will study causal relationships of a known form between random variables. Given a model, we distinguish one or more dependent (endogenous) variables Y = (Y1,…,Yl), l ∈ N, which are explained by a model, and independent (exogenous, explanatory) variables X = (X1,…,Xp),p ∈ N, which explain or
Polynomial model inversion control: numerical tests and applications
Novara, Carlo
2015-01-01
A novel control design approach for general nonlinear systems is described in this paper. The approach is based on the identification of a polynomial model of the system to control and on the on-line inversion of this model. Extensive simulations are carried out to test the numerical efficiency of the approach. Numerical examples of applicative interest are presented, concerned with control of the Duffing oscillator, control of a robot manipulator and insulin regulation in a type 1 diabetic p...
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Sparse estimation of polynomial dynamical models
Toth, R.; Hjalmarsson, H.; Rojas, C.R.; Kinnaert, M.
2012-01-01
In many practical situations, it is highly desirable to estimate an accurate mathematical model of a real system using as few parameters as possible. This can be motivated either from appealing to a parsimony principle (Occam's razor) or from the view point of the utilization complexity in terms of
International Nuclear Information System (INIS)
Jafri, Y.Z.; Kamal, L.
2007-01-01
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
New realisation of Preisach model using adaptive polynomial approximation
Liu, Van-Tsai; Lin, Chun-Liang; Wing, Home-Young
2012-09-01
Modelling system with hysteresis has received considerable attention recently due to the increasing accurate requirement in engineering applications. The classical Preisach model (CPM) is the most popular model to demonstrate hysteresis which can be represented by infinite but countable first-order reversal curves (FORCs). The usage of look-up tables is one way to approach the CPM in actual practice. The data in those tables correspond with the samples of a finite number of FORCs. This approach, however, faces two major problems: firstly, it requires a large amount of memory space to obtain an accurate prediction of hysteresis; secondly, it is difficult to derive efficient ways to modify the data table to reflect the timing effect of elements with hysteresis. To overcome, this article proposes the idea of using a set of polynomials to emulate the CPM instead of table look-up. The polynomial approximation requires less memory space for data storage. Furthermore, the polynomial coefficients can be obtained accurately by using the least-square approximation or adaptive identification algorithm, such as the possibility of accurate tracking of hysteresis model parameters.
Zernike polynomial based Rayleigh-Ritz model of a piezoelectric unimorph deformable mirror
CSIR Research Space (South Africa)
Long, CS
2012-04-01
Full Text Available , are routinely and conveniently described using Zernike polynomials. A Rayleigh-Ritz structural model, which uses Zernike polynomials directly to describe the displacements, is proposed in this paper. The proposed formulation produces a numerically inexpensive...
Genetic evaluation of European quails by random regression models
Directory of Open Access Journals (Sweden)
Flaviana Miranda Gonçalves
2012-09-01
Full Text Available The objective of this study was to compare different random regression models, defined from different classes of heterogeneity of variance combined with different Legendre polynomial orders for the estimate of (covariance of quails. The data came from 28,076 observations of 4,507 female meat quails of the LF1 lineage. Quail body weights were determined at birth and 1, 14, 21, 28, 35 and 42 days of age. Six different classes of residual variance were fitted to Legendre polynomial functions (orders ranging from 2 to 6 to determine which model had the best fit to describe the (covariance structures as a function of time. According to the evaluated criteria (AIC, BIC and LRT, the model with six classes of residual variances and of sixth-order Legendre polynomial was the best fit. The estimated additive genetic variance increased from birth to 28 days of age, and dropped slightly from 35 to 42 days. The heritability estimates decreased along the growth curve and changed from 0.51 (1 day to 0.16 (42 days. Animal genetic and permanent environmental correlation estimates between weights and age classes were always high and positive, except for birth weight. The sixth order Legendre polynomial, along with the residual variance divided into six classes was the best fit for the growth rate curve of meat quails; therefore, they should be considered for breeding evaluation processes by random regression models.
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Energy Technology Data Exchange (ETDEWEB)
Konakli, Katerina, E-mail: konakli@ibk.baug.ethz.ch; Sudret, Bruno
2016-09-15
The growing need for uncertainty analysis of complex computational models has led to an expanding use of meta-models across engineering and sciences. The efficiency of meta-modeling techniques relies on their ability to provide statistically-equivalent analytical representations based on relatively few evaluations of the original model. Polynomial chaos expansions (PCE) have proven a powerful tool for developing meta-models in a wide range of applications; the key idea thereof is to expand the model response onto a basis made of multivariate polynomials obtained as tensor products of appropriate univariate polynomials. The classical PCE approach nevertheless faces the “curse of dimensionality”, namely the exponential increase of the basis size with increasing input dimension. To address this limitation, the sparse PCE technique has been proposed, in which the expansion is carried out on only a few relevant basis terms that are automatically selected by a suitable algorithm. An alternative for developing meta-models with polynomial functions in high-dimensional problems is offered by the newly emerged low-rank approximations (LRA) approach. By exploiting the tensor–product structure of the multivariate basis, LRA can provide polynomial representations in highly compressed formats. Through extensive numerical investigations, we herein first shed light on issues relating to the construction of canonical LRA with a particular greedy algorithm involving a sequential updating of the polynomial coefficients along separate dimensions. Specifically, we examine the selection of optimal rank, stopping criteria in the updating of the polynomial coefficients and error estimation. In the sequel, we confront canonical LRA to sparse PCE in structural-mechanics and heat-conduction applications based on finite-element solutions. Canonical LRA exhibit smaller errors than sparse PCE in cases when the number of available model evaluations is small with respect to the input
International Nuclear Information System (INIS)
Konakli, Katerina; Sudret, Bruno
2016-01-01
The growing need for uncertainty analysis of complex computational models has led to an expanding use of meta-models across engineering and sciences. The efficiency of meta-modeling techniques relies on their ability to provide statistically-equivalent analytical representations based on relatively few evaluations of the original model. Polynomial chaos expansions (PCE) have proven a powerful tool for developing meta-models in a wide range of applications; the key idea thereof is to expand the model response onto a basis made of multivariate polynomials obtained as tensor products of appropriate univariate polynomials. The classical PCE approach nevertheless faces the “curse of dimensionality”, namely the exponential increase of the basis size with increasing input dimension. To address this limitation, the sparse PCE technique has been proposed, in which the expansion is carried out on only a few relevant basis terms that are automatically selected by a suitable algorithm. An alternative for developing meta-models with polynomial functions in high-dimensional problems is offered by the newly emerged low-rank approximations (LRA) approach. By exploiting the tensor–product structure of the multivariate basis, LRA can provide polynomial representations in highly compressed formats. Through extensive numerical investigations, we herein first shed light on issues relating to the construction of canonical LRA with a particular greedy algorithm involving a sequential updating of the polynomial coefficients along separate dimensions. Specifically, we examine the selection of optimal rank, stopping criteria in the updating of the polynomial coefficients and error estimation. In the sequel, we confront canonical LRA to sparse PCE in structural-mechanics and heat-conduction applications based on finite-element solutions. Canonical LRA exhibit smaller errors than sparse PCE in cases when the number of available model evaluations is small with respect to the input
Polynomial Chaos Expansion Approach to Interest Rate Models
Directory of Open Access Journals (Sweden)
Luca Di Persio
2015-01-01
Full Text Available The Polynomial Chaos Expansion (PCE technique allows us to recover a finite second-order random variable exploiting suitable linear combinations of orthogonal polynomials which are functions of a given stochastic quantity ξ, hence acting as a kind of random basis. The PCE methodology has been developed as a mathematically rigorous Uncertainty Quantification (UQ method which aims at providing reliable numerical estimates for some uncertain physical quantities defining the dynamic of certain engineering models and their related simulations. In the present paper, we use the PCE approach in order to analyze some equity and interest rate models. In particular, we take into consideration those models which are based on, for example, the Geometric Brownian Motion, the Vasicek model, and the CIR model. We present theoretical as well as related concrete numerical approximation results considering, without loss of generality, the one-dimensional case. We also provide both an efficiency study and an accuracy study of our approach by comparing its outputs with the ones obtained adopting the Monte Carlo approach, both in its standard and its enhanced version.
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Meloun, Milan; Hill, Martin; Vceláková-Havlíková, Helena
2009-01-01
Pregnenolone sulfate (PregS) is known as a steroid conjugate positively modulating N-methyl-D-aspartate receptors on neuronal membranes. These receptors are responsible for permeability of calcium channels and activation of neuronal function. Neuroactivating effect of PregS is also exerted via non-competitive negative modulation of GABA(A) receptors regulating the chloride influx. Recently, a penetrability of blood-brain barrier for PregS was found in rat, but some experiments in agreement with this finding were reported even earlier. It is known that circulating levels of PregS in human are relatively high depending primarily on age and adrenal activity. Concerning the neuromodulating effect of PregS, we recently evaluated age relationships of PregS in both sexes using polynomial regression models known to bring about the problems of multicollinearity, i.e., strong correlations among independent variables. Several criteria for the selection of suitable bias are demonstrated. Biased estimators based on the generalized principal component regression (GPCR) method avoiding multicollinearity problems are described. Significant differences were found between men and women in the course of the age dependence of PregS. In women, a significant maximum was found around the 30th year followed by a rapid decline, while the maximum in men was achieved almost 10 years earlier and changes were minor up to the 60th year. The investigation of gender differences and age dependencies in PregS could be of interest given its well-known neurostimulating effect, relatively high serum concentration, and the probable partial permeability of the blood-brain barrier for the steroid conjugate. GPCR in combination with the MEP (mean quadric error of prediction) criterion is extremely useful and appealing for constructing biased models. It can also be used for achieving such estimates with regard to keeping the model course corresponding to the data trend, especially in polynomial type
Nelemans, S A; Branje, S J T; Hale, W W; Goossens, L; Koot, H M; Oldehinkel, A J; Meeus, W H J
2016-10-01
Adolescence is a critical period for the development of depressive symptoms. Lower quality of the parent-adolescent relationship has been consistently associated with higher adolescent depressive symptoms, but discrepancies in perceptions of parents and adolescents regarding the quality of their relationship may be particularly important to consider. In the present study, we therefore examined how discrepancies in parents' and adolescents' perceptions of the parent-adolescent relationship were associated with early adolescent depressive symptoms, both concurrently and longitudinally over a 1-year period. Our sample consisted of 497 Dutch adolescents (57 % boys, M age = 13.03 years), residing in the western and central regions of the Netherlands, and their mothers and fathers, who all completed several questionnaires on two occasions with a 1-year interval. Adolescents reported on depressive symptoms and all informants reported on levels of negative interaction in the parent-adolescent relationship. Results from polynomial regression analyses including interaction terms between informants' perceptions, which have recently been proposed as more valid tests of hypotheses involving informant discrepancies than difference scores, suggested the highest adolescent depressive symptoms when both the mother and the adolescent reported high negative interaction, and when the adolescent reported high but the father reported low negative interaction. This pattern of findings underscores the need for a more sophisticated methodology such as polynomial regression analysis including tests of moderation, rather than the use of difference scores, which can adequately address both congruence and discrepancies in perceptions of adolescents and mothers/fathers of the parent-adolescent relationship in detail. Such an analysis can contribute to a more comprehensive understanding of risk factors for early adolescent depressive symptoms.
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Energy Technology Data Exchange (ETDEWEB)
Kersaudy, Pierric, E-mail: pierric.kersaudy@orange.com [Orange Labs, 38 avenue du Général Leclerc, 92130 Issy-les-Moulineaux (France); Whist Lab, 38 avenue du Général Leclerc, 92130 Issy-les-Moulineaux (France); ESYCOM, Université Paris-Est Marne-la-Vallée, 5 boulevard Descartes, 77700 Marne-la-Vallée (France); Sudret, Bruno [ETH Zürich, Chair of Risk, Safety and Uncertainty Quantification, Stefano-Franscini-Platz 5, 8093 Zürich (Switzerland); Varsier, Nadège [Orange Labs, 38 avenue du Général Leclerc, 92130 Issy-les-Moulineaux (France); Whist Lab, 38 avenue du Général Leclerc, 92130 Issy-les-Moulineaux (France); Picon, Odile [ESYCOM, Université Paris-Est Marne-la-Vallée, 5 boulevard Descartes, 77700 Marne-la-Vallée (France); Wiart, Joe [Orange Labs, 38 avenue du Général Leclerc, 92130 Issy-les-Moulineaux (France); Whist Lab, 38 avenue du Général Leclerc, 92130 Issy-les-Moulineaux (France)
2015-04-01
In numerical dosimetry, the recent advances in high performance computing led to a strong reduction of the required computational time to assess the specific absorption rate (SAR) characterizing the human exposure to electromagnetic waves. However, this procedure remains time-consuming and a single simulation can request several hours. As a consequence, the influence of uncertain input parameters on the SAR cannot be analyzed using crude Monte Carlo simulation. The solution presented here to perform such an analysis is surrogate modeling. This paper proposes a novel approach to build such a surrogate model from a design of experiments. Considering a sparse representation of the polynomial chaos expansions using least-angle regression as a selection algorithm to retain the most influential polynomials, this paper proposes to use the selected polynomials as regression functions for the universal Kriging model. The leave-one-out cross validation is used to select the optimal number of polynomials in the deterministic part of the Kriging model. The proposed approach, called LARS-Kriging-PC modeling, is applied to three benchmark examples and then to a full-scale metamodeling problem involving the exposure of a numerical fetus model to a femtocell device. The performances of the LARS-Kriging-PC are compared to an ordinary Kriging model and to a classical sparse polynomial chaos expansion. The LARS-Kriging-PC appears to have better performances than the two other approaches. A significant accuracy improvement is observed compared to the ordinary Kriging or to the sparse polynomial chaos depending on the studied case. This approach seems to be an optimal solution between the two other classical approaches. A global sensitivity analysis is finally performed on the LARS-Kriging-PC model of the fetus exposure problem.
Piecewise Polynomial Aggregation as Preprocessing for Data Numerical Modeling
Dobronets, B. S.; Popova, O. A.
2018-05-01
Data aggregation issues for numerical modeling are reviewed in the present study. The authors discuss data aggregation procedures as preprocessing for subsequent numerical modeling. To calculate the data aggregation, the authors propose using numerical probabilistic analysis (NPA). An important feature of this study is how the authors represent the aggregated data. The study shows that the offered approach to data aggregation can be interpreted as the frequency distribution of a variable. To study its properties, the density function is used. For this purpose, the authors propose using the piecewise polynomial models. A suitable example of such approach is the spline. The authors show that their approach to data aggregation allows reducing the level of data uncertainty and significantly increasing the efficiency of numerical calculations. To demonstrate the degree of the correspondence of the proposed methods to reality, the authors developed a theoretical framework and considered numerical examples devoted to time series aggregation.
Hamed Kharrati; Sohrab Khanmohammadi; Witold Pedrycz; Ghasem Alizadeh
2012-01-01
This study presents an improved model and controller for nonlinear plants using polynomial fuzzy model-based (FMB) systems. To minimize mismatch between the polynomial fuzzy model and nonlinear plant, the suitable parameters of membership functions are determined in a systematic way. Defining an appropriate fitness function and utilizing Taylor series expansion, a genetic algorithm (GA) is used to form the shape of membership functions in polynomial forms, which are afterwards used in fuzzy m...
Einstein’s gravity from a polynomial affine model
Castillo-Felisola, Oscar; Skirzewski, Aureliano
2018-03-01
We show that the effective field equations for a recently formulated polynomial affine model of gravity, in the sector of a torsion-free connection, accept general Einstein manifolds—with or without cosmological constant—as solutions. Moreover, the effective field equations are partially those obtained from a gravitational Yang–Mills theory known as Stephenson–Kilmister–Yang theory. Additionally, we find a generalization of a minimally coupled massless scalar field in General Relativity within a ‘minimally’ coupled scalar field in this affine model. Finally, we present a brief (perturbative) analysis of the propagators of the gravitational theory, and count the degrees of freedom. For completeness, we prove that a Birkhoff-like theorem is valid for the analyzed sector.
Li, Xiaomiao; Lam, Hak Keung; Song, Ge; Liu, Fucai
2017-01-01
This paper deals with the stability and positivity analysis of polynomial-fuzzy-model-based ({PFMB}) control systems with time delay, which is formed by a polynomial fuzzy model and a polynomial fuzzy controller connected in a closed loop, under imperfect premise matching. To improve the design and realization flexibility, the polynomial fuzzy model and the polynomial fuzzy controller are allowed to have their own set of premise membership functions. A sum-of-squares (SOS)-based stability ana...
Directory of Open Access Journals (Sweden)
Hamed Kharrati
2012-01-01
Full Text Available This study presents an improved model and controller for nonlinear plants using polynomial fuzzy model-based (FMB systems. To minimize mismatch between the polynomial fuzzy model and nonlinear plant, the suitable parameters of membership functions are determined in a systematic way. Defining an appropriate fitness function and utilizing Taylor series expansion, a genetic algorithm (GA is used to form the shape of membership functions in polynomial forms, which are afterwards used in fuzzy modeling. To validate the model, a controller based on proposed polynomial fuzzy systems is designed and then applied to both original nonlinear plant and fuzzy model for comparison. Additionally, stability analysis for the proposed polynomial FMB control system is investigated employing Lyapunov theory and a sum of squares (SOS approach. Moreover, the form of the membership functions is considered in stability analysis. The SOS-based stability conditions are attained using SOSTOOLS. Simulation results are also given to demonstrate the effectiveness of the proposed method.
Directory of Open Access Journals (Sweden)
Martin Hallnäs
2007-03-01
Full Text Available We review a recent construction of an explicit analytic series representation for symmetric polynomials which up to a groundstate factor are eigenfunctions of Calogero-Sutherland type models. We also indicate a generalisation of this result to polynomials which give the eigenfunctions of so-called 'deformed' Calogero-Sutherland type models.
DEFF Research Database (Denmark)
Strathe, Anders B; Mark, Thomas; Nielsen, Bjarne
2014-01-01
Random regression models were used to estimate covariance functions between cumulated feed intake (CFI) and body weight (BW) in 8424 Danish Duroc pigs. Random regressions on second order Legendre polynomials of age were used to describe genetic and permanent environmental curves in BW and CFI...
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Directory of Open Access Journals (Sweden)
Ali Ghorbani
2017-01-01
Full Text Available Coupled Piled Raft Foundations (CPRFs are broadly applied to share heavy loads of superstructures between piles and rafts and reduce total and differential settlements. Settlements induced by static/coupled static-dynamic loads are one of the main concerns of engineers in designing CPRFs. Evaluation of induced settlements of CPRFs has been commonly carried out using three-dimensional finite element/finite difference modeling or through expensive real-scale/prototype model tests. Since the analyses, especially in the case of coupled static-dynamic loads, are not simply conducted, this paper presents two practical methods to gain the values of settlement. First, different nonlinear finite difference models under different static and coupled static-dynamic loads are developed to calculate exerted settlements. Analyses are performed with respect to different axial loads and pile’s configurations, numbers, lengths, diameters, and spacing for both loading cases. Based on the results of well-validated three-dimensional finite difference modeling, artificial neural networks and evolutionary polynomial regressions are then applied and introduced as capable methods to accurately present both static and coupled static-dynamic settlements. Also, using a sensitivity analysis based on Cosine Amplitude Method, axial load is introduced as the most influential parameter, while the ratio l/d is reported as the least effective parameter on the settlements of CPRFs.
Real estate value prediction using multivariate regression models
Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav
2017-11-01
The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Model selection in kernel ridge regression
DEFF Research Database (Denmark)
Exterkate, Peter
2013-01-01
Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...
Efficient modeling of photonic crystals with local Hermite polynomials
International Nuclear Information System (INIS)
Boucher, C. R.; Li, Zehao; Albrecht, J. D.; Ram-Mohan, L. R.
2014-01-01
Developing compact algorithms for accurate electrodynamic calculations with minimal computational cost is an active area of research given the increasing complexity in the design of electromagnetic composite structures such as photonic crystals, metamaterials, optical interconnects, and on-chip routing. We show that electric and magnetic (EM) fields can be calculated using scalar Hermite interpolation polynomials as the numerical basis functions without having to invoke edge-based vector finite elements to suppress spurious solutions or to satisfy boundary conditions. This approach offers several fundamental advantages as evidenced through band structure solutions for periodic systems and through waveguide analysis. Compared with reciprocal space (plane wave expansion) methods for periodic systems, advantages are shown in computational costs, the ability to capture spatial complexity in the dielectric distributions, the demonstration of numerical convergence with scaling, and variational eigenfunctions free of numerical artifacts that arise from mixed-order real space basis sets or the inherent aberrations from transforming reciprocal space solutions of finite expansions. The photonic band structure of a simple crystal is used as a benchmark comparison and the ability to capture the effects of spatially complex dielectric distributions is treated using a complex pattern with highly irregular features that would stress spatial transform limits. This general method is applicable to a broad class of physical systems, e.g., to semiconducting lasers which require simultaneous modeling of transitions in quantum wells or dots together with EM cavity calculations, to modeling plasmonic structures in the presence of EM field emissions, and to on-chip propagation within monolithic integrated circuits
Random regression models for daily feed intake in Danish Duroc pigs
DEFF Research Database (Denmark)
Strathe, Anders Bjerring; Mark, Thomas; Jensen, Just
The objective of this study was to develop random regression models and estimate covariance functions for daily feed intake (DFI) in Danish Duroc pigs. A total of 476201 DFI records were available on 6542 Duroc boars between 70 to 160 days of age. The data originated from the National test station......-year-season, permanent, and animal genetic effects. The functional form was based on Legendre polynomials. A total of 64 models for random regressions were initially ranked by BIC to identify the approximate order for the Legendre polynomials using AI-REML. The parsimonious model included Legendre polynomials of 2nd...... order for genetic and permanent environmental curves and a heterogeneous residual variance, allowing the daily residual variance to change along the age trajectory due to scale effects. The parameters of the model were estimated in a Bayesian framework, using the RJMC module of the DMU package, where...
A generalized multivariate regression model for modelling ocean wave heights
Wang, X. L.; Feng, Y.; Swail, V. R.
2012-04-01
In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.
Directory of Open Access Journals (Sweden)
Drzewiecki Wojciech
2016-12-01
Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.
Gómez-Valent, Adrià; Amendola, Luca
2018-04-01
In this paper we present new constraints on the Hubble parameter H0 using: (i) the available data on H(z) obtained from cosmic chronometers (CCH); (ii) the Hubble rate data points extracted from the supernovae of Type Ia (SnIa) of the Pantheon compilation and the Hubble Space Telescope (HST) CANDELS and CLASH Multy-Cycle Treasury (MCT) programs; and (iii) the local HST measurement of H0 provided by Riess et al. (2018), H0HST=(73.45±1.66) km/s/Mpc. Various determinations of H0 using the Gaussian processes (GPs) method and the most updated list of CCH data have been recently provided by Yu, Ratra & Wang (2018). Using the Gaussian kernel they find H0=(67.42± 4.75) km/s/Mpc. Here we extend their analysis to also include the most released and complete set of SnIa data, which allows us to reduce the uncertainty by a factor ~ 3 with respect to the result found by only considering the CCH information. We obtain H0=(67.06± 1.68) km/s/Mpc, which favors again the lower range of values for H0 and is in tension with H0HST. The tension reaches the 2.71σ level. We round off the GPs determination too by taking also into account the error propagation of the kernel hyperparameters when the CCH with and without H0HST are used in the analysis. In addition, we present a novel method to reconstruct functions from data, which consists in a weighted sum of polynomial regressions (WPR). We apply it from a cosmographic perspective to reconstruct H(z) and estimate H0 from CCH and SnIa measurements. The result obtained with this method, H0=(68.90± 1.96) km/s/Mpc, is fully compatible with the GPs ones. Finally, a more conservative GPs+WPR value is also provided, H0=(68.45± 2.00) km/s/Mpc, which is still almost 2σ away from H0HST.
DEFF Research Database (Denmark)
Rakhshan, Mohsen; Vafamand, Navid; Khooban, Mohammad Hassan
2018-01-01
This paper introduces a polynomial fuzzy model (PFM)-based maximum power point tracking (MPPT) control approach to increase the performance and efficiency of the solar photovoltaic (PV) electricity generation. The proposed method relies on a polynomial fuzzy modeling, a polynomial parallel......, a direct maximum power (DMP)-based control structure is considered for MPPT. Using the PFM representation, the DMP-based control structure is formulated in terms of SOS conditions. Unlike the conventional approaches, the proposed approach does not require exploring the maximum power operational point...
Development of a polynomial nodal model to the multigroup transport equation in one dimension
International Nuclear Information System (INIS)
Feiz, M.
1986-01-01
A polynomial nodal model that uses Legendre polynomial expansions was developed for the multigroup transport equation in one dimension. The development depends upon the least-squares minimization of the residuals using the approximate functions over the node. Analytical expressions were developed for the polynomial coefficients. The odd moments of the angular neutron flux over the half ranges were used at the internal interfaces, and the Marshak boundary condition was used at the external boundaries. Sample problems with fine-mesh finite-difference solutions of the diffusion and transport equations were used for comparison with the model
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
A polynomial based model for cell fate prediction in human diseases.
Ma, Lichun; Zheng, Jie
2017-12-21
Cell fate regulation directly affects tissue homeostasis and human health. Research on cell fate decision sheds light on key regulators, facilitates understanding the mechanisms, and suggests novel strategies to treat human diseases that are related to abnormal cell development. In this study, we proposed a polynomial based model to predict cell fate. This model was derived from Taylor series. As a case study, gene expression data of pancreatic cells were adopted to test and verify the model. As numerous features (genes) are available, we employed two kinds of feature selection methods, i.e. correlation based and apoptosis pathway based. Then polynomials of different degrees were used to refine the cell fate prediction function. 10-fold cross-validation was carried out to evaluate the performance of our model. In addition, we analyzed the stability of the resultant cell fate prediction model by evaluating the ranges of the parameters, as well as assessing the variances of the predicted values at randomly selected points. Results show that, within both the two considered gene selection methods, the prediction accuracies of polynomials of different degrees show little differences. Interestingly, the linear polynomial (degree 1 polynomial) is more stable than others. When comparing the linear polynomials based on the two gene selection methods, it shows that although the accuracy of the linear polynomial that uses correlation analysis outcomes is a little higher (achieves 86.62%), the one within genes of the apoptosis pathway is much more stable. Considering both the prediction accuracy and the stability of polynomial models of different degrees, the linear model is a preferred choice for cell fate prediction with gene expression data of pancreatic cells. The presented cell fate prediction model can be extended to other cells, which may be important for basic research as well as clinical study of cell development related diseases.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Model Selection in Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter
Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...
Directory of Open Access Journals (Sweden)
Mir Mustafizur Rahman
2014-11-01
Full Text Available Thermal Infrared (TIR remote sensing images of urban environments are increasingly available from airborne and satellite platforms. However, limited access to high-spatial resolution (H-res: ~1 m TIR satellite images requires the use of TIR airborne sensors for mapping large complex urban surfaces, especially at micro-scales. A critical limitation of such H-res mapping is the need to acquire a large scene composed of multiple flight lines and mosaic them together. This results in the same scene components (e.g., roads, buildings, green space and water exhibiting different temperatures in different flight lines. To mitigate these effects, linear relative radiometric normalization (RRN techniques are often applied. However, the Earth’s surface is composed of features whose thermal behaviour is characterized by complexity and non-linearity. Therefore, we hypothesize that non-linear RRN techniques should demonstrate increased radiometric agreement over similar linear techniques. To test this hypothesis, this paper evaluates four (linear and non-linear RRN techniques, including: (i histogram matching (HM; (ii pseudo-invariant feature-based polynomial regression (PIF_Poly; (iii no-change stratified random sample-based linear regression (NCSRS_Lin; and (iv no-change stratified random sample-based polynomial regression (NCSRS_Poly; two of which (ii and iv are newly proposed non-linear techniques. When applied over two adjacent flight lines (~70 km2 of TABI-1800 airborne data, visual and statistical results show that both new non-linear techniques improved radiometric agreement over the previously evaluated linear techniques, with the new fully-automated method, NCSRS-based polynomial regression, providing the highest improvement in radiometric agreement between the master and the slave images, at ~56%. This is ~5% higher than the best previously evaluated linear technique (NCSRS-based linear regression.
International Nuclear Information System (INIS)
Tang, Kunkun; Congedo, Pietro M.; Abgrall, Rémi
2016-01-01
The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable for real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.
Energy Technology Data Exchange (ETDEWEB)
Tang, Kunkun, E-mail: ktg@illinois.edu [The Center for Exascale Simulation of Plasma-Coupled Combustion (XPACC), University of Illinois at Urbana–Champaign, 1308 W Main St, Urbana, IL 61801 (United States); Inria Bordeaux – Sud-Ouest, Team Cardamom, 200 avenue de la Vieille Tour, 33405 Talence (France); Congedo, Pietro M. [Inria Bordeaux – Sud-Ouest, Team Cardamom, 200 avenue de la Vieille Tour, 33405 Talence (France); Abgrall, Rémi [Institut für Mathematik, Universität Zürich, Winterthurerstrasse 190, CH-8057 Zürich (Switzerland)
2016-06-01
The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable for real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.
Model-based multi-fringe interferometry using Zernike polynomials
Gu, Wei; Song, Weihong; Wu, Gaofeng; Quan, Haiyang; Wu, Yongqian; Zhao, Wenchuan
2018-06-01
In this paper, a general phase retrieval method is proposed, which is based on one single interferogram with a small amount of fringes (either tilt or power). Zernike polynomials are used to characterize the phase to be measured; the phase distribution is reconstructed by a non-linear least squares method. Experiments show that the proposed method can obtain satisfactory results compared to the standard phase-shifting interferometry technique. Additionally, the retrace errors of proposed method can be neglected because of the few fringes; it does not need any auxiliary phase shifting facilities (low cost) and it is easy to implement without the process of phase unwrapping.
Du, Xiaosong; Leifsson, Leifur; Grandin, Robert; Meeker, William; Roberts, Ronald; Song, Jiming
2018-04-01
Probability of detection (POD) is widely used for measuring reliability of nondestructive testing (NDT) systems. Typically, POD is determined experimentally, while it can be enhanced by utilizing physics-based computational models in combination with model-assisted POD (MAPOD) methods. With the development of advanced physics-based methods, such as ultrasonic NDT testing, the empirical information, needed for POD methods, can be reduced. However, performing accurate numerical simulations can be prohibitively time-consuming, especially as part of stochastic analysis. In this work, stochastic surrogate models for computational physics-based measurement simulations are developed for cost savings of MAPOD methods while simultaneously ensuring sufficient accuracy. The stochastic surrogate is used to propagate the random input variables through the physics-based simulation model to obtain the joint probability distribution of the output. The POD curves are then generated based on those results. Here, the stochastic surrogates are constructed using non-intrusive polynomial chaos (NIPC) expansions. In particular, the NIPC methods used are the quadrature, ordinary least-squares (OLS), and least-angle regression sparse (LARS) techniques. The proposed approach is demonstrated on the ultrasonic testing simulation of a flat bottom hole flaw in an aluminum block. The results show that the stochastic surrogates have at least two orders of magnitude faster convergence on the statistics than direct Monte Carlo sampling (MCS). Moreover, the evaluation of the stochastic surrogate models is over three orders of magnitude faster than the underlying simulation model for this case, which is the UTSim2 model.
A general U-block model-based design procedure for nonlinear polynomial control systems
Zhu, Q. M.; Zhao, D. Y.; Zhang, Jianhua
2016-10-01
The proposition of U-model concept (in terms of 'providing concise and applicable solutions for complex problems') and a corresponding basic U-control design algorithm was originated in the first author's PhD thesis. The term of U-model appeared (not rigorously defined) for the first time in the first author's other journal paper, which established a framework for using linear polynomial control system design approaches to design nonlinear polynomial control systems (in brief, linear polynomial approaches → nonlinear polynomial plants). This paper represents the next milestone work - using linear state-space approaches to design nonlinear polynomial control systems (in brief, linear state-space approaches → nonlinear polynomial plants). The overall aim of the study is to establish a framework, defined as the U-block model, which provides a generic prototype for using linear state-space-based approaches to design the control systems with smooth nonlinear plants/processes described by polynomial models. For analysing the feasibility and effectiveness, sliding mode control design approach is selected as an exemplary case study. Numerical simulation studies provide a user-friendly step-by-step procedure for the readers/users with interest in their ad hoc applications. In formality, this is the first paper to present the U-model-oriented control system design in a formal way and to study the associated properties and theorems. The previous publications, in the main, have been algorithm-based studies and simulation demonstrations. In some sense, this paper can be treated as a landmark for the U-model-based research from intuitive/heuristic stage to rigour/formal/comprehensive studies.
Large N Penner matrix model and a novel asymptotic formula for the generalized Laguerre polynomials
International Nuclear Information System (INIS)
Deo, N
2003-01-01
The Gaussian Penner matrix model is re-examined in the light of the results which have been found in double-well matrix models. The orthogonal polynomials for the Gaussian Penner model are shown to be the generalized Laguerre polynomials L (α) n (x) with α and x depending on N, the size of the matrix. An asymptotic formula for the orthogonal polynomials is derived following closely the orthogonal polynomial method of Deo (1997 Nucl. Phys. B 504 609). The universality found in the double-well matrix model is extended to include non-polynomial potentials. An asymptotic formula is also found for the Laguerre polynomial using the saddle-point method by rescaling α and x with N. Combining these results a novel asymptotic formula is found for the generalized Laguerre polynomials (different from that given in Szego's book) in a different asymptotic regime. This may have applications in mathematical and physical problems in the future. The density-density correlators are derived and are the same as those found for the double-well matrix models. These correlators in the smoothed large N limit are sensitive to odd and even N where N is the size of the matrix. These results for the two-point density-density correlation function may be useful in finding eigenvalue effects in experiments in mesoscopic systems or small metallic grains. There may be applications to string theory as well as the tunnelling of an eigenvalue from one valley to the other being an important quantity there
Regression models of reactor diagnostic signals
International Nuclear Information System (INIS)
Vavrin, J.
1989-01-01
The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Specification Search for Identifying the Correct Mean Trajectory in Polynomial Latent Growth Models
Kim, Minjung; Kwok, Oi-Man; Yoon, Myeongsun; Willson, Victor; Lai, Mark H. C.
2016-01-01
This study investigated the optimal strategy for model specification search under the latent growth modeling (LGM) framework, specifically on searching for the correct polynomial mean or average growth model when there is no a priori hypothesized model in the absence of theory. In this simulation study, the effectiveness of different starting…
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Exact solution of Chern-Simons-matter matrix models with characteristic/orthogonal polynomials
International Nuclear Information System (INIS)
Tierz, Miguel
2016-01-01
We solve for finite N the matrix model of supersymmetric U(N) Chern-Simons theory coupled to N f fundamental and N f anti-fundamental chiral multiplets of R-charge 1/2 and of mass m, by identifying it with an average of inverse characteristic polynomials in a Stieltjes-Wigert ensemble. This requires the computation of the Cauchy transform of the Stieltjes-Wigert polynomials, which we carry out, finding a relationship with Mordell integrals, and hence with previous analytical results on the matrix model. The semiclassical limit of the model is expressed, for arbitrary N f , in terms of a single Hermite polynomial. This result also holds for more general matter content, involving matrix models with double-sine functions.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Wang, Y. P.; Lu, Z. P.; Sun, D. S.; Wang, N.
2016-01-01
In order to better express the characteristics of satellite clock bias (SCB) and improve SCB prediction precision, this paper proposed a new SCB prediction model which can take physical characteristics of space-borne atomic clock, the cyclic variation, and random part of SCB into consideration. First, the new model employs a quadratic polynomial model with periodic items to fit and extract the trend term and cyclic term of SCB; then based on the characteristics of fitting residuals, a time series ARIMA ~(Auto-Regressive Integrated Moving Average) model is used to model the residuals; eventually, the results from the two models are combined to obtain final SCB prediction values. At last, this paper uses precise SCB data from IGS (International GNSS Service) to conduct prediction tests, and the results show that the proposed model is effective and has better prediction performance compared with the quadratic polynomial model, grey model, and ARIMA model. In addition, the new method can also overcome the insufficiency of the ARIMA model in model recognition and order determination.
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
Computation of the Likelihood in Biallelic Diffusion Models Using Orthogonal Polynomials
Directory of Open Access Journals (Sweden)
Claus Vogl
2014-11-01
Full Text Available In population genetics, parameters describing forces such as mutation, migration and drift are generally inferred from molecular data. Lately, approximate methods based on simulations and summary statistics have been widely applied for such inference, even though these methods waste information. In contrast, probabilistic methods of inference can be shown to be optimal, if their assumptions are met. In genomic regions where recombination rates are high relative to mutation rates, polymorphic nucleotide sites can be assumed to evolve independently from each other. The distribution of allele frequencies at a large number of such sites has been called “allele-frequency spectrum” or “site-frequency spectrum” (SFS. Conditional on the allelic proportions, the likelihoods of such data can be modeled as binomial. A simple model representing the evolution of allelic proportions is the biallelic mutation-drift or mutation-directional selection-drift diffusion model. With series of orthogonal polynomials, specifically Jacobi and Gegenbauer polynomials, or the related spheroidal wave function, the diffusion equations can be solved efficiently. In the neutral case, the product of the binomial likelihoods with the sum of such polynomials leads to finite series of polynomials, i.e., relatively simple equations, from which the exact likelihoods can be calculated. In this article, the use of orthogonal polynomials for inferring population genetic parameters is investigated.
Bai , Shi; Bouvier , Cyril; Kruppa , Alexander; Zimmermann , Paul
2016-01-01
International audience; The general number field sieve (GNFS) is the most efficient algo-rithm known for factoring large integers. It consists of several stages, the first one being polynomial selection. The quality of the selected polynomials can be modelled in terms of size and root properties. We propose a new kind of polynomials for GNFS: with a new degree of freedom, we further improve the size property. We demonstrate the efficiency of our algorithm by exhibiting a better polynomial tha...
Categorical regression dose-response modeling
The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Hong, X; Harris, C J
2000-01-01
This paper introduces a new neurofuzzy model construction algorithm for nonlinear dynamic systems based upon basis functions that are Bézier-Bernstein polynomial functions. This paper is generalized in that it copes with n-dimensional inputs by utilising an additive decomposition construction to overcome the curse of dimensionality associated with high n. This new construction algorithm also introduces univariate Bézier-Bernstein polynomial functions for the completeness of the generalized procedure. Like the B-spline expansion based neurofuzzy systems, Bézier-Bernstein polynomial function based neurofuzzy networks hold desirable properties such as nonnegativity of the basis functions, unity of support, and interpretability of basis function as fuzzy membership functions, moreover with the additional advantages of structural parsimony and Delaunay input space partition, essentially overcoming the curse of dimensionality associated with conventional fuzzy and RBF networks. This new modeling network is based on additive decomposition approach together with two separate basis function formation approaches for both univariate and bivariate Bézier-Bernstein polynomial functions used in model construction. The overall network weights are then learnt using conventional least squares methods. Numerical examples are included to demonstrate the effectiveness of this new data based modeling approach.
Method of moments approach to pricing double barrier contracts in polynomial jump-diffusion models
Eriksson, B.; Pistorius, M.
2011-01-01
Abstract: We present a method of moments approach to pricing double barrier contracts when the underlying is modelled by a polynomial jump-diffusion. By general principles the price is linked to certain infinite dimensional linear programming problems. Subsequently approximating these by finite
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Regression Models For Multivariate Count Data.
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2017-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Testing homogeneity in Weibull-regression models.
Bolfarine, Heleno; Valença, Dione M
2005-10-01
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Mixed-effects regression models in linguistics
Heylen, Kris; Geeraerts, Dirk
2018-01-01
When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Methods of mathematical modeling using polynomials of algebra of sets
Kazanskiy, Alexandr; Kochetkov, Ivan
2018-03-01
The article deals with the construction of discrete mathematical models for solving applied problems arising from the operation of building structures. Security issues in modern high-rise buildings are extremely serious and relevant, and there is no doubt that interest in them will only increase. The territory of the building is divided into zones for which it is necessary to observe. Zones can overlap and have different priorities. Such situations can be described using formulas algebra of sets. Formulas can be programmed, which makes it possible to work with them using computer models.
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Н. Білак
2012-04-01
Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.
Modeling oil production based on symbolic regression
International Nuclear Information System (INIS)
Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju
2015-01-01
Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak
Uncertainty propagation through an aeroelastic wind turbine model using polynomial surrogates
DEFF Research Database (Denmark)
Murcia Leon, Juan Pablo; Réthoré, Pierre-Elouan; Dimitrov, Nikolay Krasimirov
2018-01-01
of the uncertainty in annual energy production due to wind resource variability and/or robust wind power plant layout optimization. It can be concluded that it is possible to capture the global behavior of a modern wind turbine and its uncertainty under realistic inflow conditions using polynomial response surfaces......Polynomial surrogates are used to characterize the energy production and lifetime equivalent fatigue loads for different components of the DTU 10 MW reference wind turbine under realistic atmospheric conditions. The variability caused by different turbulent inflow fields are captured by creating......-alignment. The methodology presented extends the deterministic power and thrust coefficient curves to uncertainty models and adds new variables like damage equivalent fatigue loads in different components of the turbine. These surrogate models can then be implemented inside other work-flows such as: estimation...
Geographically weighted regression model on poverty indicator
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
Freud, Géza
1971-01-01
Orthogonal Polynomials contains an up-to-date survey of the general theory of orthogonal polynomials. It deals with the problem of polynomials and reveals that the sequence of these polynomials forms an orthogonal system with respect to a non-negative m-distribution defined on the real numerical axis. Comprised of five chapters, the book begins with the fundamental properties of orthogonal polynomials. After discussing the momentum problem, it then explains the quadrature procedure, the convergence theory, and G. Szegő's theory. This book is useful for those who intend to use it as referenc
International Nuclear Information System (INIS)
Cooling, C.M.; Williams, M.M.R.; Nygaard, E.T.; Eaton, M.D.
2013-01-01
Highlights: • A point kinetics model for the Medical Isotope Production Reactor is formulated. • Reactivity insertions are simulated using this model. • Polynomial chaos is used to simulate uncertainty in reactor parameters. • The computational efficiency of polynomial chaos is compared to that of Monte Carlo. -- Abstract: This paper models a conceptual Medical Isotope Production Reactor (MIPR) using a point kinetics model which is used to explore power excursions in the event of a reactivity insertion. The effect of uncertainty of key parameters is modelled using intrusive polynomial chaos. It is found that the system is stable against reactivity insertions and power excursions are all bounded and tend towards a new equilibrium state due to the negative feedbacks inherent in Aqueous Homogeneous Reactors (AHRs). The Polynomial Chaos Expansion (PCE) method is found to be much more computationally efficient than that of Monte Carlo simulation in this application
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
General regression and representation model for classification.
Directory of Open Access Journals (Sweden)
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
Confidence bands for inverse regression models
International Nuclear Information System (INIS)
Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo
2010-01-01
We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract
Sraj, Ihab; Zedler, Sarah E.; Knio, Omar; Jackson, Charles S.; Hoteit, Ibrahim
2016-01-01
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference
DEFF Research Database (Denmark)
Tan, Qihua; Bathum, L; Christiansen, L
2003-01-01
In this paper, we apply logistic regression models to measure genetic association with human survival for highly polymorphic and pleiotropic genes. By modelling genotype frequency as a function of age, we introduce a logistic regression model with polytomous responses to handle the polymorphic...... situation. Genotype and allele-based parameterization can be used to investigate the modes of gene action and to reduce the number of parameters, so that the power is increased while the amount of multiple testing minimized. A binomial logistic regression model with fractional polynomials is used to capture...... the age-dependent or antagonistic pleiotropic effects. The models are applied to HFE genotype data to assess the effects on human longevity by different alleles and to detect if an age-dependent effect exists. Application has shown that these methods can serve as useful tools in searching for important...
Directory of Open Access Journals (Sweden)
Ivanka Jerić
2011-11-01
Full Text Available Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample.
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
Estimation of genetic parameters related to eggshell strength using random regression models.
Guo, J; Ma, M; Qu, L; Shen, M; Dou, T; Wang, K
2015-01-01
This study examined the changes in eggshell strength and the genetic parameters related to this trait throughout a hen's laying life using random regression. The data were collected from a crossbred population between 2011 and 2014, where the eggshell strength was determined repeatedly for 2260 hens. Using random regression models (RRMs), several Legendre polynomials were employed to estimate the fixed, direct genetic and permanent environment effects. The residual effects were treated as independently distributed with heterogeneous variance for each test week. The direct genetic variance was included with second-order Legendre polynomials and the permanent environment with third-order Legendre polynomials. The heritability of eggshell strength ranged from 0.26 to 0.43, the repeatability ranged between 0.47 and 0.69, and the estimated genetic correlations between test weeks was high at > 0.67. The first eigenvalue of the genetic covariance matrix accounted for about 97% of the sum of all the eigenvalues. The flexibility and statistical power of RRM suggest that this model could be an effective method to improve eggshell quality and to reduce losses due to cracked eggs in a breeding plan.
Crime Modeling using Spatial Regression Approach
Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.
2018-01-01
Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.
Reduced Multivariate Polynomial Model for Manufacturing Costs Estimation of Piping Elements
Directory of Open Access Journals (Sweden)
Nibaldo Rodriguez
2013-01-01
Full Text Available This paper discusses the development and evaluation of an estimation model of manufacturing costs of piping elements through the application of a Reduced Multivariate Polynomial (RMP. The model allows obtaining accurate estimations, even when enough and adequate information is not available. This situation typically occurs in the early stages of the design process of industrial products. The experimental evaluations show that the approach is capable, with a low complexity, of reducing uncertainties and to predict costs with significant precision. Comparisons with a neural network showed also that the RMP performs better considering a set of classical performance measures with the corresponding lower complexity and higher accuracy.
Binder, Harald; Sauerbrei, Willi; Royston, Patrick
2013-06-15
In observational studies, many continuous or categorical covariates may be related to an outcome. Various spline-based procedures or the multivariable fractional polynomial (MFP) procedure can be used to identify important variables and functional forms for continuous covariates. This is the main aim of an explanatory model, as opposed to a model only for prediction. The type of analysis often guides the complexity of the final model. Spline-based procedures and MFP have tuning parameters for choosing the required complexity. To compare model selection approaches, we perform a simulation study in the linear regression context based on a data structure intended to reflect realistic biomedical data. We vary the sample size, variance explained and complexity parameters for model selection. We consider 15 variables. A sample size of 200 (1000) and R(2) = 0.2 (0.8) is the scenario with the smallest (largest) amount of information. For assessing performance, we consider prediction error, correct and incorrect inclusion of covariates, qualitative measures for judging selected functional forms and further novel criteria. From limited information, a suitable explanatory model cannot be obtained. Prediction performance from all types of models is similar. With a medium amount of information, MFP performs better than splines on several criteria. MFP better recovers simpler functions, whereas splines better recover more complex functions. For a large amount of information and no local structure, MFP and the spline procedures often select similar explanatory models. Copyright © 2012 John Wiley & Sons, Ltd.
International Nuclear Information System (INIS)
Marquette, Ian; Links, Jon
2012-01-01
We study the Bethe ansatz/ordinary differential equation (BA/ODE) correspondence for Bethe ansatz equations that belong to a certain class of coupled, nonlinear, algebraic equations. Through this approach we numerically obtain the generalized Heine–Stieltjes and Van Vleck polynomials in the degenerate, two-level limit for four cases of integrable Bardeen–Cooper–Schrieffer (BCS) pairing models. These are the s-wave pairing model, the p + ip-wave pairing model, the p + ip pairing model coupled to a bosonic molecular pair degree of freedom, and a newly introduced extended d + id-wave pairing model with additional interactions. The zeros of the generalized Heine–Stieltjes polynomials provide solutions of the corresponding Bethe ansatz equations. We compare the roots of the ground states with curves obtained from the solution of a singular integral equation approximation, which allows for a characterization of ground-state phases in these systems. Our techniques also permit the computation of the roots of the excited states. These results illustrate how the BA/ODE correspondence can be used to provide new numerical methods to study a variety of integrable systems. (paper)
AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION
Krzyśko, Mirosław; Smaga, Łukasz
2017-01-01
In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...
Entrepreneurial intention modeling using hierarchical multiple regression
Directory of Open Access Journals (Sweden)
Marina Jeger
2014-12-01
Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.
An Additive-Multiplicative Cox-Aalen Regression Model
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
2002-01-01
Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...
Fitting the Fractional Polynomial Model to Non-Gaussian Longitudinal Data
Directory of Open Access Journals (Sweden)
Ji Hoon Ryoo
2017-08-01
Full Text Available As in cross sectional studies, longitudinal studies involve non-Gaussian data such as binomial, Poisson, gamma, and inverse-Gaussian distributions, and multivariate exponential families. A number of statistical tools have thus been developed to deal with non-Gaussian longitudinal data, including analytic techniques to estimate parameters in both fixed and random effects models. However, as yet growth modeling with non-Gaussian data is somewhat limited when considering the transformed expectation of the response via a linear predictor as a functional form of explanatory variables. In this study, we introduce a fractional polynomial model (FPM that can be applied to model non-linear growth with non-Gaussian longitudinal data and demonstrate its use by fitting two empirical binary and count data models. The results clearly show the efficiency and flexibility of the FPM for such applications.
Variable selection and model choice in geoadditive regression models.
Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard
2009-06-01
Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.
Drzewiecki, Wojciech
2016-12-01
In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Hierarchical regression analysis in structural Equation Modeling
de Jong, P.F.
1999-01-01
In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Wijayanto, D.; Kurohman, F.; Nugroho, RA
2018-03-01
The research purpose was to develop a model bioeconomic of profit maximization that can be applied to red tilapia culture. The development of fish growth model used polynomial growth function. Profit maximization process used the first derivative of profit equation to time of culture equal to zero. This research has also developed the equations to estimate the culture time to reach the target size of the fish harvest. The research proved that this research model could be applied in the red tilapia culture. In the case of this study, red tilapia culture can achieve the maximum profit at 584 days and the profit of Rp. 28,605,731 per culture cycle. If used size target of 250 g, the culture of red tilapia need 82 days of culture time.
Desrosiers, P; Mathieu, P; Desrosiers, Patrick; Lapointe, Luc; Mathieu, Pierre
2003-01-01
We present two constructions of the orthogonal eigenfunctions of the supersymmetric extension of the rational Calogero-Moser-Sutherland model with harmonic confinement. These eigenfunctions are the superspace extension of the generalized Hermite (or Hi-Jack) polynomials. The conserved quantities of the rational supersymmetric model are first related to their trigonometric relatives through a similarity transformation. This leads to a simple expression for the generalized Hermite superpolynomials as a differential operator acting on the corresponding Jack superpolynomials. The second construction relies on the action of the Hamiltonian on the supermonomial basis. This translates into determinantal expressions for the Hamiltonian's eigenfunctions. As an aside, the maximal superintegrability of the supersymmetric rational Calogero-Moser-Sutherland model is demonstrated.
Modeling maximum daily temperature using a varying coefficient regression model
Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith
2014-01-01
Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Lam, Hak-Keung
2016-01-01
This book presents recent research on the stability analysis of polynomial-fuzzy-model-based control systems where the concept of partially/imperfectly matched premises and membership-function dependent analysis are considered. The membership-function-dependent analysis offers a new research direction for fuzzy-model-based control systems by taking into account the characteristic and information of the membership functions in the stability analysis. The book presents on a research level the most recent and advanced research results, promotes the research of polynomial-fuzzy-model-based control systems, and provides theoretical support and point a research direction to postgraduate students and fellow researchers. Each chapter provides numerical examples to verify the analysis results, demonstrate the effectiveness of the proposed polynomial fuzzy control schemes, and explain the design procedure. The book is comprehensively written enclosing detailed derivation steps and mathematical derivations also for read...
Model performance analysis and model validation in logistic regression
Directory of Open Access Journals (Sweden)
Rosa Arboretti Giancristofaro
2007-10-01
Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.
Third-order polynomial model for analyzing stickup state laminated structure in flexible electronics
Meng, Xianhong; Wang, Zihao; Liu, Boya; Wang, Shuodao
2018-02-01
Laminated hard-soft integrated structures play a significant role in the fabrication and development of flexible electronics devices. Flexible electronics have advantageous characteristics such as soft and light-weight, can be folded, twisted, flipped inside-out, or be pasted onto other surfaces of arbitrary shapes. In this paper, an analytical model is presented to study the mechanics of laminated hard-soft structures in flexible electronics under a stickup state. Third-order polynomials are used to describe the displacement field, and the principle of virtual work is adopted to derive the governing equations and boundary conditions. The normal strain and the shear stress along the thickness direction in the bi-material region are obtained analytically, which agree well with the results from finite element analysis. The analytical model can be used to analyze stickup state laminated structures, and can serve as a valuable reference for the failure prediction and optimal design of flexible electronics in the future.
Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits
National Research Council Canada - National Science Library
Gravier, Michael
1999-01-01
.... The research identified logistic regression as a powerful tool for analysis of DMSMS and further developed twenty models attempting to identify the "best" way to model and predict DMSMS using logistic regression...
Narimani, Mohammand; Lam, H K; Dilmaghani, R; Wolfe, Charles
2011-06-01
Relaxed linear-matrix-inequality-based stability conditions for fuzzy-model-based control systems with imperfect premise matching are proposed. First, the derivative of the Lyapunov function, containing the product terms of the fuzzy model and fuzzy controller membership functions, is derived. Then, in the partitioned operating domain of the membership functions, the relations between the state variables and the mentioned product terms are represented by approximated polynomials in each subregion. Next, the stability conditions containing the information of all subsystems and the approximated polynomials are derived. In addition, the concept of the S-procedure is utilized to release the conservativeness caused by considering the whole operating region for approximated polynomials. It is shown that the well-known stability conditions can be special cases of the proposed stability conditions. Simulation examples are given to illustrate the validity of the proposed approach.
Energy Technology Data Exchange (ETDEWEB)
Mehrez, Loujaine [University of Southern California; Ghanem, Roger [University of Southern California; Aitharaju, Venkat [General Motors; Rodgers, William [General Motors
2017-10-23
Design of non-crimp fabric (NCF) composites entails major challenges pertaining to (1) the complex fine-scale morphology of the constituents, (2) the manufacturing-produced inconsistency of this morphology spatially, and thus (3) the ability to build reliable, robust, and efficient computational surrogate models to account for this complex nature. Traditional approaches to construct computational surrogate models have been to average over the fluctuations of the material properties at different scale lengths. This fails to account for the fine-scale features and fluctuations in morphology, material properties of the constituents, as well as fine-scale phenomena such as damage and cracks. In addition, it fails to accurately predict the scatter in macroscopic properties, which is vital to the design process and behavior prediction. In this work, funded in part by the Department of Energy, we present an approach for addressing these challenges by relying on polynomial chaos representations of both input parameters and material properties at different scales. Moreover, we emphasize the efficiency and robustness of integrating the polynomial chaos expansion with multiscale tools to perform multiscale assimilation, characterization, propagation, and prediction, all of which are necessary to construct the data-driven surrogate models required to design under the uncertainty of composites. These data-driven constructions provide an accurate map from parameters (and their uncertainties) at all scales and the system-level behavior relevant for design. While this perspective is quite general and applicable to all multiscale systems, NCF composites present a particular hierarchy of scales that permits the efficient implementation of these concepts.
A New Six-Parameter Model Based on Chebyshev Polynomials for Solar Cells
Directory of Open Access Journals (Sweden)
Shu-xian Lun
2015-01-01
Full Text Available This paper presents a new current-voltage (I-V model for solar cells. It has been proved that series resistance of a solar cell is related to temperature. However, the existing five-parameter model ignores the temperature dependence of series resistance and then only accurately predicts the performance of monocrystalline silicon solar cells. Therefore, this paper uses Chebyshev polynomials to describe the relationship between series resistance and temperature. This makes a new parameter called temperature coefficient for series resistance introduced into the single-diode model. Then, a new six-parameter model for solar cells is established in this paper. This new model can improve the accuracy of the traditional single-diode model and reflect the temperature dependence of series resistance. To validate the accuracy of the six-parameter model in this paper, five kinds of silicon solar cells with different technology types, that is, monocrystalline silicon, polycrystalline silicon, thin film silicon, and tripe-junction amorphous silicon, are tested at different irradiance and temperature conditions. Experiment results show that the six-parameter model proposed in this paper is an I-V model with moderate computational complexity and high precision.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Time series modeling by a regression approach based on a latent process.
Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice
2009-01-01
Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.
The MIDAS Touch: Mixed Data Sampling Regression Models
Ghysels, Eric; Santa-Clara, Pedro; Valkanov, Rossen
2004-01-01
We introduce Mixed Data Sampling (henceforth MIDAS) regression models. The regressions involve time series data sampled at different frequencies. Technically speaking MIDAS models specify conditional expectations as a distributed lag of regressors recorded at some higher sampling frequencies. We examine the asymptotic properties of MIDAS regression estimation and compare it with traditional distributed lag models. MIDAS regressions have wide applicability in macroeconomics and ï¿½nance.
SPSS macros to compare any two fitted values from a regression model.
Weaver, Bruce; Dubois, Sacha
2012-12-01
In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.
Closed-form estimates of the domain of attraction for nonlinear systems via fuzzy-polynomial models.
Pitarch, José Luis; Sala, Antonio; Ariño, Carlos Vicente
2014-04-01
In this paper, the domain of attraction of the origin of a nonlinear system is estimated in closed form via level sets with polynomial boundaries, iteratively computed. In particular, the domain of attraction is expanded from a previous estimate, such as a classical Lyapunov level set. With the use of fuzzy-polynomial models, the domain of attraction analysis can be carried out via sum of squares optimization and an iterative algorithm. The result is a function that bounds the domain of attraction, free from the usual restriction of being positive and decrescent in all the interior of its level sets.
Mixture of Regression Models with Single-Index
Xiang, Sijia; Yao, Weixin
2016-01-01
In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...
Linear regression crash prediction models : issues and proposed solutions.
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Model-based Quantile Regression for Discrete Data
Padellini, Tullia; Rue, Haavard
2018-01-01
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite
Time-Dependent Global Sensitivity Analysis for Long-Term Degeneracy Model Using Polynomial Chaos
Directory of Open Access Journals (Sweden)
Jianbin Guo
2014-07-01
Full Text Available Global sensitivity is used to quantify the influence of uncertain model inputs on the output variability of static models in general. However, very few approaches can be applied for the sensitivity analysis of long-term degeneracy models, as far as time-dependent reliability is concerned. The reason is that the static sensitivity may not reflect the completed sensitivity during the entire life circle. This paper presents time-dependent global sensitivity analysis for long-term degeneracy models based on polynomial chaos expansion (PCE. Sobol’ indices are employed as the time-dependent global sensitivity since they provide accurate information on the selected uncertain inputs. In order to compute Sobol’ indices more efficiently, this paper proposes a moving least squares (MLS method to obtain the time-dependent PCE coefficients with acceptable simulation effort. Then Sobol’ indices can be calculated analytically as a postprocessing of the time-dependent PCE coefficients with almost no additional cost. A test case is used to show how to conduct the proposed method, then this approach is applied to an engineering case, and the time-dependent global sensitivity is obtained for the long-term degeneracy mechanism model.
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.
2016-09-19
A method is presented for inferring the presence of an inclusion inside a domain; the proposed approach is suitable to be used in a diagnostic device with low computational power. Specifically, we use the Bayesian framework for the inference of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic moduli of the materials, and are computed by means of the stochastic Galerkin (SG) projection method. Moreover, the inclusion\\'s geometry is assumed to be unknown, and this is addressed by using a dictionary consisting of several geometrical models with different configurations. A model selection approach based on the evidence provided by the data (Bayes factors) is used to discriminate among the different geometrical models and select the most suitable one. The idea of using a dictionary of pre-computed geometrical models helps to maintain the computational cost of the inference process very low, as most of the computational burden is carried out off-line for the resolution of the SG problems. Numerical tests are used to validate the methodology, assess its performance, and analyze the robustness to model errors. © 2016 Elsevier Ltd
Forecasting Ebola with a regression transmission model
Directory of Open Access Journals (Sweden)
Jason Asher
2018-03-01
Full Text Available We describe a relatively simple stochastic model of Ebola transmission that was used to produce forecasts with the lowest mean absolute error among Ebola Forecasting Challenge participants. The model enabled prediction of peak incidence, the timing of this peak, and final size of the outbreak. The underlying discrete-time compartmental model used a time-varying reproductive rate modeled as a multiplicative random walk driven by the number of infectious individuals. This structure generalizes traditional Susceptible-Infected-Recovered (SIR disease modeling approaches and allows for the flexible consideration of outbreaks with complex trajectories of disease dynamics. Keywords: Ebola, Forecasting, Mathematical modeling, Bayesian inference
Forecasting Ebola with a regression transmission model
Asher, Jason
2017-01-01
We describe a relatively simple stochastic model of Ebola transmission that was used to produce forecasts with the lowest mean absolute error among Ebola Forecasting Challenge participants. The model enabled prediction of peak incidence, the timing of this peak, and final size of the outbreak. The underlying discrete-time compartmental model used a time-varying reproductive rate modeled as a multiplicative random walk driven by the number of infectious individuals. This structure generalizes ...
International Nuclear Information System (INIS)
Alvarez R, J.T.; Morales P, R.
1992-06-01
The absorbed dose for equivalent soft tissue is determined,it is imparted by ophthalmologic applicators, ( 90 Sr/ 90 Y, 1850 MBq) using an extrapolation chamber of variable electrodes; when estimating the slope of the extrapolation curve using a simple lineal regression model is observed that the dose values are underestimated from 17.7 percent up to a 20.4 percent in relation to the estimate of this dose by means of a regression model polynomial two grade, at the same time are observed an improvement in the standard error for the quadratic model until in 50%. Finally the global uncertainty of the dose is presented, taking into account the reproducibility of the experimental arrangement. As conclusion it can infers that in experimental arrangements where the source is to contact with the extrapolation chamber, it was recommended to substitute the lineal regression model by the quadratic regression model, in the determination of the slope of the extrapolation curve, for more exact and accurate measurements of the absorbed dose. (Author)
Genetic analysis of partial egg production records in Japanese quail using random regression models.
Abou Khadiga, G; Mahmoud, B Y F; Farahat, G S; Emam, A M; El-Full, E A
2017-08-01
The main objectives of this study were to detect the most appropriate random regression model (RRM) to fit the data of monthly egg production in 2 lines (selected and control) of Japanese quail and to test the consistency of different criteria of model choice. Data from 1,200 female Japanese quails for the first 5 months of egg production from 4 consecutive generations of an egg line selected for egg production in the first month (EP1) was analyzed. Eight RRMs with different orders of Legendre polynomials were compared to determine the proper model for analysis. All criteria of model choice suggested that the adequate model included the second-order Legendre polynomials for fixed effects, and the third-order for additive genetic effects and permanent environmental effects. Predictive ability of the best model was the highest among all models (ρ = 0.987). According to the best model fitted to the data, estimates of heritability were relatively low to moderate (0.10 to 0.17) showed a descending pattern from the first to the fifth month of production. A similar pattern was observed for permanent environmental effects with greater estimates in the first (0.36) and second (0.23) months of production than heritability estimates. Genetic correlations between separate production periods were higher (0.18 to 0.93) than their phenotypic counterparts (0.15 to 0.87). The superiority of the selected line over the control was observed through significant (P egg production in earlier ages (first and second months) than later ones. A methodology based on random regression animal models can be recommended for genetic evaluation of egg production in Japanese quail. © 2017 Poultry Science Association Inc.
Focused information criterion and model averaging based on weighted composite quantile regression
Xu, Ganggang
2013-08-13
We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non-parametric functions approximated by polynomial splines, we show that, under certain conditions, the asymptotic distribution of the frequentist model averaging WCQR-estimator of a focused parameter is a non-linear mixture of normal distributions. This asymptotic distribution is used to construct confidence intervals that achieve the nominal coverage probability. With properly chosen weights, the focused information criterion based WCQR estimators are not only robust to outliers and non-normal residuals but also can achieve efficiency close to the maximum likelihood estimator, without assuming the true error distribution. Simulation studies and a real data analysis are used to illustrate the effectiveness of the proposed procedure. © 2013 Board of the Foundation of the Scandinavian Journal of Statistics..
Single-site Lennard-Jones models via polynomial chaos surrogates of Monte Carlo molecular simulation
Kadoura, Ahmad Salim; Siripatana, Adil; Sun, Shuyu; Knio, Omar; Hoteit, Ibrahim
2016-01-01
In this work, two Polynomial Chaos (PC) surrogates were generated to reproduce Monte Carlo (MC) molecular simulation results of the canonical (single-phase) and the NVT-Gibbs (two-phase) ensembles for a system of normalized structureless Lennard
Multi-model polynomial chaos surrogate dictionary for Bayesian inference in elasticity problems
Contreras, Andres A.; Le Maî tre, Olivier P.; Aquino, Wilkins; Knio, Omar
2016-01-01
of stiff inclusions embedded in a soft matrix, mimicking tumors in soft tissues. We rely on a polynomial chaos (PC) surrogate to accelerate the inference process. The PC surrogate predicts the dependence of the displacements field with the random elastic
Elsheikh, Ahmed H.; Hoteit, Ibrahim; Wheeler, Mary Fanett
2014-01-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior
Discrete series representations for sl(2|1), Meixner polynomials and oscillator models
International Nuclear Information System (INIS)
Jafarov, E I; Van der Jeugt, J
2012-01-01
We explore a model for a one-dimensional quantum oscillator based on the Lie superalgebra sl(2|1). For this purpose, a class of discrete series representations of sl(2|1) is constructed, each representation characterized by a real number β > 0. In this model, the position and momentum operators of the oscillator are odd elements of sl(2|1) and their expressions involve an arbitrary parameter γ. In each representation, the spectrum of the Hamiltonian is the same as that of a canonical oscillator. The spectrum of a position operator can be continuous or infinite discrete, depending on the value of γ. We determine the position wavefunctions both in the continuous and the discrete case and discuss their properties. In the discrete case, these wavefunctions are given in terms of Meixner polynomials. From the embedding osp(1|2) subset of sl(2|1), it can be seen why the case γ = 1 corresponds to a paraboson oscillator. Consequently, taking the values (β, γ) = (1/2, 1) in the sl(2|1) model yields a canonical oscillator. (paper)
Corporate prediction models, ratios or regression analysis?
Bijnen, E.J.; Wijn, M.F.C.M.
1994-01-01
The models developed in the literature with respect to the prediction of a company s failure are based on ratios. It has been shown before that these models should be rejected on theoretical grounds. Our study of industrial companies in the Netherlands shows that the ratios which are used in
STREAMFLOW AND WATER QUALITY REGRESSION MODELING ...
African Journals Online (AJOL)
... downstream Obigbo station show: consistent time-trends in degree of contamination; linear and non-linear relationships for water quality models against total dissolved solids (TDS), total suspended sediment (TSS), chloride, pH and sulphate; and non-linear relationship for streamflow and water quality transport models.
International Nuclear Information System (INIS)
Schulze-Halberg, Axel; Roy, Pinaki
2017-01-01
We construct energy-dependent potentials for which the Schrödinger equations admit solutions in terms of exceptional orthogonal polynomials. Our method of construction is based on certain point transformations, applied to the equations of exceptional Hermite, Jacobi and Laguerre polynomials. We present several examples of boundary-value problems with energy-dependent potentials that admit a discrete spectrum and the corresponding normalizable solutions in closed form.
Energy Technology Data Exchange (ETDEWEB)
Schulze-Halberg, Axel, E-mail: axgeschu@iun.edu [Department of Mathematics and Actuarial Science, Indiana University Northwest, 3400 Broadway, Gary IN 46408 (United States); Department of Physics, Indiana University Northwest, 3400 Broadway, Gary IN 46408 (United States); Roy, Pinaki, E-mail: pinaki@isical.ac.in [Physics and Applied Mathematics Unit, Indian Statistical Institute, Kolkata 700108 (India)
2017-03-15
We construct energy-dependent potentials for which the Schrödinger equations admit solutions in terms of exceptional orthogonal polynomials. Our method of construction is based on certain point transformations, applied to the equations of exceptional Hermite, Jacobi and Laguerre polynomials. We present several examples of boundary-value problems with energy-dependent potentials that admit a discrete spectrum and the corresponding normalizable solutions in closed form.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Multiattribute shopping models and ridge regression analysis
Timmermans, H.J.P.
1981-01-01
Policy decisions regarding retailing facilities essentially involve multiple attributes of shopping centres. If mathematical shopping models are to contribute to these decision processes, their structure should reflect the multiattribute character of retailing planning. Examination of existing
Linear Regression Models for Estimating True Subsurface ...
Indian Academy of Sciences (India)
47
The objective is to minimize the processing time and computer memory required. 10 to carry out inversion .... to the mainland by two long bridges. .... term. In this approach, the model converges when the squared sum of the differences. 143.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Alternative regression models to assess increase in childhood BMI
Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M
2008-01-01
Abstract Background Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 childre...
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Single-site Lennard-Jones models via polynomial chaos surrogates of Monte Carlo molecular simulation
Energy Technology Data Exchange (ETDEWEB)
Kadoura, Ahmad, E-mail: ahmad.kadoura@kaust.edu.sa, E-mail: adil.siripatana@kaust.edu.sa, E-mail: shuyu.sun@kaust.edu.sa, E-mail: omar.knio@kaust.edu.sa; Sun, Shuyu, E-mail: ahmad.kadoura@kaust.edu.sa, E-mail: adil.siripatana@kaust.edu.sa, E-mail: shuyu.sun@kaust.edu.sa, E-mail: omar.knio@kaust.edu.sa [Computational Transport Phenomena Laboratory, The Earth Sciences and Engineering Department, The Physical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900 (Saudi Arabia); Siripatana, Adil, E-mail: ahmad.kadoura@kaust.edu.sa, E-mail: adil.siripatana@kaust.edu.sa, E-mail: shuyu.sun@kaust.edu.sa, E-mail: omar.knio@kaust.edu.sa; Hoteit, Ibrahim, E-mail: ibrahim.hoteit@kaust.edu.sa [Earth Fluid Modeling and Predicting Group, The Earth Sciences and Engineering Department, The Physical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900 (Saudi Arabia); Knio, Omar, E-mail: ahmad.kadoura@kaust.edu.sa, E-mail: adil.siripatana@kaust.edu.sa, E-mail: shuyu.sun@kaust.edu.sa, E-mail: omar.knio@kaust.edu.sa [Uncertainty Quantification Center, The Applied Mathematics and Computational Science Department, The Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900 (Saudi Arabia)
2016-06-07
In this work, two Polynomial Chaos (PC) surrogates were generated to reproduce Monte Carlo (MC) molecular simulation results of the canonical (single-phase) and the NVT-Gibbs (two-phase) ensembles for a system of normalized structureless Lennard-Jones (LJ) particles. The main advantage of such surrogates, once generated, is the capability of accurately computing the needed thermodynamic quantities in a few seconds, thus efficiently replacing the computationally expensive MC molecular simulations. Benefiting from the tremendous computational time reduction, the PC surrogates were used to conduct large-scale optimization in order to propose single-site LJ models for several simple molecules. Experimental data, a set of supercritical isotherms, and part of the two-phase envelope, of several pure components were used for tuning the LJ parameters (ε, σ). Based on the conducted optimization, excellent fit was obtained for different noble gases (Ar, Kr, and Xe) and other small molecules (CH{sub 4}, N{sub 2}, and CO). On the other hand, due to the simplicity of the LJ model used, dramatic deviations between simulation and experimental data were observed, especially in the two-phase region, for more complex molecules such as CO{sub 2} and C{sub 2} H{sub 6}.
Non-hermitian symmetric N = 2 coset models, Poincare polynomials, and string compactification
International Nuclear Information System (INIS)
Fuchs, J.; Schweigert, C.
1994-01-01
The field identification problem, including fixed point resolution, is solved for the non-hermitian symmetric N = 2 superconformal coset theories. Thereby these models are finally identified as well-defined modular invariant conformal field theories. As an application, the theories are used as subtheories in N = 2 tensor products with c = 9, which in turn are taken as the inner sector of heterotic superstring compactifications. All string theories of this type are classified, and the chiral ring as well as the number of massless generations and anti-generations are computed with the help of the extended Poincare polynomial. Several equivalences between a priori different non-hermitian coset theories show up; in particular there is a level-rank duality for an infinite series of coset theories based on C-type Lie algebras. Further, some general results for generic N = 2 coset theories are proven: a simple formula for the number of identification currents is found, and it is shown that the set of Ramond ground states of any N = 2 coset model is invariant under charge conjugation. (orig.)
Single-site Lennard-Jones models via polynomial chaos surrogates of Monte Carlo molecular simulation
Kadoura, Ahmad Salim
2016-06-01
In this work, two Polynomial Chaos (PC) surrogates were generated to reproduce Monte Carlo (MC) molecular simulation results of the canonical (single-phase) and the NVT-Gibbs (two-phase) ensembles for a system of normalized structureless Lennard-Jones (LJ) particles. The main advantage of such surrogates, once generated, is the capability of accurately computing the needed thermodynamic quantities in a few seconds, thus efficiently replacing the computationally expensive MC molecular simulations. Benefiting from the tremendous computational time reduction, the PC surrogates were used to conduct large-scale optimization in order to propose single-site LJ models for several simple molecules. Experimental data, a set of supercritical isotherms, and part of the two-phase envelope, of several pure components were used for tuning the LJ parameters (ε, σ). Based on the conducted optimization, excellent fit was obtained for different noble gases (Ar, Kr, and Xe) and other small molecules (CH4, N2, and CO). On the other hand, due to the simplicity of the LJ model used, dramatic deviations between simulation and experimental data were observed, especially in the two-phase region, for more complex molecules such as CO2 and C2 H6.
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
on the performance of Autoregressive Moving Average Polynomial
African Journals Online (AJOL)
Timothy Ademakinwa
Distributed Lag (PDL) model, Autoregressive Polynomial Distributed Lag ... Moving Average Polynomial Distributed Lag (ARMAPDL) model. ..... Global Journal of Mathematics and Statistics. Vol. 1. ... Business and Economic Research Center.
Multidimensional Gravitational Models: Fluxbrane and S-Brane Solutions with Polynomials
International Nuclear Information System (INIS)
Ivashchuk, V. D.; Melnikov, V. N.
2007-01-01
Main results in obtaining exact solutions for multidimensional models and their application to solving main problems of modern cosmology and black hole physics are described. Some new results on composite fluxbrane and S-brane solutions for a wide class of intersection rules are presented. These solutions are defined on a product manifold R* x M1 x ... x Mn which contains n Ricci-flat spaces M1,...,Mn with 1-dimensional R* and M1. They are defined up to a set of functions obeying non-linear differential equations equivalent to Toda-type equations with certain boundary conditions imposed. Exact solutions corresponding to configurations with two branes and intersections related to simple Lie algebras C2 and G2 are obtained. In these cases the functions Hs(z), s = 1, 2, are polynomials of degrees: (3, 4) and (6, 10), respectively, in agreement with a conjecture suggested earlier. Examples of simple S-brane solutions describing an accelerated expansion of a certain factor-space are given explicitely
A test for the parameters of multiple linear regression models ...
African Journals Online (AJOL)
A test for the parameters of multiple linear regression models is developed for conducting tests simultaneously on all the parameters of multiple linear regression models. The test is robust relative to the assumptions of homogeneity of variances and absence of serial correlation of the classical F-test. Under certain null and ...
Mixed Frequency Data Sampling Regression Models: The R Package midasr
Directory of Open Access Journals (Sweden)
Eric Ghysels
2016-08-01
Full Text Available When modeling economic relationships it is increasingly common to encounter data sampled at different frequencies. We introduce the R package midasr which enables estimating regression models with variables sampled at different frequencies within a MIDAS regression framework put forward in work by Ghysels, Santa-Clara, and Valkanov (2002. In this article we define a general autoregressive MIDAS regression model with multiple variables of different frequencies and show how it can be specified using the familiar R formula interface and estimated using various optimization methods chosen by the researcher. We discuss how to check the validity of the estimated model both in terms of numerical convergence and statistical adequacy of a chosen regression specification, how to perform model selection based on a information criterion, how to assess forecasting accuracy of the MIDAS regression model and how to obtain a forecast aggregation of different MIDAS regression models. We illustrate the capabilities of the package with a simulated MIDAS regression model and give two empirical examples of application of MIDAS regression.
Impact of multicollinearity on small sample hydrologic regression models
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Korff, Christian
2010-10-01
Starting from the Verma module of U_{q}\\mathfrak {sl}(2) we consider the evaluation module for affine U_{q}\\widehat{\\mathfrak {sl}}(2) and discuss its crystal limit (q → 0). There exists an associated integrable statistical mechanics model on a square lattice defined in terms of vertex configurations. Its transfer matrix is the generating function for noncommutative complete symmetric polynomials in the generators of the affine plactic algebra, an extension of the finite plactic algebra first discussed by Lascoux and Schützenberger. The corresponding noncommutative elementary symmetric polynomials were recently shown to be generated by the transfer matrix of the so-called phase model discussed by Bogoliubov, Izergin and Kitanine. Here we establish that both generating functions satisfy Baxter's TQ-equation in the crystal limit by tying them to special U_{q}\\widehat{ \\mathfrak {sl}}(2) solutions of the Yang-Baxter equation. The TQ-equation amounts to the well-known Jacobi-Trudi formula leading naturally to the definition of noncommutative Schur polynomials. The latter can be employed to define a ring which has applications in conformal field theory and enumerative geometry: it is isomorphic to the fusion ring of the \\widehat{\\mathfrak {sl}}(n)_{k} Wess-Zumino-Novikov-Witten model whose structure constants are the dimensions of spaces of generalized θ-functions over the Riemann sphere with three punctures.
Optimization over polynomials : Selected topics
Laurent, M.; Jang, Sun Young; Kim, Young Rock; Lee, Dae-Woong; Yie, Ikkwon
2014-01-01
Minimizing a polynomial function over a region defined by polynomial inequalities models broad classes of hard problems from combinatorics, geometry and optimization. New algorithmic approaches have emerged recently for computing the global minimum, by combining tools from real algebra (sums of
Identification of Influential Points in a Linear Regression Model
Directory of Open Access Journals (Sweden)
Jan Grosz
2011-03-01
Full Text Available The article deals with the detection and identification of influential points in the linear regression model. Three methods of detection of outliers and leverage points are described. These procedures can also be used for one-sample (independentdatasets. This paper briefly describes theoretical aspects of several robust methods as well. Robust statistics is a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. A simulation model of the simple linear regression is presented.
Directory of Open Access Journals (Sweden)
Yu-Bo Jiao
2015-01-01
Full Text Available The paper presents an effective approach for damage identification of bridge based on Chebyshev polynomial fitting and fuzzy logic systems without considering baseline model data. The modal curvature of damaged bridge can be obtained through central difference approximation based on displacement modal shape. Depending on the modal curvature of damaged structure, Chebyshev polynomial fitting is applied to acquire the curvature of undamaged one without considering baseline parameters. Therefore, modal curvature difference can be derived and used for damage localizing. Subsequently, the normalized modal curvature difference is treated as input variable of fuzzy logic systems for damage condition assessment. Numerical simulation on a simply supported bridge was carried out to demonstrate the feasibility of the proposed method.
Detection of epistatic effects with logic regression and a classical linear regression model.
Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata
2014-02-01
To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
Random regression models for detection of gene by environment interaction
Directory of Open Access Journals (Sweden)
Meuwissen Theo HE
2007-02-01
Full Text Available Abstract Two random regression models, where the effect of a putative QTL was regressed on an environmental gradient, are described. The first model estimates the correlation between intercept and slope of the random regression, while the other model restricts this correlation to 1 or -1, which is expected under a bi-allelic QTL model. The random regression models were compared to a model assuming no gene by environment interactions. The comparison was done with regards to the models ability to detect QTL, to position them accurately and to detect possible QTL by environment interactions. A simulation study based on a granddaughter design was conducted, and QTL were assumed, either by assigning an effect independent of the environment or as a linear function of a simulated environmental gradient. It was concluded that the random regression models were suitable for detection of QTL effects, in the presence and absence of interactions with environmental gradients. Fixing the correlation between intercept and slope of the random regression had a positive effect on power when the QTL effects re-ranked between environments.
Keith, Timothy Z
2014-01-01
Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.
Tutorial on Using Regression Models with Count Outcomes Using R
Directory of Open Access Journals (Sweden)
A. Alexander Beaujean
2016-02-01
Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.
Regularized multivariate regression models with skew-t error distributions
Chen, Lianfu; Pourahmadi, Mohsen; Maadooliat, Mehdi
2014-01-01
We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both
Correlation-regression model for physico-chemical quality of ...
African Journals Online (AJOL)
abusaad
areas, suggesting that groundwater quality in urban areas is closely related with land use ... the ground water, with correlation and regression model is also presented. ...... WHO (World Health Organization) (1985). Health hazards from nitrates.
Santellano-Estrada, E; Becerril-Pérez, C M; de Alba, J; Chang, Y M; Gianola, D; Torres-Hernández, G; Ramírez-Valverde, R
2008-11-01
This study inferred genetic and permanent environmental variation of milk yield in Tropical Milking Criollo cattle and compared 5 random regression test-day models using Wilmink's function and Legendre polynomials. Data consisted of 15,377 test-day records from 467 Tropical Milking Criollo cows that calved between 1974 and 2006 in the tropical lowlands of the Gulf Coast of Mexico and in southern Nicaragua. Estimated heritabilities of test-day milk yields ranged from 0.18 to 0.45, and repeatabilities ranged from 0.35 to 0.68 for the period spanning from 6 to 400 d in milk. Genetic correlation between days in milk 10 and 400 was around 0.50 but greater than 0.90 for most pairs of test days. The model that used first-order Legendre polynomials for additive genetic effects and second-order Legendre polynomials for permanent environmental effects gave the smallest residual variance and was also favored by the Akaike information criterion and likelihood ratio tests.
Wavelet regression model in forecasting crude oil price
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Application of random regression models to the genetic evaluation ...
African Journals Online (AJOL)
The model included fixed regression on AM (range from 30 to 138 mo) and the effect of herd-measurement date concatenation. Random parts of the model were RRM coefficients for additive and permanent environmental effects, while residual effects were modelled to account for heterogeneity of variance by AY. Estimates ...
The APT model as reduced-rank regression
Bekker, P.A.; Dobbelstein, P.; Wansbeek, T.J.
Integrating the two steps of an arbitrage pricing theory (APT) model leads to a reduced-rank regression (RRR) model. So the results on RRR can be used to estimate APT models, making estimation very simple. We give a succinct derivation of estimation of RRR, derive the asymptotic variance of RRR
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Alternative regression models to assess increase in childhood BMI
Directory of Open Access Journals (Sweden)
Mansmann Ulrich
2008-09-01
Full Text Available Abstract Background Body mass index (BMI data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs, quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS. We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. Results GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. Conclusion GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
Alternative regression models to assess increase in childhood BMI.
Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M
2008-09-08
Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
Exergy Analysis of a Subcritical Reheat Steam Power Plant with Regression Modeling and Optimization
Directory of Open Access Journals (Sweden)
MUHIB ALI RAJPER
2016-07-01
Full Text Available In this paper, exergy analysis of a 210 MW SPP (Steam Power Plant is performed. Firstly, the plant is modeled and validated, followed by a parametric study to show the effects of various operating parameters on the performance parameters. The net power output, energy efficiency, and exergy efficiency are taken as the performance parameters, while the condenser pressure, main steam pressure, bled steam pressures, main steam temperature, and reheat steam temperature isnominated as the operating parameters. Moreover, multiple polynomial regression models are developed to correlate each performance parameter with the operating parameters. The performance is then optimizedby using Direct-searchmethod. According to the results, the net power output, energy efficiency, and exergy efficiency are calculated as 186.5 MW, 31.37 and 30.41%, respectively under normal operating conditions as a base case. The condenser is a major contributor towards the energy loss, followed by the boiler, whereas the highest irreversibilities occur in the boiler and turbine. According to the parametric study, variation in the operating parameters greatly influences the performance parameters. The regression models have appeared to be a good estimator of the performance parameters. The optimum net power output, energy efficiency and exergy efficiency are obtained as 227.6 MW, 37.4 and 36.4, respectively, which have been calculated along with optimal values of selected operating parameters.
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Karami, K; Zerehdaran, S; Barzanooni, B; Lotfi, E
2017-12-01
1. The aim of the present study was to estimate genetic parameters for average egg weight (EW) and egg number (EN) at different ages in Japanese quail using multi-trait random regression (MTRR) models. 2. A total of 8534 records from 900 quail, hatched between 2014 and 2015, were used in the study. Average weekly egg weights and egg numbers were measured from second until sixth week of egg production. 3. Nine random regression models were compared to identify the best order of the Legendre polynomials (LP). The most optimal model was identified by the Bayesian Information Criterion. A model with second order of LP for fixed effects, second order of LP for additive genetic effects and third order of LP for permanent environmental effects (MTRR23) was found to be the best. 4. According to the MTRR23 model, direct heritability for EW increased from 0.26 in the second week to 0.53 in the sixth week of egg production, whereas the ratio of permanent environment to phenotypic variance decreased from 0.48 to 0.1. Direct heritability for EN was low, whereas the ratio of permanent environment to phenotypic variance decreased from 0.57 to 0.15 during the production period. 5. For each trait, estimated genetic correlations among weeks of egg production were high (from 0.85 to 0.98). Genetic correlations between EW and EN were low and negative for the first two weeks, but they were low and positive for the rest of the egg production period. 6. In conclusion, random regression models can be used effectively for analysing egg production traits in Japanese quail. Response to selection for increased egg weight would be higher at older ages because of its higher heritability and such a breeding program would have no negative genetic impact on egg production.
Linear regression models for quantitative assessment of left ...
African Journals Online (AJOL)
Changes in left ventricular structures and function have been reported in cardiomyopathies. No prediction models have been established in this environment. This study established regression models for prediction of left ventricular structures in normal subjects. A sample of normal subjects was drawn from a large urban ...
Geographically Weighted Logistic Regression Applied to Credit Scoring Models
Directory of Open Access Journals (Sweden)
Pedro Henrique Melo Albuquerque
Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.
Physics constrained nonlinear regression models for time series
International Nuclear Information System (INIS)
Majda, Andrew J; Harlim, John
2013-01-01
A central issue in contemporary science is the development of data driven statistical nonlinear dynamical models for time series of partial observations of nature or a complex physical model. It has been established recently that ad hoc quadratic multi-level regression (MLR) models can have finite-time blow up of statistical solutions and/or pathological behaviour of their invariant measure. Here a new class of physics constrained multi-level quadratic regression models are introduced, analysed and applied to build reduced stochastic models from data of nonlinear systems. These models have the advantages of incorporating memory effects in time as well as the nonlinear noise from energy conserving nonlinear interactions. The mathematical guidelines for the performance and behaviour of these physics constrained MLR models as well as filtering algorithms for their implementation are developed here. Data driven applications of these new multi-level nonlinear regression models are developed for test models involving a nonlinear oscillator with memory effects and the difficult test case of the truncated Burgers–Hopf model. These new physics constrained quadratic MLR models are proposed here as process models for Bayesian estimation through Markov chain Monte Carlo algorithms of low frequency behaviour in complex physical data. (paper)
Chen, Sheng; Hong, Xia; Khalaf, Emad F; Alsaadi, Fuad E; Harris, Chris J
2017-12-01
Complex-valued (CV) B-spline neural network approach offers a highly effective means for identifying and inverting practical Hammerstein systems. Compared with its conventional CV polynomial-based counterpart, a CV B-spline neural network has superior performance in identifying and inverting CV Hammerstein systems, while imposing a similar complexity. This paper reviews the optimality of the CV B-spline neural network approach. Advantages of B-spline neural network approach as compared with the polynomial based modeling approach are extensively discussed, and the effectiveness of the CV neural network-based approach is demonstrated in a real-world application. More specifically, we evaluate the comparative performance of the CV B-spline and polynomial-based approaches for the nonlinear iterative frequency-domain decision feedback equalization (NIFDDFE) of single-carrier Hammerstein channels. Our results confirm the superior performance of the CV B-spline-based NIFDDFE over its CV polynomial-based counterpart.
Directory of Open Access Journals (Sweden)
Nelson Maculan
2003-01-01
Full Text Available We present integer linear models with a polynomial number of variables and constraints for combinatorial optimization problems in graphs: optimum elementary cycles, optimum elementary paths and optimum tree problems.Apresentamos modelos lineares inteiros com um número polinomial de variáveis e restrições para problemas de otimização combinatória em grafos: ciclos elementares ótimos, caminhos elementares ótimos e problemas em árvores ótimas.
Model-based Quantile Regression for Discrete Data
Padellini, Tullia
2018-04-10
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.
Chatzis, Sotirios P; Andreou, Andreas S
2015-11-01
Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.
Forecasting daily meteorological time series using ARIMA and regression models
Murat, Małgorzata; Malinowska, Iwona; Gos, Magdalena; Krzyszczak, Jaromir
2018-04-01
The daily air temperature and precipitation time series recorded between January 1, 1980 and December 31, 2010 in four European sites (Jokioinen, Dikopshof, Lleida and Lublin) from different climatic zones were modeled and forecasted. In our forecasting we used the methods of the Box-Jenkins and Holt- Winters seasonal auto regressive integrated moving-average, the autoregressive integrated moving-average with external regressors in the form of Fourier terms and the time series regression, including trend and seasonality components methodology with R software. It was demonstrated that obtained models are able to capture the dynamics of the time series data and to produce sensible forecasts.
Multiple Response Regression for Gaussian Mixture Models with Known Labels.
Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng
2012-12-01
Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.
Thermal Efficiency Degradation Diagnosis Method Using Regression Model
International Nuclear Information System (INIS)
Jee, Chang Hyun; Heo, Gyun Young; Jang, Seok Won; Lee, In Cheol
2011-01-01
This paper proposes an idea for thermal efficiency degradation diagnosis in turbine cycles, which is based on turbine cycle simulation under abnormal conditions and a linear regression model. The correlation between the inputs for representing degradation conditions (normally unmeasured but intrinsic states) and the simulation outputs (normally measured but superficial states) was analyzed with the linear regression model. The regression models can inversely response an associated intrinsic state for a superficial state observed from a power plant. The diagnosis method proposed herein is classified into three processes, 1) simulations for degradation conditions to get measured states (referred as what-if method), 2) development of the linear model correlating intrinsic and superficial states, and 3) determination of an intrinsic state using the superficial states of current plant and the linear regression model (referred as inverse what-if method). The what-if method is to generate the outputs for the inputs including various root causes and/or boundary conditions whereas the inverse what-if method is the process of calculating the inverse matrix with the given superficial states, that is, component degradation modes. The method suggested in this paper was validated using the turbine cycle model for an operating power plant
Harrell , Jr , Frank E
2015-01-01
This highly anticipated second edition features new chapters and sections, 225 new references, and comprehensive R software. In keeping with the previous edition, this book is about the art and science of data analysis and predictive modeling, which entails choosing and using multiple tools. Instead of presenting isolated techniques, this text emphasizes problem solving strategies that address the many issues arising when developing multivariable models using real data and not standard textbook examples. It includes imputation methods for dealing with missing data effectively, methods for fitting nonlinear relationships and for making the estimation of transformations a formal part of the modeling process, methods for dealing with "too many variables to analyze and not enough observations," and powerful model validation techniques based on the bootstrap. The reader will gain a keen understanding of predictive accuracy, and the harm of categorizing continuous predictors or outcomes. This text realistically...
Machado, Fabiana Andrade; Nakamura, Fábio Yuzo; Moraes, Solange Marta Franzói De
2012-01-01
This study examined the influence of the regression model and initial intensity of an incremental test on the relationship between the lactate threshold estimated by the maximal-deviation method and the endurance performance. Sixteen non-competitive, recreational female runners performed a discontinuous incremental treadmill test. The initial speed was set at 7 km · h⁻¹, and increased every 3 min by 1 km · h⁻¹ with a 30-s rest between the stages used for earlobe capillary blood sample collection. Lactate-speed data were fitted by an exponential-plus-constant and a third-order polynomial equation. The lactate threshold was determined for both regression equations, using all the coordinates, excluding the first and excluding the first and second initial points. Mean speed of a 10-km road race was the performance index (3.04 ± 0.22 m · s⁻¹). The exponentially-derived lactate threshold had a higher correlation (0.98 ≤ r ≤ 0.99) and smaller standard error of estimate (SEE) (0.04 ≤ SEE ≤ 0.05 m · s⁻¹) with performance than the polynomially-derived equivalent (0.83 ≤ r ≤ 0.89; 0.10 ≤ SEE ≤ 0.13 m · s⁻¹). The exponential lactate threshold was greater than the polynomial equivalent (P performance index that is independent of the initial intensity of the incremental test and better than the polynomial equivalent.
Flexible competing risks regression modeling and goodness-of-fit
DEFF Research Database (Denmark)
Scheike, Thomas; Zhang, Mei-Jie
2008-01-01
In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative incidence curve based on these cause...... models that is easy to fit and contains the Fine-Gray model as a special case. One advantage of this approach is that our regression modeling allows for non-proportional hazards. This leads to a new simple goodness-of-fit procedure for the proportional subdistribution hazards assumption that is very easy...... of the flexible regression models to analyze competing risks data when non-proportionality is present in the data....
The art of regression modeling in road safety
Hauer, Ezra
2015-01-01
This unique book explains how to fashion useful regression models from commonly available data to erect models essential for evidence-based road safety management and research. Composed from techniques and best practices presented over many years of lectures and workshops, The Art of Regression Modeling in Road Safety illustrates that fruitful modeling cannot be done without substantive knowledge about the modeled phenomenon. Class-tested in courses and workshops across North America, the book is ideal for professionals, researchers, university professors, and graduate students with an interest in, or responsibilities related to, road safety. This book also: · Presents for the first time a powerful analytical tool for road safety researchers and practitioners · Includes problems and solutions in each chapter as well as data and spreadsheets for running models and PowerPoint presentation slides · Features pedagogy well-suited for graduate courses and workshops including problems, solutions, and PowerPoint p...
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
Regression analysis of a chemical reaction fouling model
International Nuclear Information System (INIS)
Vasak, F.; Epstein, N.
1996-01-01
A previously reported mathematical model for the initial chemical reaction fouling of a heated tube is critically examined in the light of the experimental data for which it was developed. A regression analysis of the model with respect to that data shows that the reference point upon which the two adjustable parameters of the model were originally based was well chosen, albeit fortuitously. (author). 3 refs., 2 tabs., 2 figs
Spatial stochastic regression modelling of urban land use
International Nuclear Information System (INIS)
Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A
2014-01-01
Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable
Direction of Effects in Multiple Linear Regression Models.
Wiedermann, Wolfgang; von Eye, Alexander
2015-01-01
Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.
Logistic regression for risk factor modelling in stuttering research.
Reed, Phil; Wu, Yaqionq
2013-06-01
To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.
Modeling and prediction of flotation performance using support vector regression
Directory of Open Access Journals (Sweden)
Despotović Vladimir
2017-01-01
Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.
Bayesian approach to errors-in-variables in regression models
Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad
2017-05-01
In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.
Selapa, N W; Nephawe, K A; Maiwashe, A; Norris, D
2012-02-08
The aim of this study was to estimate genetic parameters for body weights of individually fed beef bulls measured at centralized testing stations in South Africa using random regression models. Weekly body weights of Bonsmara bulls (N = 2919) tested between 1999 and 2003 were available for the analyses. The model included a fixed regression of the body weights on fourth-order orthogonal Legendre polynomials of the actual days on test (7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, and 84) for starting age and contemporary group effects. Random regressions on fourth-order orthogonal Legendre polynomials of the actual days on test were included for additive genetic effects and additional uncorrelated random effects of the weaning-herd-year and the permanent environment of the animal. Residual effects were assumed to be independently distributed with heterogeneous variance for each test day. Variance ratios for additive genetic, permanent environment and weaning-herd-year for weekly body weights at different test days ranged from 0.26 to 0.29, 0.37 to 0.44 and 0.26 to 0.34, respectively. The weaning-herd-year was found to have a significant effect on the variation of body weights of bulls despite a 28-day adjustment period. Genetic correlations amongst body weights at different test days were high, ranging from 0.89 to 1.00. Heritability estimates were comparable to literature using multivariate models. Therefore, random regression model could be applied in the genetic evaluation of body weight of individually fed beef bulls in South Africa.
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Variable Selection for Regression Models of Percentile Flows
Fouad, G.
2017-12-01
Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high
Linearity and Misspecification Tests for Vector Smooth Transition Regression Models
DEFF Research Database (Denmark)
Teräsvirta, Timo; Yang, Yukai
The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...
Application of multilinear regression analysis in modeling of soil ...
African Journals Online (AJOL)
The application of Multi-Linear Regression Analysis (MLRA) model for predicting soil properties in Calabar South offers a technical guide and solution in foundation designs problems in the area. Forty-five soil samples were collected from fifteen different boreholes at a different depth and 270 tests were carried out for CBR, ...
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2009-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2010-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2011-01-01
In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator
A binary logistic regression model with complex sampling design of ...
African Journals Online (AJOL)
2017-09-03
Sep 3, 2017 ... Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. .... Data was entered into STATA-12 and analyzed using. SPSS-21. .... lack of access/too far or costs too much. 35. 1.2.
Transpiration of glasshouse rose crops: evaluation of regression models
Baas, R.; Rijssel, van E.
2006-01-01
Regression models of transpiration (T) based on global radiation inside the greenhouse (G), with or without energy input from heating pipes (Eh) and/or vapor pressure deficit (VPD) were parameterized. Therefore, data on T, G, temperatures from air, canopy and heating pipes, and VPD from both a
Directory of Open Access Journals (Sweden)
Liu KJ Ray
2002-01-01
Full Text Available Orthogonal frequency division multiplexing (OFDM is an effective technique for the future 3G communications because of its great immunity to impulse noise and intersymbol interference. The channel estimation is a crucial aspect in the design of OFDM systems. In this work, we propose a channel estimation algorithm based on a time-frequency polynomial model of the fading multipath channels. The algorithm exploits the correlation of the channel responses in both time and frequency domains and hence reduce more noise than the methods using only time or frequency polynomial model. The estimator is also more robust compared to the existing methods based on Fourier transform. The simulation shows that it has more than improvement in terms of mean-squared estimation error under some practical channel conditions. The algorithm needs little prior knowledge about the delay and fading properties of the channel. The algorithm can be implemented recursively and can adjust itself to follow the variation of the channel statistics.
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest haveÂ increased for the spatial modeling and mapping of continuousÂ variables. Random forest is a non-parametric ensembleÂ approach, and unlike traditional regression approaches thereÂ is no direct quantification of prediction error. UnderstandingÂ prediction uncertainty is important when using model-basedÂ continuous maps as...
CICAAR - Convolutive ICA with an Auto-Regressive Inverse Model
DEFF Research Database (Denmark)
Dyrholm, Mads; Hansen, Lars Kai
2004-01-01
We invoke an auto-regressive IIR inverse model for convolutive ICA and derive expressions for the likelihood and its gradient. We argue that optimization will give a stable inverse. When there are more sensors than sources the mixing model parameters are estimated in a second step by least square...... estimation. We demonstrate the method on synthetic data and finally separate speech and music in a real room recording....
On concurvity in nonlinear and nonparametric regression models
Directory of Open Access Journals (Sweden)
Sonia Amodio
2014-12-01
Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Directory of Open Access Journals (Sweden)
Minh Vu Trieu
2017-03-01
Full Text Available This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS, Brazilian tensile strength (BTS, rock brittleness index (BI, the distance between planes of weakness (DPW, and the alpha angle (Alpha between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP. Four (4 statistical regression models (two linear and two nonlinear are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2 of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno
2017-03-01
This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
Detection of Outliers in Regression Model for Medical Data
Directory of Open Access Journals (Sweden)
Stephen Raj S
2017-07-01
Full Text Available In regression analysis, an outlier is an observation for which the residual is large in magnitude compared to other observations in the data set. The detection of outliers and influential points is an important step of the regression analysis. Outlier detection methods have been used to detect and remove anomalous values from data. In this paper, we detect the presence of outliers in simple linear regression models for medical data set. Chatterjee and Hadi mentioned that the ordinary residuals are not appropriate for diagnostic purposes; a transformed version of them is preferable. First, we investigate the presence of outliers based on existing procedures of residuals and standardized residuals. Next, we have used the new approach of standardized scores for detecting outliers without the use of predicted values. The performance of the new approach was verified with the real-life data.
Hierarchical Neural Regression Models for Customer Churn Prediction
Directory of Open Access Journals (Sweden)
Golshan Mohammadi
2013-01-01
Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.
Electricity consumption forecasting in Italy using linear regression models
Energy Technology Data Exchange (ETDEWEB)
Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio [DIAM, Seconda Universita degli Studi di Napoli, Via Roma 29, 81031 Aversa (CE) (Italy)
2009-09-15
The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of {+-}1% for the best case and {+-}11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)
Electricity consumption forecasting in Italy using linear regression models
International Nuclear Information System (INIS)
Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio
2009-01-01
The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of ±1% for the best case and ±11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)
Regression Model to Predict Global Solar Irradiance in Malaysia
Directory of Open Access Journals (Sweden)
Hairuniza Ahmed Kutty
2015-01-01
Full Text Available A novel regression model is developed to estimate the monthly global solar irradiance in Malaysia. The model is developed based on different available meteorological parameters, including temperature, cloud cover, rain precipitate, relative humidity, wind speed, pressure, and gust speed, by implementing regression analysis. This paper reports on the details of the analysis of the effect of each prediction parameter to identify the parameters that are relevant to estimating global solar irradiance. In addition, the proposed model is compared in terms of the root mean square error (RMSE, mean bias error (MBE, and the coefficient of determination (R2 with other models available from literature studies. Seven models based on single parameters (PM1 to PM7 and five multiple-parameter models (PM7 to PM12 are proposed. The new models perform well, with RMSE ranging from 0.429% to 1.774%, R2 ranging from 0.942 to 0.992, and MBE ranging from −0.1571% to 0.6025%. In general, cloud cover significantly affects the estimation of global solar irradiance. However, cloud cover in Malaysia lacks sufficient influence when included into multiple-parameter models although it performs fairly well in single-parameter prediction models.
Two-step variable selection in quantile regression models
Directory of Open Access Journals (Sweden)
FAN Yali
2015-06-01
Full Text Available We propose a two-step variable selection procedure for high dimensional quantile regressions, in which the dimension of the covariates, pn is much larger than the sample size n. In the first step, we perform ℓ1 penalty, and we demonstrate that the first step penalized estimator with the LASSO penalty can reduce the model from an ultra-high dimensional to a model whose size has the same order as that of the true model, and the selected model can cover the true model. The second step excludes the remained irrelevant covariates by applying the adaptive LASSO penalty to the reduced model obtained from the first step. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. We conduct a simulation study and a real data analysis to evaluate the finite sample performance of the proposed approach.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Santos-Concejero, Jordan; Tucker, Ross; Granados, Cristina; Irazusta, Jon; Bidaurrazaga-Letona, Iraia; Zabala-Lili, Jon; Gil, Susana María
2014-01-01
This study investigated the influence of the regression model and initial intensity during an incremental test on the relationship between the lactate threshold estimated by the maximal-deviation method and performance in elite-standard runners. Twenty-three well-trained runners completed a discontinuous incremental running test on a treadmill. Speed started at 9 km · h(-1) and increased by 1.5 km · h(-1) every 4 min until exhaustion, with a minute of recovery for blood collection. Lactate-speed data were fitted by exponential and polynomial models. The lactate threshold was determined for both models, using all the co-ordinates, excluding the first and excluding the first and second points. The exponential lactate threshold was greater than the polynomial equivalent in any co-ordinate condition (P performance and is independent of the initial intensity of the test.
THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE
Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan
2015-01-01
Background: The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran?s universities. Methods: This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran?s public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For pr...
BOX-COX transformation and random regression models for fecal egg count data
Directory of Open Access Journals (Sweden)
Marcos Vinicius Silva
2012-01-01
Full Text Available Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants fecal egg count (FEC is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used to achieve normality before analysis. However, the transformed data are often not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6,375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (covariance components. We also proposed using random regression models (RRM for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4 adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
Box-Cox Transformation and Random Regression Models for Fecal egg Count Data.
da Silva, Marcos Vinícius Gualberto Barbosa; Van Tassell, Curtis P; Sonstegard, Tad S; Cobuci, Jaime Araujo; Gasbarre, Louis C
2011-01-01
Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants, fecal egg count (FEC) is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used in an effort to achieve normality before analysis. However, the transformed data are often still not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (co)variance components. We also proposed using random regression models (RRM) for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4) adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
Directory of Open Access Journals (Sweden)
C. I. Cho
2016-05-01
Full Text Available The objectives of the study were to estimate genetic parameters for milk production traits of Holstein cattle using random regression models (RRMs, and to compare the goodness of fit of various RRMs with homogeneous and heterogeneous residual variances. A total of 126,980 test-day milk production records of the first parity Holstein cows between 2007 and 2014 from the Dairy Cattle Improvement Center of National Agricultural Cooperative Federation in South Korea were used. These records included milk yield (MILK, fat yield (FAT, protein yield (PROT, and solids-not-fat yield (SNF. The statistical models included random effects of genetic and permanent environments using Legendre polynomials (LP of the third to fifth order (L3–L5, fixed effects of herd-test day, year-season at calving, and a fixed regression for the test-day record (third to fifth order. The residual variances in the models were either homogeneous (HOM or heterogeneous (15 classes, HET15; 60 classes, HET60. A total of nine models (3 orders of polynomials×3 types of residual variance including L3-HOM, L3-HET15, L3-HET60, L4-HOM, L4-HET15, L4-HET60, L5-HOM, L5-HET15, and L5-HET60 were compared using Akaike information criteria (AIC and/or Schwarz Bayesian information criteria (BIC statistics to identify the model(s of best fit for their respective traits. The lowest BIC value was observed for the models L5-HET15 (MILK; PROT; SNF and L4-HET15 (FAT, which fit the best. In general, the BIC values of HET15 models for a particular polynomial order was lower than that of the HET60 model in most cases. This implies that the orders of LP and types of residual variances affect the goodness of models. Also, the heterogeneity of residual variances should be considered for the test-day analysis. The heritability estimates of from the best fitted models ranged from 0.08 to 0.15 for MILK, 0.06 to 0.14 for FAT, 0.08 to 0.12 for PROT, and 0.07 to 0.13 for SNF according to days in milk of first
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Directory of Open Access Journals (Sweden)
Arístides Alejandro Legrá-Lobaina
2016-10-01
Full Text Available The local polynomial method is based on assuming that is possible to estimate the value of a U variable in any of the P coordinate through local polynomials estimated based on approximate data. This investigation analyzes the probability of modeling in two dimensions the thickness and nickel, iron and cobalt concentrations in a block of Cuban laterite ores by using the mentioned method. It was also analyzed if the results of modeling these variables depend on the estimation method that is used.
Irreducible multivariate polynomials obtained from polynomials in ...
Indian Academy of Sciences (India)
Hall, 1409 W. Green Street, Urbana, IL 61801, USA. E-mail: Nicolae. ... Theorem A. If we write an irreducible polynomial f ∈ K[X] as a sum of polynomials a0,..., an ..... This shows us that deg ai = (n − i) deg f2 for each i = 0,..., n, so min k>0.
Reconstruction of missing daily streamflow data using dynamic regression models
Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault
2015-12-01
River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.
Predicting and Modelling of Survival Data when Cox's Regression Model does not hold
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
2002-01-01
Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects...
Directory of Open Access Journals (Sweden)
Renata Pires Gonçalves
2012-02-01
. The experiments of type dosage x response are very common in the determination of levels of nutrients in optimal food balance and include the use of regression models to achieve this objective. Nevertheless, the regression analysis routine, generally, uses a priori information about a possible relationship between the response variable. The isotonic regression is a method of estimation by least squares that generates estimates which preserves data ordering. In the theory of isotonic regression this information is essential and it is expected to increase fitting efficiency. The objective of this work was to use an isotonic regression methodology, as an alternative way of analyzing data of Zn deposition in tibia of male birds of Hubbard lineage. We considered the models of plateau response of polynomial quadratic and linear exponential forms. In addition to these models, we also proposed the fitting of a logarithmic model to the data and the efficiency of the methodology was evaluated by Monte Carlo simulations, considering different scenarios for the parametric values. The isotonization of the data yielded an improvement in all the fitting quality parameters evaluated. Among the models used, the logarithmic presented estimates of the parameters more consistent with the values reported in literature.
Extended cox regression model: The choice of timefunction
Isik, Hatice; Tutkun, Nihal Ata; Karasoy, Durdu
2017-07-01
Cox regression model (CRM), which takes into account the effect of censored observations, is one the most applicative and usedmodels in survival analysis to evaluate the effects of covariates. Proportional hazard (PH), requires a constant hazard ratio over time, is the assumptionofCRM. Using extended CRM provides the test of including a time dependent covariate to assess the PH assumption or an alternative model in case of nonproportional hazards. In this study, the different types of real data sets are used to choose the time function and the differences between time functions are analyzed and discussed.
A test of inflated zeros for Poisson regression models.
He, Hua; Zhang, Hui; Ye, Peng; Tang, Wan
2017-01-01
Excessive zeros are common in practice and may cause overdispersion and invalidate inference when fitting Poisson regression models. There is a large body of literature on zero-inflated Poisson models. However, methods for testing whether there are excessive zeros are less well developed. The Vuong test comparing a Poisson and a zero-inflated Poisson model is commonly applied in practice. However, the type I error of the test often deviates seriously from the nominal level, rendering serious doubts on the validity of the test in such applications. In this paper, we develop a new approach for testing inflated zeros under the Poisson model. Unlike the Vuong test for inflated zeros, our method does not require a zero-inflated Poisson model to perform the test. Simulation studies show that when compared with the Vuong test our approach not only better at controlling type I error rate, but also yield more power.
Kruglyakov, Mikhail; Kuvshinov, Alexey
2018-05-01
3-D interpretation of electromagnetic (EM) data of different origin and scale becomes a common practice worldwide. However, 3-D EM numerical simulations (modeling)—a key part of any 3-D EM data analysis—with realistic levels of complexity, accuracy and spatial detail still remains challenging from the computational point of view. We present a novel, efficient 3-D numerical solver based on a volume integral equation (IE) method. The efficiency is achieved by using a high-order polynomial (HOP) basis instead of the zero-order (piecewise constant) basis that is invoked in all routinely used IE-based solvers. We demonstrate that usage of the HOP basis allows us to decrease substantially the number of unknowns (preserving the same accuracy), with corresponding speed increase and memory saving.
Multivariate Frequency-Severity Regression Models in Insurance
Directory of Open Access Journals (Sweden)
Edward W. Frees
2016-02-01
Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.
Augmented Beta rectangular regression models: A Bayesian perspective.
Wang, Jue; Luo, Sheng
2016-01-01
Mixed effects Beta regression models based on Beta distributions have been widely used to analyze longitudinal percentage or proportional data ranging between zero and one. However, Beta distributions are not flexible to extreme outliers or excessive events around tail areas, and they do not account for the presence of the boundary values zeros and ones because these values are not in the support of the Beta distributions. To address these issues, we propose a mixed effects model using Beta rectangular distribution and augment it with the probabilities of zero and one. We conduct extensive simulation studies to assess the performance of mixed effects models based on both the Beta and Beta rectangular distributions under various scenarios. The simulation studies suggest that the regression models based on Beta rectangular distributions improve the accuracy of parameter estimates in the presence of outliers and heavy tails. The proposed models are applied to the motivating Neuroprotection Exploratory Trials in Parkinson's Disease (PD) Long-term Study-1 (LS-1 study, n = 1741), developed by The National Institute of Neurological Disorders and Stroke Exploratory Trials in Parkinson's Disease (NINDS NET-PD) network. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bayesian semiparametric regression models to characterize molecular evolution
Directory of Open Access Journals (Sweden)
Datta Saheli
2012-10-01
Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.
Regularized multivariate regression models with skew-t error distributions
Chen, Lianfu
2014-06-01
We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.
Modeling the number of car theft using Poisson regression
Zulkifli, Malina; Ling, Agnes Beh Yen; Kasim, Maznah Mat; Ismail, Noriszura
2016-10-01
Regression analysis is the most popular statistical methods used to express the relationship between the variables of response with the covariates. The aim of this paper is to evaluate the factors that influence the number of car theft using Poisson regression model. This paper will focus on the number of car thefts that occurred in districts in Peninsular Malaysia. There are two groups of factor that have been considered, namely district descriptive factors and socio and demographic factors. The result of the study showed that Bumiputera composition, Chinese composition, Other ethnic composition, foreign migration, number of residence with the age between 25 to 64, number of employed person and number of unemployed person are the most influence factors that affect the car theft cases. These information are very useful for the law enforcement department, insurance company and car owners in order to reduce and limiting the car theft cases in Peninsular Malaysia.
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for
Directory of Open Access Journals (Sweden)
Omholt Stig W
2011-06-01
Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback
Branched polynomial covering maps
DEFF Research Database (Denmark)
Hansen, Vagn Lundsgaard
1999-01-01
A Weierstrass polynomial with multiple roots in certain points leads to a branched covering map. With this as the guiding example, we formally define and study the notion of a branched polynomial covering map. We shall prove that many finite covering maps are polynomial outside a discrete branch...... set. Particular studies are made of branched polynomial covering maps arising from Riemann surfaces and from knots in the 3-sphere....
Dynamic Regression Intervention Modeling for the Malaysian Daily Load
Directory of Open Access Journals (Sweden)
Fadhilah Abdrazak
2014-05-01
Full Text Available Malaysia is a unique country due to having both fixed and moving holidays. These moving holidays may overlap with other fixed holidays and therefore, increase the complexity of the load forecasting activities. The errors due to holidays’ effects in the load forecasting are known to be higher than other factors. If these effects can be estimated and removed, the behavior of the series could be better viewed. Thus, the aim of this paper is to improve the forecasting errors by using a dynamic regression model with intervention analysis. Based on the linear transfer function method, a daily load model consists of either peak or average is developed. The developed model outperformed the seasonal ARIMA model in estimating the fixed and moving holidays’ effects and achieved a smaller Mean Absolute Percentage Error (MAPE in load forecast.
Learning Supervised Topic Models for Classification and Regression from Crowds.
Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete; Pereira, Francisco C
2017-12-01
The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.
Continuous validation of ASTEC containment models and regression testing
International Nuclear Information System (INIS)
Nowack, Holger; Reinke, Nils; Sonnenkalb, Martin
2014-01-01
The focus of the ASTEC (Accident Source Term Evaluation Code) development at GRS is primarily on the containment module CPA (Containment Part of ASTEC), whose modelling is to a large extent based on the GRS containment code COCOSYS (COntainment COde SYStem). Validation is usually understood as the approval of the modelling capabilities by calculations of appropriate experiments done by external users different from the code developers. During the development process of ASTEC CPA, bugs and unintended side effects may occur, which leads to changes in the results of the initially conducted validation. Due to the involvement of a considerable number of developers in the coding of ASTEC modules, validation of the code alone, even if executed repeatedly, is not sufficient. Therefore, a regression testing procedure has been implemented in order to ensure that the initially obtained validation results are still valid with succeeding code versions. Within the regression testing procedure, calculations of experiments and plant sequences are performed with the same input deck but applying two different code versions. For every test-case the up-to-date code version is compared to the preceding one on the basis of physical parameters deemed to be characteristic for the test-case under consideration. In the case of post-calculations of experiments also a comparison to experimental data is carried out. Three validation cases from the regression testing procedure are presented within this paper. The very good post-calculation of the HDR E11.1 experiment shows the high quality modelling of thermal-hydraulics in ASTEC CPA. Aerosol behaviour is validated on the BMC VANAM M3 experiment, and the results show also a very good agreement with experimental data. Finally, iodine behaviour is checked in the validation test-case of the THAI IOD-11 experiment. Within this test-case, the comparison of the ASTEC versions V2.0r1 and V2.0r2 shows how an error was detected by the regression testing
Modeling of the Monthly Rainfall-Runoff Process Through Regressions
Directory of Open Access Journals (Sweden)
Campos-Aranda Daniel Francisco
2014-10-01
Full Text Available To solve the problems associated with the assessment of water resources of a river, the modeling of the rainfall-runoff process (RRP allows the deduction of runoff missing data and to extend its record, since generally the information available on precipitation is larger. It also enables the estimation of inputs to reservoirs, when their building led to the suppression of the gauging station. The simplest mathematical model that can be set for the RRP is the linear regression or curve on a monthly basis. Such a model is described in detail and is calibrated with the simultaneous record of monthly rainfall and runoff in Ballesmi hydrometric station, which covers 35 years. Since the runoff of this station has an important contribution from the spring discharge, the record is corrected first by removing that contribution. In order to do this a procedure was developed based either on the monthly average regional runoff coefficients or on nearby and similar watershed; in this case the Tancuilín gauging station was used. Both stations belong to the Partial Hydrologic Region No. 26 (Lower Rio Panuco and are located within the state of San Luis Potosi, México. The study performed indicates that the monthly regression model, due to its conceptual approach, faithfully reproduces monthly average runoff volumes and achieves an excellent approximation in relation to the dispersion, proved by calculation of the means and standard deviations.
Energy Technology Data Exchange (ETDEWEB)
Wu, Xu, E-mail: xuwu2@illinois.edu; Kozlowski, Tomasz
2017-03-15
Modeling and simulations are naturally augmented by extensive Uncertainty Quantification (UQ) and sensitivity analysis requirements in the nuclear reactor system design, in which uncertainties must be quantified in order to prove that the investigated design stays within acceptance criteria. Historically, expert judgment has been used to specify the nominal values, probability density functions and upper and lower bounds of the simulation code random input parameters for the forward UQ process. The purpose of this paper is to replace such ad-hoc expert judgment of the statistical properties of input model parameters with inverse UQ process. Inverse UQ seeks statistical descriptions of the model random input parameters that are consistent with the experimental data. Bayesian analysis is used to establish the inverse UQ problems based on experimental data, with systematic and rigorously derived surrogate models based on Polynomial Chaos Expansion (PCE). The methods developed here are demonstrated with the Point Reactor Kinetics Equation (PRKE) coupled with lumped parameter thermal-hydraulics feedback model. Three input parameters, external reactivity, Doppler reactivity coefficient and coolant temperature coefficient are modeled as uncertain input parameters. Their uncertainties are inversely quantified based on synthetic experimental data. Compared with the direct numerical simulation, surrogate model by PC expansion shows high efficiency and accuracy. In addition, inverse UQ with Bayesian analysis can calibrate the random input parameters such that the simulation results are in a better agreement with the experimental data.
Interpreting parameters in the logistic regression model with random effects
DEFF Research Database (Denmark)
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Khairudin, N.; Keesman, K.J.
2009-01-01
In this paper a novel approach to estimate parameters in an LTI continuous-time statespace model is proposed. Essentially, the approach is based on a so-called pqR-decomposition of the numerator and denominator polynomials of the system’s transfer function. This approach allows the physical
Sraj, Ihab
2016-08-26
The authors present a polynomial chaos (PC)-based Bayesian inference method for quantifying the uncertainties of the K-profile parameterization (KPP) within the MIT general circulation model (MITgcm) of the tropical Pacific. The inference of the uncertain parameters is based on a Markov chain Monte Carlo (MCMC) scheme that utilizes a newly formulated test statistic taking into account the different components representing the structures of turbulent mixing on both daily and seasonal time scales in addition to the data quality, and filters for the effects of parameter perturbations over those as a result of changes in the wind. To avoid the prohibitive computational cost of integrating the MITgcm model at each MCMC iteration, a surrogate model for the test statistic using the PC method is built. Because of the noise in the model predictions, a basis-pursuit-denoising (BPDN) compressed sensing approach is employed to determine the PC coefficients of a representative surrogate model. The PC surrogate is then used to evaluate the test statistic in the MCMC step for sampling the posterior of the uncertain parameters. Results of the posteriors indicate good agreement with the default values for two parameters of the KPP model, namely the critical bulk and gradient Richardson numbers; while the posteriors of the remaining parameters were barely informative. © 2016 American Meteorological Society.
Learning Supervised Topic Models for Classification and Regression from Crowds
DEFF Research Database (Denmark)
Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete
2017-01-01
problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages...... annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression...
Preference learning with evolutionary Multivariate Adaptive Regression Spline model
DEFF Research Database (Denmark)
Abou-Zleikha, Mohamed; Shaker, Noor; Christensen, Mads Græsbøll
2015-01-01
This paper introduces a novel approach for pairwise preference learning through combining an evolutionary method with Multivariate Adaptive Regression Spline (MARS). Collecting users' feedback through pairwise preferences is recommended over other ranking approaches as this method is more appealing...... for function approximation as well as being relatively easy to interpret. MARS models are evolved based on their efficiency in learning pairwise data. The method is tested on two datasets that collectively provide pairwise preference data of five cognitive states expressed by users. The method is analysed...
Predicting Performance on MOOC Assessments using Multi-Regression Models
Ren, Zhiyun; Rangwala, Huzefa; Johri, Aditya
2016-01-01
The past few years has seen the rapid growth of data min- ing approaches for the analysis of data obtained from Mas- sive Open Online Courses (MOOCs). The objectives of this study are to develop approaches to predict the scores a stu- dent may achieve on a given grade-related assessment based on information, considered as prior performance or prior ac- tivity in the course. We develop a personalized linear mul- tiple regression (PLMR) model to predict the grade for a student, prior to attempt...
Analytical and regression models of glass rod drawing process
Alekseeva, L. B.
2018-03-01
The process of drawing glass rods (light guides) is being studied. The parameters of the process affecting the quality of the light guide have been determined. To solve the problem, mathematical models based on general equations of continuum mechanics are used. The conditions for the stable flow of the drawing process have been found, which are determined by the stability of the motion of the glass mass in the formation zone to small uncontrolled perturbations. The sensitivity of the formation zone to perturbations of the drawing speed and viscosity is estimated. Experimental models of the drawing process, based on the regression analysis methods, have been obtained. These models make it possible to customize a specific production process to obtain light guides of the required quality. They allow one to find the optimum combination of process parameters in the chosen area and to determine the required accuracy of maintaining them at a specified level.
Branched polynomial covering maps
DEFF Research Database (Denmark)
Hansen, Vagn Lundsgaard
2002-01-01
A Weierstrass polynomial with multiple roots in certain points leads to a branched covering map. With this as the guiding example, we formally define and study the notion of a branched polynomial covering map. We shall prove that many finite covering maps are polynomial outside a discrete branch ...... set. Particular studies are made of branched polynomial covering maps arising from Riemann surfaces and from knots in the 3-sphere. (C) 2001 Elsevier Science B.V. All rights reserved.......A Weierstrass polynomial with multiple roots in certain points leads to a branched covering map. With this as the guiding example, we formally define and study the notion of a branched polynomial covering map. We shall prove that many finite covering maps are polynomial outside a discrete branch...
Regression Models for Predicting Force Coefficients of Aerofoils
Directory of Open Access Journals (Sweden)
Mohammed ABDUL AKBAR
2015-09-01
Full Text Available Renewable sources of energy are attractive and advantageous in a lot of different ways. Among the renewable energy sources, wind energy is the fastest growing type. Among wind energy converters, Vertical axis wind turbines (VAWTs have received renewed interest in the past decade due to some of the advantages they possess over their horizontal axis counterparts. VAWTs have evolved into complex 3-D shapes. A key component in predicting the output of VAWTs through analytical studies is obtaining the values of lift and drag coefficients which is a function of shape of the aerofoil, ‘angle of attack’ of wind and Reynolds’s number of flow. Sandia National Laboratories have carried out extensive experiments on aerofoils for the Reynolds number in the range of those experienced by VAWTs. The volume of experimental data thus obtained is huge. The current paper discusses three Regression analysis models developed wherein lift and drag coefficients can be found out using simple formula without having to deal with the bulk of the data. Drag coefficients and Lift coefficients were being successfully estimated by regression models with R2 values as high as 0.98.
Complex Environmental Data Modelling Using Adaptive General Regression Neural Networks
Kanevski, Mikhail
2015-04-01
The research deals with an adaptation and application of Adaptive General Regression Neural Networks (GRNN) to high dimensional environmental data. GRNN [1,2,3] are efficient modelling tools both for spatial and temporal data and are based on nonparametric kernel methods closely related to classical Nadaraya-Watson estimator. Adaptive GRNN, using anisotropic kernels, can be also applied for features selection tasks when working with high dimensional data [1,3]. In the present research Adaptive GRNN are used to study geospatial data predictability and relevant feature selection using both simulated and real data case studies. The original raw data were either three dimensional monthly precipitation data or monthly wind speeds embedded into 13 dimensional space constructed by geographical coordinates and geo-features calculated from digital elevation model. GRNN were applied in two different ways: 1) adaptive GRNN with the resulting list of features ordered according to their relevancy; and 2) adaptive GRNN applied to evaluate all possible models N [in case of wind fields N=(2^13 -1)=8191] and rank them according to the cross-validation error. In both cases training were carried out applying leave-one-out procedure. An important result of the study is that the set of the most relevant features depends on the month (strong seasonal effect) and year. The predictabilities of precipitation and wind field patterns, estimated using the cross-validation and testing errors of raw and shuffled data, were studied in detail. The results of both approaches were qualitatively and quantitatively compared. In conclusion, Adaptive GRNN with their ability to select features and efficient modelling of complex high dimensional data can be widely used in automatic/on-line mapping and as an integrated part of environmental decision support systems. 1. Kanevski M., Pozdnoukhov A., Timonin V. Machine Learning for Spatial Environmental Data. Theory, applications and software. EPFL Press
Conditional Monte Carlo randomization tests for regression models.
Parhat, Parwen; Rosenberger, William F; Diao, Guoqing
2014-08-15
We discuss the computation of randomization tests for clinical trials of two treatments when the primary outcome is based on a regression model. We begin by revisiting the seminal paper of Gail, Tan, and Piantadosi (1988), and then describe a method based on Monte Carlo generation of randomization sequences. The tests based on this Monte Carlo procedure are design based, in that they incorporate the particular randomization procedure used. We discuss permuted block designs, complete randomization, and biased coin designs. We also use a new technique by Plamadeala and Rosenberger (2012) for simple computation of conditional randomization tests. Like Gail, Tan, and Piantadosi, we focus on residuals from generalized linear models and martingale residuals from survival models. Such techniques do not apply to longitudinal data analysis, and we introduce a method for computation of randomization tests based on the predicted rate of change from a generalized linear mixed model when outcomes are longitudinal. We show, by simulation, that these randomization tests preserve the size and power well under model misspecification. Copyright © 2014 John Wiley & Sons, Ltd.
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
Global Land Use Regression Model for Nitrogen Dioxide Air Pollution.
Larkin, Andrew; Geddes, Jeffrey A; Martin, Randall V; Xiao, Qingyang; Liu, Yang; Marshall, Julian D; Brauer, Michael; Hystad, Perry
2017-06-20
Nitrogen dioxide is a common air pollutant with growing evidence of health impacts independent of other common pollutants such as ozone and particulate matter. However, the worldwide distribution of NO 2 exposure and associated impacts on health is still largely uncertain. To advance global exposure estimates we created a global nitrogen dioxide (NO 2 ) land use regression model for 2011 using annual measurements from 5,220 air monitors in 58 countries. The model captured 54% of global NO 2 variation, with a mean absolute error of 3.7 ppb. Regional performance varied from R 2 = 0.42 (Africa) to 0.67 (South America). Repeated 10% cross-validation using bootstrap sampling (n = 10,000) demonstrated a robust performance with respect to air monitor sampling in North America, Europe, and Asia (adjusted R 2 within 2%) but not for Africa and Oceania (adjusted R 2 within 11%) where NO 2 monitoring data are sparse. The final model included 10 variables that captured both between and within-city spatial gradients in NO 2 concentrations. Variable contributions differed between continental regions, but major roads within 100 m and satellite-derived NO 2 were consistently the strongest predictors. The resulting model can be used for global risk assessments and health studies, particularly in countries without existing NO 2 monitoring data or models.
Drought Patterns Forecasting using an Auto-Regressive Logistic Model
del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.
2014-12-01
Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.
A Gompertz regression model for fern spores germination
Directory of Open Access Journals (Sweden)
Gabriel y Galán, Jose María
2015-06-01
Full Text Available Germination is one of the most important biological processes for both seed and spore plants, also for fungi. At present, mathematical models of germination have been developed in fungi, bryophytes and several plant species. However, ferns are the only group whose germination has never been modelled. In this work we develop a regression model of the germination of fern spores. We have found that for Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei and Polypodium feuillei species the Gompertz growth model describe satisfactorily cumulative germination. An important result is that regression parameters are independent of fern species and the model is not affected by intraspecific variation. Our results show that the Gompertz curve represents a general germination model for all the non-green spore leptosporangiate ferns, including in the paper a discussion about the physiological and ecological meaning of the model.La germinación es uno de los procesos biológicos más relevantes tanto para las plantas con esporas, como para las plantas con semillas y los hongos. Hasta el momento, se han desarrollado modelos de germinación para hongos, briofitos y diversas especies de espermatófitos. Los helechos son el único grupo de plantas cuya germinación nunca ha sido modelizada. En este trabajo se desarrolla un modelo de regresión para explicar la germinación de las esporas de helechos. Observamos que para las especies Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei y Polypodium feuillei el modelo de crecimiento de Gompertz describe satisfactoriamente la germinación acumulativa. Un importante resultado es que los parámetros de la regresión son independientes de la especie y que el modelo no está afectado por variación intraespecífica. Por lo tanto, los resultados del trabajo muestran que la curva de Gompertz puede representar un modelo general para todos los helechos leptosporangiados
Collision prediction models using multivariate Poisson-lognormal regression.
El-Basyouny, Karim; Sayed, Tarek
2009-07-01
This paper advocates the use of multivariate Poisson-lognormal (MVPLN) regression to develop models for collision count data. The MVPLN approach presents an opportunity to incorporate the correlations across collision severity levels and their influence on safety analyses. The paper introduces a new multivariate hazardous location identification technique, which generalizes the univariate posterior probability of excess that has been commonly proposed and applied in the literature. In addition, the paper presents an alternative approach for quantifying the effect of the multivariate structure on the precision of expected collision frequency. The MVPLN approach is compared with the independent (separate) univariate Poisson-lognormal (PLN) models with respect to model inference, goodness-of-fit, identification of hot spots and precision of expected collision frequency. The MVPLN is modeled using the WinBUGS platform which facilitates computation of posterior distributions as well as providing a goodness-of-fit measure for model comparisons. The results indicate that the estimates of the extra Poisson variation parameters were considerably smaller under MVPLN leading to higher precision. The improvement in precision is due mainly to the fact that MVPLN accounts for the correlation between the latent variables representing property damage only (PDO) and injuries plus fatalities (I+F). This correlation was estimated at 0.758, which is highly significant, suggesting that higher PDO rates are associated with higher I+F rates, as the collision likelihood for both types is likely to rise due to similar deficiencies in roadway design and/or other unobserved factors. In terms of goodness-of-fit, the MVPLN model provided a superior fit than the independent univariate models. The multivariate hazardous location identification results demonstrated that some hazardous locations could be overlooked if the analysis was restricted to the univariate models.
THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE.
Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan
2015-10-01
The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran's universities. This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran's public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For predicting the organizational climate pattern of the libraries is used from the multivariate linear regression and track diagram. of the 9 variables affecting organizational climate, 5 variables of innovation, teamwork, customer service, psychological safety and deep diversity play a major role in prediction of the organizational climate of Iran's libraries. The results also indicate that each of these variables with different coefficient have the power to predict organizational climate but the climate score of psychological safety (0.94) plays a very crucial role in predicting the organizational climate. Track diagram showed that five variables of teamwork, customer service, psychological safety, deep diversity and innovation directly effects on the organizational climate variable that contribution of the team work from this influence is more than any other variables. Of the indicator of the organizational climate of climateQual, the contribution of the team work from this influence is more than any other variables that reinforcement of teamwork in academic libraries can be more effective in improving the organizational climate of this type libraries.
Directory of Open Access Journals (Sweden)
Jatin Chatrath
2018-03-01
Full Text Available Reconfigurable and multi-standard RF front-ends for wireless communication and sensor networks have gained importance as building blocks for the Internet of Things. Simpler and highly-efficient transmitter architectures, which can transmit better quality signals with reduced impairments, are an important step in this direction. In this regard, mixer-less transmitter architecture, namely, the three-way amplitude modulator-based transmitter, avoids the use of imperfect mixers and frequency up-converters, and their resulting distortions, leading to an improved signal quality. In this work, an augmented memory polynomial-based model for the behavioral modeling of such mixer-less transmitter architecture is proposed. Extensive simulations and measurements have been carried out in order to validate the accuracy of the proposed modeling strategy. The performance of the proposed model is evaluated using normalized mean square error (NMSE for long-term evolution (LTE signals. NMSE for a LTE signal of 1.4 MHz bandwidth with 100,000 samples for digital combining and analog combining are recorded as −36.41 dB and −36.9 dB, respectively. Similarly, for a 5 MHz signal the proposed models achieves −31.93 dB and −32.08 dB NMSE using digital and analog combining, respectively. For further validation of the proposed model, amplitude-to-amplitude (AM-AM, amplitude-to-phase (AM-PM, and the spectral response of the modeled and measured data are plotted, reasonably meeting the desired modeling criteria.
Meta-Modeling by Symbolic Regression and Pareto Simulated Annealing
Stinstra, E.; Rennen, G.; Teeuwen, G.J.A.
2006-01-01
The subject of this paper is a new approach to Symbolic Regression.Other publications on Symbolic Regression use Genetic Programming.This paper describes an alternative method based on Pareto Simulated Annealing.Our method is based on linear regression for the estimation of constants.Interval
Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.
Ferrari, Alberto
2017-01-01
Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.
Variable selection in Logistic regression model with genetic algorithm.
Zhang, Zhongheng; Trevino, Victor; Hoseini, Sayed Shahabuddin; Belciug, Smaranda; Boopathi, Arumugam Manivanna; Zhang, Ping; Gorunescu, Florin; Subha, Velappan; Dai, Songshi
2018-02-01
Variable or feature selection is one of the most important steps in model specification. Especially in the case of medical-decision making, the direct use of a medical database, without a previous analysis and preprocessing step, is often counterproductive. In this way, the variable selection represents the method of choosing the most relevant attributes from the database in order to build a robust learning models and, thus, to improve the performance of the models used in the decision process. In biomedical research, the purpose of variable selection is to select clinically important and statistically significant variables, while excluding unrelated or noise variables. A variety of methods exist for variable selection, but none of them is without limitations. For example, the stepwise approach, which is highly used, adds the best variable in each cycle generally producing an acceptable set of variables. Nevertheless, it is limited by the fact that it commonly trapped in local optima. The best subset approach can systematically search the entire covariate pattern space, but the solution pool can be extremely large with tens to hundreds of variables, which is the case in nowadays clinical data. Genetic algorithms (GA) are heuristic optimization approaches and can be used for variable selection in multivariable regression models. This tutorial paper aims to provide a step-by-step approach to the use of GA in variable selection. The R code provided in the text can be extended and adapted to other data analysis needs.
Electricity prices forecasting by automatic dynamic harmonic regression models
International Nuclear Information System (INIS)
Pedregal, Diego J.; Trapero, Juan R.
2007-01-01
The changes experienced by electricity markets in recent years have created the necessity for more accurate forecast tools of electricity prices, both for producers and consumers. Many methodologies have been applied to this aim, but in the view of the authors, state space models are not yet fully exploited. The present paper proposes a univariate dynamic harmonic regression model set up in a state space framework for forecasting prices in these markets. The advantages of the approach are threefold. Firstly, a fast automatic identification and estimation procedure is proposed based on the frequency domain. Secondly, the recursive algorithms applied offer adaptive predictions that compare favourably with respect to other techniques. Finally, since the method is based on unobserved components models, explicit information about trend, seasonal and irregular behaviours of the series can be extracted. This information is of great value to the electricity companies' managers in order to improve their strategies, i.e. it provides management innovations. The good forecast performance and the rapid adaptability of the model to changes in the data are illustrated with actual prices taken from the PJM interconnection in the US and for the Spanish market for the year 2002. (author)
Characteristics and Properties of a Simple Linear Regression Model
Directory of Open Access Journals (Sweden)
Kowal Robert
2016-12-01
Full Text Available A simple linear regression model is one of the pillars of classic econometrics. Despite the passage of time, it continues to raise interest both from the theoretical side as well as from the application side. One of the many fundamental questions in the model concerns determining derivative characteristics and studying the properties existing in their scope, referring to the first of these aspects. The literature of the subject provides several classic solutions in that regard. In the paper, a completely new design is proposed, based on the direct application of variance and its properties, resulting from the non-correlation of certain estimators with the mean, within the scope of which some fundamental dependencies of the model characteristics are obtained in a much more compact manner. The apparatus allows for a simple and uniform demonstration of multiple dependencies and fundamental properties in the model, and it does it in an intuitive manner. The results were obtained in a classic, traditional area, where everything, as it might seem, has already been thoroughly studied and discovered.
Bayesian Regression of Thermodynamic Models of Redox Active Materials
Energy Technology Data Exchange (ETDEWEB)
Johnston, Katherine [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-09-01
Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from the model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).
Convergence diagnostics for Eigenvalue problems with linear regression model
International Nuclear Information System (INIS)
Shi, Bo; Petrovic, Bojan
2011-01-01
Although the Monte Carlo method has been extensively used for criticality/Eigenvalue problems, a reliable, robust, and efficient convergence diagnostics method is still desired. Most methods are based on integral parameters (multiplication factor, entropy) and either condense the local distribution information into a single value (e.g., entropy) or even disregard it. We propose to employ the detailed cycle-by-cycle local flux evolution obtained by using mesh tally mechanism to assess the source and flux convergence. By applying a linear regression model to each individual mesh in a mesh tally for convergence diagnostics, a global convergence criterion can be obtained. We exemplify this method on two problems and obtain promising diagnostics results. (author)
The R Package threg to Implement Threshold Regression Models
Directory of Open Access Journals (Sweden)
Tao Xiao
2015-08-01
This new package includes four functions: threg, and the methods hr, predict and plot for threg objects returned by threg. The threg function is the model-fitting function which is used to calculate regression coefficient estimates, asymptotic standard errors and p values. The hr method for threg objects is the hazard-ratio calculation function which provides the estimates of hazard ratios at selected time points for specified scenarios (based on given categories or value settings of covariates. The predict method for threg objects is used for prediction. And the plot method for threg objects provides plots for curves of estimated hazard functions, survival functions and probability density functions of the first-hitting-time; function curves corresponding to different scenarios can be overlaid in the same plot for comparison to give additional research insights.
Directory of Open Access Journals (Sweden)
A. Seyeddokht
2012-09-01
Full Text Available In this research a random regression test day model was used to estimate heritability values and calculation genetic correlations between test day milk records. a total of 140357 monthly test day milk records belonging to 28292 first lactation Holstein cattle(trice time a day milking distributed in 165 herd and calved from 2001 to 2010 belonging to the herds of Tehran province were used. The fixed effects of herd-year-month of calving as contemporary group and age at calving and Holstein gene percentage as covariate were fitted. Orthogonal legendre polynomial with a 4th-order was implemented to take account of genetic and environmental aspects of milk production over the course of lactation. RRM using Legendre polynomials as base functions appears to be the most adequate to describe the covariance structure of the data. The results showed that the average of heritability for the second half of lactation period was higher than that of the first half. The heritability value for the first month was lowest (0.117 and for the eighth month of the lactation was highest (0.230 compared to the other months of lactation. Because of genetic variation was increased gradually, and residual variance was high in the first months of lactation, heritabilities were different over the course of lactation. The RRMs with a higher number of parameters were more useful to describe the genetic variation of test-day milk yield throughout the lactation. In this research estimation of genetic parameters, and calculation genetic correlations were implemented by random regression test day model, therefore using this method is the exact way to take account of parameters rather than the other ways.
Ng, Kar Yong; Awang, Norhashidah
2018-01-06
Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.
Ultracentrifuge separative power modeling with multivariate regression using covariance matrix
International Nuclear Information System (INIS)
Migliavacca, Elder
2004-01-01
In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 460 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process variables, which significantly influence the δU values are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow rate F, cut θ and product line pressure P p . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heteroscedasticity with any explained regression model variable. The surface curves are made relating the separative power with the control variables F, θ and P p to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)
Modeling Pan Evaporation for Kuwait by Multiple Linear Regression
Almedeij, Jaber
2012-01-01
Evaporation is an important parameter for many projects related to hydrology and water resources systems. This paper constitutes the first study conducted in Kuwait to obtain empirical relations for the estimation of daily and monthly pan evaporation as functions of available meteorological data of temperature, relative humidity, and wind speed. The data used here for the modeling are daily measurements of substantial continuity coverage, within a period of 17 years between January 1993 and December 2009, which can be considered representative of the desert climate of the urban zone of the country. Multiple linear regression technique is used with a procedure of variable selection for fitting the best model forms. The correlations of evaporation with temperature and relative humidity are also transformed in order to linearize the existing curvilinear patterns of the data by using power and exponential functions, respectively. The evaporation models suggested with the best variable combinations were shown to produce results that are in a reasonable agreement with observation values. PMID:23226984
Technique for image interpolation using polynomial transforms
Escalante Ramírez, B.; Martens, J.B.; Haskell, G.G.; Hang, H.M.
1993-01-01
We present a new technique for image interpolation based on polynomial transforms. This is an image representation model that analyzes an image by locally expanding it into a weighted sum of orthogonal polynomials. In the discrete case, the image segment within every window of analysis is
International Nuclear Information System (INIS)
Antanasijević, Davor; Pocajt, Viktor; Ristić, Mirjana; Perić-Grujić, Aleksandra
2015-01-01
This paper presents a new approach for the estimation of energy-related GHG (greenhouse gas) emissions at the national level that combines the simplicity of the concept of GHG intensity and the generalization capabilities of ANNs (artificial neural networks). The main objectives of this work includes the determination of the accuracy of a GRNN (general regression neural network) model applied for the prediction of EC (energy consumption) and GHG intensity of energy consumption, utilizing general country statistics as inputs, as well as analysis of the accuracy of energy-related GHG emissions obtained by multiplying the two aforementioned outputs. The models were developed using historical data from the period 2004–2012, for a set of 26 European countries (EU Members). The obtained results demonstrate that the GRNN GHG intensity model provides a more accurate prediction, with the MAPE (mean absolute percentage error) of 4.5%, than tested MLR (multiple linear regression) and second-order and third-order non-linear MPR (multiple polynomial regression) models. Also, the GRNN EC model has high accuracy (MAPE = 3.6%), and therefore both GRNN models and the proposed approach can be considered as suitable for the calculation of GHG emissions. The energy-related predicted GHG emissions were very similar to the actual GHG emissions of EU Members (MAPE = 6.4%). - Highlights: • ANN modeling of GHG intensity of energy consumption is presented. • ANN modeling of energy consumption at the national level is presented. • GHG intensity concept was used for the estimation of energy-related GHG emissions. • The ANN models provide better results in comparison with conventional models. • Forecast of GHG emissions for 26 countries was made successfully with MAPE of 6.4%
International Nuclear Information System (INIS)
Che Jinxing; Wang Jianzhou
2010-01-01
In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.
International Nuclear Information System (INIS)
Lashkevich, Michael; Pugai, Yaroslav
2013-01-01
We continue the study of form factors of descendant operators in the sinh- and sine-Gordon models in the framework of the algebraic construction proposed in [1]. We find the algebraic construction to be related to a particular limit of the tensor product of the deformed Virasoro algebra and a suitably chosen Heisenberg algebra. To analyze the space of local operators in the framework of the form factor formalism we introduce screening operators and construct singular and cosingular vectors in the Fock spaces related to the free field realization of the obtained algebra. We show that the singular vectors are expressed in terms of the degenerate Macdonald polynomials with rectangular partitions. We study the matrix elements that contain a singular vector in one chirality and a cosingular vector in the other chirality and find them to lead to the resonance identities already known in the conformal perturbation theory. Besides, we give a new derivation of the equation of motion in the sinh-Gordon theory, and a new representation for conserved currents
An Ordered Regression Model to Predict Transit Passengers’ Behavioural Intentions
Energy Technology Data Exchange (ETDEWEB)
Oña, J. de; Oña, R. de; Eboli, L.; Forciniti, C.; Mazzulla, G.
2016-07-01
Passengers’ behavioural intentions after experiencing transit services can be viewed as signals that show if a customer continues to utilise a company’s service. Users’ behavioural intentions can depend on a series of aspects that are difficult to measure directly. More recently, transit passengers’ behavioural intentions have been just considered together with the concepts of service quality and customer satisfaction. Due to the characteristics of the ways for evaluating passengers’ behavioural intentions, service quality and customer satisfaction, we retain that this kind of issue could be analysed also by applying ordered regression models. This work aims to propose just an ordered probit model for analysing service quality factors that can influence passengers’ behavioural intentions towards the use of transit services. The case study is the LRT of Seville (Spain), where a survey was conducted in order to collect the opinions of the passengers about the existing transit service, and to have a measure of the aspects that can influence the intentions of the users to continue using the transit service in the future. (Author)
Heterogeneous Breast Phantom Development for Microwave Imaging Using Regression Models
Directory of Open Access Journals (Sweden)
Camerin Hahn
2012-01-01
Full Text Available As new algorithms for microwave imaging emerge, it is important to have standard accurate benchmarking tests. Currently, most researchers use homogeneous phantoms for testing new algorithms. These simple structures lack the heterogeneity of the dielectric properties of human tissue and are inadequate for testing these algorithms for medical imaging. To adequately test breast microwave imaging algorithms, the phantom has to resemble different breast tissues physically and in terms of dielectric properties. We propose a systematic approach in designing phantoms that not only have dielectric properties close to breast tissues but also can be easily shaped to realistic physical models. The approach is based on regression model to match phantom's dielectric properties with the breast tissue dielectric properties found in Lazebnik et al. (2007. However, the methodology proposed here can be used to create phantoms for any tissue type as long as ex vivo, in vitro, or in vivo tissue dielectric properties are measured and available. Therefore, using this method, accurate benchmarking phantoms for testing emerging microwave imaging algorithms can be developed.
International Nuclear Information System (INIS)
Deman, G.; Konakli, K.; Sudret, B.; Kerrou, J.; Perrochet, P.; Benabderrahmane, H.
2016-01-01
The study makes use of polynomial chaos expansions to compute Sobol' indices within the frame of a global sensitivity analysis of hydro-dispersive parameters in a simplified vertical cross-section of a segment of the subsurface of the Paris Basin. Applying conservative ranges, the uncertainty in 78 input variables is propagated upon the mean lifetime expectancy of water molecules departing from a specific location within a highly confining layer situated in the middle of the model domain. Lifetime expectancy is a hydrogeological performance measure pertinent to safety analysis with respect to subsurface contaminants, such as radionuclides. The sensitivity analysis indicates that the variability in the mean lifetime expectancy can be sufficiently explained by the uncertainty in the petrofacies, i.e. the sets of porosity and hydraulic conductivity, of only a few layers of the model. The obtained results provide guidance regarding the uncertainty modeling in future investigations employing detailed numerical models of the subsurface of the Paris Basin. Moreover, the study demonstrates the high efficiency of sparse polynomial chaos expansions in computing Sobol' indices for high-dimensional models. - Highlights: • Global sensitivity analysis of a 2D 15-layer groundwater flow model is conducted. • A high-dimensional random input comprising 78 parameters is considered. • The variability in the mean lifetime expectancy for the central layer is examined. • Sparse polynomial chaos expansions are used to compute Sobol' sensitivity indices. • The petrofacies of a few layers can sufficiently explain the response variance.
Polynomial factor models : non-iterative estimation via method-of-moments
Schuberth, Florian; Büchner, Rebecca; Schermelleh-Engel, Karin; Dijkstra, Theo K.
2017-01-01
We introduce a non-iterative method-of-moments estimator for non-linear latent variable (LV) models. Under the assumption of joint normality of all exogenous variables, we use the corrected moments of linear combinations of the observed indicators (proxies) to obtain consistent path coefficient and
application of multilinear regression analysis in modeling of soil
African Journals Online (AJOL)
Windows User
Accordingly [1, 3] in their work, they applied linear regression ... (MLRA) is a statistical technique that uses several explanatory ... order to check this, they adopted bivariate correlation analysis .... groups, namely A-1 through A-7, based on their relative expected ..... Multivariate Regression in Gorgan Province North of Iran” ...
A History of Regression and Related Model-Fitting in the Earth Sciences (1636?-2000)
International Nuclear Information System (INIS)
Howarth, Richard J.
2001-01-01
The (statistical) modeling of the behavior of a dependent variate as a function of one or more predictors provides examples of model-fitting which span the development of the earth sciences from the 17th Century to the present. The historical development of these methods and their subsequent application is reviewed. Bond's predictions (c. 1636 and 1668) of change in the magnetic declination at London may be the earliest attempt to fit such models to geophysical data. Following publication of Newton's theory of gravitation in 1726, analysis of data on the length of a 1 o meridian arc, and the length of a pendulum beating seconds, as a function of sin 2 (latitude), was used to determine the ellipticity of the oblate spheroid defining the Figure of the Earth. The pioneering computational methods of Mayer in 1750, Boscovich in 1755, and Lambert in 1765, and the subsequent independent discoveries of the principle of least squares by Gauss in 1799, Legendre in 1805, and Adrain in 1808, and its later substantiation on the basis of probability theory by Gauss in 1809 were all applied to the analysis of such geodetic and geophysical data. Notable later applications include: the geomagnetic survey of Ireland by Lloyd, Sabine, and Ross in 1836, Gauss's model of the terrestrial magnetic field in 1838, and Airy's 1845 analysis of the residuals from a fit to pendulum lengths, from which he recognized the anomalous character of measurements of gravitational force which had been made on islands. In the early 20th Century applications to geological topics proliferated, but the computational burden effectively held back applications of multivariate analysis. Following World War II, the arrival of digital computers in universities in the 1950s facilitated computation, and fitting linear or polynomial models as a function of geographic coordinates, trend surface analysis, became popular during the 1950-60s. The inception of geostatistics in France at this time by Matheron had its
Polynomial representations of GLn
Green, James A; Erdmann, Karin
2007-01-01
The first half of this book contains the text of the first edition of LNM volume 830, Polynomial Representations of GLn. This classic account of matrix representations, the Schur algebra, the modular representations of GLn, and connections with symmetric groups, has been the basis of much research in representation theory. The second half is an Appendix, and can be read independently of the first. It is an account of the Littelmann path model for the case gln. In this case, Littelmann's 'paths' become 'words', and so the Appendix works with the combinatorics on words. This leads to the repesentation theory of the 'Littelmann algebra', which is a close analogue of the Schur algebra. The treatment is self- contained; in particular complete proofs are given of classical theorems of Schensted and Knuth.
Polynomial representations of GLN
Green, James A
1980-01-01
The first half of this book contains the text of the first edition of LNM volume 830, Polynomial Representations of GLn. This classic account of matrix representations, the Schur algebra, the modular representations of GLn, and connections with symmetric groups, has been the basis of much research in representation theory. The second half is an Appendix, and can be read independently of the first. It is an account of the Littelmann path model for the case gln. In this case, Littelmann's 'paths' become 'words', and so the Appendix works with the combinatorics on words. This leads to the repesentation theory of the 'Littelmann algebra', which is a close analogue of the Schur algebra. The treatment is self- contained; in particular complete proofs are given of classical theorems of Schensted and Knuth.
Wheat flour dough Alveograph characteristics predicted by Mixolab regression models.
Codină, Georgiana Gabriela; Mironeasa, Silvia; Mironeasa, Costel; Popa, Ciprian N; Tamba-Berehoiu, Radiana
2012-02-01
In Romania, the Alveograph is the most used device to evaluate the rheological properties of wheat flour dough, but lately the Mixolab device has begun to play an important role in the breadmaking industry. These two instruments are based on different principles but there are some correlations that can be found between the parameters determined by the Mixolab and the rheological properties of wheat dough measured with the Alveograph. Statistical analysis on 80 wheat flour samples using the backward stepwise multiple regression method showed that Mixolab values using the ‘Chopin S’ protocol (40 samples) and ‘Chopin + ’ protocol (40 samples) can be used to elaborate predictive models for estimating the value of the rheological properties of wheat dough: baking strength (W), dough tenacity (P) and extensibility (L). The correlation analysis confirmed significant findings (P 0.70 for P, R²(adjusted) > 0.70 for W and R²(adjusted) > 0.38 for L, at a 95% confidence interval. Copyright © 2011 Society of Chemical Industry.
Application of regression model on stream water quality parameters
International Nuclear Information System (INIS)
Suleman, M.; Maqbool, F.; Malik, A.H.; Bhatti, Z.A.
2012-01-01
Statistical analysis was conducted to evaluate the effect of solid waste leachate from the open solid waste dumping site of Salhad on the stream water quality. Five sites were selected along the stream. Two sites were selected prior to mixing of leachate with the surface water. One was of leachate and other two sites were affected with leachate. Samples were analyzed for pH, water temperature, electrical conductivity (EC), total dissolved solids (TDS), Biological oxygen demand (BOD), chemical oxygen demand (COD), dissolved oxygen (DO) and total bacterial load (TBL). In this study correlation coefficient r among different water quality parameters of various sites were calculated by using Pearson model and then average of each correlation between two parameters were also calculated, which shows TDS and EC and pH and BOD have significantly increasing r value, while temperature and TDS, temp and EC, DO and BL, DO and COD have decreasing r value. Single factor ANOVA at 5% level of significance was used which shows EC, TDS, TCL and COD were significantly differ among various sites. By the application of these two statistical approaches TDS and EC shows strongly positive correlation because the ions from the dissolved solids in water influence the ability of that water to conduct an electrical current. These two parameters significantly vary among 5 sites which are further confirmed by using linear regression. (author)
Weierstrass polynomials for links
DEFF Research Database (Denmark)
Hansen, Vagn Lundsgaard
1997-01-01
There is a natural way of identifying links in3-space with polynomial covering spaces over thecircle. Thereby any link in 3-space can be definedby a Weierstrass polynomial over the circle. Theequivalence relation for covering spaces over thecircle is, however, completely different from...
International Nuclear Information System (INIS)
Warnaar, S.O.
1996-01-01
We compute the one-dimensional configuration sums of the AFB model using the fermionic techniques introduced in part I of this paper. Combined with the results of Andrews, Baxter, and Forrester, we prove polynominal identities for finitizations of the Virasoro characters χb, a (r-1, r) (q) as conjectured by Melzer. In the thermodynamic limit these identities reproduce Rogers-Ramanujan-type identities for the unitary minimal Virasoro characters conjectured by the Stony Brook group. We also present a list of additional Virasoro character identities which follow from our proof of Melzer's identities and application of Bailey's lemma
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Nonnegativity of uncertain polynomials
Directory of Open Access Journals (Sweden)
iljak Dragoslav D.
1998-01-01
Full Text Available The purpose of this paper is to derive tests for robust nonnegativity of scalar and matrix polynomials, which are algebraic, recursive, and can be completed in finite number of steps. Polytopic families of polynomials are considered with various characterizations of parameter uncertainty including affine, multilinear, and polynomic structures. The zero exclusion condition for polynomial positivity is also proposed for general parameter dependencies. By reformulating the robust stability problem of complex polynomials as positivity of real polynomials, we obtain new sufficient conditions for robust stability involving multilinear structures, which can be tested using only real arithmetic. The obtained results are applied to robust matrix factorization, strict positive realness, and absolute stability of multivariable systems involving parameter dependent transfer function matrices.
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Methods of Detecting Outliers in A Regression Analysis Model ...
African Journals Online (AJOL)
PROF. O. E. OSUAGWU
2013-06-01
Jun 1, 2013 ... especially true in observational studies .... Simple linear regression and multiple ... The simple linear ..... Grubbs,F.E (1950): Sample Criteria for Testing Outlying observations: Annals of ... In experimental design, the Relative.
231 Using Multiple Regression Analysis in Modelling the Role of ...
African Journals Online (AJOL)
User
of Internal Revenue, Tourism Bureau and hotel records. The multiple regression .... additional guest facilities such as restaurant, a swimming pool or child care and social function ... and provide good quality service to the public. Conclusion.
Matrix product formula for Macdonald polynomials
Cantini, Luigi; de Gier, Jan; Wheeler, Michael
2015-09-01
We derive a matrix product formula for symmetric Macdonald polynomials. Our results are obtained by constructing polynomial solutions of deformed Knizhnik-Zamolodchikov equations, which arise by considering representations of the Zamolodchikov-Faddeev and Yang-Baxter algebras in terms of t-deformed bosonic operators. These solutions are generalized probabilities for particle configurations of the multi-species asymmetric exclusion process, and form a basis of the ring of polynomials in n variables whose elements are indexed by compositions. For weakly increasing compositions (anti-dominant weights), these basis elements coincide with non-symmetric Macdonald polynomials. Our formulas imply a natural combinatorial interpretation in terms of solvable lattice models. They also imply that normalizations of stationary states of multi-species exclusion processes are obtained as Macdonald polynomials at q = 1.
Matrix product formula for Macdonald polynomials
International Nuclear Information System (INIS)
Cantini, Luigi; Gier, Jan de; Michael Wheeler
2015-01-01
We derive a matrix product formula for symmetric Macdonald polynomials. Our results are obtained by constructing polynomial solutions of deformed Knizhnik–Zamolodchikov equations, which arise by considering representations of the Zamolodchikov–Faddeev and Yang–Baxter algebras in terms of t-deformed bosonic operators. These solutions are generalized probabilities for particle configurations of the multi-species asymmetric exclusion process, and form a basis of the ring of polynomials in n variables whose elements are indexed by compositions. For weakly increasing compositions (anti-dominant weights), these basis elements coincide with non-symmetric Macdonald polynomials. Our formulas imply a natural combinatorial interpretation in terms of solvable lattice models. They also imply that normalizations of stationary states of multi-species exclusion processes are obtained as Macdonald polynomials at q = 1. (paper)
Song, Chao; Kwan, Mei-Po; Zhu, Jiping
2017-04-08
An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.
DEFF Research Database (Denmark)
Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo
2017-01-01
In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...
A logistic regression model for Ghana National Health Insurance claims
Directory of Open Access Journals (Sweden)
Samuel Antwi
2013-07-01
Full Text Available In August 2003, the Ghanaian Government made history by implementing the first National Health Insurance System (NHIS in Sub-Saharan Africa. Within three years, over half of the country’s population had voluntarily enrolled into the National Health Insurance Scheme. This study had three objectives: 1 To estimate the risk factors that influences the Ghana national health insurance claims. 2 To estimate the magnitude of each of the risk factors in relation to the Ghana national health insurance claims. In this work, data was collected from the policyholders of the Ghana National Health Insurance Scheme with the help of the National Health Insurance database and the patients’ attendance register of the Koforidua Regional Hospital, from 1st January to 31st December 2011. Quantitative analysis was done using the generalized linear regression (GLR models. The results indicate that risk factors such as sex, age, marital status, distance and length of stay at the hospital were important predictors of health insurance claims. However, it was found that the risk factors; health status, billed charges and income level are not good predictors of national health insurance claim. The outcome of the study shows that sex, age, marital status, distance and length of stay at the hospital are statistically significant in the determination of the Ghana National health insurance premiums since they considerably influence claims. We recommended, among other things that, the National Health Insurance Authority should facilitate the institutionalization of the collection of appropriate data on a continuous basis to help in the determination of future premiums.
Polynomial selection in number field sieve for integer factorization
Directory of Open Access Journals (Sweden)
Gireesh Pandey
2016-09-01
Full Text Available The general number field sieve (GNFS is the fastest algorithm for factoring large composite integers which is made up by two prime numbers. Polynomial selection is an important step of GNFS. The asymptotic runtime depends on choice of good polynomial pairs. In this paper, we present polynomial selection algorithm that will be modelled with size and root properties. The correlations between polynomial coefficient and number of relations have been explored with experimental findings.
Contributions to fuzzy polynomial techniques for stability analysis and control
Pitarch Pérez, José Luis
2014-01-01
The present thesis employs fuzzy-polynomial control techniques in order to improve the stability analysis and control of nonlinear systems. Initially, it reviews the more extended techniques in the field of Takagi-Sugeno fuzzy systems, such as the more relevant results about polynomial and fuzzy polynomial systems. The basic framework uses fuzzy polynomial models by Taylor series and sum-of-squares techniques (semidefinite programming) in order to obtain stability guarantees...
A generalized additive regression model for survival times
DEFF Research Database (Denmark)
Scheike, Thomas H.
2001-01-01
Additive Aalen model; counting process; disability model; illness-death model; generalized additive models; multiple time-scales; non-parametric estimation; survival data; varying-coefficient models......Additive Aalen model; counting process; disability model; illness-death model; generalized additive models; multiple time-scales; non-parametric estimation; survival data; varying-coefficient models...
Faraway, Julian J
2005-01-01
Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...
Polynomial Heisenberg algebras
International Nuclear Information System (INIS)
Carballo, Juan M; C, David J Fernandez; Negro, Javier; Nieto, Luis M
2004-01-01
Polynomial deformations of the Heisenberg algebra are studied in detail. Some of their natural realizations are given by the higher order susy partners (and not only by those of first order, as is already known) of the harmonic oscillator for even-order polynomials. Here, it is shown that the susy partners of the radial oscillator play a similar role when the order of the polynomial is odd. Moreover, it will be proved that the general systems ruled by such kinds of algebras, in the quadratic and cubic cases, involve Painleve transcendents of types IV and V, respectively
Generalizations of orthogonal polynomials
Bultheel, A.; Cuyt, A.; van Assche, W.; van Barel, M.; Verdonk, B.
2005-07-01
We give a survey of recent generalizations of orthogonal polynomials. That includes multidimensional (matrix and vector orthogonal polynomials) and multivariate versions, multipole (orthogonal rational functions) variants, and extensions of the orthogonality conditions (multiple orthogonality). Most of these generalizations are inspired by the applications in which they are applied. We also give a glimpse of these applications, which are usually generalizations of applications where classical orthogonal polynomials also play a fundamental role: moment problems, numerical quadrature, rational approximation, linear algebra, recurrence relations, and random matrices.
A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs
Karabatsos, George; Walker, Stephen G.
2013-01-01
The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…
Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support
Czech Academy of Sciences Publication Activity Database
Kalina, Jan; Zvárová, Jana
2017-01-01
Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability
Directory of Open Access Journals (Sweden)
Nataša Šarlija
2017-01-01
Full Text Available This study sheds light on the most common issues related to applying logistic regression in prediction models for company growth. The purpose of the paper is 1 to provide a detailed demonstration of the steps in developing a growth prediction model based on logistic regression analysis, 2 to discuss common pitfalls and methodological errors in developing a model, and 3 to provide solutions and possible ways of overcoming these issues. Special attention is devoted to the question of satisfying logistic regression assumptions, selecting and defining dependent and independent variables, using classification tables and ROC curves, for reporting model strength, interpreting odds ratios as effect measures and evaluating performance of the prediction model. Development of a logistic regression model in this paper focuses on a prediction model of company growth. The analysis is based on predominantly financial data from a sample of 1471 small and medium-sized Croatian companies active between 2009 and 2014. The financial data is presented in the form of financial ratios divided into nine main groups depicting following areas of business: liquidity, leverage, activity, profitability, research and development, investing and export. The growth prediction model indicates aspects of a business critical for achieving high growth. In that respect, the contribution of this paper is twofold. First, methodological, in terms of pointing out pitfalls and potential solutions in logistic regression modelling, and secondly, theoretical, in terms of identifying factors responsible for high growth of small and medium-sized companies.
Directory of Open Access Journals (Sweden)
Luise A Seeker
Full Text Available Telomeres cap the ends of linear chromosomes and shorten with age in many organisms. In humans short telomeres have been linked to morbidity and mortality. With the accumulation of longitudinal datasets the focus shifts from investigating telomere length (TL to exploring TL change within individuals over time. Some studies indicate that the speed of telomere attrition is predictive of future disease. The objectives of the present study were to 1 characterize the change in bovine relative leukocyte TL (RLTL across the lifetime in Holstein Friesian dairy cattle, 2 estimate genetic parameters of RLTL over time and 3 investigate the association of differences in individual RLTL profiles with productive lifespan. RLTL measurements were analysed using Legendre polynomials in a random regression model to describe TL profiles and genetic variance over age. The analyses were based on 1,328 repeated RLTL measurements of 308 female Holstein Friesian dairy cattle. A quadratic Legendre polynomial was fitted to the fixed effect of age in months and to the random effect of the animal identity. Changes in RLTL, heritability and within-trait genetic correlation along the age trajectory were calculated and illustrated. At a population level, the relationship between RLTL and age was described by a positive quadratic function. Individuals varied significantly regarding the direction and amount of RLTL change over life. The heritability of RLTL ranged from 0.36 to 0.47 (SE = 0.05-0.08 and remained statistically unchanged over time. The genetic correlation of RLTL at birth with measurements later in life decreased with the time interval between samplings from near unity to 0.69, indicating that TL later in life might be regulated by different genes than TL early in life. Even though animals differed in their RLTL profiles significantly, those differences were not correlated with productive lifespan (p = 0.954.
Superiority of legendre polynomials to Chebyshev polynomial in ...
African Journals Online (AJOL)
In this paper, we proved the superiority of Legendre polynomial to Chebyshev polynomial in solving first order ordinary differential equation with rational coefficient. We generated shifted polynomial of Chebyshev, Legendre and Canonical polynomials which deal with solving differential equation by first choosing Chebyshev ...
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19
Extended biorthogonal matrix polynomials
Directory of Open Access Journals (Sweden)
Ayman Shehata
2017-01-01
Full Text Available The pair of biorthogonal matrix polynomials for commutative matrices were first introduced by Varma and Tasdelen in [22]. The main aim of this paper is to extend the properties of the pair of biorthogonal matrix polynomials of Varma and Tasdelen and certain generating matrix functions, finite series, some matrix recurrence relations, several important properties of matrix differential recurrence relations, biorthogonality relations and matrix differential equation for the pair of biorthogonal matrix polynomials J(A,B n (x, k and K(A,B n (x, k are discussed. For the matrix polynomials J(A,B n (x, k, various families of bilinear and bilateral generating matrix functions are constructed in the sequel.
Directory of Open Access Journals (Sweden)
Soyoung Park
2017-07-01
Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.
Golden, Ryan; Cho, Ilwoo
2015-01-01
In this paper, we study structure theorems of algebras of symmetric functions. Based on a certain relation on elementary symmetric polynomials generating such algebras, we consider perturbation in the algebras. In particular, we understand generators of the algebras as perturbations. From such perturbations, define injective maps on generators, which induce algebra-monomorphisms (or embeddings) on the algebras. They provide inductive structure theorems on algebras of symmetric polynomials. As...
Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering
Xiang, Sijia; Yao, Weixin
2017-01-01
In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...
Semiparametric nonlinear quantile regression model for financial returns
Czech Academy of Sciences Publication Activity Database
Avdulaj, Krenar; Baruník, Jozef
2017-01-01
Roč. 21, č. 1 (2017), s. 81-97 ISSN 1081-1826 R&D Projects: GA ČR(CZ) GBP402/12/G097 Institutional support: RVO:67985556 Keywords : copula quantile regression * realized volatility * value-at-risk Subject RIV: AH - Economic s OBOR OECD: Applied Economic s, Econometrics Impact factor: 0.649, year: 2016 http://library.utia.cas.cz/separaty/2017/E/avdulaj-0472346.pdf
Bonellie, Sandra R
2012-10-01
To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother. Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.
Chromatic polynomials for simplicial complexes
DEFF Research Database (Denmark)
Møller, Jesper Michael; Nord, Gesche
2016-01-01
In this note we consider s s -chromatic polynomials for finite simplicial complexes. When s=1 s=1 , the 1 1 -chromatic polynomial is just the usual graph chromatic polynomial of the 1 1 -skeleton. In general, the s s -chromatic polynomial depends on the s s -skeleton and its value at r...
Directory of Open Access Journals (Sweden)
Mojtaba Rasouli Gandomani
2016-06-01
Full Text Available In this paper, the model of HIV infection of CD4+ T cells is considered as a system of fractional differential equations. Then, a numerical method by using collocation method based on the Müntz-Legendre polynomials to approximate solution of the model is presented. The application of the proposed numerical method causes fractional differential equations system to convert into the algebraic equations system. The new system can be solved by one of the existing methods. Finally, we compare the result of this numerical method with the result of the methods have already been presented in the literature.
Beta Regression Finite Mixture Models of Polarization and Priming
Smithson, Michael; Merkle, Edgar C.; Verkuilen, Jay
2011-01-01
This paper describes the application of finite-mixture general linear models based on the beta distribution to modeling response styles, polarization, anchoring, and priming effects in probability judgments. These models, in turn, enhance our capacity for explicitly testing models and theories regarding the aforementioned phenomena. The mixture…
A generalized exponential time series regression model for electricity prices
DEFF Research Database (Denmark)
Haldrup, Niels; Knapik, Oskar; Proietti, Tomasso
on the estimated model, the best linear predictor is constructed. Our modeling approach provides good fit within sample and outperforms competing benchmark predictors in terms of forecasting accuracy. We also find that building separate models for each hour of the day and averaging the forecasts is a better...
Forecast Model of Urban Stagnant Water Based on Logistic Regression
Directory of Open Access Journals (Sweden)
Liu Pan
2017-01-01
Full Text Available With the development of information technology, the construction of water resource system has been gradually carried out. In the background of big data, the work of water information needs to carry out the process of quantitative to qualitative change. Analyzing the correlation of data and exploring the deep value of data which are the key of water information’s research. On the basis of the research on the water big data and the traditional data warehouse architecture, we try to find out the connection of different data source. According to the temporal and spatial correlation of stagnant water and rainfall, we use spatial interpolation to integrate data of stagnant water and rainfall which are from different data source and different sensors, then use logistic regression to find out the relationship between them.
Parental Vaccine Acceptance: A Logistic Regression Model Using Previsit Decisions.
Lee, Sara; Riley-Behringer, Maureen; Rose, Jeanmarie C; Meropol, Sharon B; Lazebnik, Rina
2017-07-01
This study explores how parents' intentions regarding vaccination prior to their children's visit were associated with actual vaccine acceptance. A convenience sample of parents accompanying 6-week-old to 17-year-old children completed a written survey at 2 pediatric practices. Using hierarchical logistic regression, for hospital-based participants (n = 216), vaccine refusal history ( P < .01) and vaccine decision made before the visit ( P < .05) explained 87% of vaccine refusals. In community-based participants (n = 100), vaccine refusal history ( P < .01) explained 81% of refusals. Over 1 in 5 parents changed their minds about vaccination during the visit. Thirty parents who were previous vaccine refusers accepted current vaccines, and 37 who had intended not to vaccinate choose vaccination. Twenty-nine parents without a refusal history declined vaccines, and 32 who did not intend to refuse before the visit declined vaccination. Future research should identify key factors to nudge parent decision making in favor of vaccination.
Directory of Open Access Journals (Sweden)
Radosavljević Dragana B.
2017-01-01
Full Text Available This paper presents kinetics modeling of essential oil hydrodistillation from juniper berries (Juniperus communis L. by using a non-linear regression methodology. The proposed model has the polynomial-logarithmic form. The initial equation of the proposed non-linear model is q = q∞•(a•(logt2 + b•logt + c and by substituting a1=q∞•a, b1 = q∞•b and c1 = q∞•c, the final equation is obtained as q = a1•(logt2 + b1•logt + c1. In this equation q is the quantity of the obtained oil at time t, while a1, b1 and c1 are parameters to be determined for each sample. From the final equation it can be seen that the key parameter q∞, which presents the maximal oil quantity obtained after infinite time, is already included in parameters a1, b1 and c1. In this way, experimental determination of this parameter is avoided. Using the proposed model with parameters obtained by regression, the values of oil hydrodistillation in time are calculated for each sample and compared to the experimental values. In addition, two kinetic models previously proposed in literature were applied to the same experimental results. The developed model provided better agreements with the experimental values than the two, generally accepted kinetic models of this process. The average values of error measures (RSS, RSE, AIC and MRPD obtained for our model (0.005; 0.017; –84.33; 1.65 were generally lower than the corresponding values of the other two models (0.025; 0.041; –53.20; 3.89 and (0.0035; 0.015; –86.83; 1.59. Also, parameter estimation for the proposed model was significantly simpler (maximum 2 iterations per sample using the non-linear regression than that for the existing models (maximum 9 iterations per sample. [Project of the Serbian Ministry of Education, Science and Technological Development, Grant no. TR-35026
Stabilisation of discrete-time polynomial fuzzy systems via a polynomial lyapunov approach
Nasiri, Alireza; Nguang, Sing Kiong; Swain, Akshya; Almakhles, Dhafer
2018-02-01
This paper deals with the problem of designing a controller for a class of discrete-time nonlinear systems which is represented by discrete-time polynomial fuzzy model. Most of the existing control design methods for discrete-time fuzzy polynomial systems cannot guarantee their Lyapunov function to be a radially unbounded polynomial function, hence the global stability cannot be assured. The proposed control design in this paper guarantees a radially unbounded polynomial Lyapunov functions which ensures global stability. In the proposed design, state feedback structure is considered and non-convexity problem is solved by incorporating an integrator into the controller. Sufficient conditions of stability are derived in terms of polynomial matrix inequalities which are solved via SOSTOOLS in MATLAB. A numerical example is presented to illustrate the effectiveness of the proposed controller.
Additive Intensity Regression Models in Corporate Default Analysis
DEFF Research Database (Denmark)
Lando, David; Medhat, Mamdouh; Nielsen, Mads Stenbo
2013-01-01
We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of mo...
Misspecified poisson regression models for large-scale registry data
DEFF Research Database (Denmark)
Grøn, Randi; Gerds, Thomas A.; Andersen, Per K.
2016-01-01
working models that are then likely misspecified. To support and improve conclusions drawn from such models, we discuss methods for sensitivity analysis, for estimation of average exposure effects using aggregated data, and a semi-parametric bootstrap method to obtain robust standard errors. The methods...
Logistic regression model for detecting radon prone areas in Ireland.
Elío, J; Crowley, Q; Scanlon, R; Hodgson, J; Long, S
2017-12-01
A new high spatial resolution radon risk map of Ireland has been developed, based on a combination of indoor radon measurements (n=31,910) and relevant geological information (i.e. Bedrock Geology, Quaternary Geology, soil permeability and aquifer type). Logistic regression was used to predict the probability of having an indoor radon concentration above the national reference level of 200Bqm -3 in Ireland. The four geological datasets evaluated were found to be statistically significant, and, based on combinations of these four variables, the predicted probabilities ranged from 0.57% to 75.5%. Results show that the Republic of Ireland may be divided in three main radon risk categories: High (HR), Medium (MR) and Low (LR). The probability of having an indoor radon concentration above 200Bqm -3 in each area was found to be 19%, 8% and 3%; respectively. In the Republic of Ireland, the population affected by radon concentrations above 200Bqm -3 is estimated at ca. 460k (about 10% of the total population). Of these, 57% (265k), 35% (160k) and 8% (35k) are in High, Medium and Low Risk Areas, respectively. Our results provide a high spatial resolution utility which permit customised radon-awareness information to be targeted at specific geographic areas. Copyright © 2017 Elsevier B.V. All rights reserved.
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.
Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko
2016-03-01
In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Shan, Peng; Peng, Silong; Zhao, Yuhui; Tang, Liang
2016-03-01
An analysis of binary mixtures of hydroxyl compound by Attenuated Total Reflection Fourier transform infrared spectroscopy (ATR FT-IR) and classical least squares (CLS) yield large model error due to the presence of unmodeled components such as H-bonded components. To accommodate these spectral variations, polynomial-based least squares (LSP) and polynomial-based total least squares (TLSP) are proposed to capture the nonlinear absorbance-concentration relationship. LSP is based on assuming that only absorbance noise exists; while TLSP takes both absorbance noise and concentration noise into consideration. In addition, based on different solving strategy, two optimization algorithms (limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm and Levenberg-Marquardt (LM) algorithm) are combined with TLSP and then two different TLSP versions (termed as TLSP-LBFGS and TLSP-LM) are formed. The optimum order of each nonlinear model is determined by cross-validation. Comparison and analyses of the four models are made from two aspects: absorbance prediction and concentration prediction. The results for water-ethanol solution and ethanol-ethyl lactate solution show that LSP, TLSP-LBFGS, and TLSP-LM can, for both absorbance prediction and concentration prediction, obtain smaller root mean square error of prediction than CLS. Additionally, they can also greatly enhance the accuracy of estimated pure component spectra. However, from the view of concentration prediction, the Wilcoxon signed rank test shows that there is no statistically significant difference between each nonlinear model and CLS. © The Author(s) 2016.
Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits
National Research Council Canada - National Science Library
Gravier, Michael
1999-01-01
.... This thesis draws on available data from the electronics integrated circuit industry to attempt to assess whether statistical modeling offers a viable method for predicting the presence of DMSMS...
Colouring and knot polynomials
International Nuclear Information System (INIS)
Welsh, D.J.A.
1991-01-01
These lectures will attempt to explain a connection between the recent advances in knot theory using the Jones and related knot polynomials with classical problems in combinatorics and statistical mechanics. The difficulty of some of these problems will be analysed in the context of their computational complexity. In particular we shall discuss colourings and groups valued flows in graphs, knots and the Jones and Kauffman polynomials, the Ising, Potts and percolation problems of statistical physics, computational complexity of the above problems. (author). 20 refs, 9 figs
Additive and polynomial representations
Krantz, David H; Suppes, Patrick
1971-01-01
Additive and Polynomial Representations deals with major representation theorems in which the qualitative structure is reflected as some polynomial function of one or more numerical functions defined on the basic entities. Examples are additive expressions of a single measure (such as the probability of disjoint events being the sum of their probabilities), and additive expressions of two measures (such as the logarithm of momentum being the sum of log mass and log velocity terms). The book describes the three basic procedures of fundamental measurement as the mathematical pivot, as the utiliz
U.S. Environmental Protection Agency — Spreadsheets are included here to support the manuscript "Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition". This...
Martingale Regressions for a Continuous Time Model of Exchange Rates
Guo, Zi-Yi
2017-01-01
One of the daunting problems in international finance is the weak explanatory power of existing theories of the nominal exchange rates, the so-called “foreign exchange rate determination puzzle”. We propose a continuous-time model to study the impact of order flow on foreign exchange rates. The model is estimated by a newly developed econometric tool based on a time-change sampling from calendar to volatility time. The estimation results indicate that the effect of order flow on exchange rate...
Focused information criterion and model averaging based on weighted composite quantile regression
Xu, Ganggang; Wang, Suojin; Huang, Jianhua Z.
2013-01-01
We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non
Cox's regression model for dynamics of grouped unemployment data
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2003-01-01
Roč. 10, č. 19 (2003), s. 151-162 ISSN 1212-074X R&D Projects: GA ČR GA402/01/0539 Institutional research plan: CEZ:AV0Z1075907 Keywords : mathematical statistics * survival analysis * Cox's model Subject RIV: BB - Applied Statistics, Operational Research
Multiple Linear Regression Model for Estimating the Price of a ...
African Journals Online (AJOL)
Ghana Mining Journal ... In the modeling, the Ordinary Least Squares (OLS) normality assumption which could introduce errors in the statistical analyses was dealt with by log transformation of the data, ensuring the data is normally ... The resultant MLRM is: Ŷi MLRM = (X'X)-1X'Y(xi') where X is the sample data matrix.
Inflation, Forecast Intervals and Long Memory Regression Models
C.S. Bos (Charles); Ph.H.B.F. Franses (Philip Hans); M. Ooms (Marius)
2001-01-01
textabstractWe examine recursive out-of-sample forecasting of monthly postwar U.S. core inflation and log price levels. We use the autoregressive fractionally integrated moving average model with explanatory variables (ARFIMAX). Our analysis suggests a significant explanatory power of leading
Inflation, Forecast Intervals and Long Memory Regression Models
Ooms, M.; Bos, C.S.; Franses, P.H.
2003-01-01
We examine recursive out-of-sample forecasting of monthly postwar US core inflation and log price levels. We use the autoregressive fractionally integrated moving average model with explanatory variables (ARFIMAX). Our analysis suggests a significant explanatory power of leading indicators
Data-driven modelling of LTI systems using symbolic regression
Khandelwal, D.; Toth, R.; Van den Hof, P.M.J.
2017-01-01
The aim of this project is to automate the task of data-driven identification of dynamical systems. The underlying goal is to develop an identification tool that models a physical system without distinguishing between classes of systems such as linear, nonlinear or possibly even hybrid systems. Such
Nonparametric Estimation of Regression Parameters in Measurement Error Models
Czech Academy of Sciences Publication Activity Database
Ehsanes Saleh, A.K.M.D.; Picek, J.; Kalina, Jan
2009-01-01
Roč. 67, č. 2 (2009), s. 177-200 ISSN 0026-1424 Grant - others:GA AV ČR(CZ) IAA101120801; GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z10300504 Keywords : asymptotic relative efficiency(ARE) * asymptotic theory * emaculate mode * Me model * R-estimation * Reliabilty ratio(RR) Subject RIV: BB - Applied Statistics, Operational Research
Liu, Y.; Zheng, L.; Pau, G. S. H.
2016-12-01
A careful assessment of the risk associated with geologic CO2 storage is critical to the deployment of large-scale storage projects. While numerical modeling is an indispensable tool for risk assessment, there has been increasing need in considering and addressing uncertainties in the numerical models. However, uncertainty analyses have been significantly hindered by the computational complexity of the model. As a remedy, reduced-order models (ROM), which serve as computationally efficient surrogates for high-fidelity models (HFM), have been employed. The ROM is constructed at the expense of an initial set of HFM simulations, and afterwards can be relied upon to predict the model output values at minimal cost. The ROM presented here is part of National Risk Assessment Program (NRAP) and intends to predict the water quality change in groundwater in response to hypothetical CO2 and brine leakage. The HFM based on which the ROM is derived is a multiphase flow and reactive transport model, with 3-D heterogeneous flow field and complex chemical reactions including aqueous complexation, mineral dissolution/precipitation, adsorption/desorption via surface complexation and cation exchange. Reduced-order modeling techniques based on polynomial basis expansion, such as polynomial chaos expansion (PCE), are widely used in the literature. However, the accuracy of such ROMs can be affected by the sparse structure of the coefficients of the expansion. Failing to identify vanishing polynomial coefficients introduces unnecessary sampling errors, the accumulation of which deteriorates the accuracy of the ROMs. To address this issue, we treat the PCE as a sparse Bayesian learning (SBL) problem, and the sparsity is obtained by detecting and including only the non-zero PCE coefficients one at a time by iteratively selecting the most contributing coefficients. The computational complexity due to predicting the entire 3-D concentration fields is further mitigated by a dimension
Lam, H K
2012-02-01
This paper investigates the stability of sampled-data output-feedback (SDOF) polynomial-fuzzy-model-based control systems. Representing the nonlinear plant using a polynomial fuzzy model, an SDOF fuzzy controller is proposed to perform the control process using the system output information. As only the system output is available for feedback compensation, it is more challenging for the controller design and system analysis compared to the full-state-feedback case. Furthermore, because of the sampling activity, the control signal is kept constant by the zero-order hold during the sampling period, which complicates the system dynamics and makes the stability analysis more difficult. In this paper, two cases of SDOF fuzzy controllers, which either share the same number of fuzzy rules or not, are considered. The system stability is investigated based on the Lyapunov stability theory using the sum-of-squares (SOS) approach. SOS-based stability conditions are obtained to guarantee the system stability and synthesize the SDOF fuzzy controller. Simulation examples are given to demonstrate the merits of the proposed SDOF fuzzy control approach.
Many-body orthogonal polynomial systems
International Nuclear Information System (INIS)
Witte, N.S.
1997-03-01
The fundamental methods employed in the moment problem, involving orthogonal polynomial systems, the Lanczos algorithm, continued fraction analysis and Pade approximants has been combined with a cumulant approach and applied to the extensive many-body problem in physics. This has yielded many new exact results for many-body systems in the thermodynamic limit - for the ground state energy, for excited state gaps, for arbitrary ground state avenges - and are of a nonperturbative nature. These results flow from a confluence property of the three-term recurrence coefficients arising and define a general class of many-body orthogonal polynomials. These theorems constitute an analytical solution to the Lanczos algorithm in that they are expressed in terms of the three-term recurrence coefficients α and β. These results can also be applied approximately for non-solvable models in the form of an expansion, in a descending series of the system size. The zeroth order order this expansion is just the manifestation of the central limit theorem in which a Gaussian measure and hermite polynomials arise. The first order represents the first non-trivial order, in which classical distribution functions like the binomial distributions arise and the associated class of orthogonal polynomials are Meixner polynomials. Amongst examples of systems which have infinite order in the expansion are q-orthogonal polynomials where q depends on the system size in a particular way. (author)
Applications of polynomial optimization in financial risk investment
Zeng, Meilan; Fu, Hongwei
2017-09-01
Recently, polynomial optimization has many important applications in optimization, financial economics and eigenvalues of tensor, etc. This paper studies the applications of polynomial optimization in financial risk investment. We consider the standard mean-variance risk measurement model and the mean-variance risk measurement model with transaction costs. We use Lasserre's hierarchy of semidefinite programming (SDP) relaxations to solve the specific cases. The results show that polynomial optimization is effective for some financial optimization problems.
Shaofu Zhuyu Decoction Regresses Endometriotic Lesions in a Rat Model
Directory of Open Access Journals (Sweden)
Guanghui Zhu
2018-01-01
Full Text Available The current therapies for endometriosis are restricted by various side effects and treatment outcome has been less than satisfactory. Shaofu Zhuyu Decoction (SZD, a classic traditional Chinese medicinal (TCM prescription for dysmenorrhea, has been widely used in clinical practice by TCM doctors to relieve symptoms of endometriosis. The present study aimed to investigate the effects of SZD on a rat model of endometriosis. Forty-eight female Sprague-Dawley rats with regular estrous cycles went through autotransplantation operation to establish endometriosis model. Then 38 rats with successful ectopic implants were randomized into two groups: vehicle- and SZD-treated groups. The latter were administered SZD through oral gavage for 4 weeks. By the end of the treatment period, the volume of the endometriotic lesions was measured, the histopathological properties of the ectopic endometrium were evaluated, and levels of proliferating cell nuclear antigen (PCNA, CD34, and hypoxia inducible factor- (HIF- 1α in the ectopic endometrium were detected with immunohistochemistry. Furthermore, apoptosis was assessed using the terminal deoxynucleotidyl transferase (TdT deoxyuridine 5′-triphosphate (dUTP nick-end labeling (TUNEL assay. In this study, SZD significantly reduced the size of ectopic lesions in rats with endometriosis, inhibited cell proliferation, increased cell apoptosis, and reduced microvessel density and HIF-1α expression. It suggested that SZD could be an effective therapy for the treatment and prevention of endometriosis recurrence.
[Application of detecting and taking overdispersion into account in Poisson regression model].
Bouche, G; Lepage, B; Migeot, V; Ingrand, P
2009-08-01
Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Variance of the number of primary care consultations (Var[Y]=21.03) was greater than the mean (E[Y]=5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution ; in addition this regression is easy to perform with ordinary statistical software packages.
Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model
DEFF Research Database (Denmark)
Møller, Niels Framroze
This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its stru....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its......, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well...
On the Laurent polynomial rings
International Nuclear Information System (INIS)
Stefanescu, D.
1985-02-01
We describe some properties of the Laurent polynomial rings in a finite number of indeterminates over a commutative unitary ring. We study some subrings of the Laurent polynomial rings. We finally obtain two cancellation properties. (author)
Computing the Alexander Polynomial Numerically
DEFF Research Database (Denmark)
Hansen, Mikael Sonne
2006-01-01
Explains how to construct the Alexander Matrix and how this can be used to compute the Alexander polynomial numerically.......Explains how to construct the Alexander Matrix and how this can be used to compute the Alexander polynomial numerically....
Using the classical linear regression model in analysis of the dependences of conveyor belt life
Directory of Open Access Journals (Sweden)
Miriam Andrejiová
2013-12-01
Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.
Suhartono, Lee, Muhammad Hisyam; Prastyo, Dedy Dwi
2015-12-01
The aim of this research is to develop a calendar variation model for forecasting retail sales data with the Eid ul-Fitr effect. The proposed model is based on two methods, namely two levels ARIMAX and regression methods. Two levels ARIMAX and regression models are built by using ARIMAX for the first level and regression for the second level. Monthly men's jeans and women's trousers sales in a retail company for the period January 2002 to September 2009 are used as case study. In general, two levels of calendar variation model yields two models, namely the first model to reconstruct the sales pattern that already occurred, and the second model to forecast the effect of increasing sales due to Eid ul-Fitr that affected sales at the same and the previous months. The results show that the proposed two level calendar variation model based on ARIMAX and regression methods yields better forecast compared to the seasonal ARIMA model and Neural Networks.
Stochastic Estimation via Polynomial Chaos
2015-10-01
AFRL-RW-EG-TR-2015-108 Stochastic Estimation via Polynomial Chaos Douglas V. Nance Air Force Research...COVERED (From - To) 20-04-2015 – 07-08-2015 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Stochastic Estimation via Polynomial Chaos ...This expository report discusses fundamental aspects of the polynomial chaos method for representing the properties of second order stochastic
Statistical approach for selection of regression model during validation of bioanalytical method
Directory of Open Access Journals (Sweden)
Natalija Nakov
2014-06-01
Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.
On a Robust MaxEnt Process Regression Model with Sample-Selection
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2018-04-01
Full Text Available In a regression analysis, a sample-selection bias arises when a dependent variable is partially observed as a result of the sample selection. This study introduces a Maximum Entropy (MaxEnt process regression model that assumes a MaxEnt prior distribution for its nonparametric regression function and finds that the MaxEnt process regression model includes the well-known Gaussian process regression (GPR model as a special case. Then, this special MaxEnt process regression model, i.e., the GPR model, is generalized to obtain a robust sample-selection Gaussian process regression (RSGPR model that deals with non-normal data in the sample selection. Various properties of the RSGPR model are established, including the stochastic representation, distributional hierarchy, and magnitude of the sample-selection bias. These properties are used in the paper to develop a hierarchical Bayesian methodology to estimate the model. This involves a simple and computationally feasible Markov chain Monte Carlo algorithm that avoids analytical or numerical derivatives of the log-likelihood function of the model. The performance of the RSGPR model in terms of the sample-selection bias correction, robustness to non-normality, and prediction, is demonstrated through results in simulations that attest to its good finite-sample performance.
International Nuclear Information System (INIS)
Atashkari, K.; Nariman-Zadeh, N.; Goelcue, M.; Khalkhali, A.; Jamali, A.
2007-01-01
The main reason for the efficiency decrease at part load conditions for four-stroke spark-ignition (SI) engines is the flow restriction at the cross-sectional area of the intake system. Traditionally, valve-timing has been designed to optimize operation at high engine-speed and wide open throttle conditions. Several investigations have demonstrated that improvements at part load conditions in engine performance can be accomplished if the valve-timing is variable. Controlling valve-timing can be used to improve the torque and power curve as well as to reduce fuel consumption and emissions. In this paper, a group method of data handling (GMDH) type neural network and evolutionary algorithms (EAs) are firstly used for modelling the effects of intake valve-timing (V t ) and engine speed (N) of a spark-ignition engine on both developed engine torque (T) and fuel consumption (Fc) using some experimentally obtained training and test data. Using such obtained polynomial neural network models, a multi-objective EA (non-dominated sorting genetic algorithm, NSGA-II) with a new diversity preserving mechanism are secondly used for Pareto based optimization of the variable valve-timing engine considering two conflicting objectives such as torque (T) and fuel consumption (Fc). The comparison results demonstrate the superiority of the GMDH type models over feedforward neural network models in terms of the statistical measures in the training data, testing data and the number of hidden neurons. Further, it is shown that some interesting and important relationships, as useful optimal design principles, involved in the performance of the variable valve-timing four-stroke spark-ignition engine can be discovered by the Pareto based multi-objective optimization of the polynomial models. Such important optimal principles would not have been obtained without the use of both the GMDH type neural network modelling and the multi-objective Pareto optimization approach
A generalized right truncated bivariate Poisson regression model with applications to health data.
Islam, M Ataharul; Chowdhury, Rafiqul I
2017-01-01
A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model.
Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab
2011-01-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work
Killings, duality and characteristic polynomials
Álvarez, Enrique; Borlaf, Javier; León, José H.
1998-03-01
In this paper the complete geometrical setting of (lowest order) abelian T-duality is explored with the help of some new geometrical tools (the reduced formalism). In particular, all invariant polynomials (the integrands of the characteristic classes) can be explicitly computed for the dual model in terms of quantities pertaining to the original one and with the help of the canonical connection whose intrinsic characterization is given. Using our formalism the physically, and T-duality invariant, relevant result that top forms are zero when there is an isometry without fixed points is easily proved. © 1998
Orthogonal polynomials and random matrices
Deift, Percy
2000-01-01
This volume expands on a set of lectures held at the Courant Institute on Riemann-Hilbert problems, orthogonal polynomials, and random matrix theory. The goal of the course was to prove universality for a variety of statistical quantities arising in the theory of random matrix models. The central question was the following: Why do very general ensembles of random n {\\times} n matrices exhibit universal behavior as n {\\rightarrow} {\\infty}? The main ingredient in the proof is the steepest descent method for oscillatory Riemann-Hilbert problems.
Polynomial optimization : Error analysis and applications
Sun, Zhao
2015-01-01
Polynomial optimization is the problem of minimizing a polynomial function subject to polynomial inequality constraints. In this thesis we investigate several hierarchies of relaxations for polynomial optimization problems. Our main interest lies in understanding their performance, in particular how
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Analysis of dental caries using generalized linear and count regression models
Directory of Open Access Journals (Sweden)
Javali M. Phil
2013-11-01
Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.
Accounting for measurement error in log regression models with applications to accelerated testing.
Directory of Open Access Journals (Sweden)
Robert Richardson
Full Text Available In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Accounting for measurement error in log regression models with applications to accelerated testing.
Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M
2018-01-01
In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Complex Polynomial Vector Fields
DEFF Research Database (Denmark)
Dias, Kealey
vector fields. Since the class of complex polynomial vector fields in the plane is natural to consider, it is remarkable that its study has only begun very recently. There are numerous fundamental questions that are still open, both in the general classification of these vector fields, the decomposition...... of parameter spaces into structurally stable domains, and a description of the bifurcations. For this reason, the talk will focus on these questions for complex polynomial vector fields.......The two branches of dynamical systems, continuous and discrete, correspond to the study of differential equations (vector fields) and iteration of mappings respectively. In holomorphic dynamics, the systems studied are restricted to those described by holomorphic (complex analytic) functions...
Roots of the Chromatic Polynomial
DEFF Research Database (Denmark)
Perrett, Thomas
The chromatic polynomial of a graph G is a univariate polynomial whose evaluation at any positive integer q enumerates the proper q-colourings of G. It was introduced in connection with the famous four colour theorem but has recently found other applications in the field of statistical physics...... extend Thomassen’s technique to the Tutte polynomial and as a consequence, deduce a density result for roots of the Tutte polynomial. This partially answers a conjecture of Jackson and Sokal. Finally, we refocus our attention on the chromatic polynomial and investigate the density of chromatic roots...
Polynomials in algebraic analysis
Multarzyński, Piotr
2012-01-01
The concept of polynomials in the sense of algebraic analysis, for a single right invertible linear operator, was introduced and studied originally by D. Przeworska-Rolewicz \\cite{DPR}. One of the elegant results corresponding with that notion is a purely algebraic version of the Taylor formula, being a generalization of its usual counterpart, well known for functions of one variable. In quantum calculus there are some specific discrete derivations analyzed, which are right invertible linear ...
Generic global regression models for growth prediction of Salmonella in ground pork and pork cuts
DEFF Research Database (Denmark)
Buschhardt, Tasja; Hansen, Tina Beck; Bahl, Martin Iain
2017-01-01
Introduction and Objectives Models for the prediction of bacterial growth in fresh pork are primarily developed using two-step regression (i.e. primary models followed by secondary models). These models are also generally based on experiments in liquids or ground meat and neglect surface growth....... It has been shown that one-step global regressions can result in more accurate models and that bacterial growth on intact surfaces can substantially differ from growth in liquid culture. Material and Methods We used a global-regression approach to develop predictive models for the growth of Salmonella....... One part of obtained logtransformed cell counts was used for model development and another for model validation. The Ratkowsky square root model and the relative lag time (RLT) model were integrated into the logistic model with delay. Fitted parameter estimates were compared to investigate the effect...
Directory of Open Access Journals (Sweden)
Soldić-Aleksić Jasna
2009-01-01
Full Text Available Market segmentation presents one of the key concepts of the modern marketing. The main goal of market segmentation is focused on creating groups (segments of customers that have similar characteristics, needs, wishes and/or similar behavior regarding the purchase of concrete product/service. Companies can create specific marketing plan for each of these segments and therefore gain short or long term competitive advantage on the market. Depending on the concrete marketing goal, different segmentation schemes and techniques may be applied. This paper presents a predictive market segmentation model based on the application of logistic regression model and CHAID analysis. The logistic regression model was used for the purpose of variables selection (from the initial pool of eleven variables which are statistically significant for explaining the dependent variable. Selected variables were afterwards included in the CHAID procedure that generated the predictive market segmentation model. The model results are presented on the concrete empirical example in the following form: summary model results, CHAID tree, Gain chart, Index chart, risk and classification tables.
Global sensitivity analysis using polynomial chaos expansions
International Nuclear Information System (INIS)
Sudret, Bruno
2008-01-01
Global sensitivity analysis (SA) aims at quantifying the respective effects of input random variables (or combinations thereof) onto the variance of the response of a physical or mathematical model. Among the abundant literature on sensitivity measures, the Sobol' indices have received much attention since they provide accurate information for most models. The paper introduces generalized polynomial chaos expansions (PCE) to build surrogate models that allow one to compute the Sobol' indices analytically as a post-processing of the PCE coefficients. Thus the computational cost of the sensitivity indices practically reduces to that of estimating the PCE coefficients. An original non intrusive regression-based approach is proposed, together with an experimental design of minimal size. Various application examples illustrate the approach, both from the field of global SA (i.e. well-known benchmark problems) and from the field of stochastic mechanics. The proposed method gives accurate results for various examples that involve up to eight input random variables, at a computational cost which is 2-3 orders of magnitude smaller than the traditional Monte Carlo-based evaluation of the Sobol' indices
Global sensitivity analysis using polynomial chaos expansions
Energy Technology Data Exchange (ETDEWEB)
Sudret, Bruno [Electricite de France, R and D Division, Site des Renardieres, F 77818 Moret-sur-Loing Cedex (France)], E-mail: bruno.sudret@edf.fr
2008-07-15
Global sensitivity analysis (SA) aims at quantifying the respective effects of input random variables (or combinations thereof) onto the variance of the response of a physical or mathematical model. Among the abundant literature on sensitivity measures, the Sobol' indices have received much attention since they provide accurate information for most models. The paper introduces generalized polynomial chaos expansions (PCE) to build surrogate models that allow one to compute the Sobol' indices analytically as a post-processing of the PCE coefficients. Thus the computational cost of the sensitivity indices practically reduces to that of estimating the PCE coefficients. An original non intrusive regression-based approach is proposed, together with an experimental design of minimal size. Various application examples illustrate the approach, both from the field of global SA (i.e. well-known benchmark problems) and from the field of stochastic mechanics. The proposed method gives accurate results for various examples that involve up to eight input random variables, at a computational cost which is 2-3 orders of magnitude smaller than the traditional Monte Carlo-based evaluation of the Sobol' indices.
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
General Reducibility and Solvability of Polynomial Equations ...
African Journals Online (AJOL)
General Reducibility and Solvability of Polynomial Equations. ... Unlike quadratic, cubic, and quartic polynomials, the general quintic and higher degree polynomials cannot be solved algebraically in terms of finite number of additions, ... Galois Theory, Solving Polynomial Systems, Polynomial factorization, Polynomial Ring ...
Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)
Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul
2018-05-01
The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?
Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.
2016-12-01
Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.
ANALYSIS OF THE FINANCIAL PERFORMANCES OF THE FIRM, BY USING THE MULTIPLE REGRESSION MODEL
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2011-11-01
Full Text Available The information achieved through the use of simple linear regression are not always enough to characterize the evolution of an economic phenomenon and, furthermore, to identify its possible future evolution. To remedy these drawbacks, the special literature includes multiple regression models, in which the evolution of the dependant variable is defined depending on two or more factorial variables.
Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method
Prahutama, Alan; Sudarno
2018-05-01
The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).
Directory of Open Access Journals (Sweden)
Mach Łukasz
2017-06-01
Full Text Available The research process aimed at building regression models, which helps to valuate residential real estate, is presented in the following article. Two widely used computational tools i.e. the classical multiple regression and regression models of artificial neural networks were used in order to build models. An attempt to define the utilitarian usefulness of the above-mentioned tools and comparative analysis of them is the aim of the conducted research. Data used for conducting analyses refers to the secondary transactional residential real estate market.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Tarafder, Sumit; Toukir Ahmed, Md; Iqbal, Sumaiya; Tamjidul Hoque, Md; Sohel Rahman, M
2018-03-14
Accessible surface area (ASA) of a protein residue is an effective feature for protein structure prediction, binding region identification, fold recognition problems etc. Improving the prediction of ASA by the application of effective feature variables is a challenging but explorable task to consider, specially in the field of machine learning. Among the existing predictors of ASA, REGAd 3 p is a highly accurate ASA predictor which is based on regularized exact regression with polynomial kernel of degree 3. In this work, we present a new predictor RBSURFpred, which extends REGAd 3 p on several dimensions by incorporating 58 physicochemical, evolutionary and structural properties into 9-tuple peptides via Chou's general PseAAC, which allowed us to obtain higher accuracies in predicting both real-valued and binary ASA. We have compared RBSURFpred for both real and binary space predictions with state-of-the-art predictors, such as REGAd 3 p and SPIDER2. We also have carried out a rigorous analysis of the performance of RBSURFpred in terms of different amino acids and their properties, and also with biologically relevant case-studies. The performance of RBSURFpred establishes itself as a useful tool for the community. Copyright © 2018 Elsevier Ltd. All rights reserved.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni
2017-12-01
Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.
Poisson regression for modeling count and frequency outcomes in trauma research.
Gagnon, David R; Doron-LaMarca, Susan; Bell, Margret; O'Farrell, Timothy J; Taft, Casey T
2008-10-01
The authors describe how the Poisson regression method for analyzing count or frequency outcome variables can be applied in trauma studies. The outcome of interest in trauma research may represent a count of the number of incidents of behavior occurring in a given time interval, such as acts of physical aggression or substance abuse. Traditional linear regression approaches assume a normally distributed outcome variable with equal variances over the range of predictor variables, and may not be optimal for modeling count outcomes. An application of Poisson regression is presented using data from a study of intimate partner aggression among male patients in an alcohol treatment program and their female partners. Results of Poisson regression and linear regression models are compared.
Directory of Open Access Journals (Sweden)
Bruno Bastos Teixeira
2012-09-01
Full Text Available Objetivou-se comparar diferentes modelos de regressão aleatória por meio de funções polinomiais de Legendre de diferentes ordens, para avaliar o que melhor se ajusta ao estudo genético da curva de crescimento de codornas de corte. Foram avaliados dados de 2136 matrizes de codorna de corte, dos quais 1026 pertenciam ao grupo genético UFV1 e 1110 ao grupo UFV2. As codornas foram pesadas nos 1°, 7°, 14°, 21°, 28°, 35°, 42°, 77°, 112° e 147° dias de idade e seus pesos utilizados para a análise. Foram testadas duas possíveis modelagens de variância residual heterogênea, sendo agrupadas em 3 e 5 classes de idade. Após, foi realizado o estudo do modelo de regressão aleatória que melhor aplica-se à curva de crescimento das codornas. A comparação entre os modelos foi feita pelo Critério de Informação de Akaike (AIC, Critério de Informação Bayesiano de Schwarz (BIC, Logaritmo da função de verossimilhança (Log e L e teste da razão de verossimilhança (LRT, ao nível de 1%. O modelo que considerou a heterogeneidade de variância residual CL3 mostrou-se adequado à linhagem UFV1, e o modelo CL5 à linhagem UFV2. Uma função polinomial de Legendre com ordem 5, para efeito genético aditivo direto e 5 para efeito permanente de animal, para a linhagem UFV1 e, com ordem 3, para efeito genético aditivo direto e 5 para efeito permanente de animal para a linhagem UFV2, deve ser utilizada na avaliação genética da curva de crescimento das codornas de corte.The objective was to compare different random regression models using Legendre polynomial functions of different orders, to evaluate what best fits the genetic study of the growth curve of meat quails. It was evaluated data from 2136 cut dies quail, of which 1026 belonged to genetic group UFV1 and 1110 the group UFV2. Quail were weighed at 10, 70, 140, 210, 280, 350, 420, 770, 1120 and 1470 days of age, and weights used for the analysis. It was tested two possible modeling
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...
Electricity demand loads modeling using AutoRegressive Moving Average (ARMA) models
Energy Technology Data Exchange (ETDEWEB)
Pappas, S.S. [Department of Information and Communication Systems Engineering, University of the Aegean, Karlovassi, 83 200 Samos (Greece); Ekonomou, L.; Chatzarakis, G.E. [Department of Electrical Engineering Educators, ASPETE - School of Pedagogical and Technological Education, N. Heraklion, 141 21 Athens (Greece); Karamousantas, D.C. [Technological Educational Institute of Kalamata, Antikalamos, 24100 Kalamata (Greece); Katsikas, S.K. [Department of Technology Education and Digital Systems, University of Piraeus, 150 Androutsou Srt., 18 532 Piraeus (Greece); Liatsis, P. [Division of Electrical Electronic and Information Engineering, School of Engineering and Mathematical Sciences, Information and Biomedical Engineering Centre, City University, Northampton Square, London EC1V 0HB (United Kingdom)
2008-09-15
This study addresses the problem of modeling the electricity demand loads in Greece. The provided actual load data is deseasonilized and an AutoRegressive Moving Average (ARMA) model is fitted on the data off-line, using the Akaike Corrected Information Criterion (AICC). The developed model fits the data in a successful manner. Difficulties occur when the provided data includes noise or errors and also when an on-line/adaptive modeling is required. In both cases and under the assumption that the provided data can be represented by an ARMA model, simultaneous order and parameter estimation of ARMA models under the presence of noise are performed. The produced results indicate that the proposed method, which is based on the multi-model partitioning theory, tackles successfully the studied problem. For validation purposes the produced results are compared with three other established order selection criteria, namely AICC, Akaike's Information Criterion (AIC) and Schwarz's Bayesian Information Criterion (BIC). The developed model could be useful in the studies that concern electricity consumption and electricity prices forecasts. (author)
Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A
2014-09-01
Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.
Developing and testing a global-scale regression model to quantify mean annual streamflow
Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.
2017-01-01
Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.
Polynomial approximation on polytopes
Totik, Vilmos
2014-01-01
Polynomial approximation on convex polytopes in \\mathbf{R}^d is considered in uniform and L^p-norms. For an appropriate modulus of smoothness matching direct and converse estimates are proven. In the L^p-case so called strong direct and converse results are also verified. The equivalence of the moduli of smoothness with an appropriate K-functional follows as a consequence. The results solve a problem that was left open since the mid 1980s when some of the present findings were established for special, so-called simple polytopes.
International Nuclear Information System (INIS)
Milks, Matthew M; Guise, Hubert de
2005-01-01
The construction of su(2) intelligent states is simplified using a polynomial representation of su(2). The cornerstone of the new construction is the diagonalization of a 2 x 2 matrix. The method is sufficiently simple to be easily extended to su(3), where one is required to diagonalize a single 3 x 3 matrix. For two perfectly general su(3) operators, this diagonalization is technically possible but the procedure loses much of its simplicity owing to the algebraic form of the roots of a cubic equation. Simplified expressions can be obtained by specializing the choice of su(3) operators. This simpler construction will be discussed in detail
Reflexion on linear regression trip production modelling method for ensuring good model quality
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
Using the Logistic Regression model in supporting decisions of establishing marketing strategies
Directory of Open Access Journals (Sweden)
Cristinel CONSTANTIN
2015-12-01
Full Text Available This paper is about an instrumental research regarding the using of Logistic Regression model for data analysis in marketing research. The decision makers inside different organisation need relevant information to support their decisions regarding the marketing strategies. The data provided by marketing research could be computed in various ways but the multivariate data analysis models can enhance the utility of the information. Among these models we can find the Logistic Regression model, which is used for dichotomous variables. Our research is based on explanation the utility of this model and interpretation of the resulted information in order to help practitioners and researchers to use it in their future investigations
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Chen, Baojiang; Qin, Jing
2014-05-10
In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.
Twisted Polynomials and Forgery Attacks on GCM
DEFF Research Database (Denmark)
Abdelraheem, Mohamed Ahmed A. M. A.; Beelen, Peter; Bogdanov, Andrey
2015-01-01
Polynomial hashing as an instantiation of universal hashing is a widely employed method for the construction of MACs and authenticated encryption (AE) schemes, the ubiquitous GCM being a prominent example. It is also used in recent AE proposals within the CAESAR competition which aim at providing...... in an improved key recovery algorithm. As cryptanalytic applications of our twisted polynomials, we develop the first universal forgery attacks on GCM in the weak-key model that do not require nonce reuse. Moreover, we present universal weak-key forgeries for the nonce-misuse resistant AE scheme POET, which...
Zhao, Xin; Han, Meng; Ding, Lili; Calin, Adrian Cantemir
2018-01-01
The accurate forecast of carbon dioxide emissions is critical for policy makers to take proper measures to establish a low carbon society. This paper discusses a hybrid of the mixed data sampling (MIDAS) regression model and BP (back propagation) neural network (MIDAS-BP model) to forecast carbon dioxide emissions. Such analysis uses mixed frequency data to study the effects of quarterly economic growth on annual carbon dioxide emissions. The forecasting ability of MIDAS-BP is remarkably better than MIDAS, ordinary least square (OLS), polynomial distributed lags (PDL), autoregressive distributed lags (ADL), and auto-regressive moving average (ARMA) models. The MIDAS-BP model is suitable for forecasting carbon dioxide emissions for both the short and longer term. This research is expected to influence the methodology for forecasting carbon dioxide emissions by improving the forecast accuracy. Empirical results show that economic growth has both negative and positive effects on carbon dioxide emissions that last 15 quarters. Carbon dioxide emissions are also affected by their own change within 3 years. Therefore, there is a need for policy makers to explore an alternative way to develop the economy, especially applying new energy policies to establish a low carbon society.
Veerkamp, R F; Koenen, E P; De Jong, G
2001-10-01
Twenty type classifiers scored body condition (BCS) of 91,738 first-parity cows from 601 sires and 5518 maternal grandsires. Fertility data during first lactation were extracted for 177,220 cows, of which 67,278 also had a BCS observation, and first-lactation 305-d milk, fat, and protein yields were added for 180,631 cows. Heritabilities and genetic correlations were estimated using a sire-maternal grandsire model. Heritability of BCS was 0.38. Heritabilities for fertility traits were low (0.01 to 0.07), but genetic standard deviations were substantial, 9 d for days to first service and calving interval, 0.25 for number of services, and 5% for first-service conception. Phenotypic correlations between fertility and yield or BCS were small (-0.15 to 0.20). Genetic correlations between yield and all fertility traits were unfavorable (0.37 to 0.74). Genetic correlations with BCS were between -0.4 and -0.6 for calving interval and days to first service. Random regression analysis (RR) showed that correlations changed with days in milk for BCS. Little agreement was found between variances and correlations from RR, and analysis including a single month (mo 1 to 10) of data for BCS, especially during early and late lactation. However, this was due to excluding data from the conventional analysis, rather than due to the polynomials used. RR and a conventional five-traits model where BCS in mo 1, 4, 7, and 10 was treated as a separate traits (plus yield or fertility) gave similar results. Thus a parsimonious random regression model gave more realistic estimates for the (co)variances than a series of bivariate analysis on subsets of the data for BCS. A higher genetic merit for yield has unfavorable effects on fertility, but the genetic correlation suggests that BCS (at some stages of lactation) might help to alleviate the unfavorable effect of selection for higher yield on fertility.
Exponential time paradigms through the polynomial time lens
Drucker, A.; Nederlof, J.; Santhanam, R.; Sankowski, P.; Zaroliagis, C.
2016-01-01
We propose a general approach to modelling algorithmic paradigms for the exact solution of NP-hard problems. Our approach is based on polynomial time reductions to succinct versions of problems solvable in polynomial time. We use this viewpoint to explore and compare the power of paradigms such as
International Nuclear Information System (INIS)
Fang, Xiande; Xu, Yu
2011-01-01
The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination R c 2 ≥ 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: → Performed a critical review of empirical models of turbine efficiency. → Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. → Verified the method for developing the empirical model. → Verified the model.
Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer
2013-01-01
Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…
Structured Additive Regression Models: An R Interface to BayesX
Directory of Open Access Journals (Sweden)
Nikolaus Umlauf
2015-02-01
Full Text Available Structured additive regression (STAR models provide a flexible framework for model- ing possible nonlinear effects of covariates: They contain the well established frameworks of generalized linear models and generalized additive models as special cases but also allow a wider class of effects, e.g., for geographical or spatio-temporal data, allowing for specification of complex and realistic models. BayesX is standalone software package providing software for fitting general class of STAR models. Based on a comprehensive open-source regression toolbox written in C++, BayesX uses Bayesian inference for estimating STAR models based on Markov chain Monte Carlo simulation techniques, a mixed model representation of STAR models, or stepwise regression techniques combining penalized least squares estimation with model selection. BayesX not only covers models for responses from univariate exponential families, but also models from less-standard regression situations such as models for multi-categorical responses with either ordered or unordered categories, continuous time survival data, or continuous time multi-state models. This paper presents a new fully interactive R interface to BayesX: the R package R2BayesX. With the new package, STAR models can be conveniently specified using Rs formula language (with some extended terms, fitted using the BayesX binary, represented in R with objects of suitable classes, and finally printed/summarized/plotted. This makes BayesX much more accessible to users familiar with R and adds extensive graphics capabilities for visualizing fitted STAR models. Furthermore, R2BayesX complements the already impressive capabilities for semiparametric regression in R by a comprehensive toolbox comprising in particular more complex response types and alternative inferential procedures such as simulation-based Bayesian inference.
Nagel-Alne, G E; Krontveit, R; Bohlin, J; Valle, P S; Skjerve, E; Sølverød, L S
2014-07-01
In 2001, the Norwegian Goat Health Service initiated the Healthier Goats program (HG), with the aim of eradicating caprine arthritis encephalitis, caseous lymphadenitis, and Johne's disease (caprine paratuberculosis) in Norwegian goat herds. The aim of the present study was to explore how control and eradication of the above-mentioned diseases by enrolling in HG affected milk yield by comparison with herds not enrolled in HG. Lactation curves were modeled using a multilevel cubic spline regression model where farm, goat, and lactation were included as random effect parameters. The data material contained 135,446 registrations of daily milk yield from 28,829 lactations in 43 herds. The multilevel cubic spline regression model was applied to 4 categories of data: enrolled early, control early, enrolled late, and control late. For enrolled herds, the early and late notations refer to the situation before and after enrolling in HG; for nonenrolled herds (controls), they refer to development over time, independent of HG. Total milk yield increased in the enrolled herds after eradication: the total milk yields in the fourth lactation were 634.2 and 873.3 kg in enrolled early and enrolled late herds, respectively, and 613.2 and 701.4 kg in the control early and control late herds, respectively. Day of peak yield differed between enrolled and control herds. The day of peak yield came on d 6 of lactation for the control early category for parities 2, 3, and 4, indicating an inability of the goats to further increase their milk yield from the initial level. For enrolled herds, on the other hand, peak yield came between d 49 and 56, indicating a gradual increase in milk yield after kidding. Our results indicate that enrollment in the HG disease eradication program improved the milk yield of dairy goats considerably, and that the multilevel cubic spline regression was a suitable model for exploring effects of disease control and eradication on milk yield. Copyright © 2014
Profile-driven regression for modeling and runtime optimization of mobile networks
DEFF Research Database (Denmark)
McClary, Dan; Syrotiuk, Violet; Kulahci, Murat
2010-01-01
Computer networks often display nonlinear behavior when examined over a wide range of operating conditions. There are few strategies available for modeling such behavior and optimizing such systems as they run. Profile-driven regression is developed and applied to modeling and runtime optimization...... of throughput in a mobile ad hoc network, a self-organizing collection of mobile wireless nodes without any fixed infrastructure. The intermediate models generated in profile-driven regression are used to fit an overall model of throughput, and are also used to optimize controllable factors at runtime. Unlike...
DEFF Research Database (Denmark)
Carstensen, Bendix
1996-01-01
This paper shows how to fit excess and relative risk regression models to interval censored survival data, and how to implement the models in standard statistical software. The methods developed are used for the analysis of HIV infection rates in a cohort of Danish homosexual men.......This paper shows how to fit excess and relative risk regression models to interval censored survival data, and how to implement the models in standard statistical software. The methods developed are used for the analysis of HIV infection rates in a cohort of Danish homosexual men....
The Relationship between Economic Growth and Money Laundering – a Linear Regression Model
Directory of Open Access Journals (Sweden)
Daniel Rece
2009-09-01
Full Text Available This study provides an overview of the relationship between economic growth and money laundering modeled by a least squares function. The report analyzes statistically data collected from USA, Russia, Romania and other eleven European countries, rendering a linear regression model. The study illustrates that 23.7% of the total variance in the regressand (level of money laundering is “explained” by the linear regression model. In our opinion, this model will provide critical auxiliary judgment and decision support for anti-money laundering service systems.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Directory of Open Access Journals (Sweden)
Ajay Singh
2016-06-01
Full Text Available A single trait linear mixed random regression test-day model was applied for the first time for analyzing the first lactation monthly test-day milk yield records in Karan Fries cattle. The test-day milk yield data was modeled using a random regression model (RRM considering different order of Legendre polynomial for the additive genetic effect (4th order and the permanent environmental effect (5th order. Data pertaining to 1,583 lactation records spread over a period of 30 years were recorded and analyzed in the study. The variance component, heritability and genetic correlations among test-day milk yields were estimated using RRM. RRM heritability estimates of test-day milk yield varied from 0.11 to 0.22 in different test-day records. The estimates of genetic correlations between different test-day milk yields ranged 0.01 (test-day 1 [TD-1] and TD-11 to 0.99 (TD-4 and TD-5. The magnitudes of genetic correlations between test-day milk yields decreased as the interval between test-days increased and adjacent test-day had higher correlations. Additive genetic and permanent environment variances were higher for test-day milk yields at both ends of lactation. The residual variance was observed to be lower than the permanent environment variance for all the test-day milk yields.
Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression
Khikmah, L.; Wijayanto, H.; Syafitri, U. D.
2017-04-01
The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.
Liu, Chuang; Lam, H. K.
2015-01-01
In this paper, we propose a polynomial fuzzy observer controller for nonlinear systems, where the design is achieved through the stability analysis of polynomial-fuzzy-model-based (PFMB) observer-control system. The polynomial fuzzy observer estimates the system states using estimated premise variables. The estimated states are then employed by the polynomial fuzzy controller for the feedback control of nonlinear systems represented by the polynomial fuzzy model. The system stability of the P...
Regression analysis understanding and building business and economic models using Excel
Wilson, J Holton
2012-01-01
The technique of regression analysis is used so often in business and economics today that an understanding of its use is necessary for almost everyone engaged in the field. This book will teach you the essential elements of building and understanding regression models in a business/economic context in an intuitive manner. The authors take a non-theoretical treatment that is accessible even if you have a limited statistical background. It is specifically designed to teach the correct use of regression, while advising you of its limitations and teaching about common pitfalls. This book describe
Estimasi Model Seemingly Unrelated Regression (SUR dengan Metode Generalized Least Square (GLS
Directory of Open Access Journals (Sweden)
Ade Widyaningsih
2015-04-01
Full Text Available Regression analysis is a statistical tool that is used to determine the relationship between two or more quantitative variables so that one variable can be predicted from the other variables. A method that can used to obtain a good estimation in the regression analysis is ordinary least squares method. The least squares method is used to estimate the parameters of one or more regression but relationships among the errors in the response of other estimators are not allowed. One way to overcome this problem is Seemingly Unrelated Regression model (SUR in which parameters are estimated using Generalized Least Square (GLS. In this study, the author applies SUR model using GLS method on world gasoline demand data. The author obtains that SUR using GLS is better than OLS because SUR produce smaller errors than the OLS.
Estimasi Model Seemingly Unrelated Regression (SUR dengan Metode Generalized Least Square (GLS
Directory of Open Access Journals (Sweden)
Ade Widyaningsih
2014-06-01
Full Text Available Regression analysis is a statistical tool that is used to determine the relationship between two or more quantitative variables so that one variable can be predicted from the other variables. A method that can used to obtain a good estimation in the regression analysis is ordinary least squares method. The least squares method is used to estimate the parameters of one or more regression but relationships among the errors in the response of other estimators are not allowed. One way to overcome this problem is Seemingly Unrelated Regression model (SUR in which parameters are estimated using Generalized Least Square (GLS. In this study, the author applies SUR model using GLS method on world gasoline demand data. The author obtains that SUR using GLS is better than OLS because SUR produce smaller errors than the OLS.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Validation of regression models for nitrate concentrations in the upper groundwater in sandy soils
International Nuclear Information System (INIS)
Sonneveld, M.P.W.; Brus, D.J.; Roelsma, J.
2010-01-01
For Dutch sandy regions, linear regression models have been developed that predict nitrate concentrations in the upper groundwater on the basis of residual nitrate contents in the soil in autumn. The objective of our study was to validate these regression models for one particular sandy region dominated by dairy farming. No data from this area were used for calibrating the regression models. The model was validated by additional probability sampling. This sample was used to estimate errors in 1) the predicted areal fractions where the EU standard of 50 mg l -1 is exceeded for farms with low N surpluses (ALT) and farms with higher N surpluses (REF); 2) predicted cumulative frequency distributions of nitrate concentration for both groups of farms. Both the errors in the predicted areal fractions as well as the errors in the predicted cumulative frequency distributions indicate that the regression models are invalid for the sandy soils of this study area. - This study indicates that linear regression models that predict nitrate concentrations in the upper groundwater using residual soil N contents should be applied with care.
A brief introduction to regression designs and mixed-effects modelling by a recent convert
Balling, Laura Winther
2008-01-01
This article discusses the advantages of multiple regression designs over the factorial designs traditionally used in many psycholinguistic experiments. It is shown that regression designs are typically more informative, statistically more powerful and better suited to the analysis of naturalistic tasks. The advantages of including both fixed and random effects are demonstrated with reference to linear mixed-effects models, and problems of collinearity, variable distribution and variable sele...
A computational approach to compare regression modelling strategies in prediction research.
Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H
2016-08-25
It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Prahutama, Alan; Suparti; Wahyu Utami, Tiani
2018-03-01
Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Constructing general partial differential equations using polynomial and neural networks.
Zjavka, Ladislav; Pedrycz, Witold
2016-01-01
Sum fraction terms can approximate multi-variable functions on the basis of discrete observations, replacing a partial differential equation definition with polynomial elementary data relation descriptions. Artificial neural networks commonly transform the weighted sum of inputs to describe overall similarity relationships of trained and new testing input patterns. Differential polynomial neural networks form a new class of neural networks, which construct and solve an unknown general partial differential equation of a function of interest with selected substitution relative terms using non-linear multi-variable composite polynomials. The layers of the network generate simple and composite relative substitution terms whose convergent series combinations can describe partial dependent derivative changes of the input variables. This regression is based on trained generalized partial derivative data relations, decomposed into a multi-layer polynomial network structure. The sigmoidal function, commonly used as a nonlinear activation of artificial neurons, may transform some polynomial items together with the parameters with the aim to improve the polynomial derivative term series ability to approximate complicated periodic functions, as simple low order polynomials are not able to fully make up for the complete cycles. The similarity analysis facilitates substitutions for differential equations or can form dimensional units from data samples to describe real-world problems. Copyright © 2015 Elsevier Ltd. All rights reserved.
truncSP: An R Package for Estimation of Semi-Parametric Truncated Linear Regression Models
Directory of Open Access Journals (Sweden)
Maria Karlsson
2014-05-01
Full Text Available Problems with truncated data occur in many areas, complicating estimation and inference. Regarding linear regression models, the ordinary least squares estimator is inconsistent and biased for these types of data and is therefore unsuitable for use. Alternative estimators, designed for the estimation of truncated regression models, have been developed. This paper presents the R package truncSP. The package contains functions for the estimation of semi-parametric truncated linear regression models using three different estimators: the symmetrically trimmed least squares, quadratic mode, and left truncated estimators, all of which have been shown to have good asymptotic and ?nite sample properties. The package also provides functions for the analysis of the estimated models. Data from the environmental sciences are used to illustrate the functions in the package.
Modeling and prediction of Turkey's electricity consumption using Support Vector Regression
International Nuclear Information System (INIS)
Kavaklioglu, Kadir
2011-01-01
Support Vector Regression (SVR) methodology is used to model and predict Turkey's electricity consumption. Among various SVR formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. (author)
Improved model of the retardance in citric acid coated ferrofluids using stepwise regression
Lin, J. F.; Qiu, X. R.
2017-06-01
Citric acid (CA) coated Fe3O4 ferrofluids (FFs) have been conducted for biomedical application. The magneto-optical retardance of CA coated FFs was measured by a Stokes polarimeter. Optimization and multiple regression of retardance in FFs were executed by Taguchi method and Microsoft Excel previously, and the F value of regression model was large enough. However, the model executed by Excel was not systematic. Instead we adopted the stepwise regression to model the retardance of CA coated FFs. From the results of stepwise regression by MATLAB, the developed model had highly predictable ability owing to F of 2.55897e+7 and correlation coefficient of one. The average absolute error of predicted retardances to measured retardances was just 0.0044%. Using the genetic algorithm (GA) in MATLAB, the optimized parametric combination was determined as [4.709 0.12 39.998 70.006] corresponding to the pH of suspension, molar ratio of CA to Fe3O4, CA volume, and coating temperature. The maximum retardance was found as 31.712°, close to that obtained by evolutionary solver in Excel and a relative error of -0.013%. Above all, the stepwise regression method was successfully used to model the retardance of CA coated FFs, and the maximum global retardance was determined by the use of GA.
Directory of Open Access Journals (Sweden)
Alessandro Danielis
2015-01-01
Full Text Available The processing of intensity data from terrestrial laser scanners has attracted considerable attention in recent years. Accurate calibrated intensity could give added value for laser scanning campaigns, for example, in producing faithful 3D colour models of real targets and classifying easier and more reliable automatic tools. In cultural heritage area, the purely geometric information provided by the vast majority of currently available scanners is not enough for most applications, where indeed accurate colorimetric data is needed. This paper presents a remote calibration method for self-registered RGB colour data provided by a 3D tristimulus laser scanner prototype. Such distinguishing colour information opens new scenarios and problems for remote colorimetry. Using piecewise cubic Hermite polynomials, a quadratic model with nonpolynomial terms for reducing inaccuracies occurring in remote colour measurement is implemented. Colorimetric data recorded by the prototype on certified diffusive targets is processed for generating a remote Lambertian model used for assessing the accuracy of the proposed algorithm. Results concerning laser scanner digitizations of artworks are reported to confirm the effectiveness of the method.
Complex Polynomial Vector Fields
DEFF Research Database (Denmark)
The two branches of dynamical systems, continuous and discrete, correspond to the study of differential equations (vector fields) and iteration of mappings respectively. In holomorphic dynamics, the systems studied are restricted to those described by holomorphic (complex analytic) functions...... or meromorphic (allowing poles as singularities) functions. There already exists a well-developed theory for iterative holomorphic dynamical systems, and successful relations found between iteration theory and flows of vector fields have been one of the main motivations for the recent interest in holomorphic...... vector fields. Since the class of complex polynomial vector fields in the plane is natural to consider, it is remarkable that its study has only begun very recently. There are numerous fundamental questions that are still open, both in the general classification of these vector fields, the decomposition...
Polynomial methods in combinatorics
Guth, Larry
2016-01-01
This book explains some recent applications of the theory of polynomials and algebraic geometry to combinatorics and other areas of mathematics. One of the first results in this story is a short elegant solution of the Kakeya problem for finite fields, which was considered a deep and difficult problem in combinatorial geometry. The author also discusses in detail various problems in incidence geometry associated to Paul Erdős's famous distinct distances problem in the plane from the 1940s. The proof techniques are also connected to error-correcting codes, Fourier analysis, number theory, and differential geometry. Although the mathematics discussed in the book is deep and far-reaching, it should be accessible to first- and second-year graduate students and advanced undergraduates. The book contains approximately 100 exercises that further the reader's understanding of the main themes of the book. Some of the greatest advances in geometric combinatorics and harmonic analysis in recent years have been accompl...
On pseudo-values for regression analysis in competing risks models
DEFF Research Database (Denmark)
Graw, F; Gerds, Thomas Alexander; Schumacher, M
2009-01-01
For regression on state and transition probabilities in multi-state models Andersen et al. (Biometrika 90:15-27, 2003) propose a technique based on jackknife pseudo-values. In this article we analyze the pseudo-values suggested for competing risks models and prove some conjectures regarding their...
A Predictive Logistic Regression Model of World Conflict Using Open Source Data
2015-03-26
No correlation between the error terms and the independent variables 9. Absence of perfect multicollinearity (Menard, 2001) When assumptions are...some of the variables before initial model building. Multicollinearity , or near-linear dependence among the variables will cause problems in the...model. High multicollinearity tends to produce unreasonably high logistic regression coefficients and can result in coefficients that are not
Sample size calculation to externally validate scoring systems based on logistic regression models.
Directory of Open Access Journals (Sweden)
Antonio Palazón-Bru
Full Text Available A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence. Scoring systems based on binary logistic regression models are a specific type of predictive model.The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study.The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units.In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature.An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
de Vries, S O; Fidler, Vaclav; Kuipers, Wietze D; Hunink, Maria G M
1998-01-01
The purpose of this study was to develop a model that predicts the outcome of supervised exercise for intermittent claudication. The authors present an example of the use of autoregressive logistic regression for modeling observed longitudinal data. Data were collected from 329 participants in a
Endogenous glucose production from infancy to adulthood: a non-linear regression model
Huidekoper, Hidde H.; Ackermans, Mariëtte T.; Ruiter, An F. C.; Sauerwein, Hans P.; Wijburg, Frits A.
2014-01-01
To construct a regression model for endogenous glucose production (EGP) as a function of age, and compare this with glucose supplementation using commonly used dextrose-based saline solutions at fluid maintenance rate in children. A model was constructed based on EGP data, as quantified by
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Kleijnen, J.P.C.
1995-01-01
This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for
Genomic prediction based on data from three layer lines using non-linear regression models
Huang, H.; Windig, J.J.; Vereijken, A.; Calus, M.P.L.
2014-01-01
Background - Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. Methods - In an attempt to alleviate
Logistic regression models of factors influencing the location of bioenergy and biofuels plants
T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu
2011-01-01
Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.
Nikbakht, Roya; Bahrampour, Abbas
2017-01-01
Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
Photovoltaic Array Condition Monitoring Based on Online Regression of Performance Model
DEFF Research Database (Denmark)
Spataru, Sergiu; Sera, Dezso; Kerekes, Tamas
2013-01-01
regression modeling, from PV array production, plane-of-array irradiance, and module temperature measurements, acquired during an initial learning phase of the system. After the model has been parameterized automatically, the condition monitoring system enters the normal operation phase, where...
The use of logistic regression in modelling the distributions of bird ...
African Journals Online (AJOL)
The method of logistic regression was used to model the observed geographical distribution patterns of bird species in Swaziland in relation to a set of environmental variables. Reporting rates derived from bird atlas data are used as an index of population densities. This is justified in part by the success of the modelling ...
A LATENT CLASS POISSON REGRESSION-MODEL FOR HETEROGENEOUS COUNT DATA
WEDEL, M; DESARBO, WS; BULT, [No Value; RAMASWAMY, [No Value
1993-01-01
In this paper an approach is developed that accommodates heterogeneity in Poisson regression models for count data. The model developed assumes that heterogeneity arises from a distribution of both the intercept and the coefficients of the explanatory variables. We assume that the mixing
The limiting behavior of the estimated parameters in a misspecified random field regression model
DEFF Research Database (Denmark)
Dahl, Christian Møller; Qin, Yu
This paper examines the limiting properties of the estimated parameters in the random field regression model recently proposed by Hamilton (Econometrica, 2001). Though the model is parametric, it enjoys the flexibility of the nonparametric approach since it can approximate a large collection of n...
Tracking time-varying parameters with local regression
DEFF Research Database (Denmark)
Joensen, Alfred Karsten; Nielsen, Henrik Aalborg; Nielsen, Torben Skov
2000-01-01
This paper shows that the recursive least-squares (RLS) algorithm with forgetting factor is a special case of a varying-coe\\$cient model, and a model which can easily be estimated via simple local regression. This observation allows us to formulate a new method which retains the RLS algorithm, bu......, but extends the algorithm by including polynomial approximations. Simulation results are provided, which indicates that this new method is superior to the classical RLS method, if the parameter variations are smooth....
Deep ensemble learning of sparse regression models for brain disease diagnosis.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2017-04-01
Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer's disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call 'Deep Ensemble Sparse Regression Network.' To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Bias and Uncertainty in Regression-Calibrated Models of Groundwater Flow in Heterogeneous Media
DEFF Research Database (Denmark)
Cooley, R.L.; Christensen, Steen
2006-01-01
small. Model error is accounted for in the weighted nonlinear regression methodology developed to estimate θ* and assess model uncertainties by incorporating the second-moment matrix of the model errors into the weight matrix. Techniques developed by statisticians to analyze classical nonlinear...... are reduced in magnitude. Biases, correction factors, and confidence and prediction intervals were obtained for a test problem for which model error is large to test robustness of the methodology. Numerical results conform with the theoretical analysis....
Efficient computation of Laguerre polynomials
A. Gil (Amparo); J. Segura (Javier); N.M. Temme (Nico)
2017-01-01
textabstractAn efficient algorithm and a Fortran 90 module (LaguerrePol) for computing Laguerre polynomials . Ln(α)(z) are presented. The standard three-term recurrence relation satisfied by the polynomials and different types of asymptotic expansions valid for . n large and . α small, are used
Longitudinal beta regression models for analyzing health-related quality of life scores over time
Directory of Open Access Journals (Sweden)
Hunger Matthias
2012-09-01
Full Text Available Abstract Background Health-related quality of life (HRQL has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice. Methods We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy. Results At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors. Conclusions Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.
Madarang, Krish J; Kang, Joo-Hyon
2014-06-01
Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Caimmi, R.
2011-08-01
Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Construction of risk prediction model of type 2 diabetes mellitus based on logistic regression
Directory of Open Access Journals (Sweden)
Li Jian
2017-01-01
Full Text Available Objective: to construct multi factor prediction model for the individual risk of T2DM, and to explore new ideas for early warning, prevention and personalized health services for T2DM. Methods: using logistic regression techniques to screen the risk factors for T2DM and construct the risk prediction model of T2DM. Results: Male’s risk prediction model logistic regression equation: logit(P=BMI × 0.735+ vegetables × (−0.671 + age × 0.838+ diastolic pressure × 0.296+ physical activity× (−2.287 + sleep ×(−0.009 +smoking ×0.214; Female’s risk prediction model logistic regression equation: logit(P=BMI ×1.979+ vegetables× (−0.292 + age × 1.355+ diastolic pressure× 0.522+ physical activity × (−2.287 + sleep × (−0.010.The area under the ROC curve of male was 0.83, the sensitivity was 0.72, the specificity was 0.86, the area under the ROC curve of female was 0.84, the sensitivity was 0.75, the specificity was 0.90. Conclusion: This study model data is from a compared study of nested case, the risk prediction model has been established by using the more mature logistic regression techniques, and the model is higher predictive sensitivity, specificity and stability.
Buonaccorsi, John P; Romeo, Giovanni; Thoresen, Magne
2018-03-01
When fitting regression models, measurement error in any of the predictors typically leads to biased coefficients and incorrect inferences. A plethora of methods have been proposed to correct for this. Obtaining standard errors and confidence intervals using the corrected estimators can be challenging and, in addition, there is concern about remaining bias in the corrected estimators. The bootstrap, which is one option to address these problems, has received limited attention in this context. It has usually been employed by simply resampling observations, which, while suitable in some situations, is not always formally justified. In addition, the simple bootstrap does not allow for estimating bias in non-linear models, including logistic regression. Model-based bootstrapping, which can potentially estimate bias in addition to being robust to the original sampling or whether the measurement error variance is constant or not, has received limited attention. However, it faces challenges that are not present in handling regression models with no measurement error. This article develops new methods for model-based bootstrapping when correcting for measurement error in logistic regression with replicate measures. The methodology is illustrated using two examples, and a series of simulations are carried out to assess and compare the simple and model-based bootstrap methods, as well as other standard methods. While not always perfect, the model-based approaches offer some distinct improvements over the other methods. © 2017, The International Biometric Society.
Multiple regression models for energy use in air-conditioned office buildings in different climates
International Nuclear Information System (INIS)
Lam, Joseph C.; Wan, Kevin K.W.; Liu Dalong; Tsang, C.L.
2010-01-01
An attempt was made to develop multiple regression models for office buildings in the five major climates in China - severe cold, cold, hot summer and cold winter, mild, and hot summer and warm winter. A total of 12 key building design variables were identified through parametric and sensitivity analysis, and considered as inputs in the regression models. The coefficient of determination R 2 varies from 0.89 in Harbin to 0.97 in Kunming, indicating that 89-97% of the variations in annual building energy use can be explained by the changes in the 12 parameters. A pseudo-random number generator based on three simple multiplicative congruential generators was employed to generate random designs for evaluation of the regression models. The difference between regression-predicted and DOE-simulated annual building energy use are largely within 10%. It is envisaged that the regression models developed can be used to estimate the likely energy savings/penalty during the initial design stage when different building schemes and design concepts are being considered.
Testing and Modeling Fuel Regression Rate in a Miniature Hybrid Burner
Directory of Open Access Journals (Sweden)
Luciano Fanton
2012-01-01
Full Text Available Ballistic characterization of an extended group of innovative HTPB-based solid fuel formulations for hybrid rocket propulsion was performed in a lab-scale burner. An optical time-resolved technique was used to assess the quasisteady regression history of single perforation, cylindrical samples. The effects of metalized additives and radiant heat transfer on the regression rate of such formulations were assessed. Under the investigated operating conditions and based on phenomenological models from the literature, analyses of the collected experimental data show an appreciable influence of the radiant heat flux from burnt gases and soot for both unloaded and loaded fuel formulations. Pure HTPB regression rate data are satisfactorily reproduced, while the impressive initial regression rates of metalized formulations require further assessment.
Regression analysis of informative current status data with the additive hazards model.
Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo
2015-04-01
This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.
LINEAR REGRESSION MODEL ESTİMATİON FOR RIGHT CENSORED DATA
Directory of Open Access Journals (Sweden)
Ersin Yılmaz
2016-05-01
Full Text Available In this study, firstly we will define a right censored data. If we say shortly right-censored data is censoring values that above the exact line. This may be related with scaling device. And then we will use response variable acquainted from right-censored explanatory variables. Then the linear regression model will be estimated. For censored data’s existence, Kaplan-Meier weights will be used for the estimation of the model. With the weights regression model will be consistent and unbiased with that. And also there is a method for the censored data that is a semi parametric regression and this method also give useful results for censored data too. This study also might be useful for the health studies because of the censored data used in medical issues generally.
Li, Tao
2018-06-01
The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.
Link polynomial, crossing multiplier and surgery formula
International Nuclear Information System (INIS)
Deguchi, Tetsuo; Yamada, Yasuhiko.
1989-01-01
Relations between link polynomials constructed from exactly solvable lattice models and topological field theory are reviewed. It is found that the surgery formula for a three-sphere S 3 with Wilson lines corresponds to the Markov trace constructed from the exactly solvable models. This indicates that knot theory intimately relates various important subjects such as exactly solvable models, conformal field theories and topological quantum field theories. (author)
Polynomial estimation of the smoothing splines for the new Finnish reference values for spirometry.
Kainu, Annette; Timonen, Kirsi
2016-07-01
Background Discontinuity of spirometry reference values from childhood into adulthood has been a problem with traditional reference values, thus modern modelling approaches using smoothing spline functions to better depict the transition during growth and ageing have been recently introduced. Following the publication of the new international Global Lung Initiative (GLI2012) reference values also new national Finnish reference values have been calculated using similar GAMLSS-modelling, with spline estimates for mean (Mspline) and standard deviation (Sspline) provided in tables. The aim of this study was to produce polynomial estimates for these spline functions to use in lieu of lookup tables and to assess their validity in the reference population of healthy non-smokers. Methods Linear regression modelling was used to approximate the estimated values for Mspline and Sspline using similar polynomial functions as in the international GLI2012 reference values. Estimated values were compared to original calculations in absolute values, the derived predicted mean and individually calculated z-scores using both values. Results Polynomial functions were estimated for all 10 spirometry variables. The agreement between original lookup table-produced values and polynomial estimates was very good, with no significant differences found. The variation slightly increased in larger predicted volumes, but a range of -0.018 to +0.022 litres of FEV1 representing ± 0.4% of maximum difference in predicted mean. Conclusions Polynomial approximations were very close to the original lookup tables and are recommended for use in clinical practice to facilitate the use of new reference values.
Combination of supervised and semi-supervised regression models for improved unbiased estimation
DEFF Research Database (Denmark)
Arenas-Garía, Jeronimo; Moriana-Varo, Carlos; Larsen, Jan
2010-01-01
In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised and semisupervi......In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised...