Data-Driven Jump Detection Thresholds for Application in Jump Regressions
Directory of Open Access Journals (Sweden)
Robert Davies
2018-03-01
Full Text Available This paper develops a method to select the threshold in threshold-based jump detection methods. The method is motivated by an analysis of threshold-based jump detection methods in the context of jump-diffusion models. We show that over the range of sampling frequencies a researcher is most likely to encounter that the usual in-fill asymptotics provide a poor guide for selecting the jump threshold. Because of this we develop a sample-based method. Our method estimates the number of jumps over a grid of thresholds and selects the optimal threshold at what we term the ‘take-off’ point in the estimated number of jumps. We show that this method consistently estimates the jumps and their indices as the sampling interval goes to zero. In several Monte Carlo studies we evaluate the performance of our method based on its ability to accurately locate jumps and its ability to distinguish between true jumps and large diffusive moves. In one of these Monte Carlo studies we evaluate the performance of our method in a jump regression context. Finally, we apply our method in two empirical studies. In one we estimate the number of jumps and report the jump threshold our method selects for three commonly used market indices. In the other empirical application we perform a series of jump regressions using our method to select the jump threshold.
Regression Discontinuity Designs Based on Population Thresholds
DEFF Research Database (Denmark)
Eggers, Andrew C.; Freier, Ronny; Grembi, Veronica
In many countries, important features of municipal government (such as the electoral system, mayors' salaries, and the number of councillors) depend on whether the municipality is above or below arbitrary population thresholds. Several papers have used a regression discontinuity design (RDD...
Directory of Open Access Journals (Sweden)
Alaa Alaabed
2016-06-01
Full Text Available The purpose of this paper is to test the growing converging views regarding the destabilizing and growth-halting impact of interest-based debt financial system. The views are as advocated by the followers of Keynes and Hyman Minsky and those of Islam. Islam discourages interest rate based debt financing as it considers it not very conducive to productive activities and human solidarity. Likewise, since the onset of the crisis of 2007/2008, calls by skeptics of mainstream capitalism have been renewed. The paper applies a threshold regression model to Malaysian data and finds that the relationship between growth and financial development is non-linear. A threshold is estimated, after which credit expansion negatively impacts GDP growth. While the post-threshold negative relationship is found to be statistically significant, the estimated positive relationship at lower levels of financial development is insignificant. The findings provide support to the above views and are hoped to guide monetary authorities to better growth-promoting policy-making.
Thresholds and Smooth Transitions in Vector Autoregressive Models
DEFF Research Database (Denmark)
Hubrich, Kirstin; Teräsvirta, Timo
This survey focuses on two families of nonlinear vector time series models, the family of Vector Threshold Regression models and that of Vector Smooth Transition Regression models. These two model classes contain incomplete models in the sense that strongly exogeneous variables are allowed in the...
Thresholding projection estimators in functional linear models
Cardot, Hervé; Johannes, Jan
2010-01-01
We consider the problem of estimating the regression function in functional linear regression models by proposing a new type of projection estimators which combine dimension reduction and thresholding. The introduction of a threshold rule allows to get consistency under broad assumptions as well as minimax rates of convergence under additional regularity hypotheses. We also consider the particular case of Sobolev spaces generated by the trigonometric basis which permits to get easily mean squ...
Santos-Concejero, Jordan; Tucker, Ross; Granados, Cristina; Irazusta, Jon; Bidaurrazaga-Letona, Iraia; Zabala-Lili, Jon; Gil, Susana María
2014-01-01
This study investigated the influence of the regression model and initial intensity during an incremental test on the relationship between the lactate threshold estimated by the maximal-deviation method and performance in elite-standard runners. Twenty-three well-trained runners completed a discontinuous incremental running test on a treadmill. Speed started at 9 km · h(-1) and increased by 1.5 km · h(-1) every 4 min until exhaustion, with a minute of recovery for blood collection. Lactate-speed data were fitted by exponential and polynomial models. The lactate threshold was determined for both models, using all the co-ordinates, excluding the first and excluding the first and second points. The exponential lactate threshold was greater than the polynomial equivalent in any co-ordinate condition (P performance and is independent of the initial intensity of the test.
Machado, Fabiana Andrade; Nakamura, Fábio Yuzo; Moraes, Solange Marta Franzói De
2012-01-01
This study examined the influence of the regression model and initial intensity of an incremental test on the relationship between the lactate threshold estimated by the maximal-deviation method and the endurance performance. Sixteen non-competitive, recreational female runners performed a discontinuous incremental treadmill test. The initial speed was set at 7 km · h⁻¹, and increased every 3 min by 1 km · h⁻¹ with a 30-s rest between the stages used for earlobe capillary blood sample collection. Lactate-speed data were fitted by an exponential-plus-constant and a third-order polynomial equation. The lactate threshold was determined for both regression equations, using all the coordinates, excluding the first and excluding the first and second initial points. Mean speed of a 10-km road race was the performance index (3.04 ± 0.22 m · s⁻¹). The exponentially-derived lactate threshold had a higher correlation (0.98 ≤ r ≤ 0.99) and smaller standard error of estimate (SEE) (0.04 ≤ SEE ≤ 0.05 m · s⁻¹) with performance than the polynomially-derived equivalent (0.83 ≤ r ≤ 0.89; 0.10 ≤ SEE ≤ 0.13 m · s⁻¹). The exponential lactate threshold was greater than the polynomial equivalent (P performance index that is independent of the initial intensity of the incremental test and better than the polynomial equivalent.
Comparison of Classical and Robust Estimates of Threshold Auto-regression Parameters
Directory of Open Access Journals (Sweden)
V. B. Goryainov
2017-01-01
Full Text Available The study object is the first-order threshold auto-regression model with a single zero-located threshold. The model describes a stochastic temporal series with discrete time by means of a piecewise linear equation consisting of two linear classical first-order autoregressive equations. One of these equations is used to calculate a running value of the temporal series. A control variable that determines the choice between these two equations is the sign of the previous value of the same series.The first-order threshold autoregressive model with a single threshold depends on two real parameters that coincide with the coefficients of the piecewise linear threshold equation. These parameters are assumed to be unknown. The paper studies an estimate of the least squares, an estimate the least modules, and the M-estimates of these parameters. The aim of the paper is a comparative study of the accuracy of these estimates for the main probabilistic distributions of the updating process of the threshold autoregressive equation. These probability distributions were normal, contaminated normal, logistic, double-exponential distributions, a Student's distribution with different number of degrees of freedom, and a Cauchy distribution.As a measure of the accuracy of each estimate, was chosen its variance to measure the scattering of the estimate around the estimated parameter. An estimate with smaller variance made from the two estimates was considered to be the best. The variance was estimated by computer simulation. To estimate the smallest modules an iterative weighted least-squares method was used and the M-estimates were done by the method of a deformable polyhedron (the Nelder-Mead method. To calculate the least squares estimate, an explicit analytic expression was used.It turned out that the estimation of least squares is best only with the normal distribution of the updating process. For the logistic distribution and the Student's distribution with the
The R Package threg to Implement Threshold Regression Models
Directory of Open Access Journals (Sweden)
Tao Xiao
2015-08-01
This new package includes four functions: threg, and the methods hr, predict and plot for threg objects returned by threg. The threg function is the model-fitting function which is used to calculate regression coefficient estimates, asymptotic standard errors and p values. The hr method for threg objects is the hazard-ratio calculation function which provides the estimates of hazard ratios at selected time points for specified scenarios (based on given categories or value settings of covariates. The predict method for threg objects is used for prediction. And the plot method for threg objects provides plots for curves of estimated hazard functions, survival functions and probability density functions of the first-hitting-time; function curves corresponding to different scenarios can be overlaid in the same plot for comparison to give additional research insights.
International Nuclear Information System (INIS)
Huang, B.-N.; Hwang, M.J.; Yang, C.W.
2008-01-01
This paper separates data extending from 1971 to 2002 into the energy crisis period (1971-1980) and the post-energy crisis period (1981-2000) for 82 countries. The cross-sectional data (yearly averages) in these two periods are used to investigate the nonlinear relationships between energy consumption growth and economic growth when threshold variables are used. If threshold variables are higher than certain optimal threshold levels, there is either no significant relationship or else a significant negative relationship between energy consumption and economic growth. However, when these threshold variables are lower than certain optimal levels, there is a significant positive relationship between the two. In 48 out of the 82 countries studied, none of the four threshold variables is found to be higher than the optimal levels. It is inferred that these 48 countries should adopt a more aggressive energy policy. As for the other 34 countries, at least one threshold variable is higher than the optimal threshold level and thus these countries should adopt energy policies with varying degrees of conservation based on the number of threshold variables that are higher than the optimal threshold levels
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Modeling DPOAE input/output function compression: comparisons with hearing thresholds.
Bhagat, Shaum P
2014-09-01
Basilar membrane input/output (I/O) functions in mammalian animal models are characterized by linear and compressed segments when measured near the location corresponding to the characteristic frequency. A method of studying basilar membrane compression indirectly in humans involves measuring distortion-product otoacoustic emission (DPOAE) I/O functions. Previous research has linked compression estimates from behavioral growth-of-masking functions to hearing thresholds. The aim of this study was to compare compression estimates from DPOAE I/O functions and hearing thresholds at 1 and 2 kHz. A prospective correlational research design was performed. The relationship between DPOAE I/O function compression estimates and hearing thresholds was evaluated with Pearson product-moment correlations. Normal-hearing adults (n = 16) aged 22-42 yr were recruited. DPOAE I/O functions (L₂ = 45-70 dB SPL) and two-interval forced-choice hearing thresholds were measured in normal-hearing adults. A three-segment linear regression model applied to DPOAE I/O functions supplied estimates of compression thresholds, defined as breakpoints between linear and compressed segments and the slopes of the compressed segments. Pearson product-moment correlations between DPOAE compression estimates and hearing thresholds were evaluated. A high correlation between DPOAE compression thresholds and hearing thresholds was observed at 2 kHz, but not at 1 kHz. Compression slopes also correlated highly with hearing thresholds only at 2 kHz. The derivation of cochlear compression estimates from DPOAE I/O functions provides a means to characterize basilar membrane mechanics in humans and elucidates the role of compression in tone detection in the 1-2 kHz frequency range. American Academy of Audiology.
Susan L. King
2003-01-01
The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Statistical analysis of sediment toxicity by additive monotone regression splines
Boer, de W.J.; Besten, den P.J.; Braak, ter C.J.F.
2002-01-01
Modeling nonlinearity and thresholds in dose-effect relations is a major challenge, particularly in noisy data sets. Here we show the utility of nonlinear regression with additive monotone regression splines. These splines lead almost automatically to the estimation of thresholds. We applied this
Morales, Esteban; de Leon, John Mark S; Abdollahi, Niloufar; Yu, Fei; Nouri-Mahdavi, Kouros; Caprioli, Joseph
2016-03-01
The study was conducted to evaluate threshold smoothing algorithms to enhance prediction of the rates of visual field (VF) worsening in glaucoma. We studied 798 patients with primary open-angle glaucoma and 6 or more years of follow-up who underwent 8 or more VF examinations. Thresholds at each VF location for the first 4 years or first half of the follow-up time (whichever was greater) were smoothed with clusters defined by the nearest neighbor (NN), Garway-Heath, Glaucoma Hemifield Test (GHT), and weighting by the correlation of rates at all other VF locations. Thresholds were regressed with a pointwise exponential regression (PER) model and a pointwise linear regression (PLR) model. Smaller root mean square error (RMSE) values of the differences between the observed and the predicted thresholds at last two follow-ups indicated better model predictions. The mean (SD) follow-up times for the smoothing and prediction phase were 5.3 (1.5) and 10.5 (3.9) years. The mean RMSE values for the PER and PLR models were unsmoothed data, 6.09 and 6.55; NN, 3.40 and 3.42; Garway-Heath, 3.47 and 3.48; GHT, 3.57 and 3.74; and correlation of rates, 3.59 and 3.64. Smoothed VF data predicted better than unsmoothed data. Nearest neighbor provided the best predictions; PER also predicted consistently more accurately than PLR. Smoothing algorithms should be used when forecasting VF results with PER or PLR. The application of smoothing algorithms on VF data can improve forecasting in VF points to assist in treatment decisions.
When do price thresholds matter in retail categories?
Pauwels, Koen; Srinivasan, Shuba; Franses, Philip Hans
2007-01-01
textabstractMarketing literature has long recognized that brand price elasticity need not be monotonic and symmetric, but has yet to provide generalizable market-level insights on threshold-based price elasticity, asymmetric thresholds, and the sign and magnitude of elasticity transitions. This paper introduces smooth transition regression models to study threshold-based price elasticity of the top 4 brands across 20 fast-moving consumer good categories. Threshold-based price elasticity is fo...
Joe-Ming Lee
2013-01-01
This study applies by the panel transition regression (PSTR) model to investigate the nonlinear dynamic relationship between equity fund flow and investment volatility in Taiwan. Our empirical results show that the equity fund managers will be different business strategy under the volatility threshold value and the control variables of asset of funds, management fee and Turnover indicator. After the financial crisis, the threshold of volatility will be an important index to different business...
When Do Price Thresholds Matter in Retail Categories?
Koen Pauwels; Shuba Srinivasan; Philip Hans Franses
2007-01-01
Marketing literature has long recognized that brand price elasticity need not be monotonic and symmetric, but has yet to provide generalizable market-level insights on threshold-based price elasticity, asymmetric thresholds, and the sign and magnitude of elasticity transitions. This paper introduces smooth transition regression models to study threshold-based price elasticity of the top 4 brands across 20 fast-moving consumer good categories. Threshold-based price elasticity is found for 76% ...
Model Threshold untuk Pembelajaran Memproduksi Pantun Kelas XI
Directory of Open Access Journals (Sweden)
Fitri Nura Murti
2017-03-01
Full Text Available Abstract: The learning pantun method in schools provided less opportunity to develop the students’ creativity in producing pantun. This situation was supported by the result of the observation conducted on eleventh graders at SMAN 2 Bondowoso. It showed that the students tend to plagiarize their pantun. The general objective of this research and development is to develop Threshold Pantun model for learning to produce pantun for elevent graders. The product was presented in guidance book for teachers entitled “Pembelajaran Memproduksi Pantun Menggunakan Model Threshold Pantun untuk Kelas XI”. This study adapted design method of Borg-Gall’s R&D procedure. The result of this study showed that Threshold Pantun model was appropriate to be implemented for learning to produce pantun. Key Words: Threshold Pantun model, produce pantun Abstrak: Pembelajaran pantun di sekolah selama ini kurang mengembangkan kreativitas siswa dalam memproduksi pantun. Hal tersebut dikuatkan oleh hasil observasi siswa kelas XI SMAN 2 Bondowoso yang menunjukkan adanya kecenderungan produk siswa bersifat plagiat. Tujuan penelitian dan pengembangan ini secara umum adalah mengembangkan model Threshold Pantun untuk pembelajaran memproduksi pantun kelas XI..Produk disajikan dalam bentuk buku panduan bagi guru dengan judul “Pembelajaran Memproduksi Pantun Menggunakan Model Threshold Pantun untuk Kelas XI”. Penelitian ini menggunakan rancangan penelitian yang diadaptasi dari prosedur penelitian dan pengembangan Borg dan Gall. Berdasarkan hasil validasi model Threshold Pantun untuk pembelajaran memproduksi pantun layak diimplementasikan. Kata kunci: model Threshold Pantun, memproduksi pantun
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Satellite rainfall retrieval by logistic regression
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
Yeh, Chun-Yuan; Schafferer, Christian; Lee, Jie-Min; Ho, Li-Ming; Hsieh, Chi-Jung
2017-09-21
European Union public healthcare expenditure on treating smoking and attributable diseases is estimated at over €25bn annually. The reduction of tobacco consumption has thus become one of the major social policies of the EU. This study investigates the effects of price hikes on cigarette consumption, tobacco tax revenues and smoking-caused deaths in 28 EU countries. Employing panel data for the years 2005 to 2014 from Euromonitor International, the World Bank and the World Health Organization, we used income as a threshold variable and applied threshold regression modelling to estimate the elasticity of cigarette prices and to simulate the effect of price fluctuations. The results showed that there was an income threshold effect on cigarette prices in the 28 EU countries that had a gross national income (GNI) per capita lower than US$5418, with a maximum cigarette price elasticity of -1.227. The results of the simulated analysis showed that a rise of 10% in cigarette price would significantly reduce cigarette consumption as well the total death toll caused by smoking in all the observed countries, but would be most effective in Bulgaria and Romania, followed by Latvia and Poland. Additionally, an increase in the number of MPOWER tobacco control policies at the highest level of achievment would help reduce cigarette consumption. It is recommended that all EU countries levy higher tobacco taxes to increase cigarette prices, and thus in effect reduce cigarette consumption. The subsequent increase in tobacco tax revenues would be instrumental in covering expenditures related to tobacco prevention and control programs.
Estimating the exceedance probability of rain rate by logistic regression
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Uncovering state-dependent relationships in shallow lakes using Bayesian latent variable regression.
Vitense, Kelsey; Hanson, Mark A; Herwig, Brian R; Zimmer, Kyle D; Fieberg, John
2018-03-01
Ecosystems sometimes undergo dramatic shifts between contrasting regimes. Shallow lakes, for instance, can transition between two alternative stable states: a clear state dominated by submerged aquatic vegetation and a turbid state dominated by phytoplankton. Theoretical models suggest that critical nutrient thresholds differentiate three lake types: highly resilient clear lakes, lakes that may switch between clear and turbid states following perturbations, and highly resilient turbid lakes. For effective and efficient management of shallow lakes and other systems, managers need tools to identify critical thresholds and state-dependent relationships between driving variables and key system features. Using shallow lakes as a model system for which alternative stable states have been demonstrated, we developed an integrated framework using Bayesian latent variable regression (BLR) to classify lake states, identify critical total phosphorus (TP) thresholds, and estimate steady state relationships between TP and chlorophyll a (chl a) using cross-sectional data. We evaluated the method using data simulated from a stochastic differential equation model and compared its performance to k-means clustering with regression (KMR). We also applied the framework to data comprising 130 shallow lakes. For simulated data sets, BLR had high state classification rates (median/mean accuracy >97%) and accurately estimated TP thresholds and state-dependent TP-chl a relationships. Classification and estimation improved with increasing sample size and decreasing noise levels. Compared to KMR, BLR had higher classification rates and better approximated the TP-chl a steady state relationships and TP thresholds. We fit the BLR model to three different years of empirical shallow lake data, and managers can use the estimated bifurcation diagrams to prioritize lakes for management according to their proximity to thresholds and chance of successful rehabilitation. Our model improves upon
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
A non-parametric framework for estimating threshold limit values
Directory of Open Access Journals (Sweden)
Ulm Kurt
2005-11-01
Full Text Available Abstract Background To estimate a threshold limit value for a compound known to have harmful health effects, an 'elbow' threshold model is usually applied. We are interested on non-parametric flexible alternatives. Methods We describe how a step function model fitted by isotonic regression can be used to estimate threshold limit values. This method returns a set of candidate locations, and we discuss two algorithms to select the threshold among them: the reduced isotonic regression and an algorithm considering the closed family of hypotheses. We assess the performance of these two alternative approaches under different scenarios in a simulation study. We illustrate the framework by analysing the data from a study conducted by the German Research Foundation aiming to set a threshold limit value in the exposure to total dust at workplace, as a causal agent for developing chronic bronchitis. Results In the paper we demonstrate the use and the properties of the proposed methodology along with the results from an application. The method appears to detect the threshold with satisfactory success. However, its performance can be compromised by the low power to reject the constant risk assumption when the true dose-response relationship is weak. Conclusion The estimation of thresholds based on isotonic framework is conceptually simple and sufficiently powerful. Given that in threshold value estimation context there is not a gold standard method, the proposed model provides a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Estimating traffic volume on Wyoming low volume roads using linear and logistic regression methods
Directory of Open Access Journals (Sweden)
Dick Apronti
2016-12-01
Full Text Available Traffic volume is an important parameter in most transportation planning applications. Low volume roads make up about 69% of road miles in the United States. Estimating traffic on the low volume roads is a cost-effective alternative to taking traffic counts. This is because traditional traffic counts are expensive and impractical for low priority roads. The purpose of this paper is to present the development of two alternative means of cost-effectively estimating traffic volumes for low volume roads in Wyoming and to make recommendations for their implementation. The study methodology involves reviewing existing studies, identifying data sources, and carrying out the model development. The utility of the models developed were then verified by comparing actual traffic volumes to those predicted by the model. The study resulted in two regression models that are inexpensive and easy to implement. The first regression model was a linear regression model that utilized pavement type, access to highways, predominant land use types, and population to estimate traffic volume. In verifying the model, an R2 value of 0.64 and a root mean square error of 73.4% were obtained. The second model was a logistic regression model that identified the level of traffic on roads using five thresholds or levels. The logistic regression model was verified by estimating traffic volume thresholds and determining the percentage of roads that were accurately classified as belonging to the given thresholds. For the five thresholds, the percentage of roads classified correctly ranged from 79% to 88%. In conclusion, the verification of the models indicated both model types to be useful for accurate and cost-effective estimation of traffic volumes for low volume Wyoming roads. The models developed were recommended for use in traffic volume estimations for low volume roads in pavement management and environmental impact assessment studies.
Threshold model of cascades in empirical temporal networks
Karimi, Fariba; Holme, Petter
2013-08-01
Threshold models try to explain the consequences of social influence like the spread of fads and opinions. Along with models of epidemics, they constitute a major theoretical framework of social spreading processes. In threshold models on static networks, an individual changes her state if a certain fraction of her neighbors has done the same. When there are strong correlations in the temporal aspects of contact patterns, it is useful to represent the system as a temporal network. In such a system, not only contacts but also the time of the contacts are represented explicitly. In many cases, bursty temporal patterns slow down disease spreading. However, as we will see, this is not a universal truth for threshold models. In this work we propose an extension of Watts’s classic threshold model to temporal networks. We do this by assuming that an agent is influenced by contacts which lie a certain time into the past. I.e., the individuals are affected by contacts within a time window. In addition to thresholds in the fraction of contacts, we also investigate the number of contacts within the time window as a basis for influence. To elucidate the model’s behavior, we run the model on real and randomized empirical contact datasets.
Directory of Open Access Journals (Sweden)
Chun-Yuan Yeh
2017-09-01
Full Text Available Abstract Background European Union public healthcare expenditure on treating smoking and attributable diseases is estimated at over €25bn annually. The reduction of tobacco consumption has thus become one of the major social policies of the EU. This study investigates the effects of price hikes on cigarette consumption, tobacco tax revenues and smoking-caused deaths in 28 EU countries. Methods Employing panel data for the years 2005 to 2014 from Euromonitor International, the World Bank and the World Health Organization, we used income as a threshold variable and applied threshold regression modelling to estimate the elasticity of cigarette prices and to simulate the effect of price fluctuations. Results The results showed that there was an income threshold effect on cigarette prices in the 28 EU countries that had a gross national income (GNI per capita lower than US$5418, with a maximum cigarette price elasticity of −1.227. The results of the simulated analysis showed that a rise of 10% in cigarette price would significantly reduce cigarette consumption as well the total death toll caused by smoking in all the observed countries, but would be most effective in Bulgaria and Romania, followed by Latvia and Poland. Additionally, an increase in the number of MPOWER tobacco control policies at the highest level of achievment would help reduce cigarette consumption. Conclusions It is recommended that all EU countries levy higher tobacco taxes to increase cigarette prices, and thus in effect reduce cigarette consumption. The subsequent increase in tobacco tax revenues would be instrumental in covering expenditures related to tobacco prevention and control programs.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Mixture of Regression Models with Single-Index
Xiang, Sijia; Yao, Weixin
2016-01-01
In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...
Czech Academy of Sciences Publication Activity Database
Kyselý, Jan; Picek, J.; Beranová, Romana
2010-01-01
Roč. 72, 1-2 (2010), s. 55-68 ISSN 0921-8181 R&D Projects: GA ČR GA205/06/1535; GA ČR GAP209/10/2045 Grant - others:GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z30420517 Keywords : climate change * extreme value analysis * global climate models * peaks-over-threshold method * peaks-over-quantile regression * quantile regression * Poisson process * extreme temperatures Subject RIV: DG - Athmosphere Sciences, Meteorology Impact factor: 3.351, year: 2010
An Additive-Multiplicative Cox-Aalen Regression Model
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
2002-01-01
Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Regression models of reactor diagnostic signals
International Nuclear Information System (INIS)
Vavrin, J.
1989-01-01
The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Categorical regression dose-response modeling
The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Mixed-effects regression models in linguistics
Heylen, Kris; Geeraerts, Dirk
2018-01-01
When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Differential equation models for sharp threshold dynamics.
Schramm, Harrison C; Dimitrov, Nedialko B
2014-01-01
We develop an extension to differential equation models of dynamical systems to allow us to analyze probabilistic threshold dynamics that fundamentally and globally change system behavior. We apply our novel modeling approach to two cases of interest: a model of infectious disease modified for malware where a detection event drastically changes dynamics by introducing a new class in competition with the original infection; and the Lanchester model of armed conflict, where the loss of a key capability drastically changes the effectiveness of one of the sides. We derive and demonstrate a step-by-step, repeatable method for applying our novel modeling approach to an arbitrary system, and we compare the resulting differential equations to simulations of the system's random progression. Our work leads to a simple and easily implemented method for analyzing probabilistic threshold dynamics using differential equations. Published by Elsevier Inc.
Variable selection and model choice in geoadditive regression models.
Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard
2009-06-01
Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.
The MIDAS Touch: Mixed Data Sampling Regression Models
Ghysels, Eric; Santa-Clara, Pedro; Valkanov, Rossen
2004-01-01
We introduce Mixed Data Sampling (henceforth MIDAS) regression models. The regressions involve time series data sampled at different frequencies. Technically speaking MIDAS models specify conditional expectations as a distributed lag of regressors recorded at some higher sampling frequencies. We examine the asymptotic properties of MIDAS regression estimation and compare it with traditional distributed lag models. MIDAS regressions have wide applicability in macroeconomics and ï¿½nance.
Modelling the regulatory system for diabetes mellitus with a threshold window
Yang, Jin; Tang, Sanyi; Cheke, Robert A.
2015-05-01
Piecewise (or non-smooth) glucose-insulin models with threshold windows for type 1 and type 2 diabetes mellitus are proposed and analyzed with a view to improving understanding of the glucose-insulin regulatory system. For glucose-insulin models with a single threshold, the existence and stability of regular, virtual, pseudo-equilibria and tangent points are addressed. Then the relations between regular equilibria and a pseudo-equilibrium are studied. Furthermore, the sufficient and necessary conditions for the global stability of regular equilibria and the pseudo-equilibrium are provided by using qualitative analysis techniques of non-smooth Filippov dynamic systems. Sliding bifurcations related to boundary node bifurcations were investigated with theoretical and numerical techniques, and insulin clinical therapies are discussed. For glucose-insulin models with a threshold window, the effects of glucose thresholds or the widths of threshold windows on the durations of insulin therapy and glucose infusion were addressed. The duration of the effects of an insulin injection is sensitive to the variation of thresholds. Our results indicate that blood glucose level can be maintained within a normal range using piecewise glucose-insulin models with a single threshold or a threshold window. Moreover, our findings suggest that it is critical to individualise insulin therapy for each patient separately, based on initial blood glucose levels.
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Real estate value prediction using multivariate regression models
Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav
2017-11-01
The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Algorithmic detectability threshold of the stochastic block model
Kawamoto, Tatsuro
2018-03-01
The assumption that the values of model parameters are known or correctly learned, i.e., the Nishimori condition, is one of the requirements for the detectability analysis of the stochastic block model in statistical inference. In practice, however, there is no example demonstrating that we can know the model parameters beforehand, and there is no guarantee that the model parameters can be learned accurately. In this study, we consider the expectation-maximization (EM) algorithm with belief propagation (BP) and derive its algorithmic detectability threshold. Our analysis is not restricted to the community structure but includes general modular structures. Because the algorithm cannot always learn the planted model parameters correctly, the algorithmic detectability threshold is qualitatively different from the one with the Nishimori condition.
Model-based Quantile Regression for Discrete Data
Padellini, Tullia
2018-04-10
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
Analytical drift-current threshold voltage model of long-channel double-gate MOSFETs
International Nuclear Information System (INIS)
Shih, Chun-Hsing; Wang, Jhong-Sheng
2009-01-01
This paper presents a new, physical threshold voltage model to solve the ambiguity in determining the threshold voltage of double-gate (DG) MOSFETs. To avoid the difficulties of the conventional 2ψ B model in nearly undoped DG MOSFETs, this study proposes to define the on–off switching based on the actual roles of the drift and diffusion components in the total drain current. The drift current strongly enhances beyond the threshold voltage, while the diffusion current plays a major role in the subthreshold. The threshold voltage is defined as the drift component that exceeds the diffusion counterpart. From the solutions of Poisson's equation, the drift and diffusion currents of DG MOSFETs are separately formulated to derive the analytical expressions of the threshold voltage and associated threshold current. This model provides a comprehensive description of the switching behavior of DG MOSFET devices, and offers a physical onset threshold current to determine the threshold voltage in practical extraction
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
Threshold voltage roll-off modelling of bilayer graphene field-effect transistors
International Nuclear Information System (INIS)
Saeidmanesh, M; Ismail, Razali; Khaledian, M; Karimi, H; Akbari, E
2013-01-01
An analytical model is presented for threshold voltage roll-off of double gate bilayer graphene field-effect transistors. To this end, threshold voltage models of short- and long-channel states have been developed. In the short-channel case, front and back gate potential distributions have been modelled and used. In addition, the tunnelling probability is modelled and its effect is taken into consideration in the potential distribution model. To evaluate the accuracy of the potential model, FlexPDE software is employed with proper boundary conditions and a good agreement is observed. Using the proposed models, the effect of several structural parameters on the threshold voltage and its roll-off are studied at room temperature. (paper)
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
An, Qingyu; Wu, Jun; Fan, Xuesong; Pan, Liyang; Sun, Wei
2016-01-01
The hand foot and mouth disease (HFMD) is a human syndrome caused by intestinal viruses like that coxsackie A virus 16, enterovirus 71 and easily developed into outbreak in kindergarten and school. Scientifically and accurately early detection of the start time of HFMD epidemic is a key principle in planning of control measures and minimizing the impact of HFMD. The objective of this study was to establish a reliable early detection model for start timing of hand foot mouth disease epidemic in Dalian and to evaluate the performance of model by analyzing the sensitivity in detectability. The negative binomial regression model was used to estimate the weekly baseline case number of HFMD and identified the optimal alerting threshold between tested difference threshold values during the epidemic and non-epidemic year. Circular distribution method was used to calculate the gold standard of start timing of HFMD epidemic. From 2009 to 2014, a total of 62022 HFMD cases were reported (36879 males and 25143 females) in Dalian, Liaoning Province, China, including 15 fatal cases. The median age of the patients was 3 years. The incidence rate of epidemic year ranged from 137.54 per 100,000 population to 231.44 per 100,000population, the incidence rate of non-epidemic year was lower than 112 per 100,000 population. The negative binomial regression model with AIC value 147.28 was finally selected to construct the baseline level. The threshold value was 100 for the epidemic year and 50 for the non- epidemic year had the highest sensitivity(100%) both in retrospective and prospective early warning and the detection time-consuming was 2 weeks before the actual starting of HFMD epidemic. The negative binomial regression model could early warning the start of a HFMD epidemic with good sensitivity and appropriate detection time in Dalian.
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Regression Models For Multivariate Count Data.
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2017-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.
Houpt, Joseph W; Bittner, Jennifer L
2018-05-10
Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Drzewiecki Wojciech
2017-12-01
Full Text Available We evaluated the performance of nine machine learning regression algorithms and their ensembles for sub-pixel estimation of impervious areas coverages from Landsat imagery. The accuracy of imperviousness mapping in individual time points was assessed based on RMSE, MAE and R2. These measures were also used for the assessment of imperviousness change intensity estimations. The applicability for detection of relevant changes in impervious areas coverages at sub-pixel level was evaluated using overall accuracy, F-measure and ROC Area Under Curve. The results proved that Cubist algorithm may be advised for Landsat-based mapping of imperviousness for single dates. Stochastic gradient boosting of regression trees (GBM may be also considered for this purpose. However, Random Forest algorithm is endorsed for both imperviousness change detection and mapping of its intensity. In all applications the heterogeneous model ensembles performed at least as well as the best individual models or better. They may be recommended for improving the quality of sub-pixel imperviousness and imperviousness change mapping. The study revealed also limitations of the investigated methodology for detection of subtle changes of imperviousness inside the pixel. None of the tested approaches was able to reliably classify changed and non-changed pixels if the relevant change threshold was set as one or three percent. Also for fi ve percent change threshold most of algorithms did not ensure that the accuracy of change map is higher than the accuracy of random classifi er. For the threshold of relevant change set as ten percent all approaches performed satisfactory.
Drzewiecki, Wojciech
2017-12-01
We evaluated the performance of nine machine learning regression algorithms and their ensembles for sub-pixel estimation of impervious areas coverages from Landsat imagery. The accuracy of imperviousness mapping in individual time points was assessed based on RMSE, MAE and R2. These measures were also used for the assessment of imperviousness change intensity estimations. The applicability for detection of relevant changes in impervious areas coverages at sub-pixel level was evaluated using overall accuracy, F-measure and ROC Area Under Curve. The results proved that Cubist algorithm may be advised for Landsat-based mapping of imperviousness for single dates. Stochastic gradient boosting of regression trees (GBM) may be also considered for this purpose. However, Random Forest algorithm is endorsed for both imperviousness change detection and mapping of its intensity. In all applications the heterogeneous model ensembles performed at least as well as the best individual models or better. They may be recommended for improving the quality of sub-pixel imperviousness and imperviousness change mapping. The study revealed also limitations of the investigated methodology for detection of subtle changes of imperviousness inside the pixel. None of the tested approaches was able to reliably classify changed and non-changed pixels if the relevant change threshold was set as one or three percent. Also for fi ve percent change threshold most of algorithms did not ensure that the accuracy of change map is higher than the accuracy of random classifi er. For the threshold of relevant change set as ten percent all approaches performed satisfactory.
Setting conservation management thresholds using a novel participatory modeling approach.
Addison, P F E; de Bie, K; Rumpff, L
2015-10-01
We devised a participatory modeling approach for setting management thresholds that show when management intervention is required to address undesirable ecosystem changes. This approach was designed to be used when management thresholds: must be set for environmental indicators in the face of multiple competing objectives; need to incorporate scientific understanding and value judgments; and will be set by participants with limited modeling experience. We applied our approach to a case study where management thresholds were set for a mat-forming brown alga, Hormosira banksii, in a protected area management context. Participants, including management staff and scientists, were involved in a workshop to test the approach, and set management thresholds to address the threat of trampling by visitors to an intertidal rocky reef. The approach involved trading off the environmental objective, to maintain the condition of intertidal reef communities, with social and economic objectives to ensure management intervention was cost-effective. Ecological scenarios, developed using scenario planning, were a key feature that provided the foundation for where to set management thresholds. The scenarios developed represented declines in percent cover of H. banksii that may occur under increased threatening processes. Participants defined 4 discrete management alternatives to address the threat of trampling and estimated the effect of these alternatives on the objectives under each ecological scenario. A weighted additive model was used to aggregate participants' consequence estimates. Model outputs (decision scores) clearly expressed uncertainty, which can be considered by decision makers and used to inform where to set management thresholds. This approach encourages a proactive form of conservation, where management thresholds and associated actions are defined a priori for ecological indicators, rather than reacting to unexpected ecosystem changes in the future. © 2015 The
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
A threshold model of investor psychology
Cross, Rod; Grinfeld, Michael; Lamba, Harbir; Seaman, Tim
2005-08-01
We introduce a class of agent-based market models founded upon simple descriptions of investor psychology. Agents are subject to various psychological tensions induced by market conditions and endowed with a minimal ‘personality’. This personality consists of a threshold level for each of the tensions being modeled, and the agent reacts whenever a tension threshold is reached. This paper considers an elementary model including just two such tensions. The first is ‘cowardice’, which is the stress caused by remaining in a minority position with respect to overall market sentiment and leads to herding-type behavior. The second is ‘inaction’, which is the increasing desire to act or re-evaluate one's investment position. There is no inductive learning by agents and they are only coupled via the global market price and overall market sentiment. Even incorporating just these two psychological tensions, important stylized facts of real market data, including fat-tails, excess kurtosis, uncorrelated price returns and clustered volatility over the timescale of a few days are reproduced. By then introducing an additional parameter that amplifies the effect of externally generated market noise during times of extreme market sentiment, long-time volatility correlations can also be recovered.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Directory of Open Access Journals (Sweden)
DARIUSZ Piwczynski
2013-03-01
Full Text Available The research was carried out on 4,030 Polish Merino ewes born in the years 1991- 2001, kept in 15 flocks from the Pomorze and Kujawy region. Fertility of ewes in subsequent reproduction seasons was analysed with the use of multiple logistic regression. The research showed that there is a statistical influence of the flock, year of birth, age of dam, flock year interaction of birth on the ewes fertility. In order to estimate the genetic parameters, the Gibbs sampling method was applied, using the univariate animal models, both linear as well as threshold. Estimates of fertility depending on the model equalled 0.067 to 0.104, whereas the estimates of repeatability equalled respectively: 0.076 and 0.139. The obtained genetic parameters were then used to estimate the breeding values of the animals in terms of controlled trait (Best Linear Unbiased Prediction method using linear and threshold models. The obtained animal breeding values rankings in respect of the same trait with the use of linear and threshold models were strongly correlated with each other (rs = 0.972. Negative genetic trends of fertility (0.01-0.08% per year were found.
International Nuclear Information System (INIS)
Jafri, Y.Z.; Kamal, L.
2007-01-01
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
A generalized methodology for identification of threshold for HRU delineation in SWAT model
M, J.; Sudheer, K.; Chaubey, I.; Raj, C.
2016-12-01
The distributed hydrological model, Soil and Water Assessment Tool (SWAT) is a comprehensive hydrologic model widely used for making various decisions. The simulation accuracy of the distributed hydrological model differs due to the mechanism involved in the subdivision of the watershed. Soil and Water Assessment Tool (SWAT) considers sub-dividing the watershed and the sub-basins into small computing units known as 'hydrologic response units (HRU). The delineation of HRU is done based on unique combinations of land use, soil types, and slope within the sub-watersheds, which are not spatially defined. The computations in SWAT are done at HRU level and are then aggregated up to the sub-basin outlet, which is routed through the stream system. Generally, the HRUs are delineated by considering a threshold percentage of land use, soil and slope are to be given by the modeler to decrease the computation time of the model. The thresholds constrain the minimum area for constructing an HRU. In the current HRU delineation practice in SWAT, the land use, soil and slope of the watershed within a sub-basin, which is less than the predefined threshold, will be surpassed by the dominating land use, soil and slope, and introduce some level of ambiguity in the process simulations in terms of inappropriate representation of the area. But the loss of information due to variation in the threshold values depends highly on the purpose of the study. Therefore this research studies the effects of threshold values of HRU delineation on the hydrological modeling of SWAT on sediment simulations and suggests guidelines for selecting the appropriate threshold values considering the sediment simulation accuracy. The preliminary study was done on Illinois watershed by assigning different thresholds for land use and soil. A general methodology was proposed for identifying an appropriate threshold for HRU delineation in SWAT model that considered computational time and accuracy of the simulation
Mixed Frequency Data Sampling Regression Models: The R Package midasr
Directory of Open Access Journals (Sweden)
Eric Ghysels
2016-08-01
Full Text Available When modeling economic relationships it is increasingly common to encounter data sampled at different frequencies. We introduce the R package midasr which enables estimating regression models with variables sampled at different frequencies within a MIDAS regression framework put forward in work by Ghysels, Santa-Clara, and Valkanov (2002. In this article we define a general autoregressive MIDAS regression model with multiple variables of different frequencies and show how it can be specified using the familiar R formula interface and estimated using various optimization methods chosen by the researcher. We discuss how to check the validity of the estimated model both in terms of numerical convergence and statistical adequacy of a chosen regression specification, how to perform model selection based on a information criterion, how to assess forecasting accuracy of the MIDAS regression model and how to obtain a forecast aggregation of different MIDAS regression models. We illustrate the capabilities of the package with a simulated MIDAS regression model and give two empirical examples of application of MIDAS regression.
Empirical assessment of a threshold model for sylvatic plague
DEFF Research Database (Denmark)
Davis, Stephen; Leirs, Herwig; Viljugrein, H.
2007-01-01
Plague surveillance programmes established in Kazakhstan, Central Asia, during the previous century, have generated large plague archives that have been used to parameterize an abundance threshold model for sylvatic plague in great gerbil (Rhombomys opimus) populations. Here, we assess the model...... examine six hypotheses that could explain the resulting false positive predictions, namely (i) including end-of-outbreak data erroneously lowers the estimated threshold, (ii) too few gerbils were tested, (iii) plague becomes locally extinct, (iv) the abundance of fleas was too low, (v) the climate...
Impact of multicollinearity on small sample hydrologic regression models
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Testing homogeneity in Weibull-regression models.
Bolfarine, Heleno; Valença, Dione M
2005-10-01
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Model-based Quantile Regression for Discrete Data
Padellini, Tullia; Rue, Haavard
2018-01-01
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite
Detection of epistatic effects with logic regression and a classical linear regression model.
Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata
2014-02-01
To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
Li, Yan; Xiao, Tao; Liao, Dandan; Lee, Mei-Ling Ting
2018-03-30
The Cox proportional hazards (PH) model is a common statistical technique used for analyzing time-to-event data. The assumption of PH, however, is not always appropriate in real applications. In cases where the assumption is not tenable, threshold regression (TR) and other survival methods, which do not require the PH assumption, are available and widely used. These alternative methods generally assume that the study data constitute simple random samples. In particular, TR has not been studied in the setting of complex surveys that involve (1) differential selection probabilities of study subjects and (2) intracluster correlations induced by multistage cluster sampling. In this paper, we extend TR procedures to account for complex sampling designs. The pseudo-maximum likelihood estimation technique is applied to estimate the TR model parameters. Computationally efficient Taylor linearization variance estimators that consider both the intracluster correlation and the differential selection probabilities are developed. The proposed methods are evaluated by using simulation experiments with various complex designs and illustrated empirically by using mortality-linked Third National Health and Nutrition Examination Survey Phase II genetic data. Copyright © 2017 John Wiley & Sons, Ltd.
AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION
Krzyśko, Mirosław; Smaga, Łukasz
2017-01-01
In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...
The threshold of a stochastic delayed SIR epidemic model with vaccination
Liu, Qun; Jiang, Daqing
2016-11-01
In this paper, we study the threshold dynamics of a stochastic delayed SIR epidemic model with vaccination. We obtain sufficient conditions for extinction and persistence in the mean of the epidemic. The threshold between persistence in the mean and extinction of the stochastic system is also obtained. Compared with the corresponding deterministic model, the threshold affected by the white noise is smaller than the basic reproduction number Rbar0 of the deterministic system. Results show that time delay has important effects on the persistence and extinction of the epidemic.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Threshold corrections and gauge symmetry in twisted superstring models
International Nuclear Information System (INIS)
Pierce, D.M.
1994-01-01
Threshold corrections to the running of gauge couplings are calculated for superstring models with free complex world sheet fermions. For two N=1 SU(2)xU(1) 5 models, the threshold corrections lead to a small increase in the unification scale. Examples are given to illustrate how a given particle spectrum can be described by models with different boundary conditions on the internal fermions. We also discuss how complex twisted fermions can enhance the symmetry group of an N=4, SU(3)xU(1)xU(1) model to the gauge group SU(3)xSU(2)xU(1). It is then shown how a mixing angle analogous to the Weinberg angle depends on the boundary conditions of the internal fermions
Alternative regression models to assess increase in childhood BMI.
Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M
2008-09-08
Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
Alternative regression models to assess increase in childhood BMI
Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M
2008-01-01
Abstract Background Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 childre...
Thermal Efficiency Degradation Diagnosis Method Using Regression Model
International Nuclear Information System (INIS)
Jee, Chang Hyun; Heo, Gyun Young; Jang, Seok Won; Lee, In Cheol
2011-01-01
This paper proposes an idea for thermal efficiency degradation diagnosis in turbine cycles, which is based on turbine cycle simulation under abnormal conditions and a linear regression model. The correlation between the inputs for representing degradation conditions (normally unmeasured but intrinsic states) and the simulation outputs (normally measured but superficial states) was analyzed with the linear regression model. The regression models can inversely response an associated intrinsic state for a superficial state observed from a power plant. The diagnosis method proposed herein is classified into three processes, 1) simulations for degradation conditions to get measured states (referred as what-if method), 2) development of the linear model correlating intrinsic and superficial states, and 3) determination of an intrinsic state using the superficial states of current plant and the linear regression model (referred as inverse what-if method). The what-if method is to generate the outputs for the inputs including various root causes and/or boundary conditions whereas the inverse what-if method is the process of calculating the inverse matrix with the given superficial states, that is, component degradation modes. The method suggested in this paper was validated using the turbine cycle model for an operating power plant
Modeling Soil Quality Thresholds to Ecosystem Recovery at Fort Benning, Georgia, USA
Energy Technology Data Exchange (ETDEWEB)
Garten Jr., C.T.
2004-03-08
The objective of this research was to use a simple model of soil C and N dynamics to predict nutrient thresholds to ecosystem recovery on degraded soils at Fort Benning, Georgia, in the southeastern USA. The model calculates aboveground and belowground biomass, soil C inputs and dynamics, soil N stocks and availability, and plant N requirements. A threshold is crossed when predicted soil N supplies fall short of predicted N required to sustain biomass accrual at a specified recovery rate. Four factors were important to development of thresholds to recovery: (1) initial amounts of aboveground biomass, (2) initial soil C stocks (i.e., soil quality), (3) relative recovery rates of biomass, and (4) soil sand content. Thresholds to ecosystem recovery predicted by the model should not be interpreted independent of a specified recovery rate. Initial soil C stocks influenced the predicted patterns of recovery by both old field and forest ecosystems. Forests and old fields on soils with varying sand content had different predicted thresholds to recovery. Soil C stocks at barren sites on Fort Benning generally lie below predicted thresholds to 100% recovery of desired future ecosystem conditions defined on the basis of aboveground biomass (18000 versus 360 g m{sup -2} for forests and old fields, respectively). Calculations with the model indicated that reestablishment of vegetation on barren sites to a level below the desired future condition is possible at recovery rates used in the model, but the time to 100% recovery of desired future conditions, without crossing a nutrient threshold, is prolonged by a reduced rate of forest growth. Predicted thresholds to ecosystem recovery were less on soils with more than 70% sand content. The lower thresholds for old field and forest recovery on more sandy soils are apparently due to higher relative rates of net soil N mineralization in more sandy soils. Calculations with the model indicate that a combination of desired future
Random regression models for detection of gene by environment interaction
Directory of Open Access Journals (Sweden)
Meuwissen Theo HE
2007-02-01
Full Text Available Abstract Two random regression models, where the effect of a putative QTL was regressed on an environmental gradient, are described. The first model estimates the correlation between intercept and slope of the random regression, while the other model restricts this correlation to 1 or -1, which is expected under a bi-allelic QTL model. The random regression models were compared to a model assuming no gene by environment interactions. The comparison was done with regards to the models ability to detect QTL, to position them accurately and to detect possible QTL by environment interactions. A simulation study based on a granddaughter design was conducted, and QTL were assumed, either by assigning an effect independent of the environment or as a linear function of a simulated environmental gradient. It was concluded that the random regression models were suitable for detection of QTL effects, in the presence and absence of interactions with environmental gradients. Fixing the correlation between intercept and slope of the random regression had a positive effect on power when the QTL effects re-ranked between environments.
Alternative regression models to assess increase in childhood BMI
Directory of Open Access Journals (Sweden)
Mansmann Ulrich
2008-09-01
Full Text Available Abstract Background Body mass index (BMI data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs, quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS. We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. Results GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. Conclusion GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
Wavelet regression model in forecasting crude oil price
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
Predicting and Modelling of Survival Data when Cox's Regression Model does not hold
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
2002-01-01
Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects...
Spatial stochastic regression modelling of urban land use
International Nuclear Information System (INIS)
Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A
2014-01-01
Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable
How to detect and visualize extinction thresholds for structured PVA models
Hildenbrandt, H.; Grimm, V.
2006-01-01
An extinction threshold is a population size below which extinction risk increases to beyond critical values. However, detecting extinction thresholds for structured population models is not straightforward because many different population structures may correspond to the same population size.
Eigen's Error Threshold and Mutational Meltdown in a Quasispecies Model
Bagnoli, F.; Bezzi, M.
1998-01-01
We introduce a toy model for interacting populations connected by mutations and limited by a shared resource. We study the presence of Eigen's error threshold and mutational meltdown. The phase diagram of the system shows that the extinction of the whole population due to mutational meltdown can occur well before an eventual error threshold transition.
Physics constrained nonlinear regression models for time series
International Nuclear Information System (INIS)
Majda, Andrew J; Harlim, John
2013-01-01
A central issue in contemporary science is the development of data driven statistical nonlinear dynamical models for time series of partial observations of nature or a complex physical model. It has been established recently that ad hoc quadratic multi-level regression (MLR) models can have finite-time blow up of statistical solutions and/or pathological behaviour of their invariant measure. Here a new class of physics constrained multi-level quadratic regression models are introduced, analysed and applied to build reduced stochastic models from data of nonlinear systems. These models have the advantages of incorporating memory effects in time as well as the nonlinear noise from energy conserving nonlinear interactions. The mathematical guidelines for the performance and behaviour of these physics constrained MLR models as well as filtering algorithms for their implementation are developed here. Data driven applications of these new multi-level nonlinear regression models are developed for test models involving a nonlinear oscillator with memory effects and the difficult test case of the truncated Burgers–Hopf model. These new physics constrained quadratic MLR models are proposed here as process models for Bayesian estimation through Markov chain Monte Carlo algorithms of low frequency behaviour in complex physical data. (paper)
Directory of Open Access Journals (Sweden)
Nataša Šarlija
2017-01-01
Full Text Available This study sheds light on the most common issues related to applying logistic regression in prediction models for company growth. The purpose of the paper is 1 to provide a detailed demonstration of the steps in developing a growth prediction model based on logistic regression analysis, 2 to discuss common pitfalls and methodological errors in developing a model, and 3 to provide solutions and possible ways of overcoming these issues. Special attention is devoted to the question of satisfying logistic regression assumptions, selecting and defining dependent and independent variables, using classification tables and ROC curves, for reporting model strength, interpreting odds ratios as effect measures and evaluating performance of the prediction model. Development of a logistic regression model in this paper focuses on a prediction model of company growth. The analysis is based on predominantly financial data from a sample of 1471 small and medium-sized Croatian companies active between 2009 and 2014. The financial data is presented in the form of financial ratios divided into nine main groups depicting following areas of business: liquidity, leverage, activity, profitability, research and development, investing and export. The growth prediction model indicates aspects of a business critical for achieving high growth. In that respect, the contribution of this paper is twofold. First, methodological, in terms of pointing out pitfalls and potential solutions in logistic regression modelling, and secondly, theoretical, in terms of identifying factors responsible for high growth of small and medium-sized companies.
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Poisson versus threshold models for genetic analysis of clinical mastitis in US Holsteins.
Vazquez, A I; Weigel, K A; Gianola, D; Bates, D M; Perez-Cabal, M A; Rosa, G J M; Chang, Y M
2009-10-01
Typically, clinical mastitis is coded as the presence or absence of disease in a given lactation, and records are analyzed with either linear models or binary threshold models. Because the presence of mastitis may include cows with multiple episodes, there is a loss of information when counts are treated as binary responses. Poisson models are appropriated for random variables measured as the number of events, and although these models are used extensively in studying the epidemiology of mastitis, they have rarely been used for studying the genetic aspects of mastitis. Ordinal threshold models are pertinent for ordered categorical responses; although one can hypothesize that the number of clinical mastitis episodes per animal reflects a continuous underlying increase in mastitis susceptibility, these models have rarely been used in genetic analysis of mastitis. The objective of this study was to compare probit, Poisson, and ordinal threshold models for the genetic evaluation of US Holstein sires for clinical mastitis. Mastitis was measured as a binary trait or as the number of mastitis cases. Data from 44,908 first-parity cows recorded in on-farm herd management software were gathered, edited, and processed for the present study. The cows were daughters of 1,861 sires, distributed over 94 herds. Predictive ability was assessed via a 5-fold cross-validation using 2 loss functions: mean squared error of prediction (MSEP) as the end point and a cost difference function. The heritability estimates were 0.061 for mastitis measured as a binary trait in the probit model and 0.085 and 0.132 for the number of mastitis cases in the ordinal threshold and Poisson models, respectively; because of scale differences, only the probit and ordinal threshold models are directly comparable. Among healthy animals, MSEP was smallest for the probit model, and the cost function was smallest for the ordinal threshold model. Among diseased animals, MSEP and the cost function were smallest
Glass, Edmund R; Dozmorov, Mikhail G
2016-10-06
The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. Combined with cell counts, heterogeneous gene expression may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression, and a global cutoff to judge significance, such as False Discovery Rate (FDR). Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect linear regression. In this paper we quantify the parameter space affecting the performance of linear regression (sensitivity of cell type-specific differential expression detection) on a per-gene basis. We evaluated the effect of sample sizes, cell type-specific proportion variability, and mean squared error on sensitivity of cell type-specific differential expression detection using linear regression. Each parameter affected variability of cell type-specific expression estimates and, subsequently, the sensitivity of differential expression detection. We provide the R package, LRCDE, which performs linear regression-based cell type-specific differential expression (deconvolution) detection on a gene-by-gene basis. Accounting for variability around cell type-specific gene expression estimates, it computes per-gene t-statistics of differential detection, p-values, t-statistic-based sensitivity, group-specific mean squared error, and several gene-specific diagnostic metrics. The sensitivity of linear regression-based cell type-specific differential expression detection differed for each gene as a function of mean squared error, per group sample sizes, and variability of the proportions
Geographically weighted regression model on poverty indicator
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
International Nuclear Information System (INIS)
Cohen, B.L.
1992-01-01
A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy
Threshold Dynamics of a Stochastic Chemostat Model with Two Nutrients and One Microorganism
Directory of Open Access Journals (Sweden)
Jian Zhang
2017-01-01
Full Text Available A new stochastic chemostat model with two substitutable nutrients and one microorganism is proposed and investigated. Firstly, for the corresponding deterministic model, the threshold for extinction and permanence of the microorganism is obtained by analyzing the stability of the equilibria. Then, for the stochastic model, the threshold of the stochastic chemostat for extinction and permanence of the microorganism is explored. Difference of the threshold of the deterministic model and the stochastic model shows that a large stochastic disturbance can affect the persistence of the microorganism and is harmful to the cultivation of the microorganism. To illustrate this phenomenon, we give some computer simulations with different intensity of stochastic noise disturbance.
Selection Strategies for Social Influence in the Threshold Model
Karampourniotis, Panagiotis; Szymanski, Boleslaw; Korniss, Gyorgy
The ubiquity of online social networks makes the study of social influence extremely significant for its applications to marketing, politics and security. Maximizing the spread of influence by strategically selecting nodes as initiators of a new opinion or trend is a challenging problem. We study the performance of various strategies for selection of large fractions of initiators on a classical social influence model, the Threshold model (TM). Under the TM, a node adopts a new opinion only when the fraction of its first neighbors possessing that opinion exceeds a pre-assigned threshold. The strategies we study are of two kinds: strategies based solely on the initial network structure (Degree-rank, Dominating Sets, PageRank etc.) and strategies that take into account the change of the states of the nodes during the evolution of the cascade, e.g. the greedy algorithm. We find that the performance of these strategies depends largely on both the network structure properties, e.g. the assortativity, and the distribution of the thresholds assigned to the nodes. We conclude that the optimal strategy needs to combine the network specifics and the model specific parameters to identify the most influential spreaders. Supported in part by ARL NS-CTA, ARO, and ONR.
Geodesic least squares regression for scaling studies in magnetic confinement fusion
International Nuclear Information System (INIS)
Verdoolaege, Geert
2015-01-01
In regression analyses for deriving scaling laws that occur in various scientific disciplines, usually standard regression methods have been applied, of which ordinary least squares (OLS) is the most popular. However, concerns have been raised with respect to several assumptions underlying OLS in its application to scaling laws. We here discuss a new regression method that is robust in the presence of significant uncertainty on both the data and the regression model. The method, which we call geodesic least squares regression (GLS), is based on minimization of the Rao geodesic distance on a probabilistic manifold. We demonstrate the superiority of the method using synthetic data and we present an application to the scaling law for the power threshold for the transition to the high confinement regime in magnetic confinement fusion devices
International Nuclear Information System (INIS)
Hategan, Cornel
2002-01-01
Theory of Threshold Phenomena in Quantum Scattering is developed in terms of Reduced Scattering Matrix. Relationships of different types of threshold anomalies both to nuclear reaction mechanisms and to nuclear reaction models are established. Magnitude of threshold effect is related to spectroscopic factor of zero-energy neutron state. The Theory of Threshold Phenomena, based on Reduced Scattering Matrix, does establish relationships between different types of threshold effects and nuclear reaction mechanisms: the cusp and non-resonant potential scattering, s-wave threshold anomaly and compound nucleus resonant scattering, p-wave anomaly and quasi-resonant scattering. A threshold anomaly related to resonant or quasi resonant scattering is enhanced provided the neutron threshold state has large spectroscopic amplitude. The Theory contains, as limit cases, Cusp Theories and also results of different nuclear reactions models as Charge Exchange, Weak Coupling, Bohr and Hauser-Feshbach models. (author)
On a Robust MaxEnt Process Regression Model with Sample-Selection
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2018-04-01
Full Text Available In a regression analysis, a sample-selection bias arises when a dependent variable is partially observed as a result of the sample selection. This study introduces a Maximum Entropy (MaxEnt process regression model that assumes a MaxEnt prior distribution for its nonparametric regression function and finds that the MaxEnt process regression model includes the well-known Gaussian process regression (GPR model as a special case. Then, this special MaxEnt process regression model, i.e., the GPR model, is generalized to obtain a robust sample-selection Gaussian process regression (RSGPR model that deals with non-normal data in the sample selection. Various properties of the RSGPR model are established, including the stochastic representation, distributional hierarchy, and magnitude of the sample-selection bias. These properties are used in the paper to develop a hierarchical Bayesian methodology to estimate the model. This involves a simple and computationally feasible Markov chain Monte Carlo algorithm that avoids analytical or numerical derivatives of the log-likelihood function of the model. The performance of the RSGPR model in terms of the sample-selection bias correction, robustness to non-normality, and prediction, is demonstrated through results in simulations that attest to its good finite-sample performance.
On concurvity in nonlinear and nonparametric regression models
Directory of Open Access Journals (Sweden)
Sonia Amodio
2014-12-01
Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.
Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi
2018-04-01
Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.
Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering
Xiang, Sijia; Yao, Weixin
2017-01-01
In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...
The threshold of a stochastic SIQS epidemic model
Zhang, Xiao-Bing; Huo, Hai-Feng; Xiang, Hong; Shi, Qihong; Li, Dungang
2017-09-01
In this paper, we present the threshold of a stochastic SIQS epidemic model which determines the extinction and persistence of the disease. Furthermore, we find that noise can suppress the disease outbreak. Numerical simulations are also carried out to confirm the analytical results.
International Nuclear Information System (INIS)
Che Jinxing; Wang Jianzhou
2010-01-01
In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.
Modeling oil production based on symbolic regression
International Nuclear Information System (INIS)
Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju
2015-01-01
Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak
Stock, Matt S; Mota, Jacob A
2017-12-01
Muscle fatigue is associated with diminished twitch force amplitude. We examined changes in the motor unit recruitment versus derecruitment threshold relationship during fatigue. Nine men (mean age = 26 years) performed repeated isometric contractions at 50% maximal voluntary contraction (MVC) knee extensor force until exhaustion. Surface electromyographic signals were detected from the vastus lateralis, and were decomposed into their constituent motor unit action potential trains. Motor unit recruitment and derecruitment thresholds and firing rates at recruitment and derecruitment were evaluated at the beginning, middle, and end of the protocol. On average, 15 motor units were studied per contraction. For the initial contraction, three subjects showed greater recruitment thresholds than derecruitment thresholds for all motor units. Five subjects showed greater recruitment thresholds than derecruitment thresholds for only low-threshold motor units at the beginning, with a mean cross-over of 31.6% MVC. As the muscle fatigued, many motor units were derecruited at progressively higher forces. In turn, decreased slopes and increased y-intercepts were observed. These shifts were complemented by increased firing rates at derecruitment relative to recruitment. As the vastus lateralis fatigued, the central nervous system's compensatory adjustments resulted in a shift of the regression line of the recruitment versus derecruitment threshold relationship. Copyright © 2017 IPEM. Published by Elsevier Ltd. All rights reserved.
Model performance analysis and model validation in logistic regression
Directory of Open Access Journals (Sweden)
Rosa Arboretti Giancristofaro
2007-10-01
Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
The APT model as reduced-rank regression
Bekker, P.A.; Dobbelstein, P.; Wansbeek, T.J.
Integrating the two steps of an arbitrage pricing theory (APT) model leads to a reduced-rank regression (RRR) model. So the results on RRR can be used to estimate APT models, making estimation very simple. We give a succinct derivation of estimation of RRR, derive the asymptotic variance of RRR
Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William
2016-01-01
Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Modeling of Auditory Neuron Response Thresholds with Cochlear Implants
Directory of Open Access Journals (Sweden)
Frederic Venail
2015-01-01
Full Text Available The quality of the prosthetic-neural interface is a critical point for cochlear implant efficiency. It depends not only on technical and anatomical factors such as electrode position into the cochlea (depth and scalar placement, electrode impedance, and distance between the electrode and the stimulated auditory neurons, but also on the number of functional auditory neurons. The efficiency of electrical stimulation can be assessed by the measurement of e-CAP in cochlear implant users. In the present study, we modeled the activation of auditory neurons in cochlear implant recipients (nucleus device. The electrical response, measured using auto-NRT (neural responses telemetry algorithm, has been analyzed using multivariate regression with cubic splines in order to take into account the variations of insertion depth of electrodes amongst subjects as well as the other technical and anatomical factors listed above. NRT thresholds depend on the electrode squared impedance (β = −0.11 ± 0.02, P<0.01, the scalar placement of the electrodes (β = −8.50 ± 1.97, P<0.01, and the depth of insertion calculated as the characteristic frequency of auditory neurons (CNF. Distribution of NRT residues according to CNF could provide a proxy of auditory neurons functioning in implanted cochleas.
Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits
National Research Council Canada - National Science Library
Gravier, Michael
1999-01-01
.... The research identified logistic regression as a powerful tool for analysis of DMSMS and further developed twenty models attempting to identify the "best" way to model and predict DMSMS using logistic regression...
International Nuclear Information System (INIS)
Li, Yangfan; Li, Yi; Wu, Wei
2016-01-01
The concept of thresholds shows important implications for environmental and resource management. Here we derived potential landscape thresholds which indicated abrupt changes in water quality or the dividing points between exceeding and failing to meet national surface water quality standards for a rapidly urbanizing city on the Eastern Coast in China. The analysis of landscape thresholds was based on regression models linking each of the seven water quality variables to each of the six landscape metrics for this coupled land-water system. We found substantial and accelerating urban sprawl at the suburban areas between 2000 and 2008, and detected significant nonlinear relations between water quality and landscape pattern. This research demonstrated that a simple modeling technique could provide insights on environmental thresholds to support more-informed decision making in land use, water environmental and resilience management. - Graphical abstract: Fig. Threshold models and resilience management for water quality. Display Omitted - Highlights: • Coupling urbanization and water environmental system. • Developing threshold models of the coupled land-water systems. • Nonlinear relations between water quality variables and landscape metrics. • Enhancing resilience management of coastal rapid urbanization. - We develop environmental threshold models and provide their implications on resilience management for a coupled land-water system with rapid urbanization.
Mob control models of threshold collective behavior
Breer, Vladimir V; Rogatkin, Andrey D
2017-01-01
This book presents mathematical models of mob control with threshold (conformity) collective decision-making of the agents. Based on the results of analysis of the interconnection between the micro- and macromodels of active network structures, it considers the static (deterministic, stochastic and game-theoretic) and dynamic (discrete- and continuous-time) models of mob control, and highlights models of informational confrontation. Many of the results are applicable not only to mob control problems, but also to control problems arising in social groups, online social networks, etc. Aimed at researchers and practitioners, it is also a valuable resource for undergraduate and postgraduate students as well as doctoral candidates specializing in the field of collective behavior modeling.
Analysis of dental caries using generalized linear and count regression models
Directory of Open Access Journals (Sweden)
Javali M. Phil
2013-11-01
Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.
The threshold bias model: a mathematical model for the nomothetic approach of suicide.
Folly, Walter Sydney Dutra
2011-01-01
Comparative and predictive analyses of suicide data from different countries are difficult to perform due to varying approaches and the lack of comparative parameters. A simple model (the Threshold Bias Model) was tested for comparative and predictive analyses of suicide rates by age. The model comprises of a six parameter distribution that was applied to the USA suicide rates by age for the years 2001 and 2002. Posteriorly, linear extrapolations are performed of the parameter values previously obtained for these years in order to estimate the values corresponding to the year 2003. The calculated distributions agreed reasonably well with the aggregate data. The model was also used to determine the age above which suicide rates become statistically observable in USA, Brazil and Sri Lanka. The Threshold Bias Model has considerable potential applications in demographic studies of suicide. Moreover, since the model can be used to predict the evolution of suicide rates based on information extracted from past data, it will be of great interest to suicidologists and other researchers in the field of mental health.
Gauge threshold corrections for local string models
International Nuclear Information System (INIS)
Conlon, Joseph P.
2009-01-01
We study gauge threshold corrections for local brane models embedded in a large compact space. A large bulk volume gives important contributions to the Konishi and super-Weyl anomalies and the effective field theory analysis implies the unification scale should be enhanced in a model-independent way from M s to RM s . For local D3/D3 models this result is supported by the explicit string computations. In this case the scale RM s comes from the necessity of global cancellation of RR tadpoles sourced by the local model. We also study D3/D7 models and discuss discrepancies with the effective field theory analysis. We comment on phenomenological implications for gauge coupling unification and for the GUT scale.
AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Н. Білак
2012-04-01
Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.
[Application of detecting and taking overdispersion into account in Poisson regression model].
Bouche, G; Lepage, B; Migeot, V; Ingrand, P
2009-08-01
Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Variance of the number of primary care consultations (Var[Y]=21.03) was greater than the mean (E[Y]=5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution ; in addition this regression is easy to perform with ordinary statistical software packages.
Variable Selection for Regression Models of Percentile Flows
Fouad, G.
2017-12-01
Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high
Al-Asadi, H A; Al-Mansoori, M H; Ajiya, M; Hitam, S; Saripan, M I; Mahdi, M A
2010-10-11
We develop a theoretical model that can be used to predict stimulated Brillouin scattering (SBS) threshold in optical fibers that arises through the effect of Brillouin pump recycling technique. Obtained simulation results from our model are in close agreement with our experimental results. The developed model utilizes single mode optical fiber of different lengths as the Brillouin gain media. For 5-km long single mode fiber, the calculated threshold power for SBS is about 16 mW for conventional technique. This value is reduced to about 8 mW when the residual Brillouin pump is recycled at the end of the fiber. The decrement of SBS threshold is due to longer interaction lengths between Brillouin pump and Stokes wave.
International Nuclear Information System (INIS)
Huang, Bwo-Nung; Yang, C.W.; Hwang, M.J.
2009-01-01
This paper segments daily data from January of 1986 to April of 2007 into three periods based on certain important events. Both periods I and II indicate that the spot prices in general are higher than futures prices as was well-known in the literature. Only period-III (2001/9/11-2007/4/30) displays a reverse phenomenon: futures prices, in general, exceed spot prices. When the absolute value of a basis (futures-spot) is greater than the threshold value in the arbitrage area (regime 1 and 3), at least one of the error correction coefficients, representing adjustment towards equilibrium, is statistically significant. That is, there exists a tendency in the oil market in which prices move toward equilibrium. With respect to the short-run dynamic interaction between spot price change ((delta)s t ) and futures price change ((delta)f t ), our results indicate that when the spot price is higher than futures price, and the basis is less than certain threshold value (regime 3), there exists at least one causal relationship between (delta)s t and (delta)f t . Conversely, when the futures price is higher than spot price and the basis is higher than certain threshold value (regime 1), there exists at least one causal relationship between (delta)s t and (delta)f t . Finally, we use the method suggested by Diebold and Mariano [Diebold, Francis X., Mariano, Roberto S., 1995. Comparing predictive accuracy. Journal of Business and Economic Statistics 13 (3), 253-263] to compare the predictive power between the linear and nonlinear models. Our empirical results indicate that the in-sample prediction of the nonlinear model is clearly superior to that of the linear model. (author)
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Geographically Weighted Logistic Regression Applied to Credit Scoring Models
Directory of Open Access Journals (Sweden)
Pedro Henrique Melo Albuquerque
Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.
The threshold of a stochastic delayed SIR epidemic model with temporary immunity
Liu, Qun; Chen, Qingmei; Jiang, Daqing
2016-05-01
This paper is concerned with the asymptotic properties of a stochastic delayed SIR epidemic model with temporary immunity. Sufficient conditions for extinction and persistence in the mean of the epidemic are established. The threshold between persistence in the mean and extinction of the epidemic is obtained. Compared with the corresponding deterministic model, the threshold affected by the white noise is smaller than the basic reproduction number R0 of the deterministic system.
Modeling maximum daily temperature using a varying coefficient regression model
Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith
2014-01-01
Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...
Structured Additive Regression Models: An R Interface to BayesX
Directory of Open Access Journals (Sweden)
Nikolaus Umlauf
2015-02-01
Full Text Available Structured additive regression (STAR models provide a flexible framework for model- ing possible nonlinear effects of covariates: They contain the well established frameworks of generalized linear models and generalized additive models as special cases but also allow a wider class of effects, e.g., for geographical or spatio-temporal data, allowing for specification of complex and realistic models. BayesX is standalone software package providing software for fitting general class of STAR models. Based on a comprehensive open-source regression toolbox written in C++, BayesX uses Bayesian inference for estimating STAR models based on Markov chain Monte Carlo simulation techniques, a mixed model representation of STAR models, or stepwise regression techniques combining penalized least squares estimation with model selection. BayesX not only covers models for responses from univariate exponential families, but also models from less-standard regression situations such as models for multi-categorical responses with either ordered or unordered categories, continuous time survival data, or continuous time multi-state models. This paper presents a new fully interactive R interface to BayesX: the R package R2BayesX. With the new package, STAR models can be conveniently specified using Rs formula language (with some extended terms, fitted using the BayesX binary, represented in R with objects of suitable classes, and finally printed/summarized/plotted. This makes BayesX much more accessible to users familiar with R and adds extensive graphics capabilities for visualizing fitted STAR models. Furthermore, R2BayesX complements the already impressive capabilities for semiparametric regression in R by a comprehensive toolbox comprising in particular more complex response types and alternative inferential procedures such as simulation-based Bayesian inference.
Keith, Timothy Z
2014-01-01
Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Linear regression crash prediction models : issues and proposed solutions.
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
Modeling strong-field above-threshold ionization
International Nuclear Information System (INIS)
Sundaram, B.; Armstrong, L. Jr.
1990-01-01
Above-threshold ionization (ATI) by intense, short-pulse lasers is studied numerically, using the stretched hydrogen atom Hamiltonian. Within our model system, we isolate several mechanisms that contribute to the ATI process. These mechanisms, which involve both excited bound states and continuum states, all invoke intermediate, off-energy shell transitions. In particular, the importance of excited bound states and off-energy shell bound-free processes to the ionization mechanism are shown to relate to a simple physical criterion. These processes point to importance differences in the interpretation of ionization characteristics for short pulses from that for longer pulses. Our analysis concludes that although components of ATI admit of simple, few-state modeling, the ultimate synthesis points to a highly complex mechanism
Identification of Influential Points in a Linear Regression Model
Directory of Open Access Journals (Sweden)
Jan Grosz
2011-03-01
Full Text Available The article deals with the detection and identification of influential points in the linear regression model. Three methods of detection of outliers and leverage points are described. These procedures can also be used for one-sample (independentdatasets. This paper briefly describes theoretical aspects of several robust methods as well. Robust statistics is a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. A simulation model of the simple linear regression is presented.
Nguyen, Tri-Long; Collins, Gary S; Spence, Jessica; Daurès, Jean-Pierre; Devereaux, P J; Landais, Paul; Le Manach, Yannick
2017-04-28
Double-adjustment can be used to remove confounding if imbalance exists after propensity score (PS) matching. However, it is not always possible to include all covariates in adjustment. We aimed to find the optimal imbalance threshold for entering covariates into regression. We conducted a series of Monte Carlo simulations on virtual populations of 5,000 subjects. We performed PS 1:1 nearest-neighbor matching on each sample. We calculated standardized mean differences across groups to detect any remaining imbalance in the matched samples. We examined 25 thresholds (from 0.01 to 0.25, stepwise 0.01) for considering residual imbalance. The treatment effect was estimated using logistic regression that contained only those covariates considered to be unbalanced by these thresholds. We showed that regression adjustment could dramatically remove residual confounding bias when it included all of the covariates with a standardized difference greater than 0.10. The additional benefit was negligible when we also adjusted for covariates with less imbalance. We found that the mean squared error of the estimates was minimized under the same conditions. If covariate balance is not achieved, we recommend reiterating PS modeling until standardized differences below 0.10 are achieved on most covariates. In case of remaining imbalance, a double adjustment might be worth considering.
The art of regression modeling in road safety
Hauer, Ezra
2015-01-01
This unique book explains how to fashion useful regression models from commonly available data to erect models essential for evidence-based road safety management and research. Composed from techniques and best practices presented over many years of lectures and workshops, The Art of Regression Modeling in Road Safety illustrates that fruitful modeling cannot be done without substantive knowledge about the modeled phenomenon. Class-tested in courses and workshops across North America, the book is ideal for professionals, researchers, university professors, and graduate students with an interest in, or responsibilities related to, road safety. This book also: · Presents for the first time a powerful analytical tool for road safety researchers and practitioners · Includes problems and solutions in each chapter as well as data and spreadsheets for running models and PowerPoint presentation slides · Features pedagogy well-suited for graduate courses and workshops including problems, solutions, and PowerPoint p...
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)
Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul
2018-05-01
The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.
Wesonga, Ronald; Nabugoomu, Fabian
2016-01-01
The study derives a framework for assessing airport efficiency through evaluating optimal arrival and departure delay thresholds. Assumptions of airport efficiency measurements, though based upon minimum numeric values such as 15 min of turnaround time, cannot be extrapolated to determine proportions of delay-days of an airport. This study explored the concept of delay threshold to determine the proportion of delay-days as an expansion of the theory of delay and our previous work. Data-driven approach using statistical modelling was employed to a limited set of determinants of daily delay at an airport. For the purpose of testing the efficacy of the threshold levels, operational data for Entebbe International Airport were used as a case study. Findings show differences in the proportions of delay at departure (μ = 0.499; 95 % CI = 0.023) and arrival (μ = 0.363; 95 % CI = 0.022). Multivariate logistic model confirmed an optimal daily departure and arrival delay threshold of 60 % for the airport given the four probable thresholds {50, 60, 70, 80}. The decision for the threshold value was based on the number of significant determinants, the goodness of fit statistics based on the Wald test and the area under the receiver operating curves. These findings propose a modelling framework to generate relevant information for the Air Traffic Management relevant in planning and measurement of airport operational efficiency.
Flexible competing risks regression modeling and goodness-of-fit
DEFF Research Database (Denmark)
Scheike, Thomas; Zhang, Mei-Jie
2008-01-01
In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative incidence curve based on these cause...... models that is easy to fit and contains the Fine-Gray model as a special case. One advantage of this approach is that our regression modeling allows for non-proportional hazards. This leads to a new simple goodness-of-fit procedure for the proportional subdistribution hazards assumption that is very easy...... of the flexible regression models to analyze competing risks data when non-proportionality is present in the data....
Modeling, Simulation, and Analysis of Novel Threshold Voltage Definition for Nano-MOSFET
Directory of Open Access Journals (Sweden)
Yashu Swami
2017-01-01
Full Text Available Threshold voltage (VTH is the indispensable vital parameter in MOSFET designing, modeling, and operation. Diverse expounds and extraction methods exist to model the on-off transition characteristics of the device. The governing gauge for efficient threshold voltage definition and extraction method can be itemized as clarity, simplicity, precision, and stability throughout the operating conditions and technology node. The outcomes of extraction methods diverge from the exact values due to various short-channel effects (SCEs and nonidealities present in the device. A new approach to define and extract the real value of VTH of MOSFET is proposed in the manuscript. The subsequent novel enhanced SCE-independent VTH extraction method named “hybrid extrapolation VTH extraction method” (HEEM is elaborated, modeled, and compared with few prevalent MOSFET threshold voltage extraction methods for validation of the results. All the results are verified by extensive 2D TCAD simulation and confirmed analytically at various technology nodes.
A phenomenological model on the kink mode threshold varying with the inclination of sheath boundary
International Nuclear Information System (INIS)
Sun, X.; Intrator, T. P.; Sears, J.; Weber, T.; Liu, M.
2013-01-01
In nature and many laboratory plasmas, a magnetic flux tube threaded by current or a flux rope has a footpoint at a boundary. The current driven kink mode is one of the fundamental ideal magnetohydrodynamic instabilities in plasmas. It has an instability threshold that has been found to strongly depend on boundary conditions (BCs). We provide a theoretical model to explain the transition of this threshold dependence between nonline tied and line tied boundary conditions. We evaluate model parameters using experimentally measured plasma data, explicitly verify several kink eigenfunctions, and validate the model predictions for boundary conditions BCs that span the range between NLT and LT BCs. Based on this model, one could estimate the kink threshold given knowledge of the displacement of a flux rope end, or conversely estimate flux rope end motion based on knowledge of it kink stability threshold
Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.
Chatzis, Sotirios P; Andreou, Andreas S
2015-11-01
Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.
Prahutama, Alan; Suparti; Wahyu Utami, Tiani
2018-03-01
Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
General regression and representation model for classification.
Directory of Open Access Journals (Sweden)
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
A threshold model for Australian Stock Exchange equities
Bertram, William K.
2005-02-01
In this paper, we present a threshold model to describe the phenomena of zero return enhancement that is present in Australian Stock Exchange data. We examine the intraday behaviour of the ASX data and construct a new measure for the market activity using principal component analysis. We use this measure to create a business time scale that keeps the level of zero return enhancement constant throughout trading hours. Operating in this new time scale we fit the model to data for small and large time scales and find that the model affords an excellent approximation of the distribution of stock returns.
Multivariate Self-Exciting Threshold Autoregressive Models with eXogenous Input
Addo, Peter Martey
2014-01-01
This study defines a multivariate Self--Exciting Threshold Autoregressive with eXogenous input (MSETARX) models and present an estimation procedure for the parameters. The conditions for stationarity of the nonlinear MSETARX models is provided. In particular, the efficiency of an adaptive parameter estimation algorithm and LSE (least squares estimate) algorithm for this class of models is then provided via simulations.
International Nuclear Information System (INIS)
Mikhailovskii, A.B.; Shirokov, M.S.; Konovalov, S.V.; Tsypin, V.S.
2005-01-01
Transport threshold models of neoclassical tearing modes in tokamaks are investigated analytically. An analysis is made of the competition between strong transverse heat transport, on the one hand, and longitudinal heat transport, longitudinal heat convection, longitudinal inertial transport, and rotational transport, on the other hand, which leads to the establishment of the perturbed temperature profile in magnetic islands. It is shown that, in all these cases, the temperature profile can be found analytically by using rigorous solutions to the heat conduction equation in the near and far regions of a chain of magnetic islands and then by matching these solutions. Analytic expressions for the temperature profile are used to calculate the contribution of the bootstrap current to the generalized Rutherford equation for the island width evolution with the aim of constructing particular transport threshold models of neoclassical tearing modes. Four transport threshold models, differing in the underlying competing mechanisms, are analyzed: collisional, convective, inertial, and rotational models. The collisional model constructed analytically is shown to coincide exactly with that calculated numerically; the reason is that the analytical temperature profile turns out to be the same as the numerical profile. The results obtained can be useful in developing the next generation of general threshold models. The first steps toward such models have already been made
A flexible cure rate model with dependent censoring and a known cure threshold.
Bernhardt, Paul W
2016-11-10
We propose a flexible cure rate model that accommodates different censoring distributions for the cured and uncured groups and also allows for some individuals to be observed as cured when their survival time exceeds a known threshold. We model the survival times for the uncured group using an accelerated failure time model with errors distributed according to the seminonparametric distribution, potentially truncated at a known threshold. We suggest a straightforward extension of the usual expectation-maximization algorithm approach for obtaining estimates in cure rate models to accommodate the cure threshold and dependent censoring. We additionally suggest a likelihood ratio test for testing for the presence of dependent censoring in the proposed cure rate model. We show through numerical studies that our model has desirable properties and leads to approximately unbiased parameter estimates in a variety of scenarios. To demonstrate how our method performs in practice, we analyze data from a bone marrow transplantation study and a liver transplant study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Energy Technology Data Exchange (ETDEWEB)
Arora, N D
1987-05-01
A simple and accurate semi-empirical model for the threshold voltage of a small geometry double implanted enhancement type MOSFET, especially useful in a circuit simulation program like SPICE, has been developed. The effect of short channel length and narrow width on the threshold voltage has been taken into account through a geometrical approximation, which involves parameters whose values can be determined from the curve fitting experimental data. A model for the temperature dependence of the threshold voltage for the implanted devices has also been presented. The temperature coefficient of the threshold voltage was found to change with decreasing channel length and width. Experimental results from various device sizes, both short and narrow, show very good agreement with the model. The model has been implemented in SPICE as part of the complete dc model.
A test for the parameters of multiple linear regression models ...
African Journals Online (AJOL)
A test for the parameters of multiple linear regression models is developed for conducting tests simultaneously on all the parameters of multiple linear regression models. The test is robust relative to the assumptions of homogeneity of variances and absence of serial correlation of the classical F-test. Under certain null and ...
Treatment of threshold retinopathy of prematurity
Directory of Open Access Journals (Sweden)
Deshpande Dhanashree
1998-01-01
Full Text Available This report deals with our experience in the management of threshold retinopathy of prematurity (ROP. A total of 45 eyes of 23 infants were subjected to treatment of threshold ROP. 26.1% of these infants had a birth weight of >l,500 gm. The preferred modality of treatment was laser indirect photocoagulation, which was facilitated by scleral depression. Cryopexy was done in cases with nondilating pupils or medial haze and was always under general anaesthesia. Retreatment with either modality was needed in 42.2% eyes; in this the skip areas were covered. Total regression of diseases was achieved in 91.1% eyes with no sequelae. All the 4 eyes that progressed to stage 5 despite treatment had zone 1 disease. Major treatment-induced complications did not occur in this series. This study underscores the importance of routine screening of infants upto 2,000 gm birth weight for ROP and the excellent response that is achieved with laser photocoagulation in inducing regression of threshold ROP. Laser is the preferred method of treatment in view of the absence of treatment-related morbidity to the premature infants.
Model-dependence of the CO2 threshold for melting the hard Snowball Earth
Directory of Open Access Journals (Sweden)
W. R. Peltier
2011-01-01
Full Text Available One of the critical issues of the Snowball Earth hypothesis is the CO2 threshold for triggering the deglaciation. Using Community Atmospheric Model version 3.0 (CAM3, we study the problem for the CO2 threshold. Our simulations show large differences from previous results (e.g. Pierrehumbert, 2004, 2005; Le Hir et al., 2007. At 0.2 bars of CO2, the January maximum near-surface temperature is about 268 K, about 13 K higher than that in Pierrehumbert (2004, 2005, but lower than the value of 270 K for 0.1 bar of CO2 in Le Hir et al. (2007. It is found that the difference of simulation results is mainly due to model sensitivity of greenhouse effect and longwave cloud forcing to increasing CO2. At 0.2 bars of CO2, CAM3 yields 117 Wm−2 of clear-sky greenhouse effect and 32 Wm−2 of longwave cloud forcing, versus only about 77 Wm−2 and 10.5 Wm−2 in Pierrehumbert (2004, 2005, respectively. CAM3 has comparable clear-sky greenhouse effect to that in Le Hir et al. (2007, but lower longwave cloud forcing. CAM3 also produces much stronger Hadley cells than that in Pierrehumbert (2005. Effects of pressure broadening and collision-induced absorption are also studied using a radiative-convective model and CAM3. Both effects substantially increase surface temperature and thus lower the CO2 threshold. The radiative-convective model yields a CO2 threshold of about 0.21 bars with surface albedo of 0.663. Without considering the effects of pressure broadening and collision-induced absorption, CAM3 yields an approximate CO2 threshold of about 1.0 bar for surface albedo of about 0.6. However, the threshold is lowered to 0.38 bars as both effects are considered.
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.
Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko
2016-03-01
In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Stochasticity thresholds in the Fermi-Pasta-Ulam model
Energy Technology Data Exchange (ETDEWEB)
Callegari, B [Ferrara Univ. (Italy). Ist. di Matematica; Carotta, M C; Ferrario, C [Ferrara Univ. (Italy). Ist. di Fisica; Lo Vecchio, G [Ferrara Univ. (Italy). Ist. di Fisica; Gruppo Nazionale di Struttura della Materia, Ferrara (Italy)); Galgani, L [Milan Univ. (Italy). Ist. di Fisica; Milan Univ. (Italy). Ist. di Matematica)
1979-12-11
The authors consider the celebrated model of Fermi, Pasta and Ulam and give a numerical estimate for its thresholds of stochasticity, thus determining a critical energy as a function of the frequency of the corresponding oscillators. The results turn out to be qualitatively similar to those already obtained for a chain of particles with nearest-neighbour Lennard-Jones interaction potential.
Stochasticity thresholds in the Fermi-Pasta-Ulam model
International Nuclear Information System (INIS)
Callegari, B.; Galgani, L.; Milan Univ.
1979-01-01
The authors consider the celebrated model of Fermi, Pasta and Ulam and give a numerical estimate for its thresholds of stochasticity, thus determining a critical energy as a function of the frequency of the corresponding oscillators. The results turn out to be qualitatively similar to those already obtained for a chain of particles with nearest-neighbour Lennard-Jones interaction potential. (author)
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model
DEFF Research Database (Denmark)
Møller, Niels Framroze
This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its stru....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its......, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well...
Multiple Response Regression for Gaussian Mixture Models with Known Labels.
Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng
2012-12-01
Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.
Faraway, Julian J
2005-01-01
Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...
Optical Associative Memory Model With Threshold Modification Using Complementary Vector
Bian, Shaoping; Xu, Kebin; Hong, Jing
1989-02-01
A new criterion to evaluate the similarity between two vectors in associative memory is presented. According to it, an experimental research about optical associative memory model with threshold modification using complementary vector is carried out. This model is capable of eliminating the posibility to recall erroneously. Therefore the accuracy of reading out is improved.
Critical thresholds for eventual extinction in randomly disturbed population growth models.
Peckham, Scott D; Waymire, Edward C; De Leenheer, Patrick
2018-02-16
This paper considers several single species growth models featuring a carrying capacity, which are subject to random disturbances that lead to instantaneous population reduction at the disturbance times. This is motivated in part by growing concerns about the impacts of climate change. Our main goal is to understand whether or not the species can persist in the long run. We consider the discrete-time stochastic process obtained by sampling the system immediately after the disturbances, and find various thresholds for several modes of convergence of this discrete process, including thresholds for the absence or existence of a positively supported invariant distribution. These thresholds are given explicitly in terms of the intensity and frequency of the disturbances on the one hand, and the population's growth characteristics on the other. We also perform a similar threshold analysis for the original continuous-time stochastic process, and obtain a formula that allows us to express the invariant distribution for this continuous-time process in terms of the invariant distribution of the discrete-time process, and vice versa. Examples illustrate that these distributions can differ, and this sends a cautionary message to practitioners who wish to parameterize these and related models using field data. Our analysis relies heavily on a particular feature shared by all the deterministic growth models considered here, namely that their solutions exhibit an exponentially weighted averaging property between a function of the initial condition, and the same function applied to the carrying capacity. This property is due to the fact that these systems can be transformed into affine systems.
Bayesian approach to errors-in-variables in regression models
Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad
2017-05-01
In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.
The liability threshold model for censored twin data
DEFF Research Database (Denmark)
Holst, Klaus K.; Scheike, Thomas; Hjelmborg, Jacob B.
2016-01-01
the disease thus still being at risk. Ignoring this right-censoring can lead to severely biased estimates. The classical liability threshold model can be extended with inverse probability of censoring weighting of complete observations. This leads to a flexible way of modelling twin concordance and obtaining...... studies of diseases, as a way of quantifying such genetic contribution. The endpoint in these studies are typically defined as occurrence of a disease versus death without the disease. However, a large fraction of the subjects may still be alive at the time of follow-up without having experienced...
Song, Chao; Kwan, Mei-Po; Zhu, Jiping
2017-04-08
An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.
Modeling jointly low, moderate, and heavy rainfall intensities without a threshold selection
Naveau, Philippe
2016-04-09
In statistics, extreme events are often defined as excesses above a given large threshold. This definition allows hydrologists and flood planners to apply Extreme-Value Theory (EVT) to their time series of interest. Even in the stationary univariate context, this approach has at least two main drawbacks. First, working with excesses implies that a lot of observations (those below the chosen threshold) are completely disregarded. The range of precipitation is artificially shopped down into two pieces, namely large intensities and the rest, which necessarily imposes different statistical models for each piece. Second, this strategy raises a nontrivial and very practical difficultly: how to choose the optimal threshold which correctly discriminates between low and heavy rainfall intensities. To address these issues, we propose a statistical model in which EVT results apply not only to heavy, but also to low precipitation amounts (zeros excluded). Our model is in compliance with EVT on both ends of the spectrum and allows a smooth transition between the two tails, while keeping a low number of parameters. In terms of inference, we have implemented and tested two classical methods of estimation: likelihood maximization and probability weighed moments. Last but not least, there is no need to choose a threshold to define low and high excesses. The performance and flexibility of this approach are illustrated on simulated and hourly precipitation recorded in Lyon, France.
Direction of Effects in Multiple Linear Regression Models.
Wiedermann, Wolfgang; von Eye, Alexander
2015-01-01
Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.
Li, Yangfan; Li, Yi; Wu, Wei
2016-01-01
The concept of thresholds shows important implications for environmental and resource management. Here we derived potential landscape thresholds which indicated abrupt changes in water quality or the dividing points between exceeding and failing to meet national surface water quality standards for a rapidly urbanizing city on the Eastern Coast in China. The analysis of landscape thresholds was based on regression models linking each of the seven water quality variables to each of the six landscape metrics for this coupled land-water system. We found substantial and accelerating urban sprawl at the suburban areas between 2000 and 2008, and detected significant nonlinear relations between water quality and landscape pattern. This research demonstrated that a simple modeling technique could provide insights on environmental thresholds to support more-informed decision making in land use, water environmental and resilience management. Copyright © 2015 Elsevier Ltd. All rights reserved.
[Polygenic threshold model and the phenogenetic aspects of human fingerprints].
Voĭtenko, V P; Poliukhov, A M
1981-01-01
Based on the existence of the two genetic complexes determining finger prints (SU - spiral and SR - despiral), the two-compartment multithreshold polygenic model for systematization of finger prints has been proposed. It was found that the radial loop is genotypically not identical to the ulnar loop, as it was thought before, but differs very much from the latter by its print. The relative height of thresholds for each of 10 fingers has been measured. The two embryonal gradients have been established: one with a positive, and the other with a negative correlation between the threshold heights.
Meaney, Christopher; Moineddin, Rahim
2014-01-24
In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the
Tutorial on Using Regression Models with Count Outcomes Using R
Directory of Open Access Journals (Sweden)
A. Alexander Beaujean
2016-02-01
Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.
Hybrid data mining-regression for infrastructure risk assessment based on zero-inflated data
International Nuclear Information System (INIS)
Guikema, S.D.; Quiring, S.M.
2012-01-01
Infrastructure disaster risk assessment seeks to estimate the probability of a given customer or area losing service during a disaster, sometimes in conjunction with estimating the duration of each outage. This is often done on the basis of past data about the effects of similar events impacting the same or similar systems. In many situations this past performance data from infrastructure systems is zero-inflated; it has more zeros than can be appropriately modeled with standard probability distributions. The data are also often non-linear and exhibit threshold effects due to the complexities of infrastructure system performance. Standard zero-inflated statistical models such as zero-inflated Poisson and zero-inflated negative binomial regression models do not adequately capture these complexities. In this paper we develop a novel method that is a hybrid classification tree/regression method for complex, zero-inflated data sets. We investigate its predictive accuracy based on a large number of simulated data sets and then demonstrate its practical usefulness with an application to hurricane power outage risk assessment for a large utility based on actual data from the utility. While formulated for infrastructure disaster risk assessment, this method is promising for data-driven analysis for other situations with zero-inflated, complex data exhibiting response thresholds.
Distributed Monitoring of the R(sup 2) Statistic for Linear Regression
Bhaduri, Kanishka; Das, Kamalika; Giannella, Chris R.
2011-01-01
The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more dependent target variables. This problem becomes challenging for large scale data in a distributed computing environment when only a subset of instances is available at individual nodes and the local data changes frequently. Data centralization and periodic model recomputation can add high overhead to tasks like anomaly detection in such dynamic settings. Therefore, the goal is to develop techniques for monitoring and updating the model over the union of all nodes data in a communication-efficient fashion. Correctness guarantees on such techniques are also often highly desirable, especially in safety-critical application scenarios. In this paper we develop DReMo a distributed algorithm with very low resource overhead, for monitoring the quality of a regression model in terms of its coefficient of determination (R2 statistic). When the nodes collectively determine that R2 has dropped below a fixed threshold, the linear regression model is recomputed via a network-wide convergecast and the updated model is broadcast back to all nodes. We show empirically, using both synthetic and real data, that our proposed method is highly communication-efficient and scalable, and also provide theoretical guarantees on correctness.
Verachtert, E.; Den Eeckhaut, M. Van; Poesen, J.; Govers, G.; Deckers, J.
2011-07-01
Soil piping (tunnel erosion) has been recognised as an important erosion process in collapsible loess-derived soils of temperate humid climates, which can cause collapse of the topsoil and formation of discontinuous gullies. Information about the spatial patterns of collapsed pipes and regional models describing these patterns is still limited. Therefore, this study aims at better understanding the factors controlling the spatial distribution and predicting pipe collapse. A dataset with parcels suffering from collapsed pipes (n = 560) and parcels without collapsed pipes was obtained through a regional survey in a 236 km² study area in the Flemish Ardennes (Belgium). Logistic regression was applied to find the best model describing the relationship between the presence/absence of a collapsed pipe and a set of independent explanatory variables (i.e. slope gradient, drainage area, distance-to-thalweg, curvature, aspect, soil type and lithology). Special attention was paid to the selection procedure of the grid cells without collapsed pipes. Apart from the first piping susceptibility map created by logistic regression modelling, a second map was made based on topographical thresholds of slope gradient and upslope drainage area. The logistic regression model allowed identification of the most important factors controlling pipe collapse. Pipes are much more likely to occur when a topographical threshold depending on both slope gradient and upslope area is exceeded in zones with a sufficient water supply (due to topographical convergence and/or the presence of a clay-rich lithology). On the other hand, the use of slope-area thresholds only results in reasonable predictions of piping susceptibility, with minimum information.
A Threshold Model of Social Support, Adjustment, and Distress after Breast Cancer Treatment
Mallinckrodt, Brent; Armer, Jane M.; Heppner, P. Paul
2012-01-01
This study examined a threshold model that proposes that social support exhibits a curvilinear association with adjustment and distress, such that support in excess of a critical threshold level has decreasing incremental benefits. Women diagnosed with a first occurrence of breast cancer (N = 154) completed survey measures of perceived support…
Statistical approach for selection of regression model during validation of bioanalytical method
Directory of Open Access Journals (Sweden)
Natalija Nakov
2014-06-01
Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.
Linear regression models for quantitative assessment of left ...
African Journals Online (AJOL)
Changes in left ventricular structures and function have been reported in cardiomyopathies. No prediction models have been established in this environment. This study established regression models for prediction of left ventricular structures in normal subjects. A sample of normal subjects was drawn from a large urban ...
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Poisson regression for modeling count and frequency outcomes in trauma research.
Gagnon, David R; Doron-LaMarca, Susan; Bell, Margret; O'Farrell, Timothy J; Taft, Casey T
2008-10-01
The authors describe how the Poisson regression method for analyzing count or frequency outcome variables can be applied in trauma studies. The outcome of interest in trauma research may represent a count of the number of incidents of behavior occurring in a given time interval, such as acts of physical aggression or substance abuse. Traditional linear regression approaches assume a normally distributed outcome variable with equal variances over the range of predictor variables, and may not be optimal for modeling count outcomes. An application of Poisson regression is presented using data from a study of intimate partner aggression among male patients in an alcohol treatment program and their female partners. Results of Poisson regression and linear regression models are compared.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Directory of Open Access Journals (Sweden)
Minh Vu Trieu
2017-03-01
Full Text Available This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS, Brazilian tensile strength (BTS, rock brittleness index (BI, the distance between planes of weakness (DPW, and the alpha angle (Alpha between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP. Four (4 statistical regression models (two linear and two nonlinear are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2 of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno
2017-03-01
This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
Deep ensemble learning of sparse regression models for brain disease diagnosis.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2017-04-01
Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer's disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call 'Deep Ensemble Sparse Regression Network.' To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Modeling of Volatility with Non-linear Time Series Model
Kim Song Yon; Kim Mun Chol
2013-01-01
In this paper, non-linear time series models are used to describe volatility in financial time series data. To describe volatility, two of the non-linear time series are combined into form TAR (Threshold Auto-Regressive Model) with AARCH (Asymmetric Auto-Regressive Conditional Heteroskedasticity) error term and its parameter estimation is studied.
A computational approach to compare regression modelling strategies in prediction research.
Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H
2016-08-25
It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric
A threshold-voltage model for small-scaled GaAs nMOSFET with stacked high-k gate dielectric
International Nuclear Information System (INIS)
Liu Chaowen; Xu Jingping; Liu Lu; Lu Hanhan; Huang Yuan
2016-01-01
A threshold-voltage model for a stacked high-k gate dielectric GaAs MOSFET is established by solving a two-dimensional Poisson's equation in channel and considering the short-channel, DIBL and quantum effects. The simulated results are in good agreement with the Silvaco TCAD data, confirming the correctness and validity of the model. Using the model, impacts of structural and physical parameters of the stack high-k gate dielectric on the threshold-voltage shift and the temperature characteristics of the threshold voltage are investigated. The results show that the stacked gate dielectric structure can effectively suppress the fringing-field and DIBL effects and improve the threshold and temperature characteristics, and on the other hand, the influence of temperature on the threshold voltage is overestimated if the quantum effect is ignored. (paper)
Removing Malmquist bias from linear regressions
Verter, Frances
1993-01-01
Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.
Sri Lankan FRAX model and country-specific intervention thresholds.
Lekamwasam, Sarath
2013-01-01
There is a wide variation in fracture probabilities estimated by Asian FRAX models, although the outputs of South Asian models are concordant. Clinicians can choose either fixed or age-specific intervention thresholds when making treatment decisions in postmenopausal women. Cost-effectiveness of such approach, however, needs to be addressed. This study examined suitable fracture probability intervention thresholds (ITs) for Sri Lanka, based on the Sri Lankan FRAX model. Fracture probabilities were estimated using all Asian FRAX models for a postmenopausal woman of BMI 25 kg/m² and has no clinical risk factors apart from a fragility fracture, and they were compared. Age-specific ITs were estimated based on the Sri Lankan FRAX model using the method followed by the National Osteoporosis Guideline Group in the UK. Using the age-specific ITs as the reference standard, suitable fixed ITs were also estimated. Fracture probabilities estimated by different Asian FRAX models varied widely. Japanese and Taiwan models showed higher fracture probabilities while Chinese, Philippine, and Indonesian models gave lower fracture probabilities. Output of remaining FRAX models were generally similar. Age-specific ITs of major osteoporotic fracture probabilities (MOFP) based on the Sri Lankan FRAX model varied from 2.6 to 18% between 50 and 90 years. ITs of hip fracture probabilities (HFP) varied from 0.4 to 6.5% between 50 and 90 years. In finding fixed ITs, MOFP of 11% and HFP of 3.5% gave the lowest misclassification and highest agreement. Sri Lankan FRAX model behaves similar to other Asian FRAX models such as Indian, Singapore-Indian, Thai, and South Korean. Clinicians may use either the fixed or age-specific ITs in making therapeutic decisions in postmenopausal women. The economical aspects of such decisions, however, need to be considered.
International Nuclear Information System (INIS)
Roy, Prasun K.; Dutta Majumder, D.; Biswas, Jaydip
1999-01-01
Spontaneous regression of malignant tumours without treatment is a most enigmatic phenomenon with immense therapeutic potentialities. We analyse such cases to find that the commonest cause is a preceding episode of high fever-induced thermal fluctuation which produce fluctuation of biochemical and immunological parameters. Using Prigogine-Glansdorff thermodynamic stability formalism and biocybernetic principles, we develop the theoretical foundation of tumour regression induced by thermal, radiational or oxygenational fluctuations. For regression, a preliminary threshold condition of fluctuations is derived, namely σ > 2.83. We present some striking confirmation of such fluctuation-induced regression of various therapy-resistant masses as Ewing tumour, neurogranuloma and Lewis lung carcinoma by utilising σ > 2.83. Our biothermodynamic stability model of malignancy appears to illuminate the marked increase of aggressiveness of mammalian malignancy which occurred around 250 million years ago when homeothermic warm-blooded pre-mammals evolved. Using experimental data, we propose a novel approach of multi-modal hyper-fluctuation therapy involving modulation of radiotherapeutic hyper-fractionation, temperature, radiothermy and immune-status. (author)
DEFF Research Database (Denmark)
Mmbando, Bruno P; Lusingu, John P; Vestergaard, Lasse S
2009-01-01
BACKGROUND: In Sub-Sahara Africa, malaria due to Plasmodium falciparum is the main cause of ill health. Evaluation of malaria interventions, such as drugs and vaccines depends on clinical definition of the disease, which is still a challenge due to lack of distinct malaria specific clinical...... features. Parasite threshold is used in definition of clinical malaria in evaluation of interventions. This however, is likely to be influenced by other factors such as transmission intensity as well as individual level of immunity against malaria. METHODS: This paper describes step function and dose...... response model with threshold parameter as a tool for estimation of parasite threshold for onset of malaria fever in highlands (low transmission) and lowlands (high transmission intensity) strata. These models were fitted using logistic regression stratified by strata and age groups (0-1, 2-3, 4-5, 6...
A generalized right truncated bivariate Poisson regression model with applications to health data.
Islam, M Ataharul; Chowdhury, Rafiqul I
2017-01-01
A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model.
Directory of Open Access Journals (Sweden)
K.P. Pradhan
2015-12-01
Full Text Available In this paper, an analytical threshold voltage model is proposed for a cylindrical gate-all-around (CGAA MOSFET by solving the 2-D Poisson’s equation in the cylindrical coordinate system. A comparison is made for both the center and the surface potential model of CGAA MOSFET. This paper claims that the calculation of threshold voltage using center potential is more accurate rather than the calculation from surface potential. The effects of the device parameters like the drain bias (VDS, oxide thickness (tox, channel thickness (r, etc., on the threshold voltage are also studied in this paper. The model is verified with 3D numerical device simulator Sentaurus from Synopsys Inc.
International Nuclear Information System (INIS)
Mungan, M.; Coppersmith, S.; Vinokur, V.M.
1999-01-01
We analyze the strains near threshold in 1-d charge density wave models at zero temperature and strong pinning. We show that in these models local strains diverge near the depinning threshold and characterize the scaling behavior of the phenomenon. This helps quantify when the underlying elastic description breaks down and plastic effects have to be included
Hierarchical Neural Regression Models for Customer Churn Prediction
Directory of Open Access Journals (Sweden)
Golshan Mohammadi
2013-01-01
Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.
LINEAR REGRESSION MODEL ESTİMATİON FOR RIGHT CENSORED DATA
Directory of Open Access Journals (Sweden)
Ersin Yılmaz
2016-05-01
Full Text Available In this study, firstly we will define a right censored data. If we say shortly right-censored data is censoring values that above the exact line. This may be related with scaling device. And then we will use response variable acquainted from right-censored explanatory variables. Then the linear regression model will be estimated. For censored data’s existence, Kaplan-Meier weights will be used for the estimation of the model. With the weights regression model will be consistent and unbiased with that. And also there is a method for the censored data that is a semi parametric regression and this method also give useful results for censored data too. This study also might be useful for the health studies because of the censored data used in medical issues generally.
Developing and testing a global-scale regression model to quantify mean annual streamflow
Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.
2017-01-01
Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.
A threshold-voltage model for small-scaled GaAs nMOSFET with stacked high-k gate dielectric
Chaowen, Liu; Jingping, Xu; Lu, Liu; Hanhan, Lu; Yuan, Huang
2016-02-01
A threshold-voltage model for a stacked high-k gate dielectric GaAs MOSFET is established by solving a two-dimensional Poisson's equation in channel and considering the short-channel, DIBL and quantum effects. The simulated results are in good agreement with the Silvaco TCAD data, confirming the correctness and validity of the model. Using the model, impacts of structural and physical parameters of the stack high-k gate dielectric on the threshold-voltage shift and the temperature characteristics of the threshold voltage are investigated. The results show that the stacked gate dielectric structure can effectively suppress the fringing-field and DIBL effects and improve the threshold and temperature characteristics, and on the other hand, the influence of temperature on the threshold voltage is overestimated if the quantum effect is ignored. Project supported by the National Natural Science Foundation of China (No. 61176100).
Threshold burnup for recrystallization and model for rim porosity in the high burnup UO2 fuel
International Nuclear Information System (INIS)
Lee, Byung Ho; Koo, Yang Hyun; Sohn, Dong Seong
1998-01-01
Applicability of the threshold burnup for rim formation was investigated as a function of temperature by Rest's model. The threshold burnup was the lowest in the intermediate temperature region, while on the other temperature regions the threshold burnup is higher. The rim porosity was predicted by the van der Waals equation based of the rim pore radius of 0.75μm and the overpressurization model on rim pores. The calculated centerline temperature is in good agreement with the measured temperature. However, more efforts seem to be necessary for the mechanistic model of the rim effect including rim growth with the fuel burnup
Extinction threshold of a population in spatial and stochastic model
Soroka, Yevheniia; Rublyov, Bogdan
2016-01-01
In this study, spatial stochastic and logistic model (SSLM) describing dynamics of a population of a certain species was analysed. The behaviour of the extinction threshold as a function of model parameters was studied. More specifically, we studied how the critical values for the model parameters that separate the cases of extinction and persistence depend on the spatial scales of the competition and dispersal kernels. We compared the simulations and analytical results to examine if and how ...
Determination of Cost-Effectiveness Threshold for Health Care Interventions in Malaysia.
Lim, Yen Wei; Shafie, Asrul Akmal; Chua, Gin Nie; Ahmad Hassali, Mohammed Azmi
2017-09-01
One major challenge in prioritizing health care using cost-effectiveness (CE) information is when alternatives are more expensive but more effective than existing technology. In such a situation, an external criterion in the form of a CE threshold that reflects the willingness to pay (WTP) per quality-adjusted life-year is necessary. To determine a CE threshold for health care interventions in Malaysia. A cross-sectional, contingent valuation study was conducted using a stratified multistage cluster random sampling technique in four states in Malaysia. One thousand thirteen respondents were interviewed in person for their socioeconomic background, quality of life, and WTP for a hypothetical scenario. The CE thresholds established using the nonparametric Turnbull method ranged from MYR12,810 to MYR22,840 (~US $4,000-US $7,000), whereas those estimated with the parametric interval regression model were between MYR19,929 and MYR28,470 (~US $6,200-US $8,900). Key factors that affected the CE thresholds were education level, estimated monthly household income, and the description of health state scenarios. These findings suggest that there is no single WTP value for a quality-adjusted life-year. The CE threshold estimated for Malaysia was found to be lower than the threshold value recommended by the World Health Organization. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne
2016-04-01
Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Adaptive regression for modeling nonlinear relationships
Knafl, George J
2016-01-01
This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...
Confidence bands for inverse regression models
International Nuclear Information System (INIS)
Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo
2010-01-01
We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract
Threshold law for electron impact ionization in the model of Temkin and Poet
International Nuclear Information System (INIS)
Macek, J.H.
1996-01-01
The angle-Sturmian theory is used to derive the threshold law for ionization of atomic hydrogen by electron impact in the model of Temkin and Poet. In this model, the exact electron-electron interaction is replaced by its monopole term. As for Wannier's theory with the real interaction, ionization occurs only for electrons that start out nearly equidistant from the proton. Because there is a high propensity for one electron to be captured into a bound state, ionization is strongly suppressed, giving rise to a threshold law of the form σ ∝ exp[-aE -1/6 + bE 1/6 ], where a and b are constants. The exponential law appears to be the quantal counterpart of the classical offset of the ionization threshold. Relative energy distribution are computed and found to favor configurations with unequal energy sharing
Electricity consumption forecasting in Italy using linear regression models
Energy Technology Data Exchange (ETDEWEB)
Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio [DIAM, Seconda Universita degli Studi di Napoli, Via Roma 29, 81031 Aversa (CE) (Italy)
2009-09-15
The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of {+-}1% for the best case and {+-}11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)
Electricity consumption forecasting in Italy using linear regression models
International Nuclear Information System (INIS)
Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio
2009-01-01
The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of ±1% for the best case and ±11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)
Directory of Open Access Journals (Sweden)
Drzewiecki Wojciech
2016-12-01
Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.
Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method
Prahutama, Alan; Sudarno
2018-05-01
The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).
Augmented Beta rectangular regression models: A Bayesian perspective.
Wang, Jue; Luo, Sheng
2016-01-01
Mixed effects Beta regression models based on Beta distributions have been widely used to analyze longitudinal percentage or proportional data ranging between zero and one. However, Beta distributions are not flexible to extreme outliers or excessive events around tail areas, and they do not account for the presence of the boundary values zeros and ones because these values are not in the support of the Beta distributions. To address these issues, we propose a mixed effects model using Beta rectangular distribution and augment it with the probabilities of zero and one. We conduct extensive simulation studies to assess the performance of mixed effects models based on both the Beta and Beta rectangular distributions under various scenarios. The simulation studies suggest that the regression models based on Beta rectangular distributions improve the accuracy of parameter estimates in the presence of outliers and heavy tails. The proposed models are applied to the motivating Neuroprotection Exploratory Trials in Parkinson's Disease (PD) Long-term Study-1 (LS-1 study, n = 1741), developed by The National Institute of Neurological Disorders and Stroke Exploratory Trials in Parkinson's Disease (NINDS NET-PD) network. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bayesian semiparametric regression models to characterize molecular evolution
Directory of Open Access Journals (Sweden)
Datta Saheli
2012-10-01
Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.
Regression analysis of a chemical reaction fouling model
International Nuclear Information System (INIS)
Vasak, F.; Epstein, N.
1996-01-01
A previously reported mathematical model for the initial chemical reaction fouling of a heated tube is critically examined in the light of the experimental data for which it was developed. A regression analysis of the model with respect to that data shows that the reference point upon which the two adjustable parameters of the model were originally based was well chosen, albeit fortuitously. (author). 3 refs., 2 tabs., 2 figs
Correlation-regression model for physico-chemical quality of ...
African Journals Online (AJOL)
abusaad
areas, suggesting that groundwater quality in urban areas is closely related with land use ... the ground water, with correlation and regression model is also presented. ...... WHO (World Health Organization) (1985). Health hazards from nitrates.
A Rational Threshold Signature Model and Protocol Based on Different Permissions
Directory of Open Access Journals (Sweden)
Bojun Wang
2014-01-01
Full Text Available This paper develops a novel model and protocol used in some specific scenarios, in which the participants of multiple groups with different permissions can finish the signature together. We apply the secret sharing scheme based on difference equation to the private key distribution phase and secret reconstruction phrase of our threshold signature scheme. In addition, our scheme can achieve the signature success because of the punishment strategy of the repeated rational secret sharing. Besides, the bit commitment and verification method used to detect players’ cheating behavior acts as a contributing factor to prevent the internal fraud. Using bit commitments, verifiable parameters, and time sequences, this paper constructs a dynamic game model, which has the features of threshold signature management with different permissions, cheat proof, and forward security.
Ng, Kar Yong; Awang, Norhashidah
2018-01-06
Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.
Multivariate Frequency-Severity Regression Models in Insurance
Directory of Open Access Journals (Sweden)
Edward W. Frees
2016-02-01
Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Application of random regression models to the genetic evaluation ...
African Journals Online (AJOL)
The model included fixed regression on AM (range from 30 to 138 mo) and the effect of herd-measurement date concatenation. Random parts of the model were RRM coefficients for additive and permanent environmental effects, while residual effects were modelled to account for heterogeneity of variance by AY. Estimates ...
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
International Nuclear Information System (INIS)
Cui, Yingzhi; Yang, Jie; Du, Chunyu; Zuo, Pengjian; Gao, Yunzhi; Cheng, Xinqun; Ma, Yulin; Yin, Geping
2017-01-01
Highlights: •Open circuit voltage evolution over ageing of lithium ion batteries is deciphered. •The mechanism responsible for the end-of-life (EOL) threshold is elaborated. •A new prediction model of EOL threshold with improved accuracy is developed. •This EOL prediction model is promising for the applications in electric vehicles. -- Abstract: The end-of-life (EOL) of a lithium ion battery (LIB) is defined as the time point when the LIB can no longer provide sufficient power or energy to accomplish its intended function. Generally, the EOL occurs abruptly when the degradation of a LIB reaches the threshold. Therefore, current prediction methods of EOL by extrapolating the early degradation behavior often result in significant errors. To address this problem, this paper analyzes the reason for the EOL threshold of a LIB with shallow depth of discharge. It is found that the sudden appearance of EOL threshold results from the drift of open circuit voltage (OCV) at the end of both shallow depth and full discharges. Further, a new EOL threshold prediction model with highly improved accuracy is developed based on the OCV drifts and their evolution mechanism, which can effectively avoid the misjudgment of EOL threshold. The accuracy of this EOL threshold prediction model is verified by comparing with experimental results. The EOL threshold prediction model can be applied to other battery chemistry systems and its possible application in electric vehicles is finally discussed.
The Threshold of a Stochastic SIRS Model with Vertical Transmission and Saturated Incidence
Directory of Open Access Journals (Sweden)
Chunjuan Zhu
2017-01-01
Full Text Available The threshold of a stochastic SIRS model with vertical transmission and saturated incidence is investigated. If the noise is small, it is shown that the threshold of the stochastic system determines the extinction and persistence of the epidemic. In addition, we find that if the noise is large, the epidemic still prevails. Finally, numerical simulations are given to illustrate the results.
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest haveÂ increased for the spatial modeling and mapping of continuousÂ variables. Random forest is a non-parametric ensembleÂ approach, and unlike traditional regression approaches thereÂ is no direct quantification of prediction error. UnderstandingÂ prediction uncertainty is important when using model-basedÂ continuous maps as...
Threshold law for the triplet state for electron-impact ionization in the Temkin-Poet model
International Nuclear Information System (INIS)
Ihra, W.; Mota-Furtado, F.; OMahony, P.F.; Macek, J.H.
1997-01-01
We derive the analytical threshold behavior for the triplet cross section for electron-impact ionization in the Temkin-Poet model. The analytical results indicate that the most recent numerical calculations may fail to reproduce the correct threshold behavior in an energy regime below about E=0.1 a.u. We also present an analytical expression for the energy distribution of the two electrons near threshold. copyright 1997 The American Physical Society
Lagomarsino, Daniela; Rosi, Ascanio; Rossi, Guglielmo; Segoni, Samuele; Catani, Filippo
2014-05-01
This work makes a quantitative comparison between the results of landslide forecasting obtained using two different rainfall threshold models, one using intensity-duration thresholds and the other based on cumulative rainfall thresholds in an area of northern Tuscany of 116 km2. The first methodology identifies rainfall intensity-duration thresholds by means a software called MaCumBA (Massive CUMulative Brisk Analyzer) that analyzes rain-gauge records, extracts the intensities (I) and durations (D) of the rainstorms associated with the initiation of landslides, plots these values on a diagram, and identifies thresholds that define the lower bounds of the I-D values. A back analysis using data from past events can be used to identify the threshold conditions associated with the least amount of false alarms. The second method (SIGMA) is based on the hypothesis that anomalous or extreme values of rainfall are responsible for landslide triggering: the statistical distribution of the rainfall series is analyzed, and multiples of the standard deviation (σ) are used as thresholds to discriminate between ordinary and extraordinary rainfall events. The name of the model, SIGMA, reflects the central role of the standard deviations in the proposed methodology. The definition of intensity-duration rainfall thresholds requires the combined use of rainfall measurements and an inventory of dated landslides, whereas SIGMA model can be implemented using only rainfall data. These two methodologies were applied in an area of 116 km2 where a database of 1200 landslides was available for the period 2000-2012. The results obtained are compared and discussed. Although several examples of visual comparisons between different intensity-duration rainfall thresholds are reported in the international literature, a quantitative comparison between thresholds obtained in the same area using different techniques and approaches is a relatively undebated research topic.
Profile-driven regression for modeling and runtime optimization of mobile networks
DEFF Research Database (Denmark)
McClary, Dan; Syrotiuk, Violet; Kulahci, Murat
2010-01-01
Computer networks often display nonlinear behavior when examined over a wide range of operating conditions. There are few strategies available for modeling such behavior and optimizing such systems as they run. Profile-driven regression is developed and applied to modeling and runtime optimization...... of throughput in a mobile ad hoc network, a self-organizing collection of mobile wireless nodes without any fixed infrastructure. The intermediate models generated in profile-driven regression are used to fit an overall model of throughput, and are also used to optimize controllable factors at runtime. Unlike...
Heil, Peter; Matysiak, Artur; Neubauer, Heinrich
2017-09-01
Thresholds for detecting sounds in quiet decrease with increasing sound duration in every species studied. The neural mechanisms underlying this trade-off, often referred to as temporal integration, are not fully understood. Here, we probe the human auditory system with a large set of tone stimuli differing in duration, shape of the temporal amplitude envelope, duration of silent gaps between bursts, and frequency. Duration was varied by varying the plateau duration of plateau-burst (PB) stimuli, the duration of the onsets and offsets of onset-offset (OO) stimuli, and the number of identical bursts of multiple-burst (MB) stimuli. Absolute thresholds for a large number of ears (>230) were measured using a 3-interval-3-alternative forced choice (3I-3AFC) procedure. Thresholds decreased with increasing sound duration in a manner that depended on the temporal envelope. Most commonly, thresholds for MB stimuli were highest followed by thresholds for OO and PB stimuli of corresponding durations. Differences in the thresholds for MB and OO stimuli and in the thresholds for MB and PB stimuli, however, varied widely across ears, were negative in some ears, and were tightly correlated. We show that the variation and correlation of MB-OO and MB-PB threshold differences are linked to threshold microstructure, which affects the relative detectability of the sidebands of the MB stimuli and affects estimates of the bandwidth of auditory filters. We also found that thresholds for MB stimuli increased with increasing duration of the silent gaps between bursts. We propose a new model and show that it accurately accounts for our results and does so considerably better than a leaky-integrator-of-intensity model and a probabilistic model proposed by others. Our model is based on the assumption that sensory events are generated by a Poisson point process with a low rate in the absence of stimulation and higher, time-varying rates in the presence of stimulation. A subject in a 3I-3AFC
Accounting for measurement error in log regression models with applications to accelerated testing.
Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M
2018-01-01
In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Accounting for measurement error in log regression models with applications to accelerated testing.
Directory of Open Access Journals (Sweden)
Robert Richardson
Full Text Available In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Some Observations about the Nearest-Neighbor Model of the Error Threshold
International Nuclear Information System (INIS)
Gerrish, Philip J.
2009-01-01
I explore some aspects of the 'error threshold' - a critical mutation rate above which a population is nonviable. The phase transition that occurs as mutation rate crosses this threshold has been shown to be mathematically equivalent to the loss of ferromagnetism that occurs as temperature exceeds the Curie point. I will describe some refinements and new results based on the simplest of these mutation models, will discuss the commonly unperceived robustness of this simple model, and I will show some preliminary results comparing qualitative predictions with simulations of finite populations adapting at high mutation rates. I will talk about how these qualitative predictions are relevant to biomedical science and will discuss how my colleagues and I are looking for phase-transition signatures in real populations of Escherichia coli that go extinct as a result of excessive mutation.
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
DEFF Research Database (Denmark)
Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo
2017-01-01
In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...
SPSS macros to compare any two fitted values from a regression model.
Weaver, Bruce; Dubois, Sacha
2012-12-01
In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.
Amalia, Junita; Purhadi, Otok, Bambang Widjanarko
2017-11-01
Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.
Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?
Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.
2016-12-01
Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.
Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression
Khikmah, L.; Wijayanto, H.; Syafitri, U. D.
2017-04-01
The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.
Time series modeling by a regression approach based on a latent process.
Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice
2009-01-01
Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.
Regression to Causality : Regression-style presentation influences causal attribution
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Low level radiation: how does the linear without threshold model provide the safety of Canadian
International Nuclear Information System (INIS)
Anon.
2010-01-01
The linear without threshold model is a model of risk used worldwide by the most of health organisms of nuclear regulation in order to establish dose limits for workers and public. It is in the heart of the approach adopted by the Canadian commission of nuclear safety (C.C.S.N.) in matter of radiation protection. The linear without threshold model presumes reasonably it exists a direct link between radiation exposure and cancer rate. It does not exist scientific evidence that chronicle exposure to radiation doses under 100 milli sievert (mSv) leads harmful effects on health. Several scientific reports highlighted scientific evidences that seem indicate a low level of radiation is less harmful than the linear without threshold predicts. As the linear without threshold model presumes that any radiation exposure brings risks, the ALARA principle obliges the licensees to get the radiation exposure at the lowest reasonably achievable level, social and economical factors taken into account. ALARA principle constitutes a basic principle in the C.C.S.N. approach in matter of radiation protection; On the radiation protection plan, C.C.S.N. gets a careful approach that allows to provide health and safety of Canadian people and the protection of their environment. (N.C.)
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
On the renewal risk model under a threshold strategy
Dong, Yinghui; Wang, Guojing; Yuen, Kam C.
2009-08-01
In this paper, we consider the renewal risk process under a threshold dividend payment strategy. For this model, the expected discounted dividend payments and the Gerber-Shiu expected discounted penalty function are investigated. Integral equations, integro-differential equations and some closed form expressions for them are derived. When the claims are exponentially distributed, it is verified that the expected penalty of the deficit at ruin is proportional to the ruin probability.
Regression Model to Predict Global Solar Irradiance in Malaysia
Directory of Open Access Journals (Sweden)
Hairuniza Ahmed Kutty
2015-01-01
Full Text Available A novel regression model is developed to estimate the monthly global solar irradiance in Malaysia. The model is developed based on different available meteorological parameters, including temperature, cloud cover, rain precipitate, relative humidity, wind speed, pressure, and gust speed, by implementing regression analysis. This paper reports on the details of the analysis of the effect of each prediction parameter to identify the parameters that are relevant to estimating global solar irradiance. In addition, the proposed model is compared in terms of the root mean square error (RMSE, mean bias error (MBE, and the coefficient of determination (R2 with other models available from literature studies. Seven models based on single parameters (PM1 to PM7 and five multiple-parameter models (PM7 to PM12 are proposed. The new models perform well, with RMSE ranging from 0.429% to 1.774%, R2 ranging from 0.942 to 0.992, and MBE ranging from −0.1571% to 0.6025%. In general, cloud cover significantly affects the estimation of global solar irradiance. However, cloud cover in Malaysia lacks sufficient influence when included into multiple-parameter models although it performs fairly well in single-parameter prediction models.
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Logistic regression for risk factor modelling in stuttering research.
Reed, Phil; Wu, Yaqionq
2013-06-01
To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2011-01-01
In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Energy Technology Data Exchange (ETDEWEB)
Bessec, Marie [CGEMP, Universite Paris-Dauphine, Place du Marechal de Lattre de Tassigny Paris (France); Fouquau, Julien [LEO, Universite d' Orleans, Faculte de Droit, d' Economie et de Gestion, Rue de Blois, BP 6739, 45067 Orleans Cedex 2 (France)
2008-09-15
This paper investigates the relationship between electricity demand and temperature in the European Union. We address this issue by means of a panel threshold regression model on 15 European countries over the last two decades. Our results confirm the non-linearity of the link between electricity consumption and temperature found in more limited geographical areas in previous studies. By distinguishing between North and South countries, we also find that this non-linear pattern is more pronounced in the warm countries. Finally, rolling regressions show that the sensitivity of electricity consumption to temperature in summer has increased in the recent period. (author)
Crime Modeling using Spatial Regression Approach
Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.
2018-01-01
Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.
Nikbakht, Roya; Bahrampour, Abbas
2017-01-01
Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2009-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Efficient estimation of an additive quantile regression model
Cheng, Y.; de Gooijer, J.G.; Zerom, D.
2010-01-01
In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Using the classical linear regression model in analysis of the dependences of conveyor belt life
Directory of Open Access Journals (Sweden)
Miriam Andrejiová
2013-12-01
Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.
Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei
2014-01-01
The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.
Chen, Baojiang; Qin, Jing
2014-05-10
In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
M Taki
2017-05-01
. To measure the temperature and the relative humidity of the air, soil and roof inside and outside the greenhouse, the SHT 11 sensors were used. The accuracy of the measurement of temperature was ±0.4% at 20 °C and the precision measurement of the moisture was ±3% for a clear sky. We used these sensors in soil, on the roof (inside greenhouse and in the air of greenhouse and outside to measure the temperature and relative humidity. At a 1 m height above the ground outside the greenhouse, we used a pyranometre type TES 1333. Its sensitivity was proportional to the cosine of the incidence angle of the radiation. It is a measure of global radiation of the spectral band solar in the 400–1110 nm. Its measurement accuracy was approximately ±5%. Some heat transfer models used to predict the inside and roof temperature are according to equation (1 and (5: Results and Discussion Results showed that solar radiation on the roof of semi-solar greenhouse was higher after noon so this shape can receive high amounts of solar energy during a day. From statistical point of view, both desired and predicted test data have been analyzed to determine whether there are statistically significant differences between them. The null hypothesis assumes that statistical parameters of both series are equal. P value was used to check each hypothesis. Its threshold value was 0.05. If p value is greater than the threshold, the null hypothesis is then fulfilled. To check the differences between the data series, different tests were performed and p value was calculated for each case. The so called t-test was used to compare the means of both series. It was also assumed that the variance of both samples could be considered equal. The variance was analyzed using the F-test. Here, a normal distribution of samples was assumed. The results showed that the p values for heat model in all 2 statistical factors (Comparison of means, and variance is lower than regression model and so the heat model did not
Sorensen, James P R; Baker, Andy; Cumberland, Susan A; Lapworth, Dan J; MacDonald, Alan M; Pedley, Steve; Taylor, Richard G; Ward, Jade S T
2018-05-01
We assess the use of fluorescent dissolved organic matter at excitation-emission wavelengths of 280nm and 360nm, termed tryptophan-like fluorescence (TLF), as an indicator of faecally contaminated drinking water. A significant logistic regression model was developed using TLF as a predictor of thermotolerant coliforms (TTCs) using data from groundwater- and surface water-derived drinking water sources in India, Malawi, South Africa and Zambia. A TLF threshold of 1.3ppb dissolved tryptophan was selected to classify TTC contamination. Validation of the TLF threshold indicated a false-negative error rate of 15% and a false-positive error rate of 18%. The threshold was unsuccessful at classifying contaminated sources containing water globally. Copyright © 2017 Natural Environment Research Council (NERC), as represented by the British Geological Survey (BGS. Published by Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Soldić-Aleksić Jasna
2009-01-01
Full Text Available Market segmentation presents one of the key concepts of the modern marketing. The main goal of market segmentation is focused on creating groups (segments of customers that have similar characteristics, needs, wishes and/or similar behavior regarding the purchase of concrete product/service. Companies can create specific marketing plan for each of these segments and therefore gain short or long term competitive advantage on the market. Depending on the concrete marketing goal, different segmentation schemes and techniques may be applied. This paper presents a predictive market segmentation model based on the application of logistic regression model and CHAID analysis. The logistic regression model was used for the purpose of variables selection (from the initial pool of eleven variables which are statistically significant for explaining the dependent variable. Selected variables were afterwards included in the CHAID procedure that generated the predictive market segmentation model. The model results are presented on the concrete empirical example in the following form: summary model results, CHAID tree, Gain chart, Index chart, risk and classification tables.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
Madarang, Krish J; Kang, Joo-Hyon
2014-06-01
Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.
Drzewiecki, Wojciech
2016-12-01
In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.
International Nuclear Information System (INIS)
Chaujar, Rishu; Kaur, Ravneet; Gupta, Mridula; Gupta, R S; Saxena, Manoj
2009-01-01
This paper discusses a threshold voltage model for novel device structure: gate electrode work function engineered recessed channel (GEWE-RC) nanoscale MOSFET, which combines the advantages of both RC and GEWE structures. In part I, the model accurately predicts (a) surface potential, (b) threshold voltage and (c) sub-threshold slope for single material gate recessed channel (SMG-RC) and GEWE-RC structures. Part II focuses on the development of compact analytical drain current model taking into account the transition regimes from sub-threshold to saturation. Furthermore, the drain conductance evaluation has also been obtained, reflecting relevance of the proposed device for analogue design. The analysis takes into account the effect of gate length and groove depth in order to develop a compact model suitable for device design. The analytical results predicted by the model confirm well with the simulated results. Results in part I also provide valuable design insights in the performance of nanoscale GEWE-RC MOSFET with optimum threshold voltage and negative junction depth (NJD), and hence serves as a tool to optimize important device and technological parameters for 40 nm technology
Regularized multivariate regression models with skew-t error distributions
Chen, Lianfu; Pourahmadi, Mohsen; Maadooliat, Mehdi
2014-01-01
We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both
Harris, L K; Whay, H R; Murrell, J C
2018-04-01
This study investigated the effects of osteoarthritis (OA) on somatosensory processing in dogs using mechanical threshold testing. A pressure algometer was used to measure mechanical thresholds in 27 dogs with presumed hind limb osteoarthritis and 28 healthy dogs. Mechanical thresholds were measured at the stifles, radii and sternum, and were correlated with scores from an owner questionnaire and a clinical checklist, a scoring system that quantified clinical signs of osteoarthritis. The effects of age and bodyweight on mechanical thresholds were also investigated. Multiple regression models indicated that, when bodyweight was taken into account, dogs with presumed osteoarthritis had lower mechanical thresholds at the stifles than control dogs, but not at other sites. Non-parametric correlations showed that clinical checklist scores and questionnaire scores were negatively correlated with mechanical thresholds at the stifles. The results suggest that mechanical threshold testing using a pressure algometer can detect primary, and possibly secondary, hyperalgesia in dogs with presumed osteoarthritis. This suggests that the mechanical threshold testing protocol used in this study might facilitate assessment of somatosensory changes associated with disease progression or response to treatment. Copyright © 2017. Published by Elsevier Ltd.
International Nuclear Information System (INIS)
Fang, Tingting; Lahdelma, Risto
2016-01-01
Highlights: • Social factor is considered for the linear regression models besides weather file. • Simultaneously optimize all the coefficients for linear regression models. • SARIMA combined with linear regression is used to forecast the heat demand. • The accuracy for both linear regression and time series models are evaluated. - Abstract: Forecasting heat demand is necessary for production and operation planning of district heating (DH) systems. In this study we first propose a simple regression model where the hourly outdoor temperature and wind speed forecast the heat demand. Weekly rhythm of heat consumption as a social component is added to the model to significantly improve the accuracy. The other type of model is the seasonal autoregressive integrated moving average (SARIMA) model with exogenous variables as a combination to take weather factors, and the historical heat consumption data as depending variables. One outstanding advantage of the model is that it peruses the high accuracy for both long-term and short-term forecast by considering both exogenous factors and time series. The forecasting performance of both linear regression models and time series model are evaluated based on real-life heat demand data for the city of Espoo in Finland by out-of-sample tests for the last 20 full weeks of the year. The results indicate that the proposed linear regression model (T168h) using 168-h demand pattern with midweek holidays classified as Saturdays or Sundays gives the highest accuracy and strong robustness among all the tested models based on the tested forecasting horizon and corresponding data. Considering the parsimony of the input, the ease of use and the high accuracy, the proposed T168h model is the best in practice. The heat demand forecasting model can also be developed for individual buildings if automated meter reading customer measurements are available. This would allow forecasting the heat demand based on more accurate heat consumption
Pigot, Corentin; Gilibert, Fabien; Reyboz, Marina; Bocquet, Marc; Zuliani, Paola; Portal, Jean-Michel
2018-04-01
Phase-change memory (PCM) compact modeling of the threshold switching based on a thermal runaway in Poole–Frenkel conduction is proposed. Although this approach is often used in physical models, this is the first time it is implemented in a compact model. The model accuracy is validated by a good correlation between simulations and experimental data collected on a PCM cell embedded in a 90 nm technology. A wide range of intermediate states is measured and accurately modeled with a single set of parameters, allowing multilevel programing. A good convergence is exhibited even in snapback simulation owing to this fully continuous approach. Moreover, threshold properties extraction indicates a thermally enhanced switching, which validates the basic hypothesis of the model. Finally, it is shown that this model is compliant with a new drift-resilient cell-state metric. Once enriched with a phase transition module, this compact model is ready to be implemented in circuit simulators.
Validation of regression models for nitrate concentrations in the upper groundwater in sandy soils
International Nuclear Information System (INIS)
Sonneveld, M.P.W.; Brus, D.J.; Roelsma, J.
2010-01-01
For Dutch sandy regions, linear regression models have been developed that predict nitrate concentrations in the upper groundwater on the basis of residual nitrate contents in the soil in autumn. The objective of our study was to validate these regression models for one particular sandy region dominated by dairy farming. No data from this area were used for calibrating the regression models. The model was validated by additional probability sampling. This sample was used to estimate errors in 1) the predicted areal fractions where the EU standard of 50 mg l -1 is exceeded for farms with low N surpluses (ALT) and farms with higher N surpluses (REF); 2) predicted cumulative frequency distributions of nitrate concentration for both groups of farms. Both the errors in the predicted areal fractions as well as the errors in the predicted cumulative frequency distributions indicate that the regression models are invalid for the sandy soils of this study area. - This study indicates that linear regression models that predict nitrate concentrations in the upper groundwater using residual soil N contents should be applied with care.
Caimmi, R.
2011-08-01
Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both
Modeling and prediction of flotation performance using support vector regression
Directory of Open Access Journals (Sweden)
Despotović Vladimir
2017-01-01
Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.
Forecasting daily meteorological time series using ARIMA and regression models
Murat, Małgorzata; Malinowska, Iwona; Gos, Magdalena; Krzyszczak, Jaromir
2018-04-01
The daily air temperature and precipitation time series recorded between January 1, 1980 and December 31, 2010 in four European sites (Jokioinen, Dikopshof, Lleida and Lublin) from different climatic zones were modeled and forecasted. In our forecasting we used the methods of the Box-Jenkins and Holt- Winters seasonal auto regressive integrated moving-average, the autoregressive integrated moving-average with external regressors in the form of Fourier terms and the time series regression, including trend and seasonality components methodology with R software. It was demonstrated that obtained models are able to capture the dynamics of the time series data and to produce sensible forecasts.
Replica analysis of overfitting in regression models for time-to-event data
Coolen, A. C. C.; Barrett, J. E.; Paga, P.; Perez-Vicente, C. J.
2017-09-01
Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox’s proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.
Cross-validation pitfalls when selecting and assessing regression and classification models.
Krstajic, Damjan; Buturovic, Ljubomir J; Leahy, David E; Thomas, Simon
2014-03-29
We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.
A LATENT CLASS POISSON REGRESSION-MODEL FOR HETEROGENEOUS COUNT DATA
WEDEL, M; DESARBO, WS; BULT, [No Value; RAMASWAMY, [No Value
1993-01-01
In this paper an approach is developed that accommodates heterogeneity in Poisson regression models for count data. The model developed assumes that heterogeneity arises from a distribution of both the intercept and the coefficients of the explanatory variables. We assume that the mixing
International Nuclear Information System (INIS)
Bwo-Nung Huang; National Chia-Yi University; Hwang, M.J.; Hsiao-Ping Peng
2005-01-01
This paper applies the multivariate threshold model to investigate the impacts of an oil price change and its volatility on economic activities (changes in industrial production and real stock returns). The statistical test on the existence of a threshold effect indicates that a threshold value does exist. Using monthly data of the US, Canada, and Japan during the period from 1970 to 2002, we conclude: (i) the optimal threshold level seems to vary according to how an economy depends on imported oil and the attitude towards adopting energy-saving technology; (ii) an oil price change or its volatility has a limited impact on the economies if the change is below the threshold levels; (iii) if the change is above threshold levels, it appears that the change in oil price better explains macroeconomic variables than the volatility of the oil price; and (iv) if the change is above threshold levels, a change in oil price or its volatility explains the model better than the real interest rate. (author)
Two-dimensional threshold voltage analytical model of DMG strained-silicon-on-insulator MOSFETs
International Nuclear Information System (INIS)
Li Jin; Liu Hongxia; Li Bin; Cao Lei; Yuan Bo
2010-01-01
For the first time, a simple and accurate two-dimensional analytical model for the surface potential variation along the channel in fully depleted dual-material gate strained-Si-on-insulator (DMG SSOI) MOSFETs is developed. We investigate the improved short channel effect (SCE), hot carrier effect (HCE), drain-induced barrier-lowering (DIBL) and carrier transport efficiency for the novel structure MOSFET. The analytical model takes into account the effects of different metal gate lengths, work functions, the drain bias and Ge mole fraction in the relaxed SiGe buffer. The surface potential in the channel region exhibits a step potential, which can suppress SCE, HCE and DIBL. Also, strained-Si and SOI structure can improve the carrier transport efficiency, with strained-Si being particularly effective. Further, the threshold voltage model correctly predicts a 'rollup' in threshold voltage with decreasing channel length ratios or Ge mole fraction in the relaxed SiGe buffer. The validity of the two-dimensional analytical model is verified using numerical simulations. (semiconductor devices)
Continuous validation of ASTEC containment models and regression testing
International Nuclear Information System (INIS)
Nowack, Holger; Reinke, Nils; Sonnenkalb, Martin
2014-01-01
The focus of the ASTEC (Accident Source Term Evaluation Code) development at GRS is primarily on the containment module CPA (Containment Part of ASTEC), whose modelling is to a large extent based on the GRS containment code COCOSYS (COntainment COde SYStem). Validation is usually understood as the approval of the modelling capabilities by calculations of appropriate experiments done by external users different from the code developers. During the development process of ASTEC CPA, bugs and unintended side effects may occur, which leads to changes in the results of the initially conducted validation. Due to the involvement of a considerable number of developers in the coding of ASTEC modules, validation of the code alone, even if executed repeatedly, is not sufficient. Therefore, a regression testing procedure has been implemented in order to ensure that the initially obtained validation results are still valid with succeeding code versions. Within the regression testing procedure, calculations of experiments and plant sequences are performed with the same input deck but applying two different code versions. For every test-case the up-to-date code version is compared to the preceding one on the basis of physical parameters deemed to be characteristic for the test-case under consideration. In the case of post-calculations of experiments also a comparison to experimental data is carried out. Three validation cases from the regression testing procedure are presented within this paper. The very good post-calculation of the HDR E11.1 experiment shows the high quality modelling of thermal-hydraulics in ASTEC CPA. Aerosol behaviour is validated on the BMC VANAM M3 experiment, and the results show also a very good agreement with experimental data. Finally, iodine behaviour is checked in the validation test-case of the THAI IOD-11 experiment. Within this test-case, the comparison of the ASTEC versions V2.0r1 and V2.0r2 shows how an error was detected by the regression testing
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling.
Kawashima, Issaku; Kumano, Hiroaki
2017-01-01
Mind-wandering (MW), task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG) variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR) to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.
Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling
Directory of Open Access Journals (Sweden)
Issaku Kawashima
2017-07-01
Full Text Available Mind-wandering (MW, task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.
Synergistic effects in threshold models on networks
Juul, Jonas S.; Porter, Mason A.
2018-01-01
Network structure can have a significant impact on the propagation of diseases, memes, and information on social networks. Different types of spreading processes (and other dynamical processes) are affected by network architecture in different ways, and it is important to develop tractable models of spreading processes on networks to explore such issues. In this paper, we incorporate the idea of synergy into a two-state ("active" or "passive") threshold model of social influence on networks. Our model's update rule is deterministic, and the influence of each meme-carrying (i.e., active) neighbor can—depending on a parameter—either be enhanced or inhibited by an amount that depends on the number of active neighbors of a node. Such a synergistic system models social behavior in which the willingness to adopt either accelerates or saturates in a way that depends on the number of neighbors who have adopted that behavior. We illustrate that our model's synergy parameter has a crucial effect on system dynamics, as it determines whether degree-k nodes are possible or impossible to activate. We simulate synergistic meme spreading on both random-graph models and networks constructed from empirical data. Using a heterogeneous mean-field approximation, which we derive under the assumption that a network is locally tree-like, we are able to determine which synergy-parameter values allow degree-k nodes to be activated for many networks and for a broad family of synergistic models.
Reconstruction of missing daily streamflow data using dynamic regression models
Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault
2015-12-01
River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.
Regional rainfall thresholds for landslide occurrence using a centenary database
Vaz, Teresa; Luís Zêzere, José; Pereira, Susana; Cruz Oliveira, Sérgio; Garcia, Ricardo A. C.; Quaresma, Ivânia
2018-04-01
This work proposes a comprehensive method to assess rainfall thresholds for landslide initiation using a centenary landslide database associated with a single centenary daily rainfall data set. The method is applied to the Lisbon region and includes the rainfall return period analysis that was used to identify the critical rainfall combination (cumulated rainfall duration) related to each landslide event. The spatial representativeness of the reference rain gauge is evaluated and the rainfall thresholds are assessed and calibrated using the receiver operating characteristic (ROC) metrics. Results show that landslide events located up to 10 km from the rain gauge can be used to calculate the rainfall thresholds in the study area; however, these thresholds may be used with acceptable confidence up to 50 km from the rain gauge. The rainfall thresholds obtained using linear and potential regression perform well in ROC metrics. However, the intermediate thresholds based on the probability of landslide events established in the zone between the lower-limit threshold and the upper-limit threshold are much more informative as they indicate the probability of landslide event occurrence given rainfall exceeding the threshold. This information can be easily included in landslide early warning systems, especially when combined with the probability of rainfall above each threshold.
Detection of Outliers in Regression Model for Medical Data
Directory of Open Access Journals (Sweden)
Stephen Raj S
2017-07-01
Full Text Available In regression analysis, an outlier is an observation for which the residual is large in magnitude compared to other observations in the data set. The detection of outliers and influential points is an important step of the regression analysis. Outlier detection methods have been used to detect and remove anomalous values from data. In this paper, we detect the presence of outliers in simple linear regression models for medical data set. Chatterjee and Hadi mentioned that the ordinary residuals are not appropriate for diagnostic purposes; a transformed version of them is preferable. First, we investigate the presence of outliers based on existing procedures of residuals and standardized residuals. Next, we have used the new approach of standardized scores for detecting outliers without the use of predicted values. The performance of the new approach was verified with the real-life data.
String Threshold corrections in models with spondaneously broken supersymmetry
Kiritsis, Elias B; Petropoulos, P M; Rizos, J
1999-01-01
We analyse a class of four-dimensional heterotic ground states with N=2 space-time supersymmetry. From the ten-dimensional perspective, such models can be viewed as compactifications on a six-dimensional manifold with SU(2) holonomy, which is locally but not globally K3 x T^2. The maximal N=4 supersymmetry is spontaneously broken to N=2. The masses of the two massive gravitinos depend on the (T,U) moduli of T^2. We evaluate the one-loop threshold corrections of gauge and R^2 couplings and we show that they fall in several universality classes, in contrast to what happens in usual K3 x T^2 compactifications, where the N=4 supersymmetry is explicitly broken to N=2, and where a single universality class appears. These universality properties follow from the structure of the elliptic genus. The behaviour of the threshold corrections as functions of the moduli is analysed in detail: it is singular across several rational lines of the T^2 moduli because of the appearance of extra massless states, and suffers only f...
Estimasi Model Seemingly Unrelated Regression (SUR dengan Metode Generalized Least Square (GLS
Directory of Open Access Journals (Sweden)
Ade Widyaningsih
2015-04-01
Full Text Available Regression analysis is a statistical tool that is used to determine the relationship between two or more quantitative variables so that one variable can be predicted from the other variables. A method that can used to obtain a good estimation in the regression analysis is ordinary least squares method. The least squares method is used to estimate the parameters of one or more regression but relationships among the errors in the response of other estimators are not allowed. One way to overcome this problem is Seemingly Unrelated Regression model (SUR in which parameters are estimated using Generalized Least Square (GLS. In this study, the author applies SUR model using GLS method on world gasoline demand data. The author obtains that SUR using GLS is better than OLS because SUR produce smaller errors than the OLS.
Estimasi Model Seemingly Unrelated Regression (SUR dengan Metode Generalized Least Square (GLS
Directory of Open Access Journals (Sweden)
Ade Widyaningsih
2014-06-01
Full Text Available Regression analysis is a statistical tool that is used to determine the relationship between two or more quantitative variables so that one variable can be predicted from the other variables. A method that can used to obtain a good estimation in the regression analysis is ordinary least squares method. The least squares method is used to estimate the parameters of one or more regression but relationships among the errors in the response of other estimators are not allowed. One way to overcome this problem is Seemingly Unrelated Regression model (SUR in which parameters are estimated using Generalized Least Square (GLS. In this study, the author applies SUR model using GLS method on world gasoline demand data. The author obtains that SUR using GLS is better than OLS because SUR produce smaller errors than the OLS.
On pseudo-values for regression analysis in competing risks models
DEFF Research Database (Denmark)
Graw, F; Gerds, Thomas Alexander; Schumacher, M
2009-01-01
For regression on state and transition probabilities in multi-state models Andersen et al. (Biometrika 90:15-27, 2003) propose a technique based on jackknife pseudo-values. In this article we analyze the pseudo-values suggested for competing risks models and prove some conjectures regarding their...
Modeling and prediction of Turkey's electricity consumption using Support Vector Regression
International Nuclear Information System (INIS)
Kavaklioglu, Kadir
2011-01-01
Support Vector Regression (SVR) methodology is used to model and predict Turkey's electricity consumption. Among various SVR formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. (author)
Soyka, Florian; Giordano, Paolo Robuffo; Barnett-Cowan, Michael; Bülthoff, Heinrich H
2012-07-01
Understanding the dynamics of vestibular perception is important, for example, for improving the realism of motion simulation and virtual reality environments or for diagnosing patients suffering from vestibular problems. Previous research has found a dependence of direction discrimination thresholds for rotational motions on the period length (inverse frequency) of a transient (single cycle) sinusoidal acceleration stimulus. However, self-motion is seldom purely sinusoidal, and up to now, no models have been proposed that take into account non-sinusoidal stimuli for rotational motions. In this work, the influence of both the period length and the specific time course of an inertial stimulus is investigated. Thresholds for three acceleration profile shapes (triangular, sinusoidal, and trapezoidal) were measured for three period lengths (0.3, 1.4, and 6.7 s) in ten participants. A two-alternative forced-choice discrimination task was used where participants had to judge if a yaw rotation around an earth-vertical axis was leftward or rightward. The peak velocity of the stimulus was varied, and the threshold was defined as the stimulus yielding 75 % correct answers. In accordance with previous research, thresholds decreased with shortening period length (from ~2 deg/s for 6.7 s to ~0.8 deg/s for 0.3 s). The peak velocity was the determining factor for discrimination: Different profiles with the same period length have similar velocity thresholds. These measurements were used to fit a novel model based on a description of the firing rate of semi-circular canal neurons. In accordance with previous research, the estimates of the model parameters suggest that velocity storage does not influence perceptual thresholds.
Predictive based monitoring of nuclear plant component degradation using support vector regression
International Nuclear Information System (INIS)
Agarwal, Vivek; Alamaniotis, Miltiadis; Tsoukalas, Lefteri H.
2015-01-01
Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component's respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.
Cluster regression model and level fluctuation features of Van Lake, Turkey
Directory of Open Access Journals (Sweden)
Z. Şen
1999-02-01
Full Text Available Lake water levels change under the influences of natural and/or anthropogenic environmental conditions. Among these influences are the climate change, greenhouse effects and ozone layer depletions which are reflected in the hydrological cycle features over the lake drainage basins. Lake levels are among the most significant hydrological variables that are influenced by different atmospheric and environmental conditions. Consequently, lake level time series in many parts of the world include nonstationarity components such as shifts in the mean value, apparent or hidden periodicities. On the other hand, many lake level modeling techniques have a stationarity assumption. The main purpose of this work is to develop a cluster regression model for dealing with nonstationarity especially in the form of shifting means. The basis of this model is the combination of transition probability and classical regression technique. Both parts of the model are applied to monthly level fluctuations of Lake Van in eastern Turkey. It is observed that the cluster regression procedure does preserve the statistical properties and the transitional probabilities that are indistinguishable from the original data.Key words. Hydrology (hydrologic budget; stochastic processes · Meteorology and atmospheric dynamics (ocean-atmosphere interactions
Cluster regression model and level fluctuation features of Van Lake, Turkey
Directory of Open Access Journals (Sweden)
Z. Şen
Full Text Available Lake water levels change under the influences of natural and/or anthropogenic environmental conditions. Among these influences are the climate change, greenhouse effects and ozone layer depletions which are reflected in the hydrological cycle features over the lake drainage basins. Lake levels are among the most significant hydrological variables that are influenced by different atmospheric and environmental conditions. Consequently, lake level time series in many parts of the world include nonstationarity components such as shifts in the mean value, apparent or hidden periodicities. On the other hand, many lake level modeling techniques have a stationarity assumption. The main purpose of this work is to develop a cluster regression model for dealing with nonstationarity especially in the form of shifting means. The basis of this model is the combination of transition probability and classical regression technique. Both parts of the model are applied to monthly level fluctuations of Lake Van in eastern Turkey. It is observed that the cluster regression procedure does preserve the statistical properties and the transitional probabilities that are indistinguishable from the original data.
Key words. Hydrology (hydrologic budget; stochastic processes · Meteorology and atmospheric dynamics (ocean-atmosphere interactions
Stylized facts from a threshold-based heterogeneous agent model
Cross, R.; Grinfeld, M.; Lamba, H.; Seaman, T.
2007-05-01
A class of heterogeneous agent models is investigated where investors switch trading position whenever their motivation to do so exceeds some critical threshold. These motivations can be psychological in nature or reflect behaviour suggested by the efficient market hypothesis (EMH). By introducing different propensities into a baseline model that displays EMH behaviour, one can attempt to isolate their effects upon the market dynamics. The simulation results indicate that the introduction of a herding propensity results in excess kurtosis and power-law decay consistent with those observed in actual return distributions, but not in significant long-term volatility correlations. Possible alternatives for introducing such long-term volatility correlations are then identified and discussed.
Directory of Open Access Journals (Sweden)
Marco Rotiroti
2018-01-01
Full Text Available The reductive dissolution of Fe-oxide driven by organic matter oxidation is the primary mechanism accepted for As mobilization in several alluvial aquifers. These processes are often mediated by microorganisms that require a minimum Gibbs energy available to conduct the reaction in order to sustain their life functions. Implementing this threshold energy in reactive transport modeling is rarely used in the existing literature. This work presents a 1D reactive transport modeling of As mobilization by the reductive dissolution of Fe-oxide and subsequent immobilization by co-precipitation in iron sulfides considering a threshold energy for the following terminal electron accepting processes: (a Fe-oxide reduction, (b sulfate reduction, and (c methanogenesis. The model is then extended by implementing a threshold energy on both reaction directions for the redox reaction pairs Fe(III reduction/Fe(II oxidation and methanogenesis/methane oxidation. The optimal threshold energy fitted in 4.50, 3.76, and 1.60 kJ/mol e− for sulfate reduction, Fe(III reduction/Fe(II oxidation, and methanogenesis/methane oxidation, respectively. The use of models implementing bidirectional threshold energy is needed when a redox reaction pair can be transported between domains with different redox potentials. This may often occur in 2D or 3D simulations.
Improved model of the retardance in citric acid coated ferrofluids using stepwise regression
Lin, J. F.; Qiu, X. R.
2017-06-01
Citric acid (CA) coated Fe3O4 ferrofluids (FFs) have been conducted for biomedical application. The magneto-optical retardance of CA coated FFs was measured by a Stokes polarimeter. Optimization and multiple regression of retardance in FFs were executed by Taguchi method and Microsoft Excel previously, and the F value of regression model was large enough. However, the model executed by Excel was not systematic. Instead we adopted the stepwise regression to model the retardance of CA coated FFs. From the results of stepwise regression by MATLAB, the developed model had highly predictable ability owing to F of 2.55897e+7 and correlation coefficient of one. The average absolute error of predicted retardances to measured retardances was just 0.0044%. Using the genetic algorithm (GA) in MATLAB, the optimized parametric combination was determined as [4.709 0.12 39.998 70.006] corresponding to the pH of suspension, molar ratio of CA to Fe3O4, CA volume, and coating temperature. The maximum retardance was found as 31.712°, close to that obtained by evolutionary solver in Excel and a relative error of -0.013%. Above all, the stepwise regression method was successfully used to model the retardance of CA coated FFs, and the maximum global retardance was determined by the use of GA.
Metro passengers’ route choice model and its application considering perceived transfer threshold
Jin, Fanglei; Zhang, Yongsheng; Liu, Shasha
2017-01-01
With the rapid development of the Metro network in China, the greatly increased route alternatives make passengers’ route choice behavior and passenger flow assignment more complicated, which presents challenges to the operation management. In this paper, a path sized logit model is adopted to analyze passengers’ route choice preferences considering such parameters as in-vehicle time, number of transfers, and transfer time. Moreover, the “perceived transfer threshold” is defined and included in the utility function to reflect the penalty difference caused by transfer time on passengers’ perceived utility under various numbers of transfers. Next, based on the revealed preference data collected in the Guangzhou Metro, the proposed model is calibrated. The appropriate perceived transfer threshold value and the route choice preferences are analyzed. Finally, the model is applied to a personalized route planning case to demonstrate the engineering practicability of route choice behavior analysis. The results show that the introduction of the perceived transfer threshold is helpful to improve the model’s explanatory abilities. In addition, personalized route planning based on route choice preferences can meet passengers’ diversified travel demands. PMID:28957376
Djulbegovic, Benjamin; van den Ende, Jef; Hamm, Robert M; Mayrhofer, Thomas; Hozo, Iztok; Pauker, Stephen G
2015-05-01
The threshold model represents an important advance in the field of medical decision-making. It is a linchpin between evidence (which exists on the continuum of credibility) and decision-making (which is a categorical exercise - we decide to act or not act). The threshold concept is closely related to the question of rational decision-making. When should the physician act, that is order a diagnostic test, or prescribe treatment? The threshold model embodies the decision theoretic rationality that says the most rational decision is to prescribe treatment when the expected treatment benefit outweighs its expected harms. However, the well-documented large variation in the way physicians order diagnostic tests or decide to administer treatments is consistent with a notion that physicians' individual action thresholds vary. We present a narrative review summarizing the existing literature on physicians' use of a threshold strategy for decision-making. We found that the observed variation in decision action thresholds is partially due to the way people integrate benefits and harms. That is, explanation of variation in clinical practice can be reduced to a consideration of thresholds. Limited evidence suggests that non-expected utility threshold (non-EUT) models, such as regret-based and dual-processing models, may explain current medical practice better. However, inclusion of costs and recognition of risk attitudes towards uncertain treatment effects and comorbidities may improve the explanatory and predictive value of the EUT-based threshold models. The decision when to act is closely related to the question of rational choice. We conclude that the medical community has not yet fully defined criteria for rational clinical decision-making. The traditional notion of rationality rooted in EUT may need to be supplemented by reflective rationality, which strives to integrate all aspects of medical practice - medical, humanistic and socio-economic - within a coherent
Accounting for spatial effects in land use regression for urban air pollution modeling.
Bertazzon, Stefania; Johnson, Markey; Eccles, Kristin; Kaplan, Gilaad G
2015-01-01
In order to accurately assess air pollution risks, health studies require spatially resolved pollution concentrations. Land-use regression (LUR) models estimate ambient concentrations at a fine spatial scale. However, spatial effects such as spatial non-stationarity and spatial autocorrelation can reduce the accuracy of LUR estimates by increasing regression errors and uncertainty; and statistical methods for resolving these effects--e.g., spatially autoregressive (SAR) and geographically weighted regression (GWR) models--may be difficult to apply simultaneously. We used an alternate approach to address spatial non-stationarity and spatial autocorrelation in LUR models for nitrogen dioxide. Traditional models were re-specified to include a variable capturing wind speed and direction, and re-fit as GWR models. Mean R(2) values for the resulting GWR-wind models (summer: 0.86, winter: 0.73) showed a 10-20% improvement over traditional LUR models. GWR-wind models effectively addressed both spatial effects and produced meaningful predictive models. These results suggest a useful method for improving spatially explicit models. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Regression analysis understanding and building business and economic models using Excel
Wilson, J Holton
2012-01-01
The technique of regression analysis is used so often in business and economics today that an understanding of its use is necessary for almost everyone engaged in the field. This book will teach you the essential elements of building and understanding regression models in a business/economic context in an intuitive manner. The authors take a non-theoretical treatment that is accessible even if you have a limited statistical background. It is specifically designed to teach the correct use of regression, while advising you of its limitations and teaching about common pitfalls. This book describe
The prediction of intelligence in preschool children using alternative models to regression.
Finch, W Holmes; Chang, Mei; Davis, Andrew S; Holden, Jocelyn E; Rothlisberg, Barbara A; McIntosh, David E
2011-12-01
Statistical prediction of an outcome variable using multiple independent variables is a common practice in the social and behavioral sciences. For example, neuropsychologists are sometimes called upon to provide predictions of preinjury cognitive functioning for individuals who have suffered a traumatic brain injury. Typically, these predictions are made using standard multiple linear regression models with several demographic variables (e.g., gender, ethnicity, education level) as predictors. Prior research has shown conflicting evidence regarding the ability of such models to provide accurate predictions of outcome variables such as full-scale intelligence (FSIQ) test scores. The present study had two goals: (1) to demonstrate the utility of a set of alternative prediction methods that have been applied extensively in the natural sciences and business but have not been frequently explored in the social sciences and (2) to develop models that can be used to predict premorbid cognitive functioning in preschool children. Predictions of Stanford-Binet 5 FSIQ scores for preschool-aged children is used to compare the performance of a multiple regression model with several of these alternative methods. Results demonstrate that classification and regression trees provided more accurate predictions of FSIQ scores than does the more traditional regression approach. Implications of these results are discussed.
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Directory of Open Access Journals (Sweden)
Soyoung Park
2017-07-01
Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.
Sample size calculation to externally validate scoring systems based on logistic regression models.
Directory of Open Access Journals (Sweden)
Antonio Palazón-Bru
Full Text Available A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence. Scoring systems based on binary logistic regression models are a specific type of predictive model.The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study.The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units.In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature.An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.
Detection of Cutting Tool Wear using Statistical Analysis and Regression Model
Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin
2010-10-01
This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.
A Threshold Continuum for Aeolian Sand Transport
Swann, C.; Ewing, R. C.; Sherman, D. J.
2015-12-01
The threshold of motion for aeolian sand transport marks the initial entrainment of sand particles by the force of the wind. This is typically defined and modeled as a singular wind speed for a given grain size and is based on field and laboratory experimental data. However, the definition of threshold varies significantly between these empirical models, largely because the definition is based on visual-observations of initial grain movement. For example, in his seminal experiments, Bagnold defined threshold of motion when he observed that 100% of the bed was in motion. Others have used 50% and lesser values. Differences in threshold models, in turn, result is large errors in predicting the fluxes associated with sand and dust transport. Here we use a wind tunnel and novel sediment trap to capture the fractions of sand in creep, reptation and saltation at Earth and Mars pressures and show that the threshold of motion for aeolian sand transport is best defined as a continuum in which grains progress through stages defined by the proportion of grains in creep and saltation. We propose the use of scale dependent thresholds modeled by distinct probability distribution functions that differentiate the threshold based on micro to macro scale applications. For example, a geologic timescale application corresponds to a threshold when 100% of the bed in motion whereas a sub-second application corresponds to a threshold when a single particle is set in motion. We provide quantitative measurements (number and mode of particle movement) corresponding to visual observations, percent of bed in motion and degrees of transport intermittency for Earth and Mars. Understanding transport as a continuum provides a basis for revaluating sand transport thresholds on Earth, Mars and Titan.
Rainfall thresholds and susceptibility mapping for shallow landslides and debris flows in Scotland
Postance, Benjamin; Hillier, John; Dijkstra, Tom; Dixon, Neil
2017-04-01
Shallow translational slides and debris flows (hereafter 'landslides') pose a significant threat to life and cause significant annual economic impacts (e.g. by damage and disruption of infrastructure). The focus of this research is on the definition of objective rainfall thresholds using a weather radar system and landslide susceptibility mapping. In the study area Scotland, an inventory of 75 known landslides was used for the period 2003 to 2016. First, the effect of using different rain records (i.e. time series length) on two threshold selection techniques in receiver operating characteristic (ROC) analysis was evaluated. The results show that thresholds selected by 'Threat Score' (minimising false alarms) are sensitive to rain record length and which is not routinely considered, whereas thresholds selected using 'Optimal Point' (minimising failed alarms) are not; therefore these may be suited to establishing lower limit thresholds and be of interest to those developing early warning systems. Robust thresholds are found for combinations of normalised rain duration and accumulation at 1 and 12 day's antecedence respectively; these are normalised using the rainy-day normal and an equivalent measure for rain intensity. This research indicates that, in Scotland, rain accumulation provides a better indicator than rain intensity and that landslides may be generated by threshold conditions lower than previously thought. Second, a landslide susceptibility map is constructed using a cross-validated logistic regression model. A novel element of the approach is that landslide susceptibility is calculated for individual hillslope sections. The developed thresholds and susceptibility map are combined to assess potential hazards and impacts posed to the national highway network in Scotland.
Directory of Open Access Journals (Sweden)
R. Christopher Sheldrick
2016-11-01
Full Text Available Abstract Background Clinical decision-making has been conceptualized as a sequence of two separate processes: assessment of patients’ functioning and application of a decision threshold to determine whether the evidence is sufficient to justify a given decision. A range of factors, including use of evidence-based screening instruments, has the potential to influence either or both processes. However, implementation studies seldom specify or assess the mechanism by which screening is hypothesized to influence clinical decision-making, thus limiting their ability to address unexpected findings regarding clinicians’ behavior. Building on prior theory and empirical evidence, we created a system dynamics (SD model of how physicians’ clinical decisions are influenced by their assessments of patients and by factors that may influence decision thresholds, such as knowledge of past patient outcomes. Using developmental-behavioral disorders as a case example, we then explore how referral decisions may be influenced by changes in context. Specifically, we compare predictions from the SD model to published implementation trials of evidence-based screening to understand physicians’ management of positive screening results and changes in referral rates. We also conduct virtual experiments regarding the influence of a variety of interventions that may influence physicians’ thresholds, including improved access to co-located mental health care and improved feedback systems regarding patient outcomes. Results Results of the SD model were consistent with recent implementation trials. For example, the SD model suggests that if screening improves physicians’ accuracy of assessment without also influencing decision thresholds, then a significant proportion of children with positive screens will not be referred and the effect of screening implementation on referral rates will be modest—results that are consistent with a large proportion of published
Using the Logistic Regression model in supporting decisions of establishing marketing strategies
Directory of Open Access Journals (Sweden)
Cristinel CONSTANTIN
2015-12-01
Full Text Available This paper is about an instrumental research regarding the using of Logistic Regression model for data analysis in marketing research. The decision makers inside different organisation need relevant information to support their decisions regarding the marketing strategies. The data provided by marketing research could be computed in various ways but the multivariate data analysis models can enhance the utility of the information. Among these models we can find the Logistic Regression model, which is used for dichotomous variables. Our research is based on explanation the utility of this model and interpretation of the resulted information in order to help practitioners and researchers to use it in their future investigations
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Multiple regression models for energy use in air-conditioned office buildings in different climates
International Nuclear Information System (INIS)
Lam, Joseph C.; Wan, Kevin K.W.; Liu Dalong; Tsang, C.L.
2010-01-01
An attempt was made to develop multiple regression models for office buildings in the five major climates in China - severe cold, cold, hot summer and cold winter, mild, and hot summer and warm winter. A total of 12 key building design variables were identified through parametric and sensitivity analysis, and considered as inputs in the regression models. The coefficient of determination R 2 varies from 0.89 in Harbin to 0.97 in Kunming, indicating that 89-97% of the variations in annual building energy use can be explained by the changes in the 12 parameters. A pseudo-random number generator based on three simple multiplicative congruential generators was employed to generate random designs for evaluation of the regression models. The difference between regression-predicted and DOE-simulated annual building energy use are largely within 10%. It is envisaged that the regression models developed can be used to estimate the likely energy savings/penalty during the initial design stage when different building schemes and design concepts are being considered.
CICAAR - Convolutive ICA with an Auto-Regressive Inverse Model
DEFF Research Database (Denmark)
Dyrholm, Mads; Hansen, Lars Kai
2004-01-01
We invoke an auto-regressive IIR inverse model for convolutive ICA and derive expressions for the likelihood and its gradient. We argue that optimization will give a stable inverse. When there are more sensors than sources the mixing model parameters are estimated in a second step by least square...... estimation. We demonstrate the method on synthetic data and finally separate speech and music in a real room recording....
Suhartono, Lee, Muhammad Hisyam; Prastyo, Dedy Dwi
2015-12-01
The aim of this research is to develop a calendar variation model for forecasting retail sales data with the Eid ul-Fitr effect. The proposed model is based on two methods, namely two levels ARIMAX and regression methods. Two levels ARIMAX and regression models are built by using ARIMAX for the first level and regression for the second level. Monthly men's jeans and women's trousers sales in a retail company for the period January 2002 to September 2009 are used as case study. In general, two levels of calendar variation model yields two models, namely the first model to reconstruct the sales pattern that already occurred, and the second model to forecast the effect of increasing sales due to Eid ul-Fitr that affected sales at the same and the previous months. The results show that the proposed two level calendar variation model based on ARIMAX and regression methods yields better forecast compared to the seasonal ARIMA model and Neural Networks.
Directory of Open Access Journals (Sweden)
Misztal Ignacy
2009-01-01
Full Text Available Abstract A semi-parametric non-linear longitudinal hierarchical model is presented. The model assumes that individual variation exists both in the degree of the linear change of performance (slope beyond a particular threshold of the independent variable scale and in the magnitude of the threshold itself; these individual variations are attributed to genetic and environmental components. During implementation via a Bayesian MCMC approach, threshold levels were sampled using a Metropolis step because their fully conditional posterior distributions do not have a closed form. The model was tested by simulation following designs similar to previous studies on genetics of heat stress. Posterior means of parameters of interest, under all simulation scenarios, were close to their true values with the latter always being included in the uncertain regions, indicating an absence of bias. The proposed models provide flexible tools for studying genotype by environmental interaction as well as for fitting other longitudinal traits subject to abrupt changes in the performance at particular points on the independent variable scale.
Testing and Modeling Fuel Regression Rate in a Miniature Hybrid Burner
Directory of Open Access Journals (Sweden)
Luciano Fanton
2012-01-01
Full Text Available Ballistic characterization of an extended group of innovative HTPB-based solid fuel formulations for hybrid rocket propulsion was performed in a lab-scale burner. An optical time-resolved technique was used to assess the quasisteady regression history of single perforation, cylindrical samples. The effects of metalized additives and radiant heat transfer on the regression rate of such formulations were assessed. Under the investigated operating conditions and based on phenomenological models from the literature, analyses of the collected experimental data show an appreciable influence of the radiant heat flux from burnt gases and soot for both unloaded and loaded fuel formulations. Pure HTPB regression rate data are satisfactorily reproduced, while the impressive initial regression rates of metalized formulations require further assessment.
Two-step variable selection in quantile regression models
Directory of Open Access Journals (Sweden)
FAN Yali
2015-06-01
Full Text Available We propose a two-step variable selection procedure for high dimensional quantile regressions, in which the dimension of the covariates, pn is much larger than the sample size n. In the first step, we perform ℓ1 penalty, and we demonstrate that the first step penalized estimator with the LASSO penalty can reduce the model from an ultra-high dimensional to a model whose size has the same order as that of the true model, and the selected model can cover the true model. The second step excludes the remained irrelevant covariates by applying the adaptive LASSO penalty to the reduced model obtained from the first step. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. We conduct a simulation study and a real data analysis to evaluate the finite sample performance of the proposed approach.
Regression Models for Predicting Force Coefficients of Aerofoils
Directory of Open Access Journals (Sweden)
Mohammed ABDUL AKBAR
2015-09-01
Full Text Available Renewable sources of energy are attractive and advantageous in a lot of different ways. Among the renewable energy sources, wind energy is the fastest growing type. Among wind energy converters, Vertical axis wind turbines (VAWTs have received renewed interest in the past decade due to some of the advantages they possess over their horizontal axis counterparts. VAWTs have evolved into complex 3-D shapes. A key component in predicting the output of VAWTs through analytical studies is obtaining the values of lift and drag coefficients which is a function of shape of the aerofoil, ‘angle of attack’ of wind and Reynolds’s number of flow. Sandia National Laboratories have carried out extensive experiments on aerofoils for the Reynolds number in the range of those experienced by VAWTs. The volume of experimental data thus obtained is huge. The current paper discusses three Regression analysis models developed wherein lift and drag coefficients can be found out using simple formula without having to deal with the bulk of the data. Drag coefficients and Lift coefficients were being successfully estimated by regression models with R2 values as high as 0.98.
The Relationship between Economic Growth and Money Laundering – a Linear Regression Model
Directory of Open Access Journals (Sweden)
Daniel Rece
2009-09-01
Full Text Available This study provides an overview of the relationship between economic growth and money laundering modeled by a least squares function. The report analyzes statistically data collected from USA, Russia, Romania and other eleven European countries, rendering a linear regression model. The study illustrates that 23.7% of the total variance in the regressand (level of money laundering is “explained” by the linear regression model. In our opinion, this model will provide critical auxiliary judgment and decision support for anti-money laundering service systems.
Regression analysis of informative current status data with the additive hazards model.
Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo
2015-04-01
This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.
Poisson regression approach for modeling fatal injury rates amongst Malaysian workers
International Nuclear Information System (INIS)
Kamarulzaman Ibrahim; Heng Khai Theng
2005-01-01
Many safety studies are based on the analysis carried out on injury surveillance data. The injury surveillance data gathered for the analysis include information on number of employees at risk of injury in each of several strata where the strata are defined in terms of a series of important predictor variables. Further insight into the relationship between fatal injury rates and predictor variables may be obtained by the poisson regression approach. Poisson regression is widely used in analyzing count data. In this study, poisson regression is used to model the relationship between fatal injury rates and predictor variables which are year (1995-2002), gender, recording system and industry type. Data for the analysis were obtained from PERKESO and Jabatan Perangkaan Malaysia. It is found that the assumption that the data follow poisson distribution has been violated. After correction for the problem of over dispersion, the predictor variables that are found to be significant in the model are gender, system of recording, industry type, two interaction effects (interaction between recording system and industry type and between year and industry type). Introduction Regression analysis is one of the most popular
Large signal-to-noise ratio quantification in MLE for ARARMAX models
Zou, Yiqun; Tang, Xiafei
2014-06-01
It has been shown that closed-loop linear system identification by indirect method can be generally transferred to open-loop ARARMAX (AutoRegressive AutoRegressive Moving Average with eXogenous input) estimation. For such models, the gradient-related optimisation with large enough signal-to-noise ratio (SNR) can avoid the potential local convergence in maximum likelihood estimation. To ease the application of this condition, the threshold SNR needs to be quantified. In this paper, we build the amplitude coefficient which is an equivalence to the SNR and prove the finiteness of the threshold amplitude coefficient within the stability region. The quantification of threshold is achieved by the minimisation of an elaborately designed multi-variable cost function which unifies all the restrictions on the amplitude coefficient. The corresponding algorithm based on two sets of physically realisable system input-output data details the minimisation and also points out how to use the gradient-related method to estimate ARARMAX parameters when local minimum is present as the SNR is small. Then, the algorithm is tested on a theoretical AutoRegressive Moving Average with eXogenous input model for the derivation of the threshold and a gas turbine engine real system for model identification, respectively. Finally, the graphical validation of threshold on a two-dimensional plot is discussed.
Accounting for Zero Inflation of Mussel Parasite Counts Using Discrete Regression Models
Directory of Open Access Journals (Sweden)
Emel Çankaya
2017-06-01
Full Text Available In many ecological applications, the absences of species are inevitable due to either detection faults in samples or uninhabitable conditions for their existence, resulting in high number of zero counts or abundance. Usual practice for modelling such data is regression modelling of log(abundance+1 and it is well know that resulting model is inadequate for prediction purposes. New discrete models accounting for zero abundances, namely zero-inflated regression (ZIP and ZINB, Hurdle-Poisson (HP and Hurdle-Negative Binomial (HNB amongst others are widely preferred to the classical regression models. Due to the fact that mussels are one of the economically most important aquatic products of Turkey, the purpose of this study is therefore to examine the performances of these four models in determination of the significant biotic and abiotic factors on the occurrences of Nematopsis legeri parasite harming the existence of Mediterranean mussels (Mytilus galloprovincialis L.. The data collected from the three coastal regions of Sinop city in Turkey showed more than 50% of parasite counts on the average are zero-valued and model comparisons were based on information criterion. The results showed that the probability of the occurrence of this parasite is here best formulated by ZINB or HNB models and influential factors of models were found to be correspondent with ecological differences of the regions.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Generalizing a complex model for gully threshold identification in the Mediterranean environment
Torri, D.; Borselli, L.; Iaquinta, P.; Iovine, G.; Poesen, J.; Terranova, O.
2012-04-01
Among the physical processes leading to land degradation, soil erosion by water is the most important and gully erosion may contribute, at places, to 70% of the total soil loss. Nevertheless, gully erosion has often been neglected in water soil erosion modeling, whilst more prominence has been given to rill and interrill erosion. Both to facilitate the processing by agricultural machinery and to take advantage of all the arable land, gullies are commonly removed at each crop cycle, with significant soil losses due to the repeated excavation of the channel by the successive rainstorm. When the erosive forces of overland flow exceed the strength of the soil particles to detachment and displacement, water erosion occurs and usually a channel is formed. As runoff is proportional to the local catchment area, a relationship between local slope, S, and contributing area, A, is supposed to exists. A "geomorphologic threshold" scheme is therefore suitable to interpret the physical process of gully initiation: accordingly, a gully is formed when a hydraulic threshold for incision exceeds the resistance of the soil particles to detachment and transport. Similarly, it appears reasonable that a gully ends when there is a reduction of slope, or the concentrated flow meets more resistant soil-vegetation complexes. This study aims to predict the location of the beginning of gullies in the Mediterranean environment, based on an evaluation of S and A by means of a mathematical model. For the identification of the areas prone to gully erosion, the model employs two empirical thresholds relevant to the head (Thead) and to the end (Tend) of the gullies (of the type SA^ b>Thead, SA^ bsituations (usually after abandonment), and c) databases for cropland have been merged. Selected data have been examined and interpreted mathematically to assess a value to be taken as a constant for the exponent "b" of the above equation. Literature data on the problem of topological thresholds Tend are
Transpiration of glasshouse rose crops: evaluation of regression models
Baas, R.; Rijssel, van E.
2006-01-01
Regression models of transpiration (T) based on global radiation inside the greenhouse (G), with or without energy input from heating pipes (Eh) and/or vapor pressure deficit (VPD) were parameterized. Therefore, data on T, G, temperatures from air, canopy and heating pipes, and VPD from both a
Xu, Yifang; Collins, Leslie M
2004-04-01
The incorporation of low levels of noise into an electrical stimulus has been shown to improve auditory thresholds in some human subjects (Zeng et al., 2000). In this paper, thresholds for noise-modulated pulse-train stimuli are predicted utilizing a stochastic neural-behavioral model of ensemble fiber responses to bi-phasic stimuli. The neural refractory effect is described using a Markov model for a noise-free pulse-train stimulus and a closed-form solution for the steady-state neural response is provided. For noise-modulated pulse-train stimuli, a recursive method using the conditional probability is utilized to track the neural responses to each successive pulse. A neural spike count rule has been presented for both threshold and intensity discrimination under the assumption that auditory perception occurs via integration over a relatively long time period (Bruce et al., 1999). An alternative approach originates from the hypothesis of the multilook model (Viemeister and Wakefield, 1991), which argues that auditory perception is based on several shorter time integrations and may suggest an NofM model for prediction of pulse-train threshold. This motivates analyzing the neural response to each individual pulse within a pulse train, which is considered to be the brief look. A logarithmic rule is hypothesized for pulse-train threshold. Predictions from the multilook model are shown to match trends in psychophysical data for noise-free stimuli that are not always matched by the long-time integration rule. Theoretical predictions indicate that threshold decreases as noise variance increases. Theoretical models of the neural response to pulse-train stimuli not only reduce calculational overhead but also facilitate utilization of signal detection theory and are easily extended to multichannel psychophysical tasks.
Linear non-threshold (LNT) radiation hazards model and its evaluation
International Nuclear Information System (INIS)
Min Rui
2011-01-01
In order to introduce linear non-threshold (LNT) model used in study on the dose effect of radiation hazards and to evaluate its application, the analysis of comprehensive literatures was made. The results show that LNT model is more suitable to describe the biological effects in accuracy for high dose than that for low dose. Repairable-conditionally repairable model of cell radiation effects can be well taken into account on cell survival curve in the all conditions of high, medium and low absorbed dose range. There are still many uncertainties in assessment model of effective dose of internal radiation based on the LNT assumptions and individual mean organ equivalent, and it is necessary to establish gender-specific voxel human model, taking gender differences into account. From above, the advantages and disadvantages of various models coexist. Before the setting of the new theory and new model, LNT model is still the most scientific attitude. (author)
Analytical details of the instability threshold of the laser-Lorenz model
International Nuclear Information System (INIS)
Bakasov, A.A.; Abraham, N.B.
1992-11-01
An exact analysis of the second threshold of a single-mode homogeneously broadened laser is given for the most general operating conditions. We provide a general analytical proof that increasing the detuning increases the second threshold is given. An analysis of the second threshold at a fixed detuning and of the ratio of the second threshold to the first threshold reveals that the smallest values occur when the population relaxation rate is minimized. (author). 13 refs, 4 figs, 2 tabs
Terrestrial Microgravity Model and Threshold Gravity Simulation using Magnetic Levitation
Ramachandran, N.
2005-01-01
What is the threshold gravity (minimum gravity level) required for the nominal functioning of the human system? What dosage is required? Do human cell lines behave differently in microgravity in response to an external stimulus? The critical need for such a gravity simulator is emphasized by recent experiments on human epithelial cells and lymphocytes on the Space Shuttle clearly showing that cell growth and function are markedly different from those observed terrestrially. Those differences are also dramatic between cells grown in space and those in Rotating Wall Vessels (RWV), or NASA bioreactor often used to simulate microgravity, indicating that although morphological growth patterns (three dimensional growth) can be successfully simulated using RWVs, cell function performance is not reproduced - a critical difference. If cell function is dramatically affected by gravity off-loading, then cell response to stimuli such as radiation, stress, etc. can be very different from terrestrial cell lines. Yet, we have no good gravity simulator for use in study of these phenomena. This represents a profound shortcoming for countermeasures research. We postulate that we can use magnetic levitation of cells and tissue, through the use of strong magnetic fields and field gradients, as a terrestrial microgravity model to study human cells. Specific objectives of the research are: 1. To develop a tried, tested and benchmarked terrestrial microgravity model for cell culture studies; 2. Gravity threshold determination; 3. Dosage (magnitude and duration) of g-level required for nominal functioning of cells; 4. Comparisons of magnetic levitation model to other models such as RWV, hind limb suspension, etc. and 5. Cellular response to reduced gravity levels of Moon and Mars. The paper will discuss experiments md modeling work to date in support of this project.
Duda, Piotr; Jaworski, Maciej; Rutkowski, Leszek
2018-03-01
One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties - weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.
Double Photoionization Near Threshold
Wehlitz, Ralf
2007-01-01
The threshold region of the double-photoionization cross section is of particular interest because both ejected electrons move slowly in the Coulomb field of the residual ion. Near threshold both electrons have time to interact with each other and with the residual ion. Also, different theoretical models compete to describe the double-photoionization cross section in the threshold region. We have investigated that cross section for lithium and beryllium and have analyzed our data with respect to the latest results in the Coulomb-dipole theory. We find that our data support the idea of a Coulomb-dipole interaction.
Directory of Open Access Journals (Sweden)
Ivanka Jerić
2011-11-01
Full Text Available Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample.
A simulation study on Bayesian Ridge regression models for several collinearity levels
Efendi, Achmad; Effrihan
2017-12-01
When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.
Linear Regression with a Randomly Censored Covariate: Application to an Alzheimer's Study.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2017-01-01
The association between maternal age of onset of dementia and amyloid deposition (measured by in vivo positron emission tomography (PET) imaging) in cognitively normal older offspring is of interest. In a regression model for amyloid, special methods are required due to the random right censoring of the covariate of maternal age of onset of dementia. Prior literature has proposed methods to address the problem of censoring due to assay limit of detection, but not random censoring. We propose imputation methods and a survival regression method that do not require parametric assumptions about the distribution of the censored covariate. Existing imputation methods address missing covariates, but not right censored covariates. In simulation studies, we compare these methods to the simple, but inefficient complete case analysis, and to thresholding approaches. We apply the methods to the Alzheimer's study.
Stochastic Threshold Microdose Model for Cell Killing by Insoluble Metallic Nanomaterial Particles
Scott, Bobby R.
2010-01-01
This paper introduces a novel microdosimetric model for metallic nanomaterial-particles (MENAP)-induced cytotoxicity. The focus is on the engineered insoluble MENAP which represent a significant breakthrough in the design and development of new products for consumers, industry, and medicine. Increased production is rapidly occurring and may cause currently unrecognized health effects (e.g., nervous system dysfunction, heart disease, cancer); thus, dose-response models for MENAP-induced biological effects are needed to facilitate health risk assessment. The stochastic threshold microdose (STM) model presented introduces novel stochastic microdose metrics for use in constructing dose-response relationships for the frequency of specific cellular (e.g., cell killing, mutations, neoplastic transformation) or subcellular (e.g., mitochondria dysfunction) effects. A key metric is the exposure-time-dependent, specific burden (MENAP count) for a given critical target (e.g., mitochondria, nucleus). Exceeding a stochastic threshold specific burden triggers cell death. For critical targets in the cytoplasm, the autophagic mode of death is triggered. For the nuclear target, the apoptotic mode of death is triggered. Overall cell survival is evaluated for the indicated competing modes of death when both apply. The STM model can be applied to cytotoxicity data using Bayesian methods implemented via Markov chain Monte Carlo. PMID:21191483
Terrestrial Microgravity Model and Threshold Gravity Simulation sing Magnetic Levitation
Ramachandran, N.
2005-01-01
What is the threshold gravity (minimum gravity level) required for the nominal functioning of the human system? What dosage is required? Do human cell lines behave differently in microgravity in response to an external stimulus? The critical need for such a gravity simulator is emphasized by recent experiments on human epithelial cells and lymphocytes on the Space Shuttle clearly showing that cell growth and function are markedly different from those observed terrestrially. Those differences are also dramatic between cells grown in space and those in Rotating Wall Vessels (RWV), or NASA bioreactor often used to simulate microgravity, indicating that although morphological growth patterns (three dimensional growth) can be successiblly simulated using RWVs, cell function performance is not reproduced - a critical difference. If cell function is dramatically affected by gravity off-loading, then cell response to stimuli such as radiation, stress, etc. can be very different from terrestrial cell lines. Yet, we have no good gravity simulator for use in study of these phenomena. This represents a profound shortcoming for countermeasures research. We postulate that we can use magnetic levitation of cells and tissue, through the use of strong magnetic fields and field gradients, as a terrestrial microgravity model to study human cells. Specific objectives of the research are: 1. To develop a tried, tested and benchmarked terrestrial microgravity model for cell culture studies; 2. Gravity threshold determination; 3. Dosage (magnitude and duration) of g-level required for nominal functioning of cells; 4. Comparisons of magnetic levitation model to other models such as RWV, hind limb suspension, etc. and 5. Cellular response to reduced gravity levels of Moon and Mars.
Improved bounds on the epidemic threshold of exact SIS models on complex networks
Ruhi, Navid Azizan; Thrampoulidis, Christos; Hassibi, Babak
2017-01-01
The SIS (susceptible-infected-susceptible) epidemic model on an arbitrary network, without making approximations, is a 2n-state Markov chain with a unique absorbing state (the all-healthy state). This makes analysis of the SIS model and, in particular, determining the threshold of epidemic spread quite challenging. It has been shown that the exact marginal probabilities of infection can be upper bounded by an n-dimensional linear time-invariant system, a consequence of which is that the Markov chain is “fast-mixing” when the LTI system is stable, i.e. when equation (where β is the infection rate per link, δ is the recovery rate, and λmax(A) is the largest eigenvalue of the network's adjacency matrix). This well-known threshold has been recently shown not to be tight in several cases, such as in a star network. In this paper, we provide tighter upper bounds on the exact marginal probabilities of infection, by also taking pairwise infection probabilities into account. Based on this improved bound, we derive tighter eigenvalue conditions that guarantee fast mixing (i.e., logarithmic mixing time) of the chain. We demonstrate the improvement of the threshold condition by comparing the new bound with the known one on various networks with various epidemic parameters.
Improved bounds on the epidemic threshold of exact SIS models on complex networks
Ruhi, Navid Azizan
2017-01-05
The SIS (susceptible-infected-susceptible) epidemic model on an arbitrary network, without making approximations, is a 2n-state Markov chain with a unique absorbing state (the all-healthy state). This makes analysis of the SIS model and, in particular, determining the threshold of epidemic spread quite challenging. It has been shown that the exact marginal probabilities of infection can be upper bounded by an n-dimensional linear time-invariant system, a consequence of which is that the Markov chain is “fast-mixing” when the LTI system is stable, i.e. when equation (where β is the infection rate per link, δ is the recovery rate, and λmax(A) is the largest eigenvalue of the network\\'s adjacency matrix). This well-known threshold has been recently shown not to be tight in several cases, such as in a star network. In this paper, we provide tighter upper bounds on the exact marginal probabilities of infection, by also taking pairwise infection probabilities into account. Based on this improved bound, we derive tighter eigenvalue conditions that guarantee fast mixing (i.e., logarithmic mixing time) of the chain. We demonstrate the improvement of the threshold condition by comparing the new bound with the known one on various networks with various epidemic parameters.
Photovoltaic Array Condition Monitoring Based on Online Regression of Performance Model
DEFF Research Database (Denmark)
Spataru, Sergiu; Sera, Dezso; Kerekes, Tamas
2013-01-01
regression modeling, from PV array production, plane-of-array irradiance, and module temperature measurements, acquired during an initial learning phase of the system. After the model has been parameterized automatically, the condition monitoring system enters the normal operation phase, where...
Extended cox regression model: The choice of timefunction
Isik, Hatice; Tutkun, Nihal Ata; Karasoy, Durdu
2017-07-01
Cox regression model (CRM), which takes into account the effect of censored observations, is one the most applicative and usedmodels in survival analysis to evaluate the effects of covariates. Proportional hazard (PH), requires a constant hazard ratio over time, is the assumptionofCRM. Using extended CRM provides the test of including a time dependent covariate to assess the PH assumption or an alternative model in case of nonproportional hazards. In this study, the different types of real data sets are used to choose the time function and the differences between time functions are analyzed and discussed.
INVESTIGATION OF E-MAIL TRAFFIC BY USING ZERO-INFLATED REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Yılmaz KAYA
2012-06-01
Full Text Available Based on count data obtained with a value of zero may be greater than anticipated. These types of data sets should be used to analyze by regression methods taking into account zero values. Zero- Inflated Poisson (ZIP, Zero-Inflated negative binomial (ZINB, Poisson Hurdle (PH, negative binomial Hurdle (NBH are more common approaches in modeling more zero value possessing dependent variables than expected. In the present study, the e-mail traffic of Yüzüncü Yıl University in 2009 spring semester was investigated. ZIP and ZINB, PH and NBH regression methods were applied on the data set because more zeros counting (78.9% were found in data set than expected. ZINB and NBH regression considered zero dispersion and overdispersion were found to be more accurate results due to overdispersion and zero dispersion in sending e-mail. ZINB is determined to be best model accordingto Vuong statistics and information criteria.
Focused information criterion and model averaging based on weighted composite quantile regression
Xu, Ganggang; Wang, Suojin; Huang, Jianhua Z.
2013-01-01
We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non
DEFF Research Database (Denmark)
Verhey, Jesko L.; Mauermann, Manfred; Epp, Bastian
2017-01-01
For normal-hearing listeners, auditory pure-tone thresholds in quiet often show quasi periodic fluctuations when measured with a high frequency resolution, referred to as threshold fine structure. Threshold fine structure is dependent on the stimulus duration, with smaller fluctuations for short...... than for long signals. The present study demonstrates how this effect can be captured by a nonlinear and active model of the cochlear in combination with a temporal integration stage. Since this cochlear model also accounts for fine structure and connected level dependent effects, it is superior...
Buonaccorsi, John P; Romeo, Giovanni; Thoresen, Magne
2018-03-01
When fitting regression models, measurement error in any of the predictors typically leads to biased coefficients and incorrect inferences. A plethora of methods have been proposed to correct for this. Obtaining standard errors and confidence intervals using the corrected estimators can be challenging and, in addition, there is concern about remaining bias in the corrected estimators. The bootstrap, which is one option to address these problems, has received limited attention in this context. It has usually been employed by simply resampling observations, which, while suitable in some situations, is not always formally justified. In addition, the simple bootstrap does not allow for estimating bias in non-linear models, including logistic regression. Model-based bootstrapping, which can potentially estimate bias in addition to being robust to the original sampling or whether the measurement error variance is constant or not, has received limited attention. However, it faces challenges that are not present in handling regression models with no measurement error. This article develops new methods for model-based bootstrapping when correcting for measurement error in logistic regression with replicate measures. The methodology is illustrated using two examples, and a series of simulations are carried out to assess and compare the simple and model-based bootstrap methods, as well as other standard methods. While not always perfect, the model-based approaches offer some distinct improvements over the other methods. © 2017, The International Biometric Society.
Ordinal regression models to describe tourist satisfaction with Sintra's world heritage
Mouriño, Helena
2013-10-01
In Tourism Research, ordinal regression models are becoming a very powerful tool in modelling the relationship between an ordinal response variable and a set of explanatory variables. In August and September 2010, we conducted a pioneering Tourist Survey in Sintra, Portugal. The data were obtained by face-to-face interviews at the entrances of the Palaces and Parks of Sintra. The work developed in this paper focus on two main points: tourists' perception of the entrance fees; overall level of satisfaction with this heritage site. For attaining these goals, ordinal regression models were developed. We concluded that tourist's nationality was the only significant variable to describe the perception of the admission fees. Also, Sintra's image among tourists depends not only on their nationality, but also on previous knowledge about Sintra's World Heritage status.
Combination of supervised and semi-supervised regression models for improved unbiased estimation
DEFF Research Database (Denmark)
Arenas-Garía, Jeronimo; Moriana-Varo, Carlos; Larsen, Jan
2010-01-01
In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised and semisupervi......In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised...
Random regression models for daily feed intake in Danish Duroc pigs
DEFF Research Database (Denmark)
Strathe, Anders Bjerring; Mark, Thomas; Jensen, Just
The objective of this study was to develop random regression models and estimate covariance functions for daily feed intake (DFI) in Danish Duroc pigs. A total of 476201 DFI records were available on 6542 Duroc boars between 70 to 160 days of age. The data originated from the National test station......-year-season, permanent, and animal genetic effects. The functional form was based on Legendre polynomials. A total of 64 models for random regressions were initially ranked by BIC to identify the approximate order for the Legendre polynomials using AI-REML. The parsimonious model included Legendre polynomials of 2nd...... order for genetic and permanent environmental curves and a heterogeneous residual variance, allowing the daily residual variance to change along the age trajectory due to scale effects. The parameters of the model were estimated in a Bayesian framework, using the RJMC module of the DMU package, where...
Castell, Stefanie; Schwab, Frank; Geffers, Christine; Bongartz, Hannah; Brunkhorst, Frank M.; Gastmeier, Petra; Mikolajczyk, Rafael T.
2014-01-01
Early and appropriate blood culture sampling is recommended as a standard of care for patients with suspected bloodstream infections (BSI) but is rarely taken into account when quality indicators for BSI are evaluated. To date, sampling of about 100 to 200 blood culture sets per 1,000 patient-days is recommended as the target range for blood culture rates. However, the empirical basis of this recommendation is not clear. The aim of the current study was to analyze the association between blood culture rates and observed BSI rates and to derive a reference threshold for blood culture rates in intensive care units (ICUs). This study is based on data from 223 ICUs taking part in the German hospital infection surveillance system. We applied locally weighted regression and segmented Poisson regression to assess the association between blood culture rates and BSI rates. Below 80 to 90 blood culture sets per 1,000 patient-days, observed BSI rates increased with increasing blood culture rates, while there was no further increase above this threshold. Segmented Poisson regression located the threshold at 87 (95% confidence interval, 54 to 120) blood culture sets per 1,000 patient-days. Only one-third of the investigated ICUs displayed blood culture rates above this threshold. We provided empirical justification for a blood culture target threshold in ICUs. In the majority of the studied ICUs, blood culture sampling rates were below this threshold. This suggests that a substantial fraction of BSI cases might remain undetected; reporting observed BSI rates as a quality indicator without sufficiently high blood culture rates might be misleading. PMID:25520442
Longitudinal beta regression models for analyzing health-related quality of life scores over time
Directory of Open Access Journals (Sweden)
Hunger Matthias
2012-09-01
Full Text Available Abstract Background Health-related quality of life (HRQL has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice. Methods We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy. Results At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors. Conclusions Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.
Farhadian, Maryam; Aliabadi, Mohsen; Darvishi, Ebrahim
2015-01-01
Prediction models are used in a variety of medical domains, and they are frequently built from experience which constitutes data acquired from actual cases. This study aimed to analyze the potential of artificial neural networks and logistic regression techniques for estimation of hearing impairment among industrial workers. A total of 210 workers employed in a steel factory (in West of Iran) were selected, and their occupational exposure histories were analyzed. The hearing loss thresholds of the studied workers were determined using a calibrated audiometer. The personal noise exposures were also measured using a noise dosimeter in the workstations. Data obtained from five variables, which can influence the hearing loss, were used as input features, and the hearing loss thresholds were considered as target feature of the prediction methods. Multilayer feedforward neural networks and logistic regression were developed using MATLAB R2011a software. Based on the World Health Organization classification for the grades of hearing loss, 74.2% of the studied workers have normal hearing thresholds, 23.4% have slight hearing loss, and 2.4% have moderate hearing loss. The accuracy and kappa coefficient of the best developed neural networks for prediction of the grades of hearing loss were 88.6 and 66.30, respectively. The accuracy and kappa coefficient of the logistic regression were also 84.28 and 51.30, respectively. Neural networks could provide more accurate predictions of the hearing loss than logistic regression. The prediction method can provide reliable and comprehensible information for occupational health and medicine experts.
A Gompertz regression model for fern spores germination
Directory of Open Access Journals (Sweden)
Gabriel y Galán, Jose María
2015-06-01
Full Text Available Germination is one of the most important biological processes for both seed and spore plants, also for fungi. At present, mathematical models of germination have been developed in fungi, bryophytes and several plant species. However, ferns are the only group whose germination has never been modelled. In this work we develop a regression model of the germination of fern spores. We have found that for Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei and Polypodium feuillei species the Gompertz growth model describe satisfactorily cumulative germination. An important result is that regression parameters are independent of fern species and the model is not affected by intraspecific variation. Our results show that the Gompertz curve represents a general germination model for all the non-green spore leptosporangiate ferns, including in the paper a discussion about the physiological and ecological meaning of the model.La germinación es uno de los procesos biológicos más relevantes tanto para las plantas con esporas, como para las plantas con semillas y los hongos. Hasta el momento, se han desarrollado modelos de germinación para hongos, briofitos y diversas especies de espermatófitos. Los helechos son el único grupo de plantas cuya germinación nunca ha sido modelizada. En este trabajo se desarrolla un modelo de regresión para explicar la germinación de las esporas de helechos. Observamos que para las especies Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei y Polypodium feuillei el modelo de crecimiento de Gompertz describe satisfactoriamente la germinación acumulativa. Un importante resultado es que los parámetros de la regresión son independientes de la especie y que el modelo no está afectado por variación intraespecífica. Por lo tanto, los resultados del trabajo muestran que la curva de Gompertz puede representar un modelo general para todos los helechos leptosporangiados
Thermalization threshold in models of 1D fermions
Mukerjee, Subroto; Modak, Ranjan; Ramswamy, Sriram
2013-03-01
The question of how isolated quantum systems thermalize is an interesting and open one. In this study we equate thermalization with non-integrability to try to answer this question. In particular, we study the effect of system size on the integrability of 1D systems of interacting fermions on a lattice. We find that for a finite-sized system, a non-zero value of an integrability breaking parameter is required to make an integrable system appear non-integrable. Using exact diagonalization and diagnostics such as energy level statistics and the Drude weight, we find that the threshold value of the integrability breaking parameter scales to zero as a power law with system size. We find the exponent to be the same for different models with its value depending on the random matrix ensemble describing the non-integrable system. We also study a simple analytical model of a non-integrable system with an integrable limit to better understand how a power law emerges.
Generic global regression models for growth prediction of Salmonella in ground pork and pork cuts
DEFF Research Database (Denmark)
Buschhardt, Tasja; Hansen, Tina Beck; Bahl, Martin Iain
2017-01-01
Introduction and Objectives Models for the prediction of bacterial growth in fresh pork are primarily developed using two-step regression (i.e. primary models followed by secondary models). These models are also generally based on experiments in liquids or ground meat and neglect surface growth....... It has been shown that one-step global regressions can result in more accurate models and that bacterial growth on intact surfaces can substantially differ from growth in liquid culture. Material and Methods We used a global-regression approach to develop predictive models for the growth of Salmonella....... One part of obtained logtransformed cell counts was used for model development and another for model validation. The Ratkowsky square root model and the relative lag time (RLT) model were integrated into the logistic model with delay. Fitted parameter estimates were compared to investigate the effect...
Application of multilinear regression analysis in modeling of soil ...
African Journals Online (AJOL)
The application of Multi-Linear Regression Analysis (MLRA) model for predicting soil properties in Calabar South offers a technical guide and solution in foundation designs problems in the area. Forty-five soil samples were collected from fifteen different boreholes at a different depth and 270 tests were carried out for CBR, ...
Threshold behavior in electron-atom scattering
International Nuclear Information System (INIS)
Sadeghpour, H.R.; Greene, C.H.
1996-01-01
Ever since the classic work of Wannier in 1953, the process of treating two threshold electrons in the continuum of a positively charged ion has been an active field of study. The authors have developed a treatment motivated by the physics below the double ionization threshold. By modeling the double ionization as a series of Landau-Zener transitions, they obtain an analytical formulation of the absolute threshold probability which has a leading power law behavior, akin to Wannier's law. Some of the noteworthy aspects of this derivation are that the derivation can be conveniently continued below threshold giving rise to a open-quotes cuspclose quotes at threshold, and that on both sides of the threshold, absolute values of the cross sections are obtained
Masselot, Pierre; Chebana, Fateh; Bélanger, Diane; St-Hilaire, André; Abdous, Belkacem; Gosselin, Pierre; Ouarda, Taha B. M. J.
2018-01-01
In a number of environmental studies, relationships between natural processes are often assessed through regression analyses, using time series data. Such data are often multi-scale and non-stationary, leading to a poor accuracy of the resulting regression models and therefore to results with moderate reliability. To deal with this issue, the present paper introduces the EMD-regression methodology consisting in applying the empirical mode decomposition (EMD) algorithm on data series and then using the resulting components in regression models. The proposed methodology presents a number of advantages. First, it accounts of the issues of non-stationarity associated to the data series. Second, this approach acts as a scan for the relationship between a response variable and the predictors at different time scales, providing new insights about this relationship. To illustrate the proposed methodology it is applied to study the relationship between weather and cardiovascular mortality in Montreal, Canada. The results shed new knowledge concerning the studied relationship. For instance, they show that the humidity can cause excess mortality at the monthly time scale, which is a scale not visible in classical models. A comparison is also conducted with state of the art methods which are the generalized additive models and distributed lag models, both widely used in weather-related health studies. The comparison shows that EMD-regression achieves better prediction performances and provides more details than classical models concerning the relationship.
Directory of Open Access Journals (Sweden)
Mach Łukasz
2017-06-01
Full Text Available The research process aimed at building regression models, which helps to valuate residential real estate, is presented in the following article. Two widely used computational tools i.e. the classical multiple regression and regression models of artificial neural networks were used in order to build models. An attempt to define the utilitarian usefulness of the above-mentioned tools and comparative analysis of them is the aim of the conducted research. Data used for conducting analyses refers to the secondary transactional residential real estate market.
A review of a priori regression models for warfarin maintenance dose prediction.
Directory of Open Access Journals (Sweden)
Ben Francis
Full Text Available A number of a priori warfarin dosing algorithms, derived using linear regression methods, have been proposed. Although these dosing algorithms may have been validated using patients derived from the same centre, rarely have they been validated using a patient cohort recruited from another centre. In order to undertake external validation, two cohorts were utilised. One cohort formed by patients from a prospective trial and the second formed by patients in the control arm of the EU-PACT trial. Of these, 641 patients were identified as having attained stable dosing and formed the dataset used for validation. Predicted maintenance doses from six criterion fulfilling regression models were then compared to individual patient stable warfarin dose. Predictive ability was assessed with reference to several statistics including the R-square and mean absolute error. The six regression models explained different amounts of variability in the stable maintenance warfarin dose requirements of the patients in the two validation cohorts; adjusted R-squared values ranged from 24.2% to 68.6%. An overview of the summary statistics demonstrated that no one dosing algorithm could be considered optimal. The larger validation cohort from the prospective trial produced more consistent statistics across the six dosing algorithms. The study found that all the regression models performed worse in the validation cohort when compared to the derivation cohort. Further, there was little difference between regression models that contained pharmacogenetic coefficients and algorithms containing just non-pharmacogenetic coefficients. The inconsistency of results between the validation cohorts suggests that unaccounted population specific factors cause variability in dosing algorithm performance. Better methods for dosing that take into account inter- and intra-individual variability, at the initiation and maintenance phases of warfarin treatment, are needed.
A review of a priori regression models for warfarin maintenance dose prediction.
Francis, Ben; Lane, Steven; Pirmohamed, Munir; Jorgensen, Andrea
2014-01-01
A number of a priori warfarin dosing algorithms, derived using linear regression methods, have been proposed. Although these dosing algorithms may have been validated using patients derived from the same centre, rarely have they been validated using a patient cohort recruited from another centre. In order to undertake external validation, two cohorts were utilised. One cohort formed by patients from a prospective trial and the second formed by patients in the control arm of the EU-PACT trial. Of these, 641 patients were identified as having attained stable dosing and formed the dataset used for validation. Predicted maintenance doses from six criterion fulfilling regression models were then compared to individual patient stable warfarin dose. Predictive ability was assessed with reference to several statistics including the R-square and mean absolute error. The six regression models explained different amounts of variability in the stable maintenance warfarin dose requirements of the patients in the two validation cohorts; adjusted R-squared values ranged from 24.2% to 68.6%. An overview of the summary statistics demonstrated that no one dosing algorithm could be considered optimal. The larger validation cohort from the prospective trial produced more consistent statistics across the six dosing algorithms. The study found that all the regression models performed worse in the validation cohort when compared to the derivation cohort. Further, there was little difference between regression models that contained pharmacogenetic coefficients and algorithms containing just non-pharmacogenetic coefficients. The inconsistency of results between the validation cohorts suggests that unaccounted population specific factors cause variability in dosing algorithm performance. Better methods for dosing that take into account inter- and intra-individual variability, at the initiation and maintenance phases of warfarin treatment, are needed.
Modeling of chemical exergy of agricultural biomass using improved general regression neural network
International Nuclear Information System (INIS)
Huang, Y.W.; Chen, M.Q.; Li, Y.; Guo, J.
2016-01-01
A comprehensive evaluation for energy potential contained in agricultural biomass was a vital step for energy utilization of agricultural biomass. The chemical exergy of typical agricultural biomass was evaluated based on the second law of thermodynamics. The chemical exergy was significantly influenced by C and O elements rather than H element. The standard entropy of the samples also was examined based on their element compositions. Two predicted models of the chemical exergy were developed, which referred to a general regression neural network model based upon the element composition, and a linear model based upon the high heat value. An auto-refinement algorithm was firstly developed to improve the performance of regression neural network model. The developed general regression neural network model with K-fold cross-validation had a better ability for predicting the chemical exergy than the linear model, which had lower predicted errors (±1.5%). - Highlights: • Chemical exergies of agricultural biomass were evaluated based upon fifty samples. • Values for the standard entropy of agricultural biomass samples were calculated. • A linear relationship between chemical exergy and HHV of samples was detected. • An improved GRNN prediction model for the chemical exergy of biomass was developed.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Electricity demand loads modeling using AutoRegressive Moving Average (ARMA) models
Energy Technology Data Exchange (ETDEWEB)
Pappas, S.S. [Department of Information and Communication Systems Engineering, University of the Aegean, Karlovassi, 83 200 Samos (Greece); Ekonomou, L.; Chatzarakis, G.E. [Department of Electrical Engineering Educators, ASPETE - School of Pedagogical and Technological Education, N. Heraklion, 141 21 Athens (Greece); Karamousantas, D.C. [Technological Educational Institute of Kalamata, Antikalamos, 24100 Kalamata (Greece); Katsikas, S.K. [Department of Technology Education and Digital Systems, University of Piraeus, 150 Androutsou Srt., 18 532 Piraeus (Greece); Liatsis, P. [Division of Electrical Electronic and Information Engineering, School of Engineering and Mathematical Sciences, Information and Biomedical Engineering Centre, City University, Northampton Square, London EC1V 0HB (United Kingdom)
2008-09-15
This study addresses the problem of modeling the electricity demand loads in Greece. The provided actual load data is deseasonilized and an AutoRegressive Moving Average (ARMA) model is fitted on the data off-line, using the Akaike Corrected Information Criterion (AICC). The developed model fits the data in a successful manner. Difficulties occur when the provided data includes noise or errors and also when an on-line/adaptive modeling is required. In both cases and under the assumption that the provided data can be represented by an ARMA model, simultaneous order and parameter estimation of ARMA models under the presence of noise are performed. The produced results indicate that the proposed method, which is based on the multi-model partitioning theory, tackles successfully the studied problem. For validation purposes the produced results are compared with three other established order selection criteria, namely AICC, Akaike's Information Criterion (AIC) and Schwarz's Bayesian Information Criterion (BIC). The developed model could be useful in the studies that concern electricity consumption and electricity prices forecasts. (author)
Optimizing Systems of Threshold Detection Sensors
National Research Council Canada - National Science Library
Banschbach, David C
2008-01-01
.... Below the threshold all signals are ignored. We develop a mathematical model for setting individual sensor thresholds to obtain optimal probability of detecting a significant event, given a limit on the total number of false positives allowed...
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916
Directory of Open Access Journals (Sweden)
Wen-Cheng Wang
2014-01-01
Full Text Available It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.
A binary logistic regression model with complex sampling design of ...
African Journals Online (AJOL)
2017-09-03
Sep 3, 2017 ... Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. .... Data was entered into STATA-12 and analyzed using. SPSS-21. .... lack of access/too far or costs too much. 35. 1.2.
truncSP: An R Package for Estimation of Semi-Parametric Truncated Linear Regression Models
Directory of Open Access Journals (Sweden)
Maria Karlsson
2014-05-01
Full Text Available Problems with truncated data occur in many areas, complicating estimation and inference. Regarding linear regression models, the ordinary least squares estimator is inconsistent and biased for these types of data and is therefore unsuitable for use. Alternative estimators, designed for the estimation of truncated regression models, have been developed. This paper presents the R package truncSP. The package contains functions for the estimation of semi-parametric truncated linear regression models using three different estimators: the symmetrically trimmed least squares, quadratic mode, and left truncated estimators, all of which have been shown to have good asymptotic and ?nite sample properties. The package also provides functions for the analysis of the estimated models. Data from the environmental sciences are used to illustrate the functions in the package.
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Vanderick, S; Troch, T; Gillon, A; Glorieux, G; Gengler, N
2014-12-01
Calving ease scores from Holstein dairy cattle in the Walloon Region of Belgium were analysed using univariate linear and threshold animal models. Variance components and derived genetic parameters were estimated from a data set including 33,155 calving records. Included in the models were season, herd and sex of calf × age of dam classes × group of calvings interaction as fixed effects, herd × year of calving, maternal permanent environment and animal direct and maternal additive genetic as random effects. Models were fitted with the genetic correlation between direct and maternal additive genetic effects either estimated or constrained to zero. Direct heritability for calving ease was approximately 8% with linear models and approximately 12% with threshold models. Maternal heritabilities were approximately 2 and 4%, respectively. Genetic correlation between direct and maternal additive effects was found to be not significantly different from zero. Models were compared in terms of goodness of fit and predictive ability. Criteria of comparison such as mean squared error, correlation between observed and predicted calving ease scores as well as between estimated breeding values were estimated from 85,118 calving records. The results provided few differences between linear and threshold models even though correlations between estimated breeding values from subsets of data for sires with progeny from linear model were 17 and 23% greater for direct and maternal genetic effects, respectively, than from threshold model. For the purpose of genetic evaluation for calving ease in Walloon Holstein dairy cattle, the linear animal model without covariance between direct and maternal additive effects was found to be the best choice. © 2014 Blackwell Verlag GmbH.
International Nuclear Information System (INIS)
Bergstroem, Johannes; Ohlsson, Tommy; Zhang He
2011-01-01
We show that, in the low-scale type-I seesaw model, renormalization group running of neutrino parameters may lead to significant modifications of the leptonic mixing angles in view of so-called seesaw threshold effects. Especially, we derive analytical formulas for radiative corrections to neutrino parameters in crossing the different seesaw thresholds, and show that there may exist enhancement factors efficiently boosting the renormalization group running of the leptonic mixing angles. We find that, as a result of the seesaw threshold corrections to the leptonic mixing angles, various flavor symmetric mixing patterns (e.g., bi-maximal and tri-bimaximal mixing patterns) can be easily accommodated at relatively low energy scales, which is well within the reach of running and forthcoming experiments (e.g., the LHC).
Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab
2011-01-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work
Threshold Dynamics of a Stochastic SIR Model with Vertical Transmission and Vaccination
Miao, Anqi; Zhang, Jian; Zhang, Tongqian; Pradeep, B. G. Sampath Aruna
2017-01-01
A stochastic SIR model with vertical transmission and vaccination is proposed and investigated in this paper. The threshold dynamics are explored when the noise is small. The conditions for the extinction or persistence of infectious diseases are deduced. Our results show that large noise can lead to the extinction of infectious diseases which is conducive to epidemic diseases control.
Jovanovic, Milos; Radovanovic, Sandro; Vukicevic, Milan; Van Poucke, Sven; Delibasic, Boris
2016-09-01
Quantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th-revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions. The analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy. The results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve-AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have
Keat, Sim Chong; Chun, Beh Boon; San, Lim Hwee; Jafri, Mohd Zubir Mat
2015-04-01
Climate change due to carbon dioxide (CO2) emissions is one of the most complex challenges threatening our planet. This issue considered as a great and international concern that primary attributed from different fossil fuels. In this paper, regression model is used for analyzing the causal relationship among CO2 emissions based on the energy consumption in Malaysia using time series data for the period of 1980-2010. The equations were developed using regression model based on the eight major sources that contribute to the CO2 emissions such as non energy, Liquefied Petroleum Gas (LPG), diesel, kerosene, refinery gas, Aviation Turbine Fuel (ATF) and Aviation Gasoline (AV Gas), fuel oil and motor petrol. The related data partly used for predict the regression model (1980-2000) and partly used for validate the regression model (2001-2010). The results of the prediction model with the measured data showed a high correlation coefficient (R2=0.9544), indicating the model's accuracy and efficiency. These results are accurate and can be used in early warning of the population to comply with air quality standards.
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
On the Appearance of Thresholds in the Dynamical Model of Star Formation
Elmegreen, Bruce G.
2018-02-01
The Kennicutt–Schmidt (KS) relationship between the surface density of the star formation rate (SFR) and the gas surface density has three distinct power laws that may result from one model in which gas collapses at a fixed fraction of the dynamical rate. The power-law slope is 1 when the observed gas has a characteristic density for detection, 1.5 for total gas when the thickness is about constant as in the main disks of galaxies, and 2 for total gas when the thickness is regulated by self-gravity and the velocity dispersion is about constant, as in the outer parts of spirals, dwarf irregulars, and giant molecular clouds. The observed scaling of the star formation efficiency (SFR per unit CO) with the dense gas fraction (HCN/CO) is derived from the KS relationship when one tracer (HCN) is on the linear part and the other (CO) is on the 1.5 part. Observations of a threshold density or column density with a constant SFR per unit gas mass above the threshold are proposed to be selection effects, as are observations of star formation in only the dense parts of clouds. The model allows a derivation of all three KS relations using the probability distribution function of density with no thresholds for star formation. Failed galaxies and systems with sub-KS SFRs are predicted to have gas that is dominated by an equilibrium warm phase where the thermal Jeans length exceeds the Toomre length. A squared relation is predicted for molecular gas-dominated young galaxies.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Small-threshold behaviour of two-loop self-energy diagrams: two-particle thresholds
International Nuclear Information System (INIS)
Berends, F.A.; Davydychev, A.I.; Moskovskij Gosudarstvennyj Univ., Moscow; Smirnov, V.A.; Moskovskij Gosudarstvennyj Univ., Moscow
1996-01-01
The behaviour of two-loop two-point diagrams at non-zero thresholds corresponding to two-particle cuts is analyzed. The masses involved in a cut and the external momentum are assumed to be small as compared to some of the other masses of the diagram. By employing general formulae of asymptotic expansions of Feynman diagrams in momenta and masses, we construct an algorithm to derive analytic approximations to the diagrams. In such a way, we calculate several first coefficients of the expansion. Since no conditions on relative values of the small masses and the external momentum are imposed, the threshold irregularities are described analytically. Numerical examples, using diagrams occurring in the standard model, illustrate the convergence of the expansion below the first large threshold. (orig.)
Construction of risk prediction model of type 2 diabetes mellitus based on logistic regression
Directory of Open Access Journals (Sweden)
Li Jian
2017-01-01
Full Text Available Objective: to construct multi factor prediction model for the individual risk of T2DM, and to explore new ideas for early warning, prevention and personalized health services for T2DM. Methods: using logistic regression techniques to screen the risk factors for T2DM and construct the risk prediction model of T2DM. Results: Male’s risk prediction model logistic regression equation: logit(P=BMI × 0.735+ vegetables × (−0.671 + age × 0.838+ diastolic pressure × 0.296+ physical activity× (−2.287 + sleep ×(−0.009 +smoking ×0.214; Female’s risk prediction model logistic regression equation: logit(P=BMI ×1.979+ vegetables× (−0.292 + age × 1.355+ diastolic pressure× 0.522+ physical activity × (−2.287 + sleep × (−0.010.The area under the ROC curve of male was 0.83, the sensitivity was 0.72, the specificity was 0.86, the area under the ROC curve of female was 0.84, the sensitivity was 0.75, the specificity was 0.90. Conclusion: This study model data is from a compared study of nested case, the risk prediction model has been established by using the more mature logistic regression techniques, and the model is higher predictive sensitivity, specificity and stability.
Time Poverty Thresholds and Rates for the US Population
Kalenkoski, Charlene M.; Hamrick, Karen S.; Andrews, Margaret
2011-01-01
Time constraints, like money constraints, affect Americans' well-being. This paper defines what it means to be time poor based on the concepts of necessary and committed time and presents time poverty thresholds and rates for the US population and certain subgroups. Multivariate regression techniques are used to identify the key variables…
Longobardo, G S; Evangelisti, C J; Cherniack, N S
2009-12-01
We examined the effect of arousals (shifts from sleep to wakefulness) on breathing during sleep using a mathematical model. The model consisted of a description of the fluid dynamics and mechanical properties of the upper airways and lungs, as well as a controller sensitive to arterial and brain changes in CO(2), changes in arterial oxygen, and a neural input, alertness. The body was divided into multiple gas store compartments connected by the circulation. Cardiac output was constant, and cerebral blood flows were sensitive to changes in O(2) and CO(2) levels. Arousal was considered to occur instantaneously when afferent respiratory chemical and neural stimulation reached a threshold value, while sleep occurred when stimulation fell below that value. In the case of rigid and nearly incompressible upper airways, lowering arousal threshold decreased the stability of breathing and led to the occurrence of repeated apnoeas. In more compressible upper airways, to maintain stability, increasing arousal thresholds and decreasing elasticity were linked approximately linearly, until at low elastances arousal thresholds had no effect on stability. Increased controller gain promoted instability. The architecture of apnoeas during unstable sleep changed with the arousal threshold and decreases in elasticity. With rigid airways, apnoeas were central. With lower elastances, apnoeas were mixed even with higher arousal thresholds. With very low elastances and still higher arousal thresholds, sleep consisted totally of obstructed apnoeas. Cycle lengths shortened as the sleep architecture changed from mixed apnoeas to total obstruction. Deeper sleep also tended to promote instability by increasing plant gain. These instabilities could be countered by arousal threshold increases which were tied to deeper sleep or accumulated aroused time, or by decreased controller gains.
THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE
Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan
2015-01-01
Background: The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran?s universities. Methods: This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran?s public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For pr...
Sciagrà, Roberto; Cipollini, Fabrizio; Berti, Valentina; Migliorini, Angela; Antoniucci, David; Pupi, Alberto
2013-04-01
In acute myocardial infarction (AMI) treated by primary percutaneous coronary intervention (PCI), there is a direct relationship between myocardial damage and consequent left ventricular (LV) functional impairment. It is however unclear whether there is a safety threshold below which infarct size does not significantly affect LV ejection fraction (EF). The aim of this study was to evaluate the relationship between infarct size and LVEF in AMI patients treated by successful PCI using a specific statistical approach to identify a possible safety threshold. Among patients with recent AMI submitted to perfusion gated single photon emission computed tomography (SPECT) to define the infarct size, the data of 427 subjects with sizable infarct size were considered. The relationship between infarct size and LVEF was analysed using a simple segmented regression (SSR) model and an iterative algorithm based on robust least squares (RLS) for parameter estimation. The RLS algorithm detected two break points in the SSR model, set at infarct size values of 11.0 and 51.5 %. Because the slope coefficients of the two extreme segments of the regression line were not significant, by constraining such segments to zero slope in the SSR model, the lower break point was identified at infarct size = 8 % and the upper one at 45 %. Using a rigorous statistical approach, it is possible to demonstrate that below a threshold of 8 % the infarct size apparently does not affect the LVEF and therefore a safety threshold could be set at this value. Furthermore, the same analysis suggests that the relationship between infarct size and LVEF impairment is lost for an infarct size > 45 %.
Directory of Open Access Journals (Sweden)
J. Kath
2014-12-01
Full Text Available Groundwater decline is widespread, yet its implications for natural systems are poorly understood. Previous research has revealed links between groundwater depth and tree condition; however, critical thresholds which might indicate ecological ‘tipping points’ associated with rapid and potentially irreversible change have been difficult to quantify. This study collated data for two dominant floodplain species, Eucalyptus camaldulensis (river red gum and E. populnea (poplar box from 118 sites in eastern Australia where significant groundwater decline has occurred. Boosted regression trees, quantile regression and Threshold Indicator Taxa Analysis were used to investigate the relationship between tree condition and groundwater depth. Distinct non-linear responses were found, with groundwater depth thresholds identified in the range from 12.1 m to 22.6 m for E. camaldulensis and 12.6 m to 26.6 m for E. populnea beyond which canopy condition declined abruptly. Non-linear threshold responses in canopy condition in these species may be linked to rooting depth, with chronic groundwater decline decoupling trees from deep soil moisture resources. The quantification of groundwater depth thresholds is likely to be critical for management aimed at conserving groundwater dependent biodiversity. Identifying thresholds will be important in regions where water extraction and drying climates may contribute to further groundwater decline. Keywords: Canopy condition, Dieback, Drought, Tipping point, Ecological threshold, Groundwater dependent ecosystems
Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.
2018-03-01
Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.
Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A
2014-09-01
Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.
Modeling of the Monthly Rainfall-Runoff Process Through Regressions
Directory of Open Access Journals (Sweden)
Campos-Aranda Daniel Francisco
2014-10-01
Full Text Available To solve the problems associated with the assessment of water resources of a river, the modeling of the rainfall-runoff process (RRP allows the deduction of runoff missing data and to extend its record, since generally the information available on precipitation is larger. It also enables the estimation of inputs to reservoirs, when their building led to the suppression of the gauging station. The simplest mathematical model that can be set for the RRP is the linear regression or curve on a monthly basis. Such a model is described in detail and is calibrated with the simultaneous record of monthly rainfall and runoff in Ballesmi hydrometric station, which covers 35 years. Since the runoff of this station has an important contribution from the spring discharge, the record is corrected first by removing that contribution. In order to do this a procedure was developed based either on the monthly average regional runoff coefficients or on nearby and similar watershed; in this case the Tancuilín gauging station was used. Both stations belong to the Partial Hydrologic Region No. 26 (Lower Rio Panuco and are located within the state of San Luis Potosi, México. The study performed indicates that the monthly regression model, due to its conceptual approach, faithfully reproduces monthly average runoff volumes and achieves an excellent approximation in relation to the dispersion, proved by calculation of the means and standard deviations.
Genetic evaluation of European quails by random regression models
Directory of Open Access Journals (Sweden)
Flaviana Miranda Gonçalves
2012-09-01
Full Text Available The objective of this study was to compare different random regression models, defined from different classes of heterogeneity of variance combined with different Legendre polynomial orders for the estimate of (covariance of quails. The data came from 28,076 observations of 4,507 female meat quails of the LF1 lineage. Quail body weights were determined at birth and 1, 14, 21, 28, 35 and 42 days of age. Six different classes of residual variance were fitted to Legendre polynomial functions (orders ranging from 2 to 6 to determine which model had the best fit to describe the (covariance structures as a function of time. According to the evaluated criteria (AIC, BIC and LRT, the model with six classes of residual variances and of sixth-order Legendre polynomial was the best fit. The estimated additive genetic variance increased from birth to 28 days of age, and dropped slightly from 35 to 42 days. The heritability estimates decreased along the growth curve and changed from 0.51 (1 day to 0.16 (42 days. Animal genetic and permanent environmental correlation estimates between weights and age classes were always high and positive, except for birth weight. The sixth order Legendre polynomial, along with the residual variance divided into six classes was the best fit for the growth rate curve of meat quails; therefore, they should be considered for breeding evaluation processes by random regression models.
Significance tests to determine the direction of effects in linear regression models.
Wiedermann, Wolfgang; Hagmann, Michael; von Eye, Alexander
2015-02-01
Previous studies have discussed asymmetric interpretations of the Pearson correlation coefficient and have shown that higher moments can be used to decide on the direction of dependence in the bivariate linear regression setting. The current study extends this approach by illustrating that the third moment of regression residuals may also be used to derive conclusions concerning the direction of effects. Assuming non-normally distributed variables, it is shown that the distribution of residuals of the correctly specified regression model (e.g., Y is regressed on X) is more symmetric than the distribution of residuals of the competing model (i.e., X is regressed on Y). Based on this result, 4 one-sample tests are discussed which can be used to decide which variable is more likely to be the response and which one is more likely to be the explanatory variable. A fifth significance test is proposed based on the differences of skewness estimates, which leads to a more direct test of a hypothesis that is compatible with direction of dependence. A Monte Carlo simulation study was performed to examine the behaviour of the procedures under various degrees of associations, sample sizes, and distributional properties of the underlying population. An empirical example is given which illustrates the application of the tests in practice. © 2014 The British Psychological Society.
Weighted functional linear regression models for gene-based association analysis.
Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I
2018-01-01
Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.
Threshold Dynamics in Stochastic SIRS Epidemic Models with Nonlinear Incidence and Vaccination
Directory of Open Access Journals (Sweden)
Lei Wang
2017-01-01
Full Text Available In this paper, the dynamical behaviors for a stochastic SIRS epidemic model with nonlinear incidence and vaccination are investigated. In the models, the disease transmission coefficient and the removal rates are all affected by noise. Some new basic properties of the models are found. Applying these properties, we establish a series of new threshold conditions on the stochastically exponential extinction, stochastic persistence, and permanence in the mean of the disease with probability one for the models. Furthermore, we obtain a sufficient condition on the existence of unique stationary distribution for the model. Finally, a series of numerical examples are introduced to illustrate our main theoretical results and some conjectures are further proposed.
Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling
Directory of Open Access Journals (Sweden)
Eric R. Edelman
2017-06-01
Full Text Available For efficient utilization of operating rooms (ORs, accurate schedules of assigned block time and sequences of patient cases need to be made. The quality of these planning tools is dependent on the accurate prediction of total procedure time (TPT per case. In this paper, we attempt to improve the accuracy of TPT predictions by using linear regression models based on estimated surgeon-controlled time (eSCT and other variables relevant to TPT. We extracted data from a Dutch benchmarking database of all surgeries performed in six academic hospitals in The Netherlands from 2012 till 2016. The final dataset consisted of 79,983 records, describing 199,772 h of total OR time. Potential predictors of TPT that were included in the subsequent analysis were eSCT, patient age, type of operation, American Society of Anesthesiologists (ASA physical status classification, and type of anesthesia used. First, we computed the predicted TPT based on a previously described fixed ratio model for each record, multiplying eSCT by 1.33. This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of anesthesia-controlled time (ACT. We then systematically tested all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables. In addition, all regression models were again tested without eSCT as a predictor to predict ACT separately (which leads to TPT by adding SCT. TPT was most accurately predicted using a linear regression model based on the independent variables eSCT, type of operation, ASA classification, and type of anesthesia. This model performed significantly better than the fixed ratio model and the method of predicting ACT separately. Making use of these more accurate predictions in planning and sequencing algorithms may enable an increase in utilization of ORs, leading to significant financial and productivity related
Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling.
Edelman, Eric R; van Kuijk, Sander M J; Hamaekers, Ankie E W; de Korte, Marcel J M; van Merode, Godefridus G; Buhre, Wolfgang F F A
2017-01-01
For efficient utilization of operating rooms (ORs), accurate schedules of assigned block time and sequences of patient cases need to be made. The quality of these planning tools is dependent on the accurate prediction of total procedure time (TPT) per case. In this paper, we attempt to improve the accuracy of TPT predictions by using linear regression models based on estimated surgeon-controlled time (eSCT) and other variables relevant to TPT. We extracted data from a Dutch benchmarking database of all surgeries performed in six academic hospitals in The Netherlands from 2012 till 2016. The final dataset consisted of 79,983 records, describing 199,772 h of total OR time. Potential predictors of TPT that were included in the subsequent analysis were eSCT, patient age, type of operation, American Society of Anesthesiologists (ASA) physical status classification, and type of anesthesia used. First, we computed the predicted TPT based on a previously described fixed ratio model for each record, multiplying eSCT by 1.33. This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of anesthesia-controlled time (ACT). We then systematically tested all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables. In addition, all regression models were again tested without eSCT as a predictor to predict ACT separately (which leads to TPT by adding SCT). TPT was most accurately predicted using a linear regression model based on the independent variables eSCT, type of operation, ASA classification, and type of anesthesia. This model performed significantly better than the fixed ratio model and the method of predicting ACT separately. Making use of these more accurate predictions in planning and sequencing algorithms may enable an increase in utilization of ORs, leading to significant financial and productivity related benefits.
Jin, Li; Hongxia, Liu; Bin, Li; Lei, Cao; Bo, Yuan
2010-08-01
For the first time, a simple and accurate two-dimensional analytical model for the surface potential variation along the channel in fully depleted dual-material gate strained-Si-on-insulator (DMG SSOI) MOSFETs is developed. We investigate the improved short channel effect (SCE), hot carrier effect (HCE), drain-induced barrier-lowering (DIBL) and carrier transport efficiency for the novel structure MOSFET. The analytical model takes into account the effects of different metal gate lengths, work functions, the drain bias and Ge mole fraction in the relaxed SiGe buffer. The surface potential in the channel region exhibits a step potential, which can suppress SCE, HCE and DIBL. Also, strained-Si and SOI structure can improve the carrier transport efficiency, with strained-Si being particularly effective. Further, the threshold voltage model correctly predicts a “rollup" in threshold voltage with decreasing channel length ratios or Ge mole fraction in the relaxed SiGe buffer. The validity of the two-dimensional analytical model is verified using numerical simulations.
Linearity and Misspecification Tests for Vector Smooth Transition Regression Models
DEFF Research Database (Denmark)
Teräsvirta, Timo; Yang, Yukai
The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...
Regularized multivariate regression models with skew-t error distributions
Chen, Lianfu
2014-06-01
We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.
Larson, Steven J.; Crawford, Charles G.; Gilliom, Robert J.
2004-01-01
Regression models were developed for predicting atrazine concentration distributions in rivers and streams, using the Watershed Regressions for Pesticides (WARP) methodology. Separate regression equations were derived for each of nine percentiles of the annual distribution of atrazine concentrations and for the annual time-weighted mean atrazine concentration. In addition, seasonal models were developed for two specific periods of the year--the high season, when the highest atrazine concentrations are expected in streams, and the low season, when concentrations are expected to be low or undetectable. Various nationally available watershed parameters were used as explanatory variables, including atrazine use intensity, soil characteristics, hydrologic parameters, climate and weather variables, land use, and agricultural management practices. Concentration data from 112 river and stream stations sampled as part of the U.S. Geological Survey's National Water-Quality Assessment and National Stream Quality Accounting Network Programs were used for computing the concentration percentiles and mean concentrations used as the response variables in regression models. Tobit regression methods, using maximum likelihood estimation, were used for developing the models because some of the concentration values used for the response variables were censored (reported as less than a detection threshold). Data from 26 stations not used for model development were used for model validation. The annual models accounted for 62 to 77 percent of the variability in concentrations among the 112 model development stations. Atrazine use intensity (the amount of atrazine used in the watershed divided by watershed area) was the most important explanatory variable in all models, but additional watershed parameters significantly increased the amount of variability explained by the models. Predicted concentrations from all 10 models were within a factor of 10 of the observed concentrations at most
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
International Nuclear Information System (INIS)
Alvarez R, J.T.; Morales P, R.
1992-06-01
The absorbed dose for equivalent soft tissue is determined,it is imparted by ophthalmologic applicators, ( 90 Sr/ 90 Y, 1850 MBq) using an extrapolation chamber of variable electrodes; when estimating the slope of the extrapolation curve using a simple lineal regression model is observed that the dose values are underestimated from 17.7 percent up to a 20.4 percent in relation to the estimate of this dose by means of a regression model polynomial two grade, at the same time are observed an improvement in the standard error for the quadratic model until in 50%. Finally the global uncertainty of the dose is presented, taking into account the reproducibility of the experimental arrangement. As conclusion it can infers that in experimental arrangements where the source is to contact with the extrapolation chamber, it was recommended to substitute the lineal regression model by the quadratic regression model, in the determination of the slope of the extrapolation curve, for more exact and accurate measurements of the absorbed dose. (Author)
Threshold Dynamics of a Stochastic SIR Model with Vertical Transmission and Vaccination
Directory of Open Access Journals (Sweden)
Anqi Miao
2017-01-01
Full Text Available A stochastic SIR model with vertical transmission and vaccination is proposed and investigated in this paper. The threshold dynamics are explored when the noise is small. The conditions for the extinction or persistence of infectious diseases are deduced. Our results show that large noise can lead to the extinction of infectious diseases which is conducive to epidemic diseases control.
Directory of Open Access Journals (Sweden)
Nina L. Timofeeva
2014-01-01
Full Text Available The article presents the methodological and technical bases for the creation of regression models that adequately reflect reality. The focus is on methods of removing residual autocorrelation in models. Algorithms eliminating heteroscedasticity and autocorrelation of the regression model residuals: reweighted least squares method, the method of Cochran-Orkutta are given. A model of "pure" regression is build, as well as to compare the effect on the dependent variable of the different explanatory variables when the latter are expressed in different units, a standardized form of the regression equation. The scheme of abatement techniques of heteroskedasticity and autocorrelation for the creation of regression models specific to the social and cultural sphere is developed.
A generalized multivariate regression model for modelling ocean wave heights
Wang, X. L.; Feng, Y.; Swail, V. R.
2012-04-01
In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.
International Nuclear Information System (INIS)
Murari, A.; Lupelli, I.; Gaudio, P.; Gelfusa, M.; Vega, J.
2012-01-01
In this paper, a refined set of statistical techniques is developed and then applied to the problem of deriving the scaling law for the threshold power to access the H-mode of confinement in tokamaks. This statistical methodology is applied to the 2010 version of the ITPA International Global Threshold Data Base v6b(IGDBTHv6b). To increase the engineering and operative relevance of the results, only macroscopic physical quantities, measured in the vast majority of experiments, have been considered as candidate variables in the models. Different principled methods, such as agglomerative hierarchical variables clustering, without assumption about the functional form of the scaling, and nonlinear regression, are implemented to select the best subset of candidate independent variables and to improve the regression model accuracy. Two independent model selection criteria, based on the classical (Akaike information criterion) and Bayesian formalism (Bayesian information criterion), are then used to identify the most efficient scaling law from candidate models. The results derived from the full multi-machine database confirm the results of previous analysis but emphasize the importance of shaping quantities, elongation and triangularity. On the other hand, the scaling laws for the different machines and at different currents are different from each other at the level of confidence well above 95%, suggesting caution in the use of the global scaling laws for both interpretation and extrapolation purposes. (paper)
The use of logistic regression in modelling the distributions of bird ...
African Journals Online (AJOL)
The method of logistic regression was used to model the observed geographical distribution patterns of bird species in Swaziland in relation to a set of environmental variables. Reporting rates derived from bird atlas data are used as an index of population densities. This is justified in part by the success of the modelling ...
Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni
2017-12-01
Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.
Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.
Ferrari, Alberto
2017-01-01
Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
2015-01-01
This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of
Wang, Hui; Liu, Huifang; Cao, Zhiyong; Wang, Bowen
2016-01-01
This paper presents a new perspective that there is a double-threshold effect in terms of the technology gap existing in the foreign direct investment (FDI) technology spillover process in different regional Chinese industrial sectors. In this paper, a double-threshold regression model was established to examine the relation between the threshold effect of the technology gap and technology spillover. Based on the provincial panel data of Chinese industrial sectors from 2000 to 2011, the empirical results reveal that there are two threshold values, which are 1.254 and 2.163, in terms of the technology gap in the industrial sector in eastern China. There are also two threshold values in both the central and western industrial sector, which are 1.516, 2.694 and 1.635, 2.714, respectively. The technology spillover is a decreasing function of the technology gap in both the eastern and western industrial sectors, but a concave curve function of the technology gap is in the central industrial sectors. Furthermore, the FDI technology spillover has increased gradually in recent years. Based on the empirical results, suggestions were proposed to elucidate the introduction of the FDI and the improvement in the industrial added value in different regions of China.
Notes on power of normality tests of error terms in regression models
International Nuclear Information System (INIS)
Střelec, Luboš
2015-01-01
Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importance of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models
Notes on power of normality tests of error terms in regression models
Energy Technology Data Exchange (ETDEWEB)
Střelec, Luboš [Department of Statistics and Operation Analysis, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, Brno, 61300 (Czech Republic)
2015-03-10
Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importance of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models.
Hellard, Philippe; Avalos, Marta; Millet, Gregoire; Lacoste, Lucien; Barale, Frederic; Chatard, Jean-Claude
2005-02-01
The aim of this study was to model the residual effects of training on the swimming performance and to compare a model that includes threshold saturation (MM) with the Banister model (BM). Seven Olympic swimmers were studied over a period of 4 +/- 2 years. For 3 training loads (low-intensity w(LIT), high-intensity w(HIT), and strength training w(ST)), 3 residual training effects were determined: short-term (STE) during the taper phase (i.e., 3 weeks before the performance [weeks 0, 1, and 2]), intermediate-term (ITE) during the intensity phase (weeks 3, 4, and 5), and long-term (LTE) during the volume phase (weeks 6, 7, and 8). ITE and LTE were positive for w(HIT) and w(LIT), respectively (p measures indicated that MM compares favorably with BM. Identifying individual training thresholds may help individualize the distribution of training loads.
Online Statistical Modeling (Regression Analysis) for Independent Responses
Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus
2017-06-01
Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.
Alcohol and cirrhosis: dose--response or threshold effect?
DEFF Research Database (Denmark)
Kamper-Jørgensen, Mads; Grønbaek, Morten; Tolstrup, Janne
2004-01-01
BACKGROUND/AIMS: General population studies have shown a strong association between alcohol intake and death from alcoholic cirrhosis, but whether this is a dose-response or a threshold effect remains unknown, and the relation among alcohol misusers has not been studied. METHODS: A cohort of 6152...... alcohol misusing men and women aged 15-83 were interviewed about drinking pattern and social issues and followed for 84,257 person-years. Outcome was alcoholic cirrhosis mortality. Data was analyzed by means of Cox-regression models. RESULTS: In this large prospective cohort study of alcohol misusers...... there was a 27 fold increased mortality from alcoholic cirrhosis in men and a 35 fold increased mortality from alcoholic cirrhosis in women compared to the Danish population. Number of drinks per day was not significantly associated with death from alcoholic cirrhosis, since there was no additional risk of death...
The Transmuted Geometric-Weibull distribution: Properties, Characterizations and Regression Models
Directory of Open Access Journals (Sweden)
Zohdy M Nofal
2017-06-01
Full Text Available We propose a new lifetime model called the transmuted geometric-Weibull distribution. Some of its structural properties including ordinary and incomplete moments, quantile and generating functions, probability weighted moments, Rényi and q-entropies and order statistics are derived. The maximum likelihood method is discussed to estimate the model parameters by means of Monte Carlo simulation study. A new location-scale regression model is introduced based on the proposed distribution. The new distribution is applied to two real data sets to illustrate its flexibility. Empirical results indicate that proposed distribution can be alternative model to other lifetime models available in the literature for modeling real data in many areas.
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Conditional Monte Carlo randomization tests for regression models.
Parhat, Parwen; Rosenberger, William F; Diao, Guoqing
2014-08-15
We discuss the computation of randomization tests for clinical trials of two treatments when the primary outcome is based on a regression model. We begin by revisiting the seminal paper of Gail, Tan, and Piantadosi (1988), and then describe a method based on Monte Carlo generation of randomization sequences. The tests based on this Monte Carlo procedure are design based, in that they incorporate the particular randomization procedure used. We discuss permuted block designs, complete randomization, and biased coin designs. We also use a new technique by Plamadeala and Rosenberger (2012) for simple computation of conditional randomization tests. Like Gail, Tan, and Piantadosi, we focus on residuals from generalized linear models and martingale residuals from survival models. Such techniques do not apply to longitudinal data analysis, and we introduce a method for computation of randomization tests based on the predicted rate of change from a generalized linear mixed model when outcomes are longitudinal. We show, by simulation, that these randomization tests preserve the size and power well under model misspecification. Copyright © 2014 John Wiley & Sons, Ltd.
Bonellie, Sandra R
2012-10-01
To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother. Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.
Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks
Gray-Davies, Tristan; Holmes, Chris C.; Caron, François
2018-01-01
We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Chardon, Jérémy; Hingray, Benoit; Favre, Anne-Catherine
2018-01-01
Statistical downscaling models (SDMs) are often used to produce local weather scenarios from large-scale atmospheric information. SDMs include transfer functions which are based on a statistical link identified from observations between local weather and a set of large-scale predictors. As physical processes driving surface weather vary in time, the most relevant predictors and the regression link are likely to vary in time too. This is well known for precipitation for instance and the link is thus often estimated after some seasonal stratification of the data. In this study, we present a two-stage analog/regression model where the regression link is estimated from atmospheric analogs of the current prediction day. Atmospheric analogs are identified from fields of geopotential heights at 1000 and 500 hPa. For the regression stage, two generalized linear models are further used to model the probability of precipitation occurrence and the distribution of non-zero precipitation amounts, respectively. The two-stage model is evaluated for the probabilistic prediction of small-scale precipitation over France. It noticeably improves the skill of the prediction for both precipitation occurrence and amount. As the analog days vary from one prediction day to another, the atmospheric predictors selected in the regression stage and the value of the corresponding regression coefficients can vary from one prediction day to another. The model allows thus for a day-to-day adaptive and tailored downscaling. It can also reveal specific predictors for peculiar and non-frequent weather configurations.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Cumulative t-link threshold models for the genetic analysis of calving ease scores
Directory of Open Access Journals (Sweden)
Tempelman Robert J
2003-09-01
Full Text Available Abstract In this study, a hierarchical threshold mixed model based on a cumulative t-link specification for the analysis of ordinal data or more, specifically, calving ease scores, was developed. The validation of this model and the Markov chain Monte Carlo (MCMC algorithm was carried out on simulated data from normally and t4 (i.e. a t-distribution with four degrees of freedom distributed populations using the deviance information criterion (DIC and a pseudo Bayes factor (PBF measure to validate recently proposed model choice criteria. The simulation study indicated that although inference on the degrees of freedom parameter is possible, MCMC mixing was problematic. Nevertheless, the DIC and PBF were validated to be satisfactory measures of model fit to data. A sire and maternal grandsire cumulative t-link model was applied to a calving ease dataset from 8847 Italian Piemontese first parity dams. The cumulative t-link model was shown to lead to posterior means of direct and maternal heritabilities (0.40 ± 0.06, 0.11 ± 0.04 and a direct maternal genetic correlation (-0.58 ± 0.15 that were not different from the corresponding posterior means of the heritabilities (0.42 ± 0.07, 0.14 ± 0.04 and the genetic correlation (-0.55 ± 0.14 inferred under the conventional cumulative probit link threshold model. Furthermore, the correlation (> 0.99 between posterior means of sire progeny merit from the two models suggested no meaningful rerankings. Nevertheless, the cumulative t-link model was decisively chosen as the better fitting model for this calving ease data using DIC and PBF.
DEFF Research Database (Denmark)
Tan, Qihua; Bathum, L; Christiansen, L
2003-01-01
In this paper, we apply logistic regression models to measure genetic association with human survival for highly polymorphic and pleiotropic genes. By modelling genotype frequency as a function of age, we introduce a logistic regression model with polytomous responses to handle the polymorphic...... situation. Genotype and allele-based parameterization can be used to investigate the modes of gene action and to reduce the number of parameters, so that the power is increased while the amount of multiple testing minimized. A binomial logistic regression model with fractional polynomials is used to capture...... the age-dependent or antagonistic pleiotropic effects. The models are applied to HFE genotype data to assess the effects on human longevity by different alleles and to detect if an age-dependent effect exists. Application has shown that these methods can serve as useful tools in searching for important...
Delbari, Masoomeh; Sharifazari, Salman; Mohammadi, Ehsan
2018-02-01
The knowledge of soil temperature at different depths is important for agricultural industry and for understanding climate change. The aim of this study is to evaluate the performance of a support vector regression (SVR)-based model in estimating daily soil temperature at 10, 30 and 100 cm depth at different climate conditions over Iran. The obtained results were compared to those obtained from a more classical multiple linear regression (MLR) model. The correlation sensitivity for the input combinations and periodicity effect were also investigated. Climatic data used as inputs to the models were minimum and maximum air temperature, solar radiation, relative humidity, dew point, and the atmospheric pressure (reduced to see level), collected from five synoptic stations Kerman, Ahvaz, Tabriz, Saghez, and Rasht located respectively in the hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid climate conditions. According to the results, the performance of both MLR and SVR models was quite well at surface layer, i.e., 10-cm depth. However, SVR performed better than MLR in estimating soil temperature at deeper layers especially 100 cm depth. Moreover, both models performed better in humid climate condition than arid and hyper-arid areas. Further, adding a periodicity component into the modeling process considerably improved the models' performance especially in the case of SVR.
Interaction of radon and smoking among Czech uranium miners using model of a threshold energy
International Nuclear Information System (INIS)
Boehm, R.; Holy, K.; Sedlak, A.
2014-01-01
Exposure to radon and smoking are among the most important factors influencing the risk of lung cancer. However, the joint effect of radon and smoking has not been sufficiently investigated so far. In this paper we will try to describe by means of a threshold energy model the mechanism of synergic effect of the aforementioned factors, and compare their influence on the risk of lung cancer. The model is based on the assumption that the inactivation of cells is caused by the excess of threshold specific energy z0 in the sensitive volume of the cell. Cigarette smoking causes, among others, an increase in the synthesis of the survivin protein that protects cells from apoptosis and thereby reduces their radiosensitivity. Survivin is therefore responsible for the increase of threshold energy z0, which in turn leads to the increase of lung cancer risk. A linear relationship between the threshold energy and the number of cigarettes smoked was assumed. The effect of smoking on radon exposure was evaluated for various groups of smokers that were defined by the degree of morphometric and geometric changes in the lungs induced by smoking and various degrees of chronic obstructive pulmonary disease. We simulate various scenarios of irradiation - short-term exposure, long-term exposure, as well as various smoking habits - smoker, ex-smoker. The calculated values can be, to an extent, compared to the epidemiological analysis geometric mixture models of Tomasek, who statistically evaluated epidemiological data about lung cancer occurrence among miners working in Jachymov and Pribram mines. From the results it follows that the correlation coefficient was particularly high. Although the approach outlined in this paper is only one of the many that strive to describe in detail the synergic effect of smoking and exposition, the used model can contribute to a more precise estimate of lung cancer risk in areas with various smoking habits. (authors)
Directory of Open Access Journals (Sweden)
Renata Pires Gonçalves
2012-02-01
. The experiments of type dosage x response are very common in the determination of levels of nutrients in optimal food balance and include the use of regression models to achieve this objective. Nevertheless, the regression analysis routine, generally, uses a priori information about a possible relationship between the response variable. The isotonic regression is a method of estimation by least squares that generates estimates which preserves data ordering. In the theory of isotonic regression this information is essential and it is expected to increase fitting efficiency. The objective of this work was to use an isotonic regression methodology, as an alternative way of analyzing data of Zn deposition in tibia of male birds of Hubbard lineage. We considered the models of plateau response of polynomial quadratic and linear exponential forms. In addition to these models, we also proposed the fitting of a logarithmic model to the data and the efficiency of the methodology was evaluated by Monte Carlo simulations, considering different scenarios for the parametric values. The isotonization of the data yielded an improvement in all the fitting quality parameters evaluated. Among the models used, the logarithmic presented estimates of the parameters more consistent with the values reported in literature.
Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard
2016-10-01
In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.
Endogenous glucose production from infancy to adulthood: a non-linear regression model
Huidekoper, Hidde H.; Ackermans, Mariëtte T.; Ruiter, An F. C.; Sauerwein, Hans P.; Wijburg, Frits A.
2014-01-01
To construct a regression model for endogenous glucose production (EGP) as a function of age, and compare this with glucose supplementation using commonly used dextrose-based saline solutions at fluid maintenance rate in children. A model was constructed based on EGP data, as quantified by
Modeling animal-vehicle collisions using diagonal inflated bivariate Poisson regression.
Lao, Yunteng; Wu, Yao-Jan; Corey, Jonathan; Wang, Yinhai
2011-01-01
Two types of animal-vehicle collision (AVC) data are commonly adopted for AVC-related risk analysis research: reported AVC data and carcass removal data. One issue with these two data sets is that they were found to have significant discrepancies by previous studies. In order to model these two types of data together and provide a better understanding of highway AVCs, this study adopts a diagonal inflated bivariate Poisson regression method, an inflated version of bivariate Poisson regression model, to fit the reported AVC and carcass removal data sets collected in Washington State during 2002-2006. The diagonal inflated bivariate Poisson model not only can model paired data with correlation, but also handle under- or over-dispersed data sets as well. Compared with three other types of models, double Poisson, bivariate Poisson, and zero-inflated double Poisson, the diagonal inflated bivariate Poisson model demonstrates its capability of fitting two data sets with remarkable overlapping portions resulting from the same stochastic process. Therefore, the diagonal inflated bivariate Poisson model provides researchers a new approach to investigating AVCs from a different perspective involving the three distribution parameters (λ(1), λ(2) and λ(3)). The modeling results show the impacts of traffic elements, geometric design and geographic characteristics on the occurrences of both reported AVC and carcass removal data. It is found that the increase of some associated factors, such as speed limit, annual average daily traffic, and shoulder width, will increase the numbers of reported AVCs and carcass removals. Conversely, the presence of some geometric factors, such as rolling and mountainous terrain, will decrease the number of reported AVCs. Published by Elsevier Ltd.
Gomes, Marcos José Timbó Lima; Cunto, Flávio; da Silva, Alan Ricardo
2017-09-01
Generalized Linear Models (GLM) with negative binomial distribution for errors, have been widely used to estimate safety at the level of transportation planning. The limited ability of this technique to take spatial effects into account can be overcome through the use of local models from spatial regression techniques, such as Geographically Weighted Poisson Regression (GWPR). Although GWPR is a system that deals with spatial dependency and heterogeneity and has already been used in some road safety studies at the planning level, it fails to account for the possible overdispersion that can be found in the observations on road-traffic crashes. Two approaches were adopted for the Geographically Weighted Negative Binomial Regression (GWNBR) model to allow discrete data to be modeled in a non-stationary form and to take note of the overdispersion of the data: the first examines the constant overdispersion for all the traffic zones and the second includes the variable for each spatial unit. This research conducts a comparative analysis between non-spatial global crash prediction models and spatial local GWPR and GWNBR at the level of traffic zones in Fortaleza/Brazil. A geographic database of 126 traffic zones was compiled from the available data on exposure, network characteristics, socioeconomic factors and land use. The models were calibrated by using the frequency of injury crashes as a dependent variable and the results showed that GWPR and GWNBR achieved a better performance than GLM for the average residuals and likelihood as well as reducing the spatial autocorrelation of the residuals, and the GWNBR model was more able to capture the spatial heterogeneity of the crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Regression-based model of skin diffuse reflectance for skin color analysis
Tsumura, Norimichi; Kawazoe, Daisuke; Nakaguchi, Toshiya; Ojima, Nobutoshi; Miyake, Yoichi
2008-11-01
A simple regression-based model of skin diffuse reflectance is developed based on reflectance samples calculated by Monte Carlo simulation of light transport in a two-layered skin model. This reflectance model includes the values of spectral reflectance in the visible spectra for Japanese women. The modified Lambert Beer law holds in the proposed model with a modified mean free path length in non-linear density space. The averaged RMS and maximum errors of the proposed model were 1.1 and 3.1%, respectively, in the above range.
Xia, Li; Shihada, Basem
2014-01-01
This paper studies the joint optimization problem of energy and delay in a multi-hop wireless network. The optimization variables are the transmission rates, which are adjustable according to the packet queueing length in the buffer. The optimization goal is to minimize the energy consumption of energy-critical nodes and the packet transmission delay throughout the network. In this paper, we aim at understanding the well-known decentralized algorithms which are threshold based from a different research angle. By using a simplified network model, we show that we can adopt the semi-open Jackson network model and study this optimization problem in closed form. This simplified network model further allows us to establish some significant optimality properties. We prove that the system performance is monotonic with respect to (w.r.t.) the transmission rate. We also prove that the threshold-type policy is optimal, i.e., when the number of packets in the buffer is larger than a threshold, transmit with the maximal rate (power); otherwise, no transmission. With these optimality properties, we develop a heuristic algorithm to iteratively find the optimal threshold. Finally, we conduct some simulation experiments to demonstrate the main idea of this paper.
Xia, Li
2014-11-20
This paper studies the joint optimization problem of energy and delay in a multi-hop wireless network. The optimization variables are the transmission rates, which are adjustable according to the packet queueing length in the buffer. The optimization goal is to minimize the energy consumption of energy-critical nodes and the packet transmission delay throughout the network. In this paper, we aim at understanding the well-known decentralized algorithms which are threshold based from a different research angle. By using a simplified network model, we show that we can adopt the semi-open Jackson network model and study this optimization problem in closed form. This simplified network model further allows us to establish some significant optimality properties. We prove that the system performance is monotonic with respect to (w.r.t.) the transmission rate. We also prove that the threshold-type policy is optimal, i.e., when the number of packets in the buffer is larger than a threshold, transmit with the maximal rate (power); otherwise, no transmission. With these optimality properties, we develop a heuristic algorithm to iteratively find the optimal threshold. Finally, we conduct some simulation experiments to demonstrate the main idea of this paper.
International Nuclear Information System (INIS)
Fang, Xiande; Xu, Yu
2011-01-01
The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination R c 2 ≥ 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: → Performed a critical review of empirical models of turbine efficiency. → Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. → Verified the method for developing the empirical model. → Verified the model.
Kirk-lawlor, N. E.; Edwards, E. C.
2012-12-01
In many groundwater systems, the height of the water table must be above certain thresholds for some types of surface flow to exist. Examples of flows that depend on water table elevation include groundwater baseflow to river systems, groundwater flow to wetland systems, and flow to springs. Meeting many of the goals of sustainable water resource management requires maintaining these flows at certain rates. Water resource management decisions invariably involve weighing tradeoffs between different possible usage regimes and the economic consequences of potential management choices are an important factor in these tradeoffs. Policies based on sustainability may have a social cost from forgoing present income. This loss of income may be worth bearing, but should be well understood and carefully considered. Traditionally, the economic theory of groundwater exploitation has relied on the assumption of a single-cell or "bathtub" aquifer model, which offers a simple means to examine complex interactions between water user and hydrologic system behavior. However, such a model assumes a closed system and does not allow for the simulation of groundwater outflows that depend on water table elevation (e.g. baseflow, springs, wetlands), even though those outflows have value. We modify the traditional single-cell aquifer model by allowing for outflows when the water table is above certain threshold elevations. These thresholds behave similarly to holes in a bathtub, where the outflow is a positive function of the height of the water table above the threshold and the outflow is lost when the water table drops below the threshold. We find important economic consequences to this representation of the groundwater system. The economic value of services provided by threshold-dependent outflows (including non-market value), such as ecosystem services, can be incorporated. The value of services provided by these flows may warrant maintaining the water table at higher levels than would
Kuramoto model with uniformly spaced frequencies: Finite-N asymptotics of the locking threshold.
Ottino-Löffler, Bertrand; Strogatz, Steven H
2016-06-01
We study phase locking in the Kuramoto model of coupled oscillators in the special case where the number of oscillators, N, is large but finite, and the oscillators' natural frequencies are evenly spaced on a given interval. In this case, stable phase-locked solutions are known to exist if and only if the frequency interval is narrower than a certain critical width, called the locking threshold. For infinite N, the exact value of the locking threshold was calculated 30 years ago; however, the leading corrections to it for finite N have remained unsolved analytically. Here we derive an asymptotic formula for the locking threshold when N≫1. The leading correction to the infinite-N result scales like either N^{-3/2} or N^{-1}, depending on whether the frequencies are evenly spaced according to a midpoint rule or an end-point rule. These scaling laws agree with numerical results obtained by Pazó [D. Pazó, Phys. Rev. E 72, 046211 (2005)PLEEE81539-375510.1103/PhysRevE.72.046211]. Moreover, our analysis yields the exact prefactors in the scaling laws, which also match the numerics.
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors
Directory of Open Access Journals (Sweden)
Xibin Zhang
2016-04-01
Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Threshold effect under nonlinear limitation of the intensity of high-power light
International Nuclear Information System (INIS)
Tereshchenko, S A; Podgaetskii, V M; Gerasimenko, A Yu; Savel'ev, M S
2015-01-01
A model is proposed to describe the properties of limiters of high-power laser radiation, which takes into account the threshold character of nonlinear interaction of radiation with the working medium of the limiter. The generally accepted non-threshold model is a particular case of the threshold model if the threshold radiation intensity is zero. Experimental z-scan data are used to determine the nonlinear optical characteristics of media with carbon nanotubes, polymethine and pyran dyes, zinc selenide, porphyrin-graphene and fullerene-graphene. A threshold effect of nonlinear interaction between laser radiation and some of investigated working media of limiters is revealed. It is shown that the threshold model more adequately describes experimental z-scan data. (nonlinear optical phenomena)
A hydrologic regression sediment-yield model for two ungaged watershed outlet stations in Africa
International Nuclear Information System (INIS)
Moussa, O.M.; Smith, S.E.; Shrestha, R.L.
1991-01-01
A hydrologic regression sediment-yield model was established to determine the relationship between water discharge and suspended sediment discharge at the Blue Nile and the Atbara River outlet stations during the flood season. The model consisted of two main submodels: (1) a suspended sediment discharge model, which was used to determine suspended sediment discharge for each basin outlet; and (2) a sediment rating model, which related water discharge and suspended sediment discharge for each outlet station. Due to the absence of suspended sediment concentration measurements at or near the outlet stations, a minimum norm solution, which is based on the minimization of the unknowns rather than the residuals, was used to determine the suspended sediment discharges at the stations. In addition, the sediment rating submodel was regressed by using an observation equations procedure. Verification analyses on the model were carried out and the mean percentage errors were found to be +12.59 and -12.39, respectively, for the Blue Nile and Atbara. The hydrologic regression model was found to be most sensitive to the relative weight matrix, moderately sensitive to the mean water discharge ratio, and slightly sensitive to the concentration variation along the River Nile's course
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Cobben, Marleen M P; van Noordwijk, Arie J
2017-10-01
Migration is a widespread phenomenon across the animal kingdom as a response to seasonality in environmental conditions. Partially migratory populations are populations that consist of both migratory and residential individuals. Such populations are very common, yet their stability has long been debated. The inheritance of migratory activity is currently best described by the threshold model of quantitative genetics. The inclusion of such a genetic threshold model for migratory behavior leads to a stable zone in time and space of partially migratory populations under a wide range of demographic parameter values, when assuming stable environmental conditions and unlimited genetic diversity. Migratory species are expected to be particularly sensitive to global warming, as arrival at the breeding grounds might be increasingly mistimed as a result of the uncoupling of long-used cues and actual environmental conditions, with decreasing reproduction as a consequence. Here, we investigate the consequences for migratory behavior and the stability of partially migratory populations under five climate change scenarios and the assumption of a genetic threshold value for migratory behavior in an individual-based model. The results show a spatially and temporally stable zone of partially migratory populations after different lengths of time in all scenarios. In the scenarios in which the species expands its range from a particular set of starting populations, the genetic diversity and location at initialization determine the species' colonization speed across the zone of partial migration and therefore across the entire landscape. Abruptly changing environmental conditions after model initialization never caused a qualitative change in phenotype distributions, or complete extinction. This suggests that climate change-induced shifts in species' ranges as well as changes in survival probabilities and reproductive success can be met with flexibility in migratory behavior at the
Directory of Open Access Journals (Sweden)
J. Chardon
2018-01-01
Full Text Available Statistical downscaling models (SDMs are often used to produce local weather scenarios from large-scale atmospheric information. SDMs include transfer functions which are based on a statistical link identified from observations between local weather and a set of large-scale predictors. As physical processes driving surface weather vary in time, the most relevant predictors and the regression link are likely to vary in time too. This is well known for precipitation for instance and the link is thus often estimated after some seasonal stratification of the data. In this study, we present a two-stage analog/regression model where the regression link is estimated from atmospheric analogs of the current prediction day. Atmospheric analogs are identified from fields of geopotential heights at 1000 and 500 hPa. For the regression stage, two generalized linear models are further used to model the probability of precipitation occurrence and the distribution of non-zero precipitation amounts, respectively. The two-stage model is evaluated for the probabilistic prediction of small-scale precipitation over France. It noticeably improves the skill of the prediction for both precipitation occurrence and amount. As the analog days vary from one prediction day to another, the atmospheric predictors selected in the regression stage and the value of the corresponding regression coefficients can vary from one prediction day to another. The model allows thus for a day-to-day adaptive and tailored downscaling. It can also reveal specific predictors for peculiar and non-frequent weather configurations.
A test of inflated zeros for Poisson regression models.
He, Hua; Zhang, Hui; Ye, Peng; Tang, Wan
2017-01-01
Excessive zeros are common in practice and may cause overdispersion and invalidate inference when fitting Poisson regression models. There is a large body of literature on zero-inflated Poisson models. However, methods for testing whether there are excessive zeros are less well developed. The Vuong test comparing a Poisson and a zero-inflated Poisson model is commonly applied in practice. However, the type I error of the test often deviates seriously from the nominal level, rendering serious doubts on the validity of the test in such applications. In this paper, we develop a new approach for testing inflated zeros under the Poisson model. Unlike the Vuong test for inflated zeros, our method does not require a zero-inflated Poisson model to perform the test. Simulation studies show that when compared with the Vuong test our approach not only better at controlling type I error rate, but also yield more power.
Baryon-antibaryon threshold and ω-baryonium mixing
International Nuclear Information System (INIS)
Gavai, R.V.
1981-01-01
It is shown that in any dual-topological-unitarization model of ω-baryonium (B) mixing at the cylinder level, in which the production of baryon-antibaryon (bb-bar) pairs can take place only above a certain threshold energy, the phenomenologically relevant ω and B trajectories do not mix below bb-bar threshold. However, their couplings to external particles do get modified. The ω-B mixing angle theta/sub omegahyphenB/, which characterizes these coupling modification effects below bb-bar threshold at t = 0, is estimated in some models. These estimates are found to agree reasonably well with the existing phenomenological bound on theta/sub omegahyphenB/
Estimating the Threshold Level of Inflation for Thailand
Jiranyakul, Komain
2017-01-01
Abstract. This paper analyzes the relationship between inflation and economic growth in Thailand using annual dataset during 1990 and 2015. The threshold model is estimated for different levels of threshold inflation rate. The results suggest that the threshold level of inflation above which inflation significantly slow growth is estimated at 3 percent. The negative relationship between inflation and growth is apparent above this threshold level of inflation. In other words, the inflation rat...
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
International Nuclear Information System (INIS)
Abdolmaleki, P.; Yarmohammadi, M.; Gity, M.
2004-01-01
Background: We designed an algorithmic model based on regression analysis and a non-algorithmic model based on the Artificial Neural Network. Materials and methods: The ability of these models was compared together in clinical application to differentiate malignant from benign breast tumors in a study group of 161 patient's records. Each patient's record consisted of 6 subjective features extracted from MRI appearance. These findings were enclosed as features extracted for an Artificial Neural Network as well as a logistic regression model to predict biopsy outcome. After both models had been trained perfectly on samples (n=100), the validation samples (n=61) were presented to the trained network as well as the established logistic regression models. Finally, the diagnostic performance of models were compared to the that of the radiologist in terms of sensitivity, specificity and accuracy, using receiver operating characteristic curve analysis. Results: The average out put of the Artificial Neural Network yielded a perfect sensitivity (98%) and high accuracy (90%) similar to that one of an expert radiologist (96% and 92%) while specificity was smaller than that (67%) verses 80%). The output of the logistic regression model using significant features showed improvement in specificity from 60% for the logistic regression model using all features to 93% for the reduced logistic regression model, keeping the accuracy around 90%. Conclusion: Results show that Artificial Neural Network and logistic regression model prove the relationship between extracted morphological features and biopsy results. Using statistically significant variables reduced logistic regression model outperformed of Artificial Neural Network with remarkable specificity while keeping high sensitivity is achieved
Chandler, T L; Pralle, R S; Dórea, J R R; Poock, S E; Oetzel, G R; Fourdraine, R H; White, H M
2018-03-01
Although cowside testing strategies for diagnosing hyperketonemia (HYK) are available, many are labor intensive and costly, and some lack sufficient accuracy. Predicting milk ketone bodies by Fourier transform infrared spectrometry during routine milk sampling may offer a more practical monitoring strategy. The objectives of this study were to (1) develop linear and logistic regression models using all available test-day milk and performance variables for predicting HYK and (2) compare prediction methods (Fourier transform infrared milk ketone bodies, linear regression models, and logistic regression models) to determine which is the most predictive of HYK. Given the data available, a secondary objective was to evaluate differences in test-day milk and performance variables (continuous measurements) between Holsteins and Jerseys and between cows with or without HYK within breed. Blood samples were collected on the same day as milk sampling from 658 Holstein and 468 Jersey cows between 5 and 20 d in milk (DIM). Diagnosis of HYK was at a serum β-hydroxybutyrate (BHB) concentration ≥1.2 mmol/L. Concentrations of milk BHB and acetone were predicted by Fourier transform infrared spectrometry (Foss Analytical, Hillerød, Denmark). Thresholds of milk BHB and acetone were tested for diagnostic accuracy, and logistic models were built from continuous variables to predict HYK in primiparous and multiparous cows within breed. Linear models were constructed from continuous variables for primiparous and multiparous cows within breed that were 5 to 11 DIM or 12 to 20 DIM. Milk ketone body thresholds diagnosed HYK with 64.0 to 92.9% accuracy in Holsteins and 59.1 to 86.6% accuracy in Jerseys. Logistic models predicted HYK with 82.6 to 97.3% accuracy. Internally cross-validated multiple linear regression models diagnosed HYK of Holstein cows with 97.8% accuracy for primiparous and 83.3% accuracy for multiparous cows. Accuracy of Jersey models was 81.3% in primiparous and 83
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
The impact of public debt on the twin imbalances in Europe: A threshold model
Directory of Open Access Journals (Sweden)
Šuliková Veronika
2017-01-01
Full Text Available Recent empirical research rejecting twin deficits in indebted countries and current account imbalances adjustment in Europe led to the idea to test the twin imbalances at different public debt-to-GDP intervals. The analysis covers 14 EU countries over the time period 1995-2012. A panel data threshold model with fixed effects estimates two debt-to-GDP thresholds (40.2% and 96.6%, which determine three debt-to-GDP intervals in the twin relationship. If public debtto-GDP is less than 40.2%, the model determines a negative relationship (twin divergence between budget balance and current account. Twin deficits (surpluses are confirmed exclusively if debt-to-GDP is in the interval between 40.2% and 96.6%. A twin divergence is also confirmed if public debt-to-GDP is more than 96.6% (e.g., as in Greece and Italy. The results confirm that increased indebtedness in European countries contributed to their current account imbalance adjustment.
Capacitance Regression Modelling Analysis on Latex from Selected Rubber Tree Clones
International Nuclear Information System (INIS)
Rosli, A D; Baharudin, R; Hashim, H; Khairuzzaman, N A; Mohd Sampian, A F; Abdullah, N E; Kamaru'zzaman, M; Sulaiman, M S
2015-01-01
This paper investigates the capacitance regression modelling performance of latex for various rubber tree clones, namely clone 2002, 2008, 2014 and 3001. Conventionally, the rubber tree clones identification are based on observation towards tree features such as shape of leaf, trunk, branching habit and pattern of seeds texture. The former method requires expert persons and very time-consuming. Currently, there is no sensing device based on electrical properties that can be employed to measure different clones from latex samples. Hence, with a hypothesis that the dielectric constant of each clone varies, this paper discusses the development of a capacitance sensor via Capacitance Comparison Bridge (known as capacitance sensor) to measure an output voltage of different latex samples. The proposed sensor is initially tested with 30ml of latex sample prior to gradually addition of dilution water. The output voltage and capacitance obtained from the test are recorded and analyzed using Simple Linear Regression (SLR) model. This work outcome infers that latex clone of 2002 has produced the highest and reliable linear regression line with determination coefficient of 91.24%. In addition, the study also found that the capacitive elements in latex samples deteriorate if it is diluted with higher volume of water. (paper)
Hansen, Anja; Krueger, Alexander; Ripken, Tammo
2013-03-01
In ophthalmic microsurgery tissue dissection is achieved using femtosecond laser pulses to create an optical breakdown. For vitreo-retinal applications the irradiance distribution in the focal volume is distorted by the anterior components of the eye causing a raised threshold energy for breakdown. In this work, an adaptive optics system enables spatial beam shaping for compensation of aberrations and investigation of wave front influence on optical breakdown. An eye model was designed to allow for aberration correction as well as detection of optical breakdown. The eye model consists of an achromatic lens for modeling the eye's refractive power, a water chamber for modeling the tissue properties, and a PTFE sample for modeling the retina's scattering properties. Aberration correction was performed using a deformable mirror in combination with a Hartmann-Shack-sensor. The influence of an adaptive optics aberration correction on the pulse energy required for photodisruption was investigated using transmission measurements for determination of the breakdown threshold and video imaging of the focal region for study of the gas bubble dynamics. The threshold energy is considerably reduced when correcting for the aberrations of the system and the model eye. Also, a raise in irradiance at constant pulse energy was shown for the aberration corrected case. The reduced pulse energy lowers the potential risk of collateral damage which is especially important for retinal safety. This offers new possibilities for vitreo-retinal surgery using femtosecond laser pulses.
Forutan, M; Ansari Mahyari, S; Sargolzaei, M
2015-02-01
Calf and heifer survival are important traits in dairy cattle affecting profitability. This study was carried out to estimate genetic parameters of survival traits in female calves at different age periods, until nearly the first calving. Records of 49,583 female calves born during 1998 and 2009 were considered in five age periods as days 1-30, 31-180, 181-365, 366-760 and full period (day 1-760). Genetic components were estimated based on linear and threshold sire models and linear animal models. The models included both fixed effects (month of birth, dam's parity number, calving ease and twin/single) and random effects (herd-year, genetic effect of sire or animal and residual). Rates of death were 2.21, 3.37, 1.97, 4.14 and 12.4% for the above periods, respectively. Heritability estimates were very low ranging from 0.48 to 3.04, 0.62 to 3.51 and 0.50 to 4.24% for linear sire model, animal model and threshold sire model, respectively. Rank correlations between random effects of sires obtained with linear and threshold sire models and with linear animal and sire models were 0.82-0.95 and 0.61-0.83, respectively. The estimated genetic correlations between the five different periods were moderate and only significant for 31-180 and 181-365 (r(g) = 0.59), 31-180 and 366-760 (r(g) = 0.52), and 181-365 and 366-760 (r(g) = 0.42). The low genetic correlations in current study would suggest that survival at different periods may be affected by the same genes with different expression or by different genes. Even though the additive genetic variations of survival traits were small, it might be possible to improve these traits by traditional or genomic selection. © 2014 Blackwell Verlag GmbH.
Salmerón, Diego; Cano, Juan A; Chirlaque, María D
2015-08-30
In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.
Koeneman, Margot M; van Lint, Freyja H M; van Kuijk, Sander M J; Smits, Luc J M; Kooreman, Loes F S; Kruitwagen, Roy F P M; Kruse, Arnold J
2017-01-01
This study aims to develop a prediction model for spontaneous regression of cervical intraepithelial neoplasia grade 2 (CIN 2) lesions based on simple clinicopathological parameters. The study was conducted at Maastricht University Medical Center, the Netherlands. The prediction model was developed in a retrospective cohort of 129 women with a histologic diagnosis of CIN 2 who were managed by watchful waiting for 6 to 24months. Five potential predictors for spontaneous regression were selected based on the literature and expert opinion and were analyzed in a multivariable logistic regression model, followed by backward stepwise deletion based on the Wald test. The prediction model was internally validated by the bootstrapping method. Discriminative capacity and accuracy were tested by assessing the area under the receiver operating characteristic curve (AUC) and a calibration plot. Disease regression within 24months was seen in 91 (71%) of 129 patients. A prediction model was developed including the following variables: smoking, Papanicolaou test outcome before the CIN 2 diagnosis, concomitant CIN 1 diagnosis in the same biopsy, and more than 1 biopsy containing CIN 2. Not smoking, Papanicolaou class predictive of disease regression. The AUC was 69.2% (95% confidence interval, 58.5%-79.9%), indicating a moderate discriminative ability of the model. The calibration plot indicated good calibration of the predicted probabilities. This prediction model for spontaneous regression of CIN 2 may aid physicians in the personalized management of these lesions. Copyright © 2016 Elsevier Inc. All rights reserved.
DEFF Research Database (Denmark)
Carstensen, Bendix
1996-01-01
This paper shows how to fit excess and relative risk regression models to interval censored survival data, and how to implement the models in standard statistical software. The methods developed are used for the analysis of HIV infection rates in a cohort of Danish homosexual men.......This paper shows how to fit excess and relative risk regression models to interval censored survival data, and how to implement the models in standard statistical software. The methods developed are used for the analysis of HIV infection rates in a cohort of Danish homosexual men....
Modeling vector nonlinear time series using POLYMARS
de Gooijer, J.G.; Ray, B.K.
2003-01-01
A modified multivariate adaptive regression splines method for modeling vector nonlinear time series is investigated. The method results in models that can capture certain types of vector self-exciting threshold autoregressive behavior, as well as provide good predictions for more general vector
Hierarchical regression analysis in structural Equation Modeling
de Jong, P.F.
1999-01-01
In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main
Sornette, Didier
1993-05-01
A mean-field (MF) model of the critical behavior of charge-density waves below the threshold for sliding is proposed, which replaces the combined effect of the pinning force and of the forces exerted by the neighbors on a given particle n by an effective force threshold Xn. It allows one to rationalize the numerical results of Middleton and Fisher [Phys. Rev. Lett. 66 (1991) 92] on the divergence of the polarization and of the largest correlation length and of Pla and Nori [Phys. Rev. Lett. 67 (1991) 919] on the distribution D( d) of sliding bursts of size d, measured in narrow intervals of driving fields E at a finite distance below the threshold Ec.
Soft Sensor Modeling Based on Multiple Gaussian Process Regression and Fuzzy C-mean Clustering
Directory of Open Access Journals (Sweden)
Xianglin ZHU
2014-06-01
Full Text Available In order to overcome the difficulties of online measurement of some crucial biochemical variables in fermentation processes, a new soft sensor modeling method is presented based on the Gaussian process regression and fuzzy C-mean clustering. With the consideration that the typical fermentation process can be distributed into 4 phases including lag phase, exponential growth phase, stable phase and dead phase, the training samples are classified into 4 subcategories by using fuzzy C- mean clustering algorithm. For each sub-category, the samples are trained using the Gaussian process regression and the corresponding soft-sensing sub-model is established respectively. For a new sample, the membership between this sample and sub-models are computed based on the Euclidean distance, and then the prediction output of soft sensor is obtained using the weighting sum. Taking the Lysine fermentation as example, the simulation and experiment are carried out and the corresponding results show that the presented method achieves better fitting and generalization ability than radial basis function neutral network and single Gaussian process regression model.
An evaluation of bias in propensity score-adjusted non-linear regression models.
Wan, Fei; Mitra, Nandita
2018-03-01
Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.
Regional Seismic Threshold Monitoring
National Research Council Canada - National Science Library
Kvaerna, Tormod
2006-01-01
... model to be used for predicting the travel times of regional phases. We have applied these attenuation relations to develop and assess a regional threshold monitoring scheme for selected subregions of the European Arctic...
Modeling the number of car theft using Poisson regression
Zulkifli, Malina; Ling, Agnes Beh Yen; Kasim, Maznah Mat; Ismail, Noriszura
2016-10-01
Regression analysis is the most popular statistical methods used to express the relationship between the variables of response with the covariates. The aim of this paper is to evaluate the factors that influence the number of car theft using Poisson regression model. This paper will focus on the number of car thefts that occurred in districts in Peninsular Malaysia. There are two groups of factor that have been considered, namely district descriptive factors and socio and demographic factors. The result of the study showed that Bumiputera composition, Chinese composition, Other ethnic composition, foreign migration, number of residence with the age between 25 to 64, number of employed person and number of unemployed person are the most influence factors that affect the car theft cases. These information are very useful for the law enforcement department, insurance company and car owners in order to reduce and limiting the car theft cases in Peninsular Malaysia.
Support vector regression model based predictive control of water level of U-tube steam generators
Energy Technology Data Exchange (ETDEWEB)
Kavaklioglu, Kadir, E-mail: kadir.kavaklioglu@pau.edu.tr
2014-10-15
Highlights: • Water level of U-tube steam generators was controlled in a model predictive fashion. • Models for steam generator water level were built using support vector regression. • Cost function minimization for future optimal controls was performed by using the steepest descent method. • The results indicated the feasibility of the proposed method. - Abstract: A predictive control algorithm using support vector regression based models was proposed for controlling the water level of U-tube steam generators of pressurized water reactors. Steam generator data were obtained using a transfer function model of U-tube steam generators. Support vector regression based models were built using a time series type model structure for five different operating powers. Feedwater flow controls were calculated by minimizing a cost function that includes the level error, the feedwater change and the mismatch between feedwater and steam flow rates. Proposed algorithm was applied for a scenario consisting of a level setpoint change and a steam flow disturbance. The results showed that steam generator level can be controlled at all powers effectively by the proposed method.
Oil and gas pipeline construction cost analysis and developing regression models for cost estimation
Thaduri, Ravi Kiran
In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.
Penfield, Randall D.; Myers, Nicholas D.; Wolfe, Edward W.
2008-01-01
Measurement invariance in the partial credit model (PCM) can be conceptualized in several different but compatible ways. In this article the authors distinguish between three forms of measurement invariance in the PCM: step invariance, item invariance, and threshold invariance. Approaches for modeling these three forms of invariance are proposed,…
Dynamic Regression Intervention Modeling for the Malaysian Daily Load
Directory of Open Access Journals (Sweden)
Fadhilah Abdrazak
2014-05-01
Full Text Available Malaysia is a unique country due to having both fixed and moving holidays. These moving holidays may overlap with other fixed holidays and therefore, increase the complexity of the load forecasting activities. The errors due to holidays’ effects in the load forecasting are known to be higher than other factors. If these effects can be estimated and removed, the behavior of the series could be better viewed. Thus, the aim of this paper is to improve the forecasting errors by using a dynamic regression model with intervention analysis. Based on the linear transfer function method, a daily load model consists of either peak or average is developed. The developed model outperformed the seasonal ARIMA model in estimating the fixed and moving holidays’ effects and achieved a smaller Mean Absolute Percentage Error (MAPE in load forecast.
Li, Tao
2018-06-01
The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.
CVTresh: R Package for Level-Dependent Cross-Validation Thresholding
Directory of Open Access Journals (Sweden)
Donghoh Kim
2006-04-01
Full Text Available The core of the wavelet approach to nonparametric regression is thresholding of wavelet coefficients. This paper reviews a cross-validation method for the selection of the thresholding value in wavelet shrinkage of Oh, Kim, and Lee (2006, and introduces the R package CVThresh implementing details of the calculations for the procedures. This procedure is implemented by coupling a conventional cross-validation with a fast imputation method, so that it overcomes a limitation of data length, a power of 2. It can be easily applied to the classical leave-one-out cross-validation and K-fold cross-validation. Since the procedure is computationally fast, a level-dependent cross-validation can be developed for wavelet shrinkage of data with various sparseness according to levels.
CVTresh: R Package for Level-Dependent Cross-Validation Thresholding
Directory of Open Access Journals (Sweden)
Donghoh Kim
2006-04-01
Full Text Available The core of the wavelet approach to nonparametric regression is thresholding of wavelet coefficients. This paper reviews a cross-validation method for the selection of the thresholding value in wavelet shrinkage of Oh, Kim, and Lee (2006, and introduces the R package CVThresh implementing details of the calculations for the procedures.This procedure is implemented by coupling a conventional cross-validation with a fast imputation method, so that it overcomes a limitation of data length, a power of 2. It can be easily applied to the classical leave-one-out cross-validation and K-fold cross-validation. Since the procedure is computationally fast, a level-dependent cross-validation can be developed for wavelet shrinkage of data with various sparseness according to levels.
Saramekala, Gopi Krishna; Tiwari, Pramod Kumar
2016-10-01
This paper presents an analytical threshold voltage model for back-gated fully depleted (FD), recessed-source drain silicon-on-insulator metal-oxide-semiconductor field-effect transistors (MOSFETs). Analytical surface potential models have been developed at front and back surfaces of the channel by solving the two-dimensional (2-D) Poisson's equation in the channel region with appropriate boundary conditions assuming a parabolic potential profile in the transverse direction of the channel. The strong inversion criterion is applied to the front surface potential as well as on the back one in order to find two separate threshold voltages for front and back channels of the device, respectively. The device threshold voltage has been assumed to be associated with the surface that offers a lower threshold voltage. The developed model was analyzed extensively for a variety of device geometry parameters like the oxide and silicon channel thicknesses, the thickness of the source/drain extension in the buried oxide, and the applied bias voltages with back-gate control. The proposed model has been validated by comparing the analytical results with numerical simulation data obtained from ATLAS™, a 2-D device simulator from SILVACO.
Loch, R.A.; Sobierajski, R.; Louis, Eric; Bosgra, J.; Bosgra, J.; Bijkerk, Frederik
2012-01-01
The single shot damage thresholds of multilayer optics for highintensity short-wavelength radiation sources are theoretically investigated, using a model developed on the basis of experimental data obtained at the FLASH and LCLS free electron lasers. We compare the radiation hardness of commonly
An Efficient Code-Based Threshold Ring Signature Scheme with a Leader-Participant Model
Directory of Open Access Journals (Sweden)
Guomin Zhou
2017-01-01
Full Text Available Digital signature schemes with additional properties have broad applications, such as in protecting the identity of signers allowing a signer to anonymously sign a message in a group of signers (also known as a ring. While these number-theoretic problems are still secure at the time of this research, the situation could change with advances in quantum computing. There is a pressing need to design PKC schemes that are secure against quantum attacks. In this paper, we propose a novel code-based threshold ring signature scheme with a leader-participant model. A leader is appointed, who chooses some shared parameters for other signers to participate in the signing process. This leader-participant model enhances the performance because every participant including the leader could execute the decoding algorithm (as a part of signing process upon receiving the shared parameters from the leader. The time complexity of our scheme is close to Courtois et al.’s (2001 scheme. The latter is often used as a basis to construct other types of code-based signature schemes. Moreover, as a threshold ring signature scheme, our scheme is as efficient as the normal code-based ring signature.
History-Based Response Threshold Model for Division of Labor in Multi-Agent Systems
Lee, Wonki; Kim, DaeEun
2017-01-01
Dynamic task allocation is a necessity in a group of robots. Each member should decide its own task such that it is most commensurate with its current state in the overall system. In this work, the response threshold model is applied to a dynamic foraging task. Each robot employs a task switching function based on the local task demand obtained from the surrounding environment, and no communication occurs between the robots. Each individual member has a constant-sized task demand history that reflects the global demand. In addition, it has response threshold values for all of the tasks and manages the task switching process depending on the stimuli of the task demands. The robot then determines the task to be executed to regulate the overall division of labor. This task selection induces a specialized tendency for performing a specific task and regulates the division of labor. In particular, maintaining a history of the task demands is very effective for the dynamic foraging task. Various experiments are performed using a simulation with multiple robots, and the results show that the proposed algorithm is more effective as compared to the conventional model. PMID:28555031
Climate Impacts on Chinese Corn Yields: A Fractional Polynomial Regression Model
Kooten, van G.C.; Sun, Baojing
2012-01-01
In this study, we examine the effect of climate on corn yields in northern China using data from ten districts in Inner Mongolia and two in Shaanxi province. A regression model with a flexible functional form is specified, with explanatory variables that include seasonal growing degree days,
Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E
2015-05-01
The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Modeling the residual effects and threshold saturation of training: a case study of Olympic swimmers
Hellard, Philippe; Avalos, Marta; Millet, Grégoire; Lacoste, Lucien; Barale, Frédéric; Chatard, Jean-Claude
2005-01-01
The aim of this study was to model the residual effects of training on the swimming performance and to compare a model including threshold saturation (MM) to the Banister model (BM). Seven Olympic swimmers were studied over a period of 4 ± 2 years. For three training loads (low-intensity wLIT, high-intensity wHIT and strength training wST), three residual training effects were determined: short-term (STE) during the taper phase, i.e. three weeks before the performance (weeks 0, −1, −2), intermediate-term (ITE) during the intensity phase (weeks −3, −4 and −5) and long-term (LTE) during the volume phase (weeks −6, −7, −8). ITE and LTE were positive for wHIT and wLIT, respectively (P < 0.05). wLIT during taper was related to performances by a parabolic relationship (P < 0.05). Different quality measures indicated that MM compares favorably with BM. Identifying individual training thresholds may help individualizing the distribution of training loads. PMID:15705048
Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan
2011-11-01
To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Suzuki, Makoto; Sugimura, Yuko; Yamada, Sumio; Omori, Yoshitsugu; Miyamoto, Masaaki; Yamamoto, Jun-ichi
2013-01-01
Cognitive disorders in the acute stage of stroke are common and are important independent predictors of adverse outcome in the long term. Despite the impact of cognitive disorders on both patients and their families, it is still difficult to predict the extent or duration of cognitive impairments. The objective of the present study was, therefore, to provide data on predicting the recovery of cognitive function soon after stroke by differential modeling with logarithmic and linear regression. This study included two rounds of data collection comprising 57 stroke patients enrolled in the first round for the purpose of identifying the time course of cognitive recovery in the early-phase group data, and 43 stroke patients in the second round for the purpose of ensuring that the correlation of the early-phase group data applied to the prediction of each individual's degree of cognitive recovery. In the first round, Mini-Mental State Examination (MMSE) scores were assessed 3 times during hospitalization, and the scores were regressed on the logarithm and linear of time. In the second round, calculations of MMSE scores were made for the first two scoring times after admission to tailor the structures of logarithmic and linear regression formulae to fit an individual's degree of functional recovery. The time course of early-phase recovery for cognitive functions resembled both logarithmic and linear functions. However, MMSE scores sampled at two baseline points based on logarithmic regression modeling could estimate prediction of cognitive recovery more accurately than could linear regression modeling (logarithmic modeling, R(2) = 0.676, PLogarithmic modeling based on MMSE scores could accurately predict the recovery of cognitive function soon after the occurrence of stroke. This logarithmic modeling with mathematical procedures is simple enough to be adopted in daily clinical practice.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
International Nuclear Information System (INIS)
Shin, Ho Cheol; Park, Moon Ghu; You, Skin
2006-01-01
Recently, many on-line approaches to instrument channel surveillance (drift monitoring and fault detection) have been reported worldwide. On-line monitoring (OLM) method evaluates instrument channel performance by assessing its consistency with other plant indications through parametric or non-parametric models. The heart of an OLM system is the model giving an estimate of the true process parameter value against individual measurements. This model gives process parameter estimate calculated as a function of other plant measurements which can be used to identify small sensor drifts that would require the sensor to be manually calibrated or replaced. This paper describes an improvement of auto associative kernel regression (AAKR) by introducing a correlation coefficient weighting on kernel distances. The prediction performance of the developed method is compared with conventional auto-associative kernel regression
Scott, B R; Potter, C A
2014-07-01
Whole-body exposure to large radiation doses can cause severe loss of hematopoietic tissue cells and threaten life if the lost cells are not replaced in a timely manner through natural repopulation (a homeostatic mechanism). Repopulation to the baseline level N 0 is called reconstitution and a reconstitution deficit (repopulation shortfall) can occur in a dose-related and organ-specific manner. Scott et al. (2013) previously introduced a deterministic version of a threshold exponential (TE) model of tissue-reconstitution deficit at a given follow-up time that was applied to bone marrow and spleen cellularity (number of constituent cells) data obtained 6 weeks after whole-body gamma-ray exposure of female C.B-17 mice. In this paper a more realistic, stochastic version of the TE model is provided that allows radiation response to vary between different individuals. The Stochastic TE model is applied to post gamma-ray-exposure cellularity data previously reported and also to more limited X-ray cellularity data for whole-body irradiated female C.B-17 mice. Results indicate that the population average threshold for a tissue reconstitution deficit appears to be similar for bone marrow and spleen and for 320-kV-spectrum X-rays and Cs-137 gamma rays. This means that 320-kV spectrum X-rays could successfully be used in conducting such studies.
Directory of Open Access Journals (Sweden)
Menon Carlo
2011-09-01
Full Text Available Abstract Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2 values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS was shown to have high isometric torque estimation accuracy combined with very short training times.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other m