Liu, Xian; Engel, Charles C
2012-12-20
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.
Interpreting Results from the Multinomial Logit Model
DEFF Research Database (Denmark)
Wulff, Jesper
2015-01-01
This article provides guidelines and illustrates practical steps necessary for an analysis of results from the multinomial logit model (MLM). The MLM is a popular model in the strategy literature because it allows researchers to examine strategic choices with multiple outcomes. However, there see...... suitable for both interpretation and communication of results. The pratical steps are illustrated through an application of the MLM to the choice of foreign market entry mode.......This article provides guidelines and illustrates practical steps necessary for an analysis of results from the multinomial logit model (MLM). The MLM is a popular model in the strategy literature because it allows researchers to examine strategic choices with multiple outcomes. However, there seem...... to be systematic issues with regard to how researchers interpret their results when using the MLM. In this study, I present a set of guidelines critical to analyzing and interpreting results from the MLM. The procedure involves intuitive graphical representations of predicted probabilities and marginal effects...
and Multinomial Logistic Regression
African Journals Online (AJOL)
This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).
Interpreting Marginal Effects in the Multinomial Logit Model
DEFF Research Database (Denmark)
Wulff, Jesper
2014-01-01
with a substantial increase in the probability of entering a foreign market using a joint venture, while increases in the unpredictability in the host country environment are associated with a lower probability of wholly owned subsidiaries and a higher probability of exporting entries....... that have entered foreign markets. Through the application of a multinomial logit model, careful analysis of the marginal effects is performed through graphical representations, marginal effects at the mean, average marginal effects and elasticities. I show that increasing cultural distance is associated......This paper presents the challenges when researchers interpret results about relationships between variables from discrete choice models with multiple outcomes. The recommended approach is demonstrated by testing predictions from transaction cost theory on a sample of 246 Scandinavian firms...
Directory of Open Access Journals (Sweden)
Dilek ALTAŞ
2013-05-01
Full Text Available Watching the commercials depends on the choice of the viewer. Most of the television viewing takes place during “Prime-Time” unfortunately; many viewers opt to zap to other channels when commercials start. The television viewers’ demographic characteristics may indicate the likelihood of the zapping frequency. Analysis made by using Multinomial Logit Model indicates how effective the demographic variables are in the watching rate of the first minute of the television commercials.
Multinomial logistic regression in workers' health
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
The consumer’s choice among television displays: A multinomial logit approach
Directory of Open Access Journals (Sweden)
Carlos Giovanni González Espitia
2013-07-01
Full Text Available The consumer’s choice over a bundle of products depends on observable and unobservable characteristics of goods and consumers. This choice is made in order to maximize utility subject to a budget constraint. At the same time, firms make product differentiation decisions to maximize profit. Quality is a form of differentiation. An example of this occurs in the TV market, where several displays are developed. Our objective is to determine the probability for a consumer of choosing a type of display from among five kinds: standard tube, LCD, plasma, projection and LED. Using a multinomial logit approach, we find that electronic appliances like DVDs and audio systems, as well as socioeconomic status, increase the probability of choosing a high-tech television display. Our empirical approximation contributes to further understanding rational consumer behavior through the theory of utility maximization and highlights the importance of studying market structure and analyzing changes in welfare and efficiency.
Analysis of the liquidity risk in credit unions: a logit multinomial approach
Directory of Open Access Journals (Sweden)
Rosiane Maria Lima Gonçalves
2008-10-01
Full Text Available Liquidity risk in financial institutions is associated to balance between working capital and financial demands. Other factors that affect credit union liquidity are an unanticipated increase of withdrawals without an offsetting amount of new deposits, and the lack of ability in promoting the product geographical diversification. The objective of this study is to analyze Minas Gerais state credit union liquidity risk and its factor determinants. Financial ratios and the multinomial logit model are used. The cooperatives were classified in five categories of liquidity risk: very low, low, medium, high and very high. The empirical results indicate that high levels of liquidity are related to smaller values of the outsourcing capital use, immobilization of the turnover capital, and provision ratios. So, they are associated to larger values of the deposit total/credit operations, and asset growth ratios.
Model-based Clustering of Categorical Time Series with Multinomial Logit Classification
Frühwirth-Schnatter, Sylvia; Pamminger, Christoph; Winter-Ebmer, Rudolf; Weber, Andrea
2010-09-01
A common problem in many areas of applied statistics is to identify groups of similar time series in a panel of time series. However, distance-based clustering methods cannot easily be extended to time series data, where an appropriate distance-measure is rather difficult to define, particularly for discrete-valued time series. Markov chain clustering, proposed by Pamminger and Frühwirth-Schnatter [6], is an approach for clustering discrete-valued time series obtained by observing a categorical variable with several states. This model-based clustering method is based on finite mixtures of first-order time-homogeneous Markov chain models. In order to further explain group membership we present an extension to the approach of Pamminger and Frühwirth-Schnatter [6] by formulating a probabilistic model for the latent group indicators within the Bayesian classification rule by using a multinomial logit model. The parameters are estimated for a fixed number of clusters within a Bayesian framework using an Markov chain Monte Carlo (MCMC) sampling scheme representing a (full) Gibbs-type sampler which involves only draws from standard distributions. Finally, an application to a panel of Austrian wage mobility data is presented which leads to an interesting segmentation of the Austrian labour market.
Determination of the Factors Influencing Store Preference in Erzurum by a Multinomial Logit Model
Directory of Open Access Journals (Sweden)
Hüseyin ÖZER
2008-12-01
Full Text Available The main objective of this study is to determine factors influencing store preference of the store costumers in Erzurum in terms of some characteristics of the store and its product and costumers’ demographic characteristics (sex, age, marital status, level of education and their income level. In order to carry out this objective, Pearson chi-square test is applied to determine whether there is a relationship between the store preference and customer, stores, and some characteristics of products and a multinominal logit model is fitted by stepwise regression method to the cross-section data compiled from a questionnaire applied to 384 store costumers in the center of Erzurum province. According to the model estimation and test results, the variables of marital status (married, education (primary and cheapness (unimportant for Migros; education (middle for Özmar and marital status (married for the other stores are determined as statistically significant at the level of 5 percent
Wu, Qiong; Zhang, Guohui; Ci, Yusheng; Wu, Lina; Tarefder, Rafiqul A; Alcántara, Adélamar Dely
2016-05-18
Teenage drivers are more likely to be involved in severely incapacitating and fatal crashes compared to adult drivers. Moreover, because two thirds of urban vehicle miles traveled are on signal-controlled roadways, significant research efforts are needed to investigate intersection-related teenage driver injury severities and their contributing factors in terms of driver behavior, vehicle-infrastructure interactions, environmental characteristics, roadway geometric features, and traffic compositions. Therefore, this study aims to explore the characteristic differences between teenage and adult drivers in intersection-related crashes, identify the significant contributing attributes, and analyze their impacts on driver injury severities. Using crash data collected in New Mexico from 2010 to 2011, 2 multinomial logit regression models were developed to analyze injury severities for teenage and adult drivers, respectively. Elasticity analyses and transferability tests were conducted to better understand the quantitative impacts of these factors and the teenage driver injury severity model's generality. The results showed that although many of the same contributing factors were found to be significant in the both teenage and adult driver models, certain different attributes must be distinguished to specifically develop effective safety solutions for the 2 driver groups. The research findings are helpful to better understand teenage crash uniqueness and develop cost-effective solutions to reduce intersection-related teenage injury severities and facilitate driver injury mitigation research.
Predicting Dropouts of University Freshmen: A Logit Regression Analysis.
Lam, Y. L. Jack
1984-01-01
Stepwise discriminant analysis coupled with logit regression analysis of freshmen data from Brandon University (Manitoba) indicated that six tested variables drawn from research on university dropouts were useful in predicting attrition: student status, residence, financial sources, distance from home town, goal fulfillment, and satisfaction with…
Joan Daouli; Eirini Konstantina Nikolatou
2015-01-01
The objective of this paper is to investigate the factors influencing the probability that a Ph.D. holder in Greece will work in the academic sector, as well as the probability of his or her choosing employment in various sectors of industry and occupational categories. Probit/multinomial logit models are employed using the 2001 Census data. The empirical results indicate that being young, married, having a Ph.D. in Natural Sciences and/or in Engineering, granted by a Greek university, increa...
Directory of Open Access Journals (Sweden)
Erik Šoltés
2018-03-01
Full Text Available Exclusion from the labour market is a serious social problem that is also addressed by the Europe 2020 strategy. While in the past the attention of statisticians and sociologists in the fight against poverty and social exclusion has concentrated mainly on income poverty and material deprivation, in recent times many studies and analyses are much more focused on work intensity as well. Households that use their work potential to less than 20%, have a very low work intensity, and members of such households are included into the population of people who are at risk of poverty or social exclusion. Moreover, the low use of labour potential of households significantly increases the risk of income poverty and the threat of material deprivation. This article provides an analysis of work intensity levels of Slovak households depending on the factors that are monitored by the EU-SILC 2015. The impact of relevant factors is quantified by correspondence analysis and by multinomial logistic regression model.
A dynamic random effects multinomial logit model of household car ownership
DEFF Research Database (Denmark)
Bue Bjørner, Thomas; Leth-Petersen, Søren
2007-01-01
Using a large household panel we estimate demand for car ownership by means of a dynamic multinomial model with correlated random effects. Results suggest that the persistence in car ownership observed in the data should be attributed to both true state dependence and to unobserved heterogeneity...... (random effects). It also appears that random effects related to single and multiple car ownership are correlated, suggesting that the IIA assumption employed in simple multinomial models of car ownership is invalid. Relatively small elasticities with respect to income and car costs are estimated...
Creel, Michael; Loomis, John
1992-10-01
The recreational benefits from providing increased quantities of water to wildlife and fisheries habitats is estimated using linked multinomial logit site selection models and count data trip frequency models. The study encompasses waterfowl hunting, fishing and wildlife viewing at 14 recreational resources in the San Joaquin Valley, including the National Wildlife Refuges, the State Wildlife Management Areas, and six river destinations. The economic benefits of increasing water supplies to wildlife refuges were also examined by using the estimated models to predict changing patterns of site selection and overall participation due to increases in water allocations. Estimates of the dollar value per acre foot of water are calculated for increases in water to refuges. The resulting model is a flexible and useful tool for estimating the economic benefits of alternative water allocation policies for wildlife habitat and rivers.
Xu, Yueqing; McNamara, Paul; Wu, Yanfang; Dong, Yue
2013-10-15
Arable land in China has been decreasing as a result of rapid population growth and economic development as well as urban expansion, especially in developed regions around cities where quality farmland quickly disappears. This paper analyzed changes in arable land utilization during 1993-2008 in the Pinggu district, Beijing, China, developed a multinomial logit (MNL) model to determine spatial driving factors influencing arable land-use change, and simulated arable land transition probabilities. Land-use maps, as well as social-economic and geographical data were used in the study. The results indicated that arable land decreased significantly between 1993 and 2008. Lost arable land shifted into orchard, forestland, settlement, and transportation land. Significant differences existed for arable land transitions among different landform areas. Slope, elevation, population density, urbanization rate, distance to settlements, and distance to roadways were strong drivers influencing arable land transition to other uses. The MNL model was proved effective for predicting transition probabilities in land use from arable land to other land-use types, thus can be used for scenario analysis to develop land-use policies and land-management measures in this metropolitan area. Copyright © 2013 Elsevier Ltd. All rights reserved.
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.
Ferrari, Alberto
2017-01-01
Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.
Energy Technology Data Exchange (ETDEWEB)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam [Pusat Pengajian Sains Matematik, Universiti Sains Malaysia, 11800 USM, Pulau Pinang, Malaysia amirul@unisel.edu.my, zalila@cs.usm.my, norlida@usm.my, adam@usm.my (Malaysia)
2015-10-22
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
International Nuclear Information System (INIS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-01-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake
Meaney, Christopher; Moineddin, Rahim
2014-01-24
response data are generated from a discrete multinomial distribution with support on (0,1). The linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise.
MEMPREDIKSI FINANCIAL DISTRESS DENGAN BINARY LOGIT REGRESSION PERUSAHAAN TELEKOMUNIKASI
Directory of Open Access Journals (Sweden)
Tiara Widya Antikasari
2017-04-01
Full Text Available In this globalization era, sub–sector telecommunication industry has rapid development as time goes by with the number of customers’ growth. However, its growth is not balanced with operational revenue development. Therefore, it is important to analyze the financial distress in telecommunication companies in order to avoid bankruptcy. This research aimed to investigate the effect of financial ratios to predict probability of financial distress. Financial ratios indicator used profitability ratio, liquidity ratio, activity ratio, and leverage ratio. The population in this research was telecommunication companies listed in the Indonesia Stock Exchange periods 2009-2016. Based on purposive sampling method, the criteria of financial distress in this study was measured by using net operation negative two years, while statistic analysis used was logistic regression with a significance level of 10%. The result was that liquidity ratio (current ratio and activity ratio (total asset turnover ratio had a negative significant value, and profitability ratio(return on asset and leverage ratio (debt to total asset had positive significant value to predict financial distress.
Wage mobility in Europe. A comparative analysis using restricted multinomial logit regression
Pavlopoulos, D.; Muffels, R.; Vermunt, J.K.
2010-01-01
In this paper, we investigate cross-country differences in wage mobility in Europe using the European Community Household Panel. The paper is particularly focused on examining the impact of economic conditions, welfare state regimes and employment regulation on wage mobility. We apply a log-linear
Albaqshi, Amani Mohammed H.
2017-01-01
Functional Data Analysis (FDA) has attracted substantial attention for the last two decades. Within FDA, classifying curves into two or more categories is consistently of interest to scientists, but multi-class prediction within FDA is challenged in that most classification tools have been limited to binary response applications. The functional…
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
Steven F. Koch; Jeffrey S. Racine
2013-01-01
We apply parametric and nonparametric regression discontinuity methodology within a multinomial choice setting to examine the impact of public health care user fee abolition on health facility choice using data from South Africa. The nonparametric model is found to outperform the parametric model both in- and out-of-sample, while also delivering more plausible estimates of the impact of user fee abolition (i.e. the 'treatment effect'). In the parametric framework, treatment effects were relat...
Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen
2017-12-01
Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.
Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis
Chang, C. H.; Chan, H. C.; Chen, B. A.
2016-12-01
Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.
DETERMINATION OF FACTORS AFFECTING LENGTH OF STAY WITH MULTINOMIAL LOGISTIC REGRESSION IN TURKEY
Directory of Open Access Journals (Sweden)
Öğr. Gör. Rukiye NUMAN TEKİN
2016-08-01
Full Text Available Length of stay (LOS has important implications in various aspects of health services, can vary according to a wide range of factors. It is noticed that LOS has been neglected mostly in both theoratical studies and practice of health care management in Turkey. The main purpose of this study is to identify factors related to LOS in Turkey. A retrospective analysis of 2.255.836 patients hospitalized to private, university, foundation university and other (municipality, association and foreigners/minority hospitals hospitals which have an agreement with Social Security Institution (SSI in Turkey, from January 1, 2010, until the December 31, 2010, was examined. Patient’s data were taken from MEDULA (National Electronic Invoice System and SPSS 18.0 was used to perform statistical analysis. In this study t-test, one way anova and multinomial logistic regression are used to determine variables that may affect to LOS. The average LOS of patients was 3,93 days (SD = 5,882. LOS showed a statistically significant difference according to all independent variables used in the study (age, gender, disease class, type of hospitalization, presence of comorbidity, type and number of surgery, season of hospitalization, hospital ownership/bed capacity/ geographical region/residential area/type of service. According to the results of the multinomial lojistic regression analysis, LOS was negatively affected in terms of gender, presence of comorbidity, geographical region of hospital and was positively affected in terms of age, season of hospitalization, hospital bed capacity/ ownership/type of service/residential area.
Ardoino, Ilaria; Lanzoni, Monica; Marano, Giuseppe; Boracchi, Patrizia; Sagrini, Elisabetta; Gianstefani, Alice; Piscaglia, Fabio; Biganzoli, Elia M
2017-04-01
The interpretation of regression models results can often benefit from the generation of nomograms, 'user friendly' graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.
Lewis, Kristin Nicole; Heckman, Bernadette Davantes; Himawan, Lina
2011-08-01
Growth mixture modeling (GMM) identified latent groups based on treatment outcome trajectories of headache disability measures in patients in headache subspecialty treatment clinics. Using a longitudinal design, 219 patients in headache subspecialty clinics in 4 large cities throughout Ohio provided data on their headache disability at pretreatment and 3 follow-up assessments. GMM identified 3 treatment outcome trajectory groups: (1) patients who initiated treatment with elevated disability levels and who reported statistically significant reductions in headache disability (high-disability improvers; 11%); (2) patients who initiated treatment with elevated disability but who reported no reductions in disability (high-disability nonimprovers; 34%); and (3) patients who initiated treatment with moderate disability and who reported statistically significant reductions in headache disability (moderate-disability improvers; 55%). Based on the final multinomial logistic regression model, a dichotomized treatment appointment attendance variable was a statistically significant predictor for differentiating high-disability improvers from high-disability nonimprovers. Three-fourths of patients who initiated treatment with elevated disability levels did not report reductions in disability after 5 months of treatment with new preventive pharmacotherapies. Preventive headache agents may be most efficacious for patients with moderate levels of disability and for patients with high disability levels who attend all treatment appointments. Copyright © 2011 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
Snedden, Gregg A.; Steyer, Gregory D.
2013-01-01
Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.
Snedden, Gregg A.; Steyer, Gregory D.
2013-02-01
Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007-Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly
Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei
2008-10-01
Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.
Analysis of Functional Data with Focus on Multinomial Regression and Multilevel Data
DEFF Research Database (Denmark)
Mousavi, Seyed Nourollah
Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects of application...... and methodological development. Our main Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects...
A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.
Bersabé, Rosa; Rivas, Teresa
2010-05-01
The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.
Dai, Huanping; Micheyl, Christophe
2012-11-01
Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.
Perumal, Vanamail
2014-07-01
To assess reproductive risk factors for anaemia among pregnant women in urban and rural areas of India. The International Institute of Population Sciences, India, carried out third National Family Health Survey in 2005-2006 to estimate a key indicator from a sample of ever-married women in the reproductive age group 15-49 years. Data on various dimensions were collected using a structured questionnaire, and anaemia was measured using a portable HemoCue instrument. Anaemia prevalence among pregnant women was compared between rural and urban areas using chi-square test and odds ratio. Multinomial logistic regression analysis was used to determine risk factors. Anaemia prevalence was assessed among 3355 pregnant women from rural areas and 1962 pregnant women from urban areas. Moderate-to-severe anaemia in rural areas (32.4%) is significantly more common than in urban areas (27.3%) with an excess risk of 30%. Gestational age specific prevalence of anaemia significantly increases in rural areas after 6 months. Pregnancy duration is a significant risk factor in both urban and rural areas. In rural areas, increasing age at marriage and mass media exposure are significant protective factors of anaemia. However, more births in the last five years, alcohol consumption and smoking habits are significant risk factors. In rural areas, various reproductive factors and lifestyle characteristics constitute significant risk factors for moderate-to-severe anaemia. Therefore, intensive health education on reproductive practices and the impact of lifestyle characteristics are warranted to reduce anaemia prevalence. © 2014 John Wiley & Sons Ltd.
Directory of Open Access Journals (Sweden)
Jason W. Osborne
2012-06-01
Full Text Available Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These outcomes represent important social science lines of research: retention in, or dropout from school, using illicit drugs, underage alcohol consumption, antisocial behavior, purchasing decisions, voting patterns, risky behavior, and so on. The goal of this paper is to briefly lead the reader through the surprisingly simple mathematics that underpins logistic regression: probabilities, odds, odds ratios, and logits. Anyone with spreadsheet software or a scientific calculator can follow along, and in turn, this knowledge can be used to make much more interesting, clear, and accurate presentations of results (especially to non-technical audiences. In particular, I will share an example of an interaction in logistic regression, how it was originally graphed, and how the graph was made substantially more user-friendly by converting the original metric (logits to a more readily interpretable metric (probability through three simple steps.
Directory of Open Access Journals (Sweden)
Varga Csaba
2012-10-01
Full Text Available Abstract Background Identifying risk factors for Salmonella Enteritidis (SE infections in Ontario will assist public health authorities to design effective control and prevention programs to reduce the burden of SE infections. Our research objective was to identify risk factors for acquiring SE infections with various phage types (PT in Ontario, Canada. We hypothesized that certain PTs (e.g., PT8 and PT13a have specific risk factors for infection. Methods Our study included endemic SE cases with various PTs whose isolates were submitted to the Public Health Laboratory-Toronto from January 20th to August 12th, 2011. Cases were interviewed using a standardized questionnaire that included questions pertaining to demographics, travel history, clinical symptoms, contact with animals, and food exposures. A multinomial logistic regression method using the Generalized Linear Latent and Mixed Model procedure and a case-case study design were used to identify risk factors for acquiring SE infections with various PTs in Ontario, Canada. In the multinomial logistic regression model, the outcome variable had three categories representing human infections caused by SE PT8, PT13a, and all other SE PTs (i.e., non-PT8/non-PT13a as a referent category to which the other two categories were compared. Results In the multivariable model, SE PT8 was positively associated with contact with dogs (OR=2.17, 95% CI 1.01-4.68 and negatively associated with pepper consumption (OR=0.35, 95% CI 0.13-0.94, after adjusting for age categories and gender, and using exposure periods and health regions as random effects to account for clustering. Conclusions Our study findings offer interesting hypotheses about the role of phage type-specific risk factors. Multinomial logistic regression analysis and the case-case study approach are novel methodologies to evaluate associations among SE infections with different PTs and various risk factors.
Sequential and Simultaneous Logit: A Nested Model.
van Ophem, J.C.M.; Schram, A.J.H.C.
1997-01-01
A nested model is presented which has both the sequential and the multinomial logit model as special cases. This model provides a simple test to investigate the validity of these specifications. Some theoretical properties of the model are discussed. In the analysis a distribution function is
Directory of Open Access Journals (Sweden)
Milewska Anna Justyna
2017-09-01
Full Text Available Infertility is a huge problem nowadays, not only from the medical but also from the social point of view. A key step to improve treatment outcomes is the possibility of effective prediction of treatment result. In a situation when a phenomenon with more than 2 states needs to be explained, e.g. pregnancy, miscarriage, non-pregnancy, the use of multinomial logistic regression is a good solution. The aim of this paper is to select those features that have a significant impact on achieving clinical pregnancy as well as those that determine the occurrence of spontaneous miscarriage (non-pregnancy was set as the reference category. Two multi-factor models were obtained, used in predicting infertility treatment outcomes. One of the models enabled to conclude that the number of follicles and the percentage of retrieved mature oocytes have a significant impact when prediction of treatment outcome is made on the basis of information about oocytes. The other model, built on the basis of information about embryos, showed the significance of the number of fertilized oocytes, the percentage of at least 7-cell embryos on day 3, the percentage of blasts on day 5, and the day of transfer.
Regression Models For Multivariate Count Data.
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2017-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Osborne, Jason W.
2012-01-01
Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…
Interpreting and Understanding Logits, Probits, and other Non-Linear Probability Models
DEFF Research Database (Denmark)
Breen, Richard; Karlson, Kristian Bernt; Holm, Anders
2018-01-01
Methods textbooks in sociology and other social sciences routinely recommend the use of the logit or probit model when an outcome variable is binary, an ordered logit or ordered probit when it is ordinal, and a multinomial logit when it has more than two categories. But these methodological...... guidelines take little or no account of a body of work that, over the past 30 years, has pointed to problematic aspects of these nonlinear probability models and, particularly, to difficulties in interpreting their parameters. In this chapterreview, we draw on that literature to explain the problems, show...
Street Choice Logit Model for Visitors in Shopping Districts
Directory of Open Access Journals (Sweden)
Ko Kawada
2014-07-01
Full Text Available In this study, we propose two models for predicting people’s activity. The first model is the pedestrian distribution prediction (or postdiction model by multiple regression analysis using space syntax indices of urban fabric and people distribution data obtained from a field survey. The second model is a street choice model for visitors using multinomial logit model. We performed a questionnaire survey on the field to investigate the strolling routes of 46 visitors and obtained a total of 1211 street choices in their routes. We proposed a utility function, sum of weighted space syntax indices, and other indices, and estimated the parameters for weights on the basis of maximum likelihood. These models consider both street networks, distance from destination, direction of the street choice and other spatial compositions (numbers of pedestrians, cars, shops, and elevation. The first model explains the characteristics of the street where many people tend to walk or stay. The second model explains the mechanism underlying the street choice of visitors and clarifies the differences in the weights of street choice parameters among the various attributes, such as gender, existence of destinations, number of people, etc. For all the attributes considered, the influences of DISTANCE and DIRECTION are strong. On the other hand, the influences of Int.V, SHOPS, CARS, ELEVATION, and WIDTH are different for each attribute. People with defined destinations tend to choose streets that “have more shops, and are wider and lower”. In contrast, people with undefined destinations tend to choose streets of high Int.V. The choice of males is affected by Int.V, SHOPS, WIDTH (positive and CARS (negative. Females prefer streets that have many shops, and couples tend to choose downhill streets. The behavior of individual persons is affected by all variables. The behavior of people visiting in groups is affected by SHOP and WIDTH (positive.
Street Choice Logit Model for Visitors in Shopping Districts
Kawada, Ko; Yamada, Takashi; Kishimoto, Tatsuya
2014-01-01
In this study, we propose two models for predicting people’s activity. The first model is the pedestrian distribution prediction (or postdiction) model by multiple regression analysis using space syntax indices of urban fabric and people distribution data obtained from a field survey. The second model is a street choice model for visitors using multinomial logit model. We performed a questionnaire survey on the field to investigate the strolling routes of 46 visitors and obtained a total of 1211 street choices in their routes. We proposed a utility function, sum of weighted space syntax indices, and other indices, and estimated the parameters for weights on the basis of maximum likelihood. These models consider both street networks, distance from destination, direction of the street choice and other spatial compositions (numbers of pedestrians, cars, shops, and elevation). The first model explains the characteristics of the street where many people tend to walk or stay. The second model explains the mechanism underlying the street choice of visitors and clarifies the differences in the weights of street choice parameters among the various attributes, such as gender, existence of destinations, number of people, etc. For all the attributes considered, the influences of DISTANCE and DIRECTION are strong. On the other hand, the influences of Int.V, SHOPS, CARS, ELEVATION, and WIDTH are different for each attribute. People with defined destinations tend to choose streets that “have more shops, and are wider and lower”. In contrast, people with undefined destinations tend to choose streets of high Int.V. The choice of males is affected by Int.V, SHOPS, WIDTH (positive) and CARS (negative). Females prefer streets that have many shops, and couples tend to choose downhill streets. The behavior of individual persons is affected by all variables. The behavior of people visiting in groups is affected by SHOP and WIDTH (positive). PMID:25379274
Comparison of multinomial and binomial proportion methods for analysis of multinomial count data.
Galyean, M L; Wester, D B
2010-10-01
Simulation methods were used to generate 1,000 experiments, each with 3 treatments and 10 experimental units/treatment, in completely randomized (CRD) and randomized complete block designs. Data were counts in 3 ordered or 4 nominal categories from multinomial distributions. For the 3-category analyses, category probabilities were 0.6, 0.3, and 0.1, respectively, for 2 of the treatments, and 0.5, 0.35, and 0.15 for the third treatment. In the 4-category analysis (CRD only), probabilities were 0.3, 0.3, 0.2, and 0.2 for treatments 1 and 2 vs. 0.4, 0.4, 0.1, and 0.1 for treatment 3. The 3-category data were analyzed with generalized linear mixed models as an ordered multinomial distribution with a cumulative logit link or by regrouping the data (e.g., counts in 1 category/sum of counts in all categories), followed by analysis of single categories as binomial proportions. Similarly, the 4-category data were analyzed as a nominal multinomial distribution with a glogit link or by grouping data as binomial proportions. For the 3-category CRD analyses, empirically determined type I error rates based on pair-wise comparisons (F- and Wald chi(2) tests) did not differ between multinomial and individual binomial category analyses with 10 (P = 0.38 to 0.60) or 50 (P = 0.19 to 0.67) sampling units/experimental unit. When analyzed as binomial proportions, power estimates varied among categories, with analysis of the category with the greatest counts yielding power similar to the multinomial analysis. Agreement between methods (percentage of experiments with the same results for the overall test for treatment effects) varied considerably among categories analyzed and sampling unit scenarios for the 3-category CRD analyses. Power (F-test) was 24.3, 49.1, 66.9, 83.5, 86.8, and 99.7% for 10, 20, 30, 40, 50, and 100 sampling units/experimental unit for the 3-category multinomial CRD analyses. Results with randomized complete block design simulations were similar to those with the CRD
Modeling a Multinomial Logit Model of Intercity Travel Mode Choice Behavior for All Trips in Libya
Manssour A. Abdulsalam Bin Miskeen; Ahmed Mohamed Alhodairi; Riza Atiq Abdullah Bin O. K. Rahmat
2013-01-01
In the planning point of view, it is essential to have mode choice, due to the massive amount of incurred in transportation systems. The intercity travellers in Libya have distinct features, as against travellers from other countries, which includes cultural and socioeconomic factors. Consequently, the goal of this study is to recognize the behavior of intercity travel using disaggregate models, for projecting the demand of nation-level intercity travel in Libya. Multinom...
Identifying response styles: A latent-class bilinear multinomial logit model
van Rosmalen, J.; van Herk, H.; Groenen, P.J.F.
2010-01-01
Respondents can vary strongly in the way they use rating scales. Specifically, respondents can exhibit a variety of response styles, which threatens the validity of the responses. The purpose of this article Is to investigate how response style and content of the items affect rating scale responses.
A Multinomial Logit Approach to Estimating Regional Inventories by Product Class
Lawrence Teeter; Xiaoping Zhou
1998-01-01
Current timber inventory projections generally lack information on inventory by product classes. Most models available for inventory projection and linked to supply analyses are limited to projecting aggregate softwood and hardwood. The objective of this research is to develop a methodology to distribute the volume on each FIA survey plot to product classes and...
Mixed multinomial logit model for out-of-home leisure activity choice
Grigolon, A.B.; Kemperman, A.D.A.M.; Timmermans, H.J.P.
2013-01-01
This paper documents the design and results of a study on the factors influencing the choice of out-of-home leisure activities. Influencing factors seem related to socio-demographic characteristics, personal preferences, characteristics of the built environment and other aspects of the activities
Directory of Open Access Journals (Sweden)
JORGE E. CÓRDOBA MAQUILÓN
2012-01-01
Full Text Available Los modelos de demanda de viajes utilizan principalmente los atributos modales y las características socioeconómicas como variables explicativas. También se ha establecido que las actitudes y percepciones influyen en el comportamiento de los usuarios. Sin embargo, las variables psicológicas del individuo condicionan la conducta del usuario. En este estudio se incluyó la variable latente personalidad, en la estimación del modelo híbrido de elección discreta, el cual constituye una buena alternativa para incorporar los efectos de los factores subjetivos. La variable latente personalidad se evaluó con la prueba psicométrica 16PF de validez internacional. El artículo analiza los resultados de la aplicación de este modelo a una población de empleados y docentes universitarios, y también propone un camino para la utilización de pruebas psicométricas en los modelos híbridos de elección discreta. Nuestros resultados muestran que los modelos híbridos que incluyen variables latentes psicológicas son superiores a los modelos tradicionales que ignoran los efectos de la conducta de los usuarios.
Reasons for not buying a car : a probit-selection multinomial logit choice model
Gao, Y.; Rasouli, S.; Timmermans, H.J.P.
2014-01-01
Generating and maintaining gradients of cell density and extracellular matrix (ECM) components is a prerequisite for the development of functionality of healthy tissue. Therefore, gaining insights into the drivers of spatial organization of cells and the role of ECM during tissue morphogenesis is
Identifying Unknown Response Styles: A Latent-Class Bilinear Multinomial Logit Model
J.M. van Rosmalen (Joost); H. van Herk (Hester); P.J.F. Groenen (Patrick)
2007-01-01
textabstractRespondents can vary significantly in the way they use rating scales. Specifically, respondents can exhibit varying degrees of response style, which threatens the validity of the responses. The purpose of this article is to investigate to what extent rating scale responses show response
Unobserved Heterogeneity in the Binary Logit Model with Cross-Sectional Data and Short Panels
DEFF Research Database (Denmark)
Holm, Anders; Jæger, Mads Meier; Pedersen, Morten
This paper proposes a new approach to dealing with unobserved heterogeneity in applied research using the binary logit model with cross-sectional data and short panels. Unobserved heterogeneity is particularly important in non-linear regression models such as the binary logit model because, unlike...... in linear regression models, estimates of the effects of observed independent variables are biased even when omitted independent variables are uncorrelated with the observed independent variables. We propose an extension of the binary logit model based on a finite mixture approach in which we conceptualize...
FORMULASI MODEL PERMUTASI SIKLIS DENGAN OBJEK MULTINOMIAL
Directory of Open Access Journals (Sweden)
Sukma Adi Perdana
2016-10-01
Full Text Available Penelitian ini bertujuan membangun model matematika untuk menghitung jumlah susunan objek dari permutasi siklis yang memiliki objek multinomial. Model yang dibangun dibatasi untuk permutasi siklis yang memiliki objek multinomial dengan minimal ada satu jenis objek beranggotakan tunggal. Pemodelan dilakukan berdasarkan struktur matematika dari permutasi siklis dan permutasi multinomial. Model permutasi siklis yang memiliki objek multinomial telah dirumuskan. Pembuktian model telah dilakukan melalui validasi struktur serta validasi hasil yang dilakukan dengan cara membandingkan hasil perhitungan model dan hasil pencacahan. Teorema tentang permutasi siklis dengan objek multinomial juga telah dibangun. Kata kunci: pemodelan , permutasi siklis, permutasi multinomial This study aims at constructing mathematical model to count the number of arrangement of objects form cyclical permutation that has multinomial objects. The model constructed is limited to cyclical permutation that has multinomial object in which at least one kind of object having single cardinality is contained within. Modelling is undertaken based on mathematical structure of cyclical permutation and multinomial permutation. Cyclical permutation model having multinomial object has been formulated as . The proof of the model has been undertaken by validating structure and validating the outcome which was conducted by comparing counting result of model and counting result manually. The theorem of cyclical permutation with multinomial object has also been developed. Keywords: modelling, cyclical permutation, multinomial permutation
Memprediksi Financial Distress dengan Binary Logit Regression Perusahaan Telekomunikasi
antikasari, tiara widya; Djuminah, Djuminah
2017-01-01
In this globalization era, sub–sector telecommunication industry has rapid development as time goes by with the number of customers’ growth. However, its growth is not balanced with operational revenue development. Therefore, it is important to analyze the financial distress in telecommunication companies in order to avoid bankruptcy. This research aimed to investigate the effect of financial ratios to predict probability of financial distress. Financial ratios indicator used profitability ra...
Kallas, Zein; Borrisser-Pairó,, Francesc; Martínez, Beatriz; Vieira, Ceferina; Panella-Riera, Nuria; Olivar, Maria Angels; Gil Roig, José María
2016-01-01
The European societies are requiring that animals to be raised as closely as possible to their natural conditions. The growing concerns about animal welfare is resulting in continuous modifications of regulations and policies that led to ban of a number of intensive farming methods. The European authorities consider the pig welfare as a priority issue. They are studying to ban surgical pig castration by 2018, which may seriously affect markets and consumers due to boar tainted-meat. This stud...
Parameter identification in multinomial processing tree models
Schmittmann, V.D.; Dolan, C.V.; Raijmakers, M.E.J.; Batchelder, W.H.
2010-01-01
Multinomial processing tree models form a popular class of statistical models for categorical data that have applications in various areas of psychological research. As in all statistical models, establishing which parameters are identified is necessary for model inference and selection on the basis
DEFF Research Database (Denmark)
Kaplan, Sigal; Prato, Carlo Giacomo
' propensity to engage in various corrective maneuvers in the case of the critical event of vehicle travelling. Five lateral and speed control maneuvers are considered: “braking”, “steering”, “braking & steering”, and “other maneuvers”, in addition to a “no action” option. The analyzed data are retrieved from...... the United States National Automotive Sampling System General Estimates System (GES) crash database for the years 2005-2009. Results show (i) the correlation between crash avoidance maneuvers and crash severity, and (ii) the link between drivers' attributes, risky driving behavior, road characteristics...
Ordered LOGIT Model approach for the determination of financial distress.
Kinay, B
2010-01-01
Nowadays, as a result of the global competition encountered, numerous companies come up against financial distresses. To predict and take proactive approaches for those problems is quite important. Thus, the prediction of crisis and financial distress is essential in terms of revealing the financial condition of companies. In this study, financial ratios relating to 156 industrial firms that are quoted in the Istanbul Stock Exchange are used and probabilities of financial distress are predicted by means of an ordered logit regression model. By means of Altman's Z Score, the dependent variable is composed by scaling the level of risk. Thus, a model that can compose an early warning system and predict financial distress is proposed.
Analisis Faktor yang Mempengaruhi Tingkat Kesehatan Bank dengan Regresi Logit
Directory of Open Access Journals (Sweden)
Titik Aryati
2007-09-01
Full Text Available The article aims to find the probability effects of bank’s health level using CAMEL ratio analysis. The statistic method used to test on the research hypothesis was logit regression. The dependent variable used in this research was bank’s health level and independent variables were CAMEL financial ratios consisting of CAR, NPL, ROA, ROE, LDR, and NIM. The report data were extracted from bank’s financial from financial report, which had been published and accumulated by Infobank research bureau with valuation, based on bank Indonesia policy. The sample consisted of 60 healthy banks and 14 unhealthy banks in 2005 and 2006. The empirical result of this research indicates that the Non Performing Loan is the significant variable affecting bank health level.
Analysis of RIA standard curve by log-logistic and cubic log-logit models
International Nuclear Information System (INIS)
Yamada, Hideo; Kuroda, Akira; Yatabe, Tami; Inaba, Taeko; Chiba, Kazuo
1981-01-01
In order to improve goodness-of-fit in RIA standard analysis, programs for computing log-logistic and cubic log-logit were written in BASIC using personal computer P-6060 (Olivetti). Iterative least square method of Taylor series was applied for non-linear estimation of logistic and log-logistic. Hear ''log-logistic'' represents Y = (a - d)/(1 + (log(X)/c)sup(b)) + d As weights either 1, 1/var(Y) or 1/σ 2 were used in logistic or log-logistic and either Y 2 (1 - Y) 2 , Y 2 (1 - Y) 2 /var(Y), or Y 2 (1 - Y) 2 /σ 2 were used in quadratic or cubic log-logit. The term var(Y) represents squares of pure error and σ 2 represents estimated variance calculated using a following equation log(σ 2 + 1) = log(A) + J log(y). As indicators for goodness-of-fit, MSL/S sub(e)sup(2), CMD% and WRV (see text) were used. Better regression was obtained in case of alpha-fetoprotein by log-logistic than by logistic. Cortisol standard curve was much better fitted with cubic log-logit than quadratic log-logit. Predicted precision of AFP standard curve was below 5% in log-logistic in stead of 8% in logistic analysis. Predicted precision obtained using cubic log-logit was about five times lower than that with quadratic log-logit. Importance of selecting good models in RIA data processing was stressed in conjunction with intrinsic precision of radioimmunoassay system indicated by predicted precision. (author)
Environmental regulations and plant exit: A logit analysis based on established panel data
Energy Technology Data Exchange (ETDEWEB)
Bioern, E; Golombek, R; Raknerud, A
1995-12-01
This publication uses a model to study the relationship between environmental regulations and plant exit. It has the main characteristics of a multinomial qualitative response model of the logit type, but also has elements of a Markov chain model. The model uses Norwegian panel data for establishments in three manufacturing sectors with high shares of units which have been under strict environmental regulations. In two of the sectors, the exit probability of non-regulated establishments is about three times higher than for regulated ones. It is also found that the probability of changing regulation status from non-regulated to regulated depends significantly on economic factors. In particular, establishments with weak profitability are the most likely to become subject to environmental regulation. 12 refs., 2 figs., 6 tabs.
A Multinomial Probit Model with Latent Factors
DEFF Research Database (Denmark)
Piatek, Rémi; Gensowski, Miriam
2017-01-01
be meaningfully linked to an economic model. We provide sufficient conditions that make this structure identified and interpretable. For inference, we design a Markov chain Monte Carlo sampler based on marginal data augmentation. A simulation exercise shows the good numerical performance of our sampler......We develop a parametrization of the multinomial probit model that yields greater insight into the underlying decision-making process, by decomposing the error terms of the utilities into latent factors and noise. The latent factors are identified without a measurement system, and they can...
Numerical proceessing of radioimmunoassay results using logit-log transformation method
International Nuclear Information System (INIS)
Textoris, R.
1983-01-01
The mathematical model and algorithm are described of the numerical processing of the results of a radioimmunoassay by the logit-log transformation method and by linear regression with weight factors. The limiting value of the curve for zero concentration is optimized with regard to the residual sum by the iterative method by multiple repeats of the linear regression. Typical examples are presented of the approximation of calibration curves. The method proved suitable for all hitherto used RIA sets and is well suited for small computers with internal memory of min. 8 Kbyte. (author)
Implicit moral evaluations: A multinomial modeling approach.
Cameron, C Daryl; Payne, B Keith; Sinnott-Armstrong, Walter; Scheffer, Julian A; Inzlicht, Michael
2017-01-01
Implicit moral evaluations-i.e., immediate, unintentional assessments of the wrongness of actions or persons-play a central role in supporting moral behavior in everyday life. Yet little research has employed methods that rigorously measure individual differences in implicit moral evaluations. In five experiments, we develop a new sequential priming measure-the Moral Categorization Task-and a multinomial model that decomposes judgment on this task into multiple component processes. These include implicit moral evaluations of moral transgression primes (Unintentional Judgment), accurate moral judgments about target actions (Intentional Judgment), and a directional tendency to judge actions as morally wrong (Response Bias). Speeded response deadlines reduced Intentional Judgment but not Unintentional Judgment (Experiment 1). Unintentional Judgment was stronger toward moral transgression primes than non-moral negative primes (Experiments 2-4). Intentional Judgment was associated with increased error-related negativity, a neurophysiological indicator of behavioral control (Experiment 4). Finally, people who voted for an anti-gay marriage amendment had stronger Unintentional Judgment toward gay marriage primes (Experiment 5). Across Experiments 1-4, implicit moral evaluations converged with moral personality: Unintentional Judgment about wrong primes, but not negative primes, was negatively associated with psychopathic tendencies and positively associated with moral identity and guilt proneness. Theoretical and practical applications of formal modeling for moral psychology are discussed. Copyright © 2016 Elsevier B.V. All rights reserved.
The Finite and Moving Order Multinomial Universal Portfolio
International Nuclear Information System (INIS)
Tan, Choon Peng; Pang, Sook Theng
2013-01-01
An upper bound for the ratio of wealths of the best constant -rebalanced portfolio to that of the multinomial universal portfolio is derived. The finite- order multinomial universal portfolios can reduce the implementation time and computer-memory requirements for computation. The improved performance of the finite-order portfolios on some selected local stock-price data sets is observed.
Naive Bayesian classifiers for multinomial features: a theoretical analysis
CSIR Research Space (South Africa)
Van Dyk, E
2007-11-01
Full Text Available The authors investigate the use of naive Bayesian classifiers for multinomial feature spaces and derive error estimates for these classifiers. The error analysis is done by developing a mathematical model to estimate the probability density...
Age and pedestrian injury severity in motor-vehicle crashes: a heteroskedastic logit analysis.
Kim, Joon-Ki; Ulfarsson, Gudmundur F; Shankar, Venkataraman N; Kim, Sungyop
2008-09-01
This research explores the injury severity of pedestrians in motor-vehicle crashes. It is hypothesized that the variance of unobserved pedestrian characteristics increases with age. In response, a heteroskedastic generalized extreme value model is used. The analysis links explanatory factors with four injury outcomes: fatal, incapacitating, non-incapacitating, and possible or no injury. Police-reported crash data between 1997 and 2000 from North Carolina, USA, are used. The results show that pedestrian age induces heteroskedasticity which affects the probability of fatal injury. The effect grows more pronounced with increasing age past 65. The heteroskedastic model provides a better fit than the multinomial logit model. Notable factors increasing the probability of fatal pedestrian injury: increasing pedestrian age, male driver, intoxicated driver (2.7 times greater probability of fatality), traffic sign, commercial area, darkness with or without streetlights (2-4 times greater probability of fatality), sport-utility vehicle, truck, freeway, two-way divided roadway, speeding-involved, off roadway, motorist turning or backing, both driver and pedestrian at fault, and pedestrian only at fault. Conversely, the probability of a fatal injury decreased: with increasing driver age, during the PM traffic peak, with traffic signal control, in inclement weather, on a curved roadway, at a crosswalk, and when walking along roadway.
STAS and Logit Modeling of Advertising and Promotion Effects
DEFF Research Database (Denmark)
Hansen, Flemming; Yssing Hansen, Lotte; Grønholdt, Lars
2002-01-01
This paper describes the preliminary studies of the effect of advertising and promotion on purchases using the British single-source database Adlab. STAS and logit modeling are the two measures studied. Results from the two measures have been compared to determine the extent to which, they give...
Spatial age-length key modelling using continuation ratio logits
DEFF Research Database (Denmark)
Berg, Casper W.; Kristensen, Kasper
2012-01-01
-called age-length key (ALK) is then used to obtain the age distribution. Regional differences in ALKs are not uncommon, but stratification is often problematic due to a small number of samples. Here, we combine generalized additive modelling with continuation ratio logits to model the probability of age...
Directory of Open Access Journals (Sweden)
E Haji Nejad
2001-06-01
Full Text Available Difference aspects of multinomial statistical modelings and its classifications has been studied so far. In these type of problems Y is the qualitative random variable with T possible states which are considered as classifications. The goal is prediction of Y based on a random Vector X ? IR^m. Many methods for analyzing these problems were considered. One of the modern and general method of classification is Classification and Regression Trees (CART. Another method is recursive partitioning techniques which has a strange relationship with nonparametric regression. Classical discriminant analysis is a standard method for analyzing these type of data. Flexible discriminant analysis method which is a combination of nonparametric regression and discriminant analysis and classification using spline that includes least square regression and additive cubic splines. Neural network is an advanced statistical method for analyzing these types of data. In this paper properties of multinomial logistics regression were investigated and this method was used for modeling effective factors in selecting contraceptive methods in Ghom province for married women age 15-49. The response variable has a tetranomial distibution. The levels of this variable are: nothing, pills, traditional and a collection of other contraceptive methods. A collection of significant independent variables were: place, age of women, education, history of pregnancy and family size. Menstruation age and age at marriage were not statistically significant.
Pricing Mining Concessions Based on Combined Multinomial Pricing Model
Directory of Open Access Journals (Sweden)
Chang Xiao
2017-01-01
Full Text Available A combined multinomial pricing model is proposed for pricing mining concession in which the annualized volatility of the price of mineral products follows a multinomial distribution. First, a combined multinomial pricing model is proposed which consists of binomial pricing models calculated according to different volatility values. Second, a method is provided to calculate the annualized volatility and the distribution. Third, the value of convenience yields is calculated based on the relationship between the futures price and the spot price. The notion of convenience yields is used to adjust our model as well. Based on an empirical study of a Chinese copper mine concession, we verify that our model is easy to use and better than the model with constant volatility when considering the changing annualized volatility of the price of the mineral product.
Un procedimiento para selección de los modelos Logit Mixtos
Ruíz Gallegos, José de Jesús
2004-01-01
En el presente trabajo se hace una revisión de dos modelos que han tenido una fuerte aplicabilidad en los problemas de elecciones discretas: El modelo Logit y el modelo Logit Mixto. Además, se propone el uso del estadístico de Cox para seleccionar modelos, en el modelo Logit Mixto.
A nested recursive logit model for route choice analysis
DEFF Research Database (Denmark)
Mai, Tien; Frejinger, Emma; Fosgerau, Mogens
2015-01-01
choices and the model does not require any sampling of choice sets. Furthermore, the model can be consistently estimated and efficiently used for prediction.A key challenge lies in the computation of the value functions, i.e. the expected maximum utility from any position in the network to a destination....... The value functions are the solution to a system of non-linear equations. We propose an iterative method with dynamic accuracy that allows to efficiently solve these systems.We report estimation results and a cross-validation study for a real network. The results show that the NRL model yields sensible......We propose a route choice model that relaxes the independence from irrelevant alternatives property of the logit model by allowing scale parameters to be link specific. Similar to the recursive logit (RL) model proposed by Fosgerau et al. (2013), the choice of path is modeled as a sequence of link...
Total, Direct, and Indirect Effects in Logit Models
DEFF Research Database (Denmark)
Karlson, Kristian Bernt; Holm, Anders; Breen, Richard
It has long been believed that the decomposition of the total effect of one variable on another into direct and indirect effects, while feasible in linear models, is not possible in non-linear probability models such as the logit and probit. In this paper we present a new and simple method...... average partial effects, as defined by Wooldridge (2002). We present the method graphically and illustrate it using the National Educational Longitudinal Study of 1988...
Multinomial logistic models explaining income changes of migrants to high-amenity counties.
Von Reichert, C; Rudzitis, G
1992-01-01
"A survey of residents of and migrants to 15 fast-growing wilderness counties [in the United States] showed that only 25 percent of the migrants increased their income, while almost 50 percent accepted income losses upon their moves to high-amenity counties. Concomitantly, amenities and quality of life were more important factors in the migration decision than was employment, for instance. We focused on migrants in the labor force and employed multinomial logistic regression to identify the impact of migrants' characteristics, their satisfaction/dissatisfaction with previous location (push), and the importance of destination features (pull) on income change." excerpt
Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach
Klauer, Karl Christoph
2010-01-01
Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…
Directory of Open Access Journals (Sweden)
Alberto Gómez Mejía
2011-01-01
de Doing Business explican cómo, cuando un país implementa estos criterios, incrementa las posibilidades de pasar a un nivel de ingreso per cápita superior, lo cual implica mayor crecimiento económico y potencial desarrollo económico.
Multinomial-exponential reliability function: a software reliability model
International Nuclear Information System (INIS)
Saiz de Bustamante, Amalio; Saiz de Bustamante, Barbara
2003-01-01
The multinomial-exponential reliability function (MERF) was developed during a detailed study of the software failure/correction processes. Later on MERF was approximated by a much simpler exponential reliability function (EARF), which keeps most of MERF mathematical properties, so the two functions together makes up a single reliability model. The reliability model MERF/EARF considers the software failure process as a non-homogeneous Poisson process (NHPP), and the repair (correction) process, a multinomial distribution. The model supposes that both processes are statistically independent. The paper discusses the model's theoretical basis, its mathematical properties and its application to software reliability. Nevertheless it is foreseen model applications to inspection and maintenance of physical systems. The paper includes a complete numerical example of the model application to a software reliability analysis
Measuring political sentiment on Twitter: factor-optimal design for multinomial inverse regression
Taddy, Matt
2012-01-01
This article presents a short case study in text analysis: the scoring of Twitter posts for positive, negative, or neutral sentiment directed towards particular US politicians. The study requires selection of a sub-sample of representative posts for sentiment scoring, a common and costly aspect of sentiment mining. As a general contribution, our application is preceded by a proposed algorithm for maximizing sampling efficiency. In particular, we outline and illustrate greedy selection of docu...
Woo-Yong Hyun; Robert B. Ditton
2007-01-01
The concept of recreation substitutability has been a continuing research topic for outdoor recreation researchers. This study explores the relationships among variables regarding the willingness to substitute one location for another location. The objectives of the study are 1) to ascertain and predict the extent to which saltwater anglers were willing to substitute...
Directory of Open Access Journals (Sweden)
Yeison Díaz-Mateus
2017-07-01
Full Text Available Decision making in supply chains is influenced by demand variations, and hence sales, purchase orders and inventory levels are therefore concerned. This paper presents a non-linear optimization model for a two-echelon supply chain, for a unique product. In addition, the model includes the consumers’ maximum willingness to pay, taking socioeconomic differences into account. To do so, the constrained multinomial logit for discrete choices is used to estimate demand levels. Then, a metaheuristic approach based on particle swarm optimization is proposed to determine the optimal product sales price and inventory coordination variables. To validate the proposed model, a supply chain of a technological product was chosen and three scenarios are analyzed: discounts, demand segmentation and demand overestimation. Results are analyzed on the basis of profits, lotsizing and inventory turnover and market share. It can be concluded that the maximum willingness to pay must be taken into consideration, otherwise fictitious profits may mislead decision making, and although the market share would seem to improve, overall profits are not in fact necessarily better.
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Essays on pricing dynamics, price dispersion, and nested logit modelling
Verlinda, Jeremy Alan
The body of this dissertation comprises three standalone essays, presented in three respective chapters. Chapter One explores the possibility that local market power contributes to the asymmetric relationship observed between wholesale costs and retail prices in gasoline markets. I exploit an original data set of weekly gas station prices in Southern California from September 2002 to May 2003, and take advantage of highly detailed station and local market-level characteristics to determine the extent to which spatial differentiation influences price-response asymmetry. I find that brand identity, proximity to rival stations, bundling and advertising, operation type, and local market features and demographics each influence a station's predicted asymmetric relationship between prices and wholesale costs. Chapter Two extends the existing literature on the effect of market structure on price dispersion in airline fares by modeling the effect at the disaggregate ticket level. Whereas past studies rely on aggregate measures of price dispersion such as the Gini coefficient or the standard deviation of fares, this paper estimates the entire empirical distribution of airline fares and documents how the shape of the distribution is determined by market structure. Specifically, I find that monopoly markets favor a wider distribution of fares with more mass in the tails while duopoly and competitive markets exhibit a tighter fare distribution. These findings indicate that the dispersion of airline fares may result from the efforts of airlines to practice second-degree price discrimination. Chapter Three adopts a Bayesian approach to the problem of tree structure specification in nested logit modelling, which requires a heavy computational burden in calculating marginal likelihoods. I compare two different techniques for estimating marginal likelihoods: (1) the Laplace approximation, and (2) reversible jump MCMC. I apply the techniques to both a simulated and a travel mode
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Directory of Open Access Journals (Sweden)
Barbara Reis-Santos
2013-09-01
Full Text Available OBJECTIVE: To analyze the association between clinical/epidemiological characteristics and outcomes of tuberculosis treatment in patients with concomitant tuberculosis and chronic kidney disease (CKD in Brazil. METHODS: We used the Brazilian Ministry of Health National Case Registry Database to identify patients with tuberculosis and CKD, treated between 2007 and 2011. The tuberculosis treatment outcomes were compared with epidemiological and clinical characteristics of the subjects using a hierarchical multinomial logistic regression model, in which cure was the reference outcome. RESULTS: The prevalence of CKD among patients with tuberculosis was 0.4% (95% CI: 0.37-0.42%. The sample comprised 1,077 subjects. The outcomes were cure, in 58%; treatment abandonment, in 7%; death from tuberculosis, in 13%; and death from other causes, in 22%. The characteristics that differentiated the ORs for treatment abandonment or death were age; alcoholism; AIDS; previous noncompliance with treatment; transfer to another facility; suspected tuberculosis on chest X-ray; positive results in the first smear microscopy; and indications for/use of directly observed treatment, short-course strategy. CONCLUSIONS: Our data indicate the importance of sociodemographic characteristics for the diagnosis of tuberculosis in patients with CKD and underscore the need for tuberculosis control strategies targeting patients with chronic noncommunicable diseases, such as CKD.
Modelling Stochastic Route Choice Behaviours with a Closed-Form Mixed Logit Model
Directory of Open Access Journals (Sweden)
Xinjun Lai
2015-01-01
Full Text Available A closed-form mixed Logit approach is proposed to model the stochastic route choice behaviours. It combines both the advantages of Probit and Logit to provide a flexible form in alternatives correlation and a tractable form in expression; besides, the heterogeneity in alternative variance can also be addressed. Paths are compared by pairs where the superiority of the binary Probit can be fully used. The Probit-based aggregation is also used for a nested Logit structure. Case studies on both numerical and empirical examples demonstrate that the new method is valid and practical. This paper thus provides an operational solution to incorporate the normal distribution in route choice with an analytical expression.
DETEKSI DINI KRISIS PERBANKAN INDONESIA: IDENTIFIKASI VARIABEL MAKRO DENGAN MODEL LOGIT
Directory of Open Access Journals (Sweden)
Shanty Oktavilia
2012-01-01
Full Text Available Indonesia suffered from banking crisis for several times. It was the effect of the worst crisis occurredin 1997. Actually, Bath Thailand which plunged into 27,8% at the third quarter of the year 1997 was thebeginning problem that caused Asia currency crisis. This study analyzes the influence of macro indicatoras an early warning system by using logit econometrics model for predicting the possibilities of bankingcrisis that may occur in Indonesia.Kewords: Banking Crisis, macro economic indicator, EWS-logit model
Multiple equilibria and limit cycles in evolutonary games with Logit Dynamics
Hommes, C.H.; Ochea, M.I.
2012-01-01
This note shows, by means of two simple, three-strategy games, the existence of stable periodic orbits and of multiple, interior steady states in a smooth version of the Best-Response Dynamics, the Logit Dynamics. The main finding is that, unlike Replicator Dynamics, generic Hopf bifurcation and
Hommes, C.H.; Ochea, M.I.
2010-01-01
This paper investigates, by means of simple, three and four strategy games, the occurrence of periodic and chaotic behaviour in a smooth version of the Best Response Dynamics, the Logit Dynamics. The main finding is that, unlike Replicator Dynamics, generic Hopf bifurcation and thus, stable limit
Another Look at the Method of Y-Standardization in Logit and Probit Models
DEFF Research Database (Denmark)
Karlson, Kristian Bernt
2015-01-01
This paper takes another look at the derivation of the method of Y-standardization used in sociological analysis involving comparisons of coefficients across logit or probit models. It shows that the method can be derived under less restrictive assumptions than hitherto suggested. Rather than...
Logit Estimation of a Gravity Model of the College Enrollment Decision.
Leppel, Karen
1993-01-01
A study investigated the factors influencing students' decisions about attending a college to which they had been admitted. Logit analysis confirmed gravity model predictions that geographic distance and student ability would most influence the enrollment decision and found other variables, although affecting earlier stages of decision making, did…
An integrated Markov decision process and nested logit consumer response model of air ticket pricing
Lu, J.; Feng, T.; Timmermans, H.P.J.; Yang, Z.
2017-01-01
The paper attempts to propose an optimal air ticket pricing model during the booking horizon by taking into account passengers' purchasing behavior of air tickets. A Markov decision process incorporating a nested logit consumer response model is established to modeling the dynamic pricing process.
DEFF Research Database (Denmark)
Kaplan, Sigal; Prato, Carlo Giacomo
2012-01-01
as from the key role of the ability of drivers to perform effective corrective maneuvers for the success of automated in-vehicle warning and driver assistance systems. The analysis is conducted by means of a mixed logit model that accommodates correlations across alternatives and heteroscedasticity. Data...
Mixed logit model of intended residential mobility in renovated historical blocks in China
Jiang, W.; Timmermans, H.J.P.; Li, H.; Feng, T.
2016-01-01
Using data from 8 historical blocks in China, the influence of socialdemographic characteristics and residential satisfaction on intended residentialmobility is analysed. The results of a mixed logit model indicate that higher residential satisfaction will lead to a lower intention to move house,
Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.
2017-01-01
Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Street, Nathan Lee
2017-01-01
Teacher value-added measures (VAM) are designed to provide information regarding teachers' causal impact on the academic growth of students while controlling for exogenous variables. While some researchers contend VAMs successfully and authentically measure teacher causality on learning, others suggest VAMs cannot adequately control for exogenous…
Estimation from incomplete multinomial data. Ph.D. Thesis - Harvard Univ.
Credeur, K. R.
1978-01-01
The vector of multinomial cell probabilities was estimated from incomplete data, incomplete in that it contains partially classified observations. Each such partially classified observation was observed to fall in one of two or more selected categories but was not classified further into a single category. The data were assumed to be incomplete at random. The estimation criterion was minimization of risk for quadratic loss. The estimators were the classical maximum likelihood estimate, the Bayesian posterior mode, and the posterior mean. An approximation was developed for the posterior mean. The Dirichlet, the conjugate prior for the multinomial distribution, was assumed for the prior distribution.
DEFF Research Database (Denmark)
Dlugosz, Stephan; Mammen, Enno; Wilke, Ralf
We consider the semiparametric generalised linear regression model which has mainstream empirical models such as the (partially) linear mean regression, logistic and multinomial regression as special cases. As an extension to related literature we allow a misclassified covariate to be interacted...
Study on Emission Measurement of Vehicle on Road Based on Binomial Logit Model
Aly, Sumarni Hamid; Selintung, Mary; Ramli, Muhammad Isran; Sumi, Tomonori
2011-01-01
This research attempts to evaluate emission measurement of on road vehicle. In this regard, the research develops failure probability model of vehicle emission test for passenger car which utilize binomial logit model. The model focuses on failure of CO and HC emission test for gasoline cars category and Opacity emission test for diesel-fuel cars category as dependent variables, while vehicle age, engine size, brand and type of the cars as independent variables. In order to imp...
Analysis of Internet Usage Intensity in Iraq: An Ordered Logit Model
Almas Heshmati; Firas H. Al-Hammadany; Ashraf Bany-Mohammed
2013-01-01
Intensity of Internet use is significantly influenced by government policies, people’s levels of income, education, employment and general development and economic conditions. Iraq has very low Internet usage levels compared to the region and the world. This study uses an ordered logit model to analyse the intensity of Internet use in Iraq. The results showed that economic reasons (internet cost and income level) were key cause for low level usage intensity rates. About 68% of the population ...
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
A metric for cross-sample comparisons using logit and probit
DEFF Research Database (Denmark)
Karlson, Kristian Bernt
relative to an arbitrary scale, which makes the coefficients difficult both to interpret and to compare across groups or samples. Do differences in coefficients reflect true differences or differences in scales? This cross-sample comparison problem raises concerns for comparative research. However, we......* across groups or samples, making it suitable for situations met in real applications in comparative research. Our derivations also extend to the probit and to ordered and multinomial models. The new metric is implemented in the Stata command nlcorr....
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
The importance of examining movements within the US health care system: sequential logit modeling
Directory of Open Access Journals (Sweden)
Lee Chioun
2010-09-01
Full Text Available Abstract Background Utilization of specialty care may not be a discrete, isolated behavior but rather, a behavior of sequential movements within the health care system. Although patients may often visit their primary care physician and receive a referral before utilizing specialty care, prior studies have underestimated the importance of accounting for these sequential movements. Methods The sample included 6,772 adults aged 18 years and older who participated in the 2001 Survey on Disparities in Quality of Care, sponsored by the Commonwealth Fund. A sequential logit model was used to account for movement in all stages of utilization: use of any health services (i.e., first stage, having a perceived need for specialty care (i.e., second stage, and utilization of specialty care (i.e., third stage. In the sequential logit model, all stages are nested within the previous stage. Results Gender, race/ethnicity, education and poor health had significant explanatory effects with regard to use of any health services and having a perceived need for specialty care, however racial/ethnic, gender, and educational disparities were not present in utilization of specialty care. After controlling for use of any health services and having a perceived need for specialty care, inability to pay for specialty care via income (AOR = 1.334, CI = 1.10 to 1.62 or health insurance (unstable insurance: AOR = 0.26, CI = 0.14 to 0.48; no insurance: AOR = 0.12, CI = 0.07 to 0.20 were significant barriers to utilization of specialty care. Conclusions Use of a sequential logit model to examine utilization of specialty care resulted in a detailed representation of utilization behaviors and patient characteristics that impact these behaviors at all stages within the health care system. After controlling for sequential movements within the health care system, the biggest barrier to utilizing specialty care is the inability to pay, while racial, gender, and educational disparities
DEFF Research Database (Denmark)
Kaplan, Sigal; Prato, Carlo Giacomo
2012-01-01
of 2011. Method: The current study investigates the underlying risk factors of bus accident severity in the United States by estimating a generalized ordered logit model. Data for the analysis are retrieved from the General Estimates System (GES) database for the years 2005–2009. Results: Results show...... that accident severity increases: (i) for young bus drivers under the age of 25; (ii) for drivers beyond the age of 55, and most prominently for drivers over 65 years old; (iii) for female drivers; (iv) for very high (over 65 mph) and very low (under 20 mph) speed limits; (v) at intersections; (vi) because......Introduction: Recent years have witnessed a growing interest in improving bus safety operations worldwide. While in the United States buses are considered relatively safe, the number of bus accidents is far from being negligible, triggering the introduction of the Motor-coach Enhanced Safety Act...
Stability of Mixed-Strategy-Based Iterative Logit Quantal Response Dynamics in Game Theory
Zhuang, Qian; Di, Zengru; Wu, Jinshan
2014-01-01
Using the Logit quantal response form as the response function in each step, the original definition of static quantal response equilibrium (QRE) is extended into an iterative evolution process. QREs remain as the fixed points of the dynamic process. However, depending on whether such fixed points are the long-term solutions of the dynamic process, they can be classified into stable (SQREs) and unstable (USQREs) equilibriums. This extension resembles the extension from static Nash equilibriums (NEs) to evolutionary stable solutions in the framework of evolutionary game theory. The relation between SQREs and other solution concepts of games, including NEs and QREs, is discussed. Using experimental data from other published papers, we perform a preliminary comparison between SQREs, NEs, QREs and the observed behavioral outcomes of those experiments. For certain games, we determine that SQREs have better predictive power than QREs and NEs. PMID:25157502
Using continuation-ratio logits to analyze the variation of the age composition of fish catches
DEFF Research Database (Denmark)
Kvist, Trine; Gislason, Henrik; Thyregod, Poul
2000-01-01
Major sources of information for the estimation of the size of the fish stocks and the rate of their exploitation are samples from which the age composition of catches may be determined However, the age composition in the catches often varies as a result of several factors. Stratification...... of the sampling is desirable, because it leads to better estimates of the age composition, and the corresponding variances and covariances. The analysis is impeded by the fact that the response is ordered categorical. This paper introduces an easily applicable method to analyze such data. The method combines...... be applied separately to each level of the logits. The method is illustrated by the analysis of age-composition data collected from the Danish sandeel fishery in the North Sea in 1993. The significance of possible sources of variation is evaluated, and formulae for estimating the proportions of each age...
DEFF Research Database (Denmark)
Cherchi, Elisabetta; Guevara, Cristian
2012-01-01
with cross-sectional or with panel data, and (d) EM systematically attained more efficient estimators than the MSL method. The results imply that if the purpose of the estimation is only to determine the ratios of the model parameters (e.g., the value of time), the EM method should be preferred. For all......The random coefficients logit model allows a more realistic representation of agents' behavior. However, the estimation of that model may involve simulation, which may become impractical with many random coefficients because of the curse of dimensionality. In this paper, the traditional maximum...... simulated likelihood (MSL) method is compared with the alternative expectation- maximization (EM) method, which does not require simulation. Previous literature had shown that for cross-sectional data, MSL outperforms the EM method in the ability to recover the true parameters and estimation time...
Airport Choice in Sao Paulo Metropolitan Area: An Application of the Conditional Logit Model
Moreno, Marcelo Baena; Muller, Carlos
2003-01-01
Using the conditional LOGIT model, this paper addresses the airport choice in the Sao Paulo Metropolitan Area. In this region, Guarulhos International Airport (GRU) and Congonhas Airport (CGH) compete for passengers flying to several domestic destinations. The airport choice is believed to be a result of the tradeoff passengers perform considering airport access characteristics, airline level of service characteristics and passenger experience with the analyzed airports. It was found that access time to the airports better explain the airport choice than access distance, whereas direct flight frequencies gives better explanation to the airport choice than the indirect (connections and stops) and total (direct plus indirect) flight frequencies. Out of 15 tested variables, passenger experience with the analyzed airports was the variable that best explained the airport choice in the region. Model specifications considering 1, 2 or 3 variables were tested. The model specification most adjusted to the observed data considered access time, direct flight frequencies in the travel period (morning or afternoon peak) and passenger experience with the analyzed airports. The influence of these variables was therefore analyzed across market segments according to departure airport and flight duration criteria. The choice of GRU (located neighboring Sao Paulo city) is not well explained by the rationality of access time economy and the increase of the supply of direct flight frequencies, while the choice of CGH (located inside Sao Paulo city) is. Access time was found to be more important to passengers flying shorter distances while direct flight frequencies in the travel period were more significant to those flying longer distances. Keywords: Airport choice, Multiple airport region, Conditional LOGIT model, Access time, Flight frequencies, Passenger experience with the analyzed airports, Transportation planning
Prospective memory after moderate-to-severe traumatic brain injury: a multinomial modeling approach.
Pavawalla, Shital P; Schmitter-Edgecombe, Maureen; Smith, Rebekah E
2012-01-01
Prospective memory (PM), which can be understood as the processes involved in realizing a delayed intention, is consistently found to be impaired after a traumatic brain injury (TBI). Although PM can be empirically dissociated from retrospective memory, it inherently involves both a prospective component (i.e., remembering that an action needs to be carried out) and retrospective components (i.e., remembering what action needs to be executed and when). This study utilized a multinomial processing tree model to disentangle the prospective (that) and retrospective recognition (when) components underlying PM after moderate-to-severe TBI. Seventeen participants with moderate to severe TBI and 17 age- and education-matched control participants completed an event-based PM task that was embedded within an ongoing computer-based color-matching task. The multinomial processing tree modeling approach revealed a significant group difference in the prospective component, indicating that the control participants allocated greater preparatory attentional resources to the PM task compared to the TBI participants. Participants in the TBI group were also found to be significantly more impaired than controls in the when aspect of the retrospective component. These findings indicated that the TBI participants had greater difficulty allocating the necessary preparatory attentional resources to the PM task and greater difficulty discriminating between PM targets and nontargets during task execution, despite demonstrating intact posttest recall and/or recognition of the PM tasks and targets.
Parameter Estimation in Probit Model for Multivariate Multinomial Response Using SMLE
Directory of Open Access Journals (Sweden)
Jaka Nugraha
2012-02-01
Full Text Available In the research field of transportation, market research and politics, often involving the response of the multinomial multivariate observations. In this paper, we discused a modeling of multivariate multinomial responses using probit model. The estimated parameters were calculated using Maximum Likelihood Estimations (MLE based on the GHK simulation. method known as Simulated Maximum Likelihood Estimations (SMLE. Likelihood function on the Probit model contains probability values that must be resolved by simulation. By using the GHK simulation algorithm, the estimator equation has been obtained for the parameters in the model Probit Keywords : Probit Model, Newton-Raphson Iteration, GHK simulator, MLE, simulated log-likelihood
Uncovering a latent multinomial: Analysis of mark-recapture data with misidentification
Link, W.A.; Yoshizaki, J.; Bailey, L.L.; Pollock, K.H.
2010-01-01
Natural tags based on DNA fingerprints or natural features of animals are now becoming very widely used in wildlife population biology. However, classic capture-recapture models do not allow for misidentification of animals which is a potentially very serious problem with natural tags. Statistical analysis of misidentification processes is extremely difficult using traditional likelihood methods but is easily handled using Bayesian methods. We present a general framework for Bayesian analysis of categorical data arising from a latent multinomial distribution. Although our work is motivated by a specific model for misidentification in closed population capture-recapture analyses, with crucial assumptions which may not always be appropriate, the methods we develop extend naturally to a variety of other models with similar structure. Suppose that observed frequencies f are a known linear transformation f = A???x of a latent multinomial variable x with cell probability vector ?? = ??(??). Given that full conditional distributions [?? | x] can be sampled, implementation of Gibbs sampling requires only that we can sample from the full conditional distribution [x | f, ??], which is made possible by knowledge of the null space of A???. We illustrate the approach using two data sets with individual misidentification, one simulated, the other summarizing recapture data for salamanders based on natural marks. ?? 2009, The International Biometric Society.
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Anàlisi cluster multinomial bayesià..Aplicació a dades electorals
Montón Domingo, Maria
2009-01-01
En aquest treball fi de màster se li vol donar una altra visió a les dades de resultats electorals, en concret, les del Parlament de Catalunya. Així doncs, l'eina d'anàlisi que s'utilitza és l'anàlisi clúster multinomial bayesià i les unitats d'estudi són les zones de recerca petita de la ciutat de Barcelona. D'aquesta manera es determina com s'agrupen les diferents zones de recerca petita de Barcelona des del punt de vista de les seves votacions i quina relació hi ha entre els partits en fun...
Hocine, Mounia; Guillemot, Didier; Tubert-Bitter, Pascale; Moreau, Thierry
2005-12-30
In case-series or cohort studies, we propose a test of independence between the occurrences of two types of recurrent events (such as two repeated infections) related to an intermittent exposure (such as an antibiotic treatment). The test relies upon an extension of a recent method for analysing case-series data, in the presence of one type of recurrent event. The test statistic is derived from a bivariate Poisson generated-multinomial distribution. Simulations for checking the validity of the test concerning the type I error and the power properties are presented. The test is illustrated using data from a cohort on antibiotics bacterial resistance in schoolchildren. Copyright 2005 John Wiley & Sons, Ltd.
Multinomial Bayesian learning for modeling classical and nonclassical receptive field properties.
Hosoya, Haruo
2012-08-01
We study the interplay of Bayesian inference and natural image learning in a hierarchical vision system, in relation to the response properties of early visual cortex. We particularly focus on a Bayesian network with multinomial variables that can represent discrete feature spaces similar to hypercolumns combining minicolumns, enforce sparsity of activation to learn efficient representations, and explain divisive normalization. We demonstrate that maximal-likelihood learning using sampling-based Bayesian inference gives rise to classical receptive field properties similar to V1 simple cells and V2 cells, while inference performed on the trained network yields nonclassical context-dependent response properties such as cross-orientation suppression and filling in. Comparison with known physiological properties reveals some qualitative and quantitative similarities.
The empathy impulse: A multinomial model of intentional and unintentional empathy for pain.
Cameron, C Daryl; Spring, Victoria L; Todd, Andrew R
2017-04-01
Empathy for pain is often described as automatic. Here, we used implicit measurement and multinomial modeling to formally quantify unintentional empathy for pain: empathy that occurs despite intentions to the contrary. We developed the pain identification task (PIT), a sequential priming task wherein participants judge the painfulness of target experiences while trying to avoid the influence of prime experiences. Using multinomial modeling, we distinguished 3 component processes underlying PIT performance: empathy toward target stimuli (Intentional Empathy), empathy toward prime stimuli (Unintentional Empathy), and bias to judge target stimuli as painful (Response Bias). In Experiment 1, imposing a fast (vs. slow) response deadline uniquely reduced Intentional Empathy. In Experiment 2, inducing imagine-self (vs. imagine-other) perspective-taking uniquely increased Unintentional Empathy. In Experiment 3, Intentional and Unintentional Empathy were stronger toward targets with typical (vs. atypical) pain outcomes, suggesting that outcome information matters and that effects on the PIT are not reducible to affective priming. Typicality of pain outcomes more weakly affected task performance when target stimuli were merely categorized rather than judged for painfulness, suggesting that effects on the latter are not reducible to semantic priming. In Experiment 4, Unintentional Empathy was stronger for participants who engaged in costly donation to cancer charities, but this parameter was also high for those who donated to an objectively worse but socially more popular charity, suggesting that overly high empathy may facilitate maladaptive altruism. Theoretical and practical applications of our modeling approach for understanding variation in empathy are discussed. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Targeting: Logistic Regression, Special Cases and Extensions
Directory of Open Access Journals (Sweden)
Helmut Schaeben
2014-12-01
Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.
A mixed logit analysis of two-vehicle crash severities involving a motorcycle.
Shaheed, Mohammad Saad B; Gkritza, Konstantina; Zhang, Wei; Hans, Zachary
2013-12-01
Using motorcycle crash data for Iowa from 2001 to 2008, this paper estimates a mixed logit model to investigate the factors that affect crash severity outcomes in a collision between a motorcycle and another vehicle. These include crash-specific factors (such as manner of collision, motorcycle rider and non-motorcycle driver and vehicle actions), roadway and environmental conditions, location and time, motorcycle rider and non-motorcycle driver and vehicle attributes. The methodological approach allows the parameters to vary across observations as opposed to a single parameter representing all observations. Our results showed non-uniform effects of rear-end collisions on minor injury crashes, as well as of the roadway speed limit greater or equal to 55mph, the type of area (urban), the riding season (summer) and motorcyclist's gender on low severity crashes. We also found significant effects of the roadway surface condition, clear vision (not obscured by moving vehicles, trees, buildings, or other), light conditions, speed limit, and helmet use on severe injury outcomes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Oktaviana, P. P.; Fithriasari, K.
2018-04-01
Mostly Indonesian citizen consume vannamei shrimp as their food. Vannamei shrimp also is one of Indonesian exports comodities mainstay. Vannamei shrimp in the ponds and markets could be contaminated by Salmonella sp bacteria. This bacteria will endanger human health. Salmonella sp bacterial contamination on vannamei shrimp could be affected by many factors. This study is intended to identify what factors that supposedly influence the Salmonella sp bacterial contamination on vannamei shrimp. The researchers used the testing result of Salmonella sp bacterial contamination on vannamei shrimp as response variable. This response variable has two categories: 0 = if testing result indicate that there is no Salmonella sp on vannamei shrimp; 1 = if testing result indicate that there is Salmonella sp on vannamei shrimp. There are four factors that supposedly influence the Salmonella sp bacterial contamination on vannamei shrimp, which are the testing result of Salmonella sp bacterial contamination on farmer hand swab; the subdistrict of vannamei shrimp ponds; the fish processing unit supplied by; and the pond are in hectare. This four factors used as predictor variables. The analysis used is Binary Logit Model Approach according to the response variable that has two categories. The analysis result indicates that the factors or predictor variables which is significantly affect the Salmonella sp bacterial contamination on vannamei shrimp are the testing result of Salmonella sp bacterial contamination on farmer hand swab and the subdistrict of vannamei shrimp ponds.
Directory of Open Access Journals (Sweden)
Bowen Dong
2018-01-01
Full Text Available Road traffic accidents are believed to be associated with not only road geometric feature and traffic characteristic, but also weather condition. To address these safety issues, it is of paramount importance to understand how these factors affect the occurrences of the crashes. Existing studies have suggested that the mechanisms of single-vehicle (SV accidents and multivehicle (MV accidents can be very different. Few studies were conducted to examine the difference of SV and MV accident probability by addressing unobserved heterogeneity at the same time. To investigate the different contributing factors on SV and MV, a mixed logit model is employed using disaggregated data with the response variable categorized as no accidents, SV accidents, and MV accidents. The results indicate that, in addition to speed gap, length of segment, and wet road surfaces which are significant for both SV and MV accidents, most of other variables are significant only for MV accidents. Traffic, road, and surface characteristics are main influence factors of SV and MV accident possibility. Hourly traffic volume, inside shoulder width, and wet road surface are found to produce statistically significant random parameters. Their effects on the possibility of SV and MV accident vary across different road segments.
Time-varying mixed logit model for vehicle merging behavior in work zone merging areas.
Weng, Jinxian; Du, Gang; Li, Dan; Yu, Yao
2018-08-01
This study aims to develop a time-varying mixed logit model for the vehicle merging behavior in work zone merging areas during the merging implementation period from the time of starting a merging maneuver to that of completing the maneuver. From the safety perspective, vehicle crash probability and severity between the merging vehicle and its surrounding vehicles are regarded as major factors influencing vehicle merging decisions. Model results show that the model with the use of vehicle crash risk probability and severity could provide higher prediction accuracy than previous models with the use of vehicle speeds and gap sizes. It is found that lead vehicle type, through lead vehicle type, through lag vehicle type, crash probability of the merging vehicle with respect to the through lag vehicle, crash severities of the merging vehicle with respect to the through lead and lag vehicles could exhibit time-varying effects on the merging behavior. One important finding is that the merging vehicle could become more and more aggressive in order to complete the merging maneuver as quickly as possible over the elapsed time, even if it has high vehicle crash risk with respect to the through lead and lag vehicles. Copyright © 2018 Elsevier Ltd. All rights reserved.
A Subpath-based Logit Model to Capture the Correlation of Routes
Directory of Open Access Journals (Sweden)
Xinjun Lai
2016-06-01
Full Text Available A subpath-based methodology is proposed to capture the travellers’ route choice behaviours and their perceptual correlation of routes, because the original link-based style may not be suitable in application: (1 travellers do not process road network information and construct the chosen route by a link-by-link style; (2 observations from questionnaires and GPS data, however, are not always link-specific. Subpaths are defined as important portions of the route, such as major roads and landmarks. The cross-nested Logit (CNL structure is used for its tractable closed-form and its capability to explicitly capture the routes correlation. Nests represent subpaths other than links so that the number of nests is significantly reduced. Moreover, the proposed method simplifies the original link-based CNL model; therefore, it alleviates the estimation and computation difficulties. The estimation and forecast validation with real data are presented, and the results suggest that the new method is practical.
Application of LogitBoost Classifier for Traceability Using SNP Chip Data.
Kim, Kwondo; Seo, Minseok; Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok
2015-01-01
Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability.
Vazquez, A I; Gianola, D; Bates, D; Weigel, K A; Heringstad, B
2009-02-01
Clinical mastitis is typically coded as presence/absence during some period of exposure, and records are analyzed with linear or binary data models. Because presence includes cows with multiple episodes, there is loss of information when a count is treated as a binary response. The Poisson model is designed for counting random variables, and although it is used extensively in epidemiology of mastitis, it has rarely been used for studying the genetics of mastitis. Many models have been proposed for genetic analysis of mastitis, but they have not been formally compared. The main goal of this study was to compare linear (Gaussian), Bernoulli (with logit link), and Poisson models for the purpose of genetic evaluation of sires for mastitis in dairy cattle. The response variables were clinical mastitis (CM; 0, 1) and number of CM cases (NCM; 0, 1, 2, ..). Data consisted of records on 36,178 first-lactation daughters of 245 Norwegian Red sires distributed over 5,286 herds. Predictive ability of models was assessed via a 3-fold cross-validation using mean squared error of prediction (MSEP) as the end-point. Between-sire variance estimates for NCM were 0.065 in Poisson and 0.007 in the linear model. For CM the between-sire variance was 0.093 in logit and 0.003 in the linear model. The ratio between herd and sire variances for the models with NCM response was 4.6 and 3.5 for Poisson and linear, respectively, and for model for CM was 3.7 in both logit and linear models. The MSEP for all cows was similar. However, within healthy animals, MSEP was 0.085 (Poisson), 0.090 (linear for NCM), 0.053 (logit), and 0.056 (linear for CM). For mastitic animals the MSEP values were 1.206 (Poisson), 1.185 (linear for NCM response), 1.333 (logit), and 1.319 (linear for CM response). The models for count variables had a better performance when predicting diseased animals and also had a similar performance between them. Logit and linear models for CM had better predictive ability for healthy
Directory of Open Access Journals (Sweden)
Xiao-Jun Yu
2014-02-01
Full Text Available The efficiency loss of mixed equilibrium associated with two categories of users is investigated in this paper. The first category of users are altruistic users (AU who have the same altruism coefficient and try to minimize their own perceived cost that assumed to be a linear combination of selfish component and altruistic component. The second category of users are Logit-based stochastic users (LSU who choose the route according to the Logit-based stochastic user equilibrium (SUE principle. The variational inequality (VI model is used to formulate the mixed route choice behaviours associated with AU and LSU. The efficiency loss caused by the two categories of users is analytically derived and the relations to some network parameters are discussed. The numerical tests validate our analytical results. Our result takes the results in the existing literature as its special cases.
2018-04-01
Reports an error in "The empathy impulse: A multinomial model of intentional and unintentional empathy for pain" by C. Daryl Cameron, Victoria L. Spring and Andrew R. Todd ( Emotion , 2017[Apr], Vol 17[3], 395-411). In this article, there was an error in the calculation of some of the effect sizes. The w effect size was manually computed incorrectly. The incorrect number of total observations was used, which affected the final effect size estimates. This computing error does not change any of the results or interpretations about model fit based on the G² statistic, or about significant differences across conditions in process parameters. Therefore, it does not change any of the hypothesis tests or conclusions. The w statistics for overall model fit should be .02 instead of .04 in Study 1, .01 instead of .02 in Study 2, .01 instead of .03 for the OIT in Study 3 (model fit for the PIT remains the same: .00), and .02 instead of .03 in Study 4. The corrected tables can be seen here: http://osf.io/qebku at the Open Science Framework site for the article. (The following abstract of the original article appeared in record 2017-01641-001.) Empathy for pain is often described as automatic. Here, we used implicit measurement and multinomial modeling to formally quantify unintentional empathy for pain: empathy that occurs despite intentions to the contrary. We developed the pain identification task (PIT), a sequential priming task wherein participants judge the painfulness of target experiences while trying to avoid the influence of prime experiences. Using multinomial modeling, we distinguished 3 component processes underlying PIT performance: empathy toward target stimuli (Intentional Empathy), empathy toward prime stimuli (Unintentional Empathy), and bias to judge target stimuli as painful (Response Bias). In Experiment 1, imposing a fast (vs. slow) response deadline uniquely reduced Intentional Empathy. In Experiment 2, inducing imagine-self (vs. imagine
Kaplan, Sigal; Prato, Carlo Giacomo
2012-01-01
The current study focuses on the propensity of drivers to engage in crash avoidance maneuvers in relation to driver attributes, critical events, crash characteristics, vehicles involved, road characteristics, and environmental conditions. The importance of avoidance maneuvers derives from the key role of proactive and state-aware road users within the concept of sustainable safety systems, as well as from the key role of effective corrective maneuvers in the success of automated in-vehicle warning and driver assistance systems. The analysis is conducted by means of a mixed logit model that represents the selection among 5 emergency lateral and speed control maneuvers (i.e., "no avoidance maneuvers," "braking," "steering," "braking and steering," and "other maneuvers) while accommodating correlations across maneuvers and heteroscedasticity. Data for the analysis were retrieved from the General Estimates System (GES) crash database for the year 2009 by considering drivers for which crash avoidance maneuvers are known. The results show that (1) the nature of the critical event that made the crash imminent greatly influences the choice of crash avoidance maneuvers, (2) women and elderly have a relatively lower propensity to conduct crash avoidance maneuvers, (3) drowsiness and fatigue have a greater negative marginal effect on the tendency to engage in crash avoidance maneuvers than alcohol and drug consumption, (4) difficult road conditions increase the propensity to perform crash avoidance maneuvers, and (5) visual obstruction and artificial illumination decrease the probability to carry out crash avoidance maneuvers. The results emphasize the need for public awareness campaigns to promote safe driving style for senior drivers and warning about the risks of driving under fatigue and distraction being comparable to the risks of driving under the influence of alcohol and drugs. Moreover, the results suggest the need to educate drivers about hazard perception, designing
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Yoo, J.; Kong, K.
2010-12-01
This research the findings from a discrete-choice experiment designed to estimate the economic benefits associated with the Anyangcheon watershed improvements in Rep. of Korea. The Anyangcheon watershed has suffered from streamflow depletion and poor stream quality, which often negatively affect instream and near-stream ecologic integrity, as well as water supply. Such distortions in the hydrologic cycle mainly result from rapid increase of impermeable area due to urbanization, decreases of baseflow runoff due to groundwater pumping, and reduced precipitation inputs driven by climate forcing. As well, combined sewer overflows and increase of non-point source pollution from urban regions decrease water quality. The appeal of choice experiments (CE) in economic analysis is that it is based on random utility theory (McFadden, 1974; Ben-Akiva and Lerman, 1985). In contrast to contingent valuation method (CVM), which asks people to choose between a base case and a specific alternative, CE asks people to choice between cases that are described by attributes. The attributes of this study were selected from hydrologic vulnerability components that represent flood damage possibility, instreamflow depletion, water quality deterioration, form of the watershed and tax. Their levels were divided into three grades include status quo. Two grades represented the ideal conditions. These scenarios were constructed from a 35 orthogonal main effect design. This design resulted in twenty-seven choice sets. The design had nine different choice scenarios presented to each respondent. The most popular choice models in use are the conditional logit (CNL). This model provides closed-form choice probability calculation. The shortcoming of CNL comes from irrelevant alternatives (IIA). In this paper, the mixed logit (ML) is applied to allow the coefficient’s variation for random taste heterogeneity in the population. The mixed logit model(with normal distributions for the attributes) fit the
Directory of Open Access Journals (Sweden)
Ibsen Chivatá Cárdenas
2008-05-01
Full Text Available This article presents a rainfall model constructed by applying non-parametric modelling and imprecise probabilities; these tools were used because there was not enough homogeneous information in the study area. The area’s hydro-logical information regarding rainfall was scarce and existing hydrological time series were not uniform. A distributed extended rainfall model was constructed from so-called probability boxes (p-boxes, multinomial probability distribu-tion and confidence intervals (a friendly algorithm was constructed for non-parametric modelling by combining the last two tools. This model confirmed the high level of uncertainty involved in local rainfall modelling. Uncertainty en-compassed the whole range (domain of probability values thereby showing the severe limitations on information, leading to the conclusion that a detailed estimation of probability would lead to significant error. Nevertheless, rele-vant information was extracted; it was estimated that maximum daily rainfall threshold (70 mm would be surpassed at least once every three years and the magnitude of uncertainty affecting hydrological parameter estimation. This paper’s conclusions may be of interest to non-parametric modellers and decisions-makers as such modelling and imprecise probability represents an alternative for hydrological variable assessment and maybe an obligatory proce-dure in the future. Its potential lies in treating scarce information and represents a robust modelling strategy for non-seasonal stochastic modelling conditions
COMPARACION DE 13 INTERVALOS DE CONFIANZA PARA LOS PARAMETROS DE LA DISTRIBUCION MULTINOMIAL
Directory of Open Access Journals (Sweden)
Difariney González-Gómez
2015-07-01
Full Text Available La distribución multinomial es fundamental para la descripción de fenómenos en los que pueden ocurrir k > 2 eventos excluyentes, cada uno con probabilidad π = (π1, π2, . . . , πk. Algunos ejemplos de esta distribución incluyen la calidad de un producto o encuestas de selección múltiple. Un problema de gran interés en inferencia estadística es la construcción de intervalos de confianza los parámetros para π. En este trabajo se comparan, a través de un estudio de simulación, 13 metodologías para la construcción de intervalos de confianza para dicha distribución. Utilizando como criterios de comparación el nivel de confianza nominal, la longitud del intervalo y una combinación de estos, se encuentra que los intervalos de confianza basados en el Teorema del Límite Central no presentan el mejor desempeño. Finalmente se recomiendan los métodos basados en la distribución F (Leemis, 1996, seguido del método de verosimilitud relativa (Kalbfleish, 1985 y Quesenberry & Hurst (1964.
Zhang, Yongsheng; Wei, Heng; Zheng, Kangning
2017-01-01
Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Rodriguez Santana, Idaira; Chalkley, Martin
2017-08-11
To analyse how training doctors' demographic and socioeconomic characteristics vary according to the specialty that they are training for. Descriptive statistics and mixed logistic regression analysis of cross-sectional survey data to quantify evidence of systematic relationships between doctors' characteristics and their specialty. Doctors in training in the United Kingdom in 2013. 27 530 doctors in training but not in their foundation year who responded to the National Training Survey 2013. Mixed logit regression estimates and the corresponding odds ratios (calculated separately for all doctors in training and a subsample comprising those educated in the UK), relating gender, age, ethnicity, place of studies, socioeconomic background and parental education to the probability of training for a particular specialty. Being female and being white British increase the chances of being in general practice with respect to any other specialty, while coming from a better-off socioeconomic background and having parents with tertiary education have the opposite effect. Mixed results are found for age and place of studies. For example, the difference between men and women is greatest for surgical specialties for which a man is 12.121 times more likely to be training to a surgical specialty (relative to general practice) than a woman (p-valuevalue<0.01). There are systematic and substantial differences between specialties in respect of training doctors' gender, ethnicity, age and socioeconomic background. The persistent underrepresentation in some specialties of women, minority ethnic groups and of those coming from disadvantaged backgrounds will impact on the representativeness of the profession into the future. Further research is needed to understand how the processes of selection and the self-selection of applicants into specialties gives rise to these observed differences. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article
The Effect of Task Duration on Event-Based Prospective Memory: A Multinomial Modeling Approach
Directory of Open Access Journals (Sweden)
Hongxia Zhang
2017-11-01
Full Text Available Remembering to perform an action when a specific event occurs is referred to as Event-Based Prospective Memory (EBPM. This study investigated how EBPM performance is affected by task duration by having university students (n = 223 perform an EBPM task that was embedded within an ongoing computer-based color-matching task. For this experiment, we separated the overall task’s duration into the filler task duration and the ongoing task duration. The filler task duration is the length of time between the intention and the beginning of the ongoing task, and the ongoing task duration is the length of time between the beginning of the ongoing task and the appearance of the first Prospective Memory (PM cue. The filler task duration and ongoing task duration were further divided into three levels: 3, 6, and 9 min. Two factors were then orthogonally manipulated between-subjects using a multinomial processing tree model to separate the effects of different task durations on the two EBPM components. A mediation model was then created to verify whether task duration influences EBPM via self-reminding or discrimination. The results reveal three points. (1 Lengthening the duration of ongoing tasks had a negative effect on EBPM performance while lengthening the duration of the filler task had no significant effect on it. (2 As the filler task was lengthened, both the prospective and retrospective components show a decreasing and then increasing trend. Also, when the ongoing task duration was lengthened, the prospective component decreased while the retrospective component significantly increased. (3 The mediating effect of discrimination between the task duration and EBPM performance was significant. We concluded that different task durations influence EBPM performance through different components with discrimination being the mediator between task duration and EBPM performance.
Mollenhauer, Robert; Brewer, Shannon K.
2017-01-01
Failure to account for variable detection across survey conditions constrains progressive stream ecology and can lead to erroneous stream fish management and conservation decisions. In addition to variable detection’s confounding long-term stream fish population trends, reliable abundance estimates across a wide range of survey conditions are fundamental to establishing species–environment relationships. Despite major advancements in accounting for variable detection when surveying animal populations, these approaches remain largely ignored by stream fish scientists, and CPUE remains the most common metric used by researchers and managers. One notable advancement for addressing the challenges of variable detection is the multinomial N-mixture model. Multinomial N-mixture models use a flexible hierarchical framework to model the detection process across sites as a function of covariates; they also accommodate common fisheries survey methods, such as removal and capture–recapture. Effective monitoring of stream-dwelling Smallmouth Bass Micropterus dolomieu populations has long been challenging; therefore, our objective was to examine the use of multinomial N-mixture models to improve the applicability of electrofishing for estimating absolute abundance. We sampled Smallmouth Bass populations by using tow-barge electrofishing across a range of environmental conditions in streams of the Ozark Highlands ecoregion. Using an information-theoretic approach, we identified effort, water clarity, wetted channel width, and water depth as covariates that were related to variable Smallmouth Bass electrofishing detection. Smallmouth Bass abundance estimates derived from our top model consistently agreed with baseline estimates obtained via snorkel surveys. Additionally, confidence intervals from the multinomial N-mixture models were consistently more precise than those of unbiased Petersen capture–recapture estimates due to the dependency among data sets in the
Fakherpour, Atousa; Ghaem, Haleh; Fattahi, Zeinabsadat; Zaree, Samaneh
2018-01-01
Although spinal anaesthesia (SA) is nowadays the preferred anaesthesia technique for caesarean section (CS), it is associated with considerable haemodynamic effects, such as maternal hypotension. This study aimed to evaluate a wide range of variables (related to parturient and anaesthesia techniques) associated with the incidence of different degrees of SA-induced hypotension during elective CS. This prospective study was conducted on 511 mother-infant pairs, in which the mother underwent elective CS under SA. The data were collected through preset proforma containing three parts related to the parturient, anaesthetic techniques and a table for recording maternal blood pressure. It was hypothesized that some maternal (such as age) and anaesthesia-related risk factors (such as block height) were associated with occurance of SA-induced hypotension during elective CS. The incidence of mild, moderate and severe hypotension was 20%, 35% and 40%, respectively. Eventually, ten risk factors were found to be associated with hypotension, including age >35 years, body mass index ≥25 kg/m 2 , 11-20 kg weight gain, gravidity ≥4, history of hypotension, baseline systolic blood pressure (SBP) 100 beats/min in maternal modelling, fluid preloading ≥1000 ml, adding sufentanil to bupivacaine and sensory block height >T 4 in anaesthesia-related modelling ( P < 0.05). Age, body mass index, weight gain, gravidity, history of hypotension, baseline SBP and heart rate, fluid preloading, adding sufentanil to bupivacaine and sensory block hieght were the main risk factors identified in the study for SA-induced hypotension during CS.
Directory of Open Access Journals (Sweden)
Atousa Fakherpour
2018-01-01
Full Text Available Background and Aims: Although spinal anaesthesia (SA is nowadays the preferred anaesthesia technique for caesarean section (CS, it is associated with considerable haemodynamic effects, such as maternal hypotension. This study aimed to evaluate a wide range of variables (related to parturient and anaesthesia techniques associated with the incidence of different degrees of SA-induced hypotension during elective CS. Methods: This prospective study was conducted on 511 mother–infant pairs, in which the mother underwent elective CS under SA. The data were collected through preset proforma containing three parts related to the parturient, anaesthetic techniques and a table for recording maternal blood pressure. It was hypothesized that some maternal (such as age and anaesthesia-related risk factors (such as block height were associated with occurance of SA-induced hypotension during elective CS. Results: The incidence of mild, moderate and severe hypotension was 20%, 35% and 40%, respectively. Eventually, ten risk factors were found to be associated with hypotension, including age >35 years, body mass index ≥25 kg/m2, 11–20 kg weight gain, gravidity ≥4, history of hypotension, baseline systolic blood pressure (SBP 100 beats/min in maternal modelling, fluid preloading ≥1000 ml, adding sufentanil to bupivacaine and sensory block height >T4in anaesthesia-related modelling (P < 0.05. Conclusion: Age, body mass index, weight gain, gravidity, history of hypotension, baseline SBP and heart rate, fluid preloading, adding sufentanil to bupivacaine and sensory block hieght were the main risk factors identified in the study for SA-induced hypotension during CS.
Energy Technology Data Exchange (ETDEWEB)
Cardil Forradellas, A.; Molina Terrén, D.M.; Oliveres, J.; Castellnou, M.
2016-07-01
Aim of study: In this study we compare the accuracy of three bivariate distributions: Johnson’s SBB, Weibull-2P and LL-2P functions for characterizing the joint distribution of tree diameters and heights. Area of study: North-West of Spain. Material and methods: Diameter and height measurements of 128 plots of pure and even-aged Tasmanian blue gum (Eucalyptus globulus Labill.) stands located in the North-west of Spain were considered in the present study. The SBB bivariate distribution was obtained from SB marginal distributions using a Normal Copula based on a four-parameter logistic transformation. The Plackett Copula was used to obtain the bivariate models from the Weibull and Logit-logistic univariate marginal distributions. The negative logarithm of the maximum likelihood function was used to compare the results and the Wilcoxon signed-rank test was used to compare the related samples of these logarithms calculated for each sample plot and each distribution. Main results: The best results were obtained by using the Plackett copula and the best marginal distribution was the Logit-logistic. Research highlights: The copulas used in this study have shown a good performance for modeling the joint distribution of tree diameters and heights. They could be easily extended for modelling multivariate distributions involving other tree variables, such as tree volume or biomass. (Author)
International Nuclear Information System (INIS)
Byun, Hyunsuk; Lee, Chul-Yong
2017-01-01
Generally, consumers use electricity without considering the source the electricity was generated from. Since different energy sources exert varying effects on society, it is necessary to analyze consumers’ latent preference for electricity generation sources. The present study estimates Korean consumers’ marginal utility and an appropriate generation mix is derived using the hierarchical Bayesian logit model in a discrete choice experiment. The results show that consumers consider the danger posed by the source of electricity as the most important factor among the effects of electricity generation sources. Additionally, Korean consumers wish to reduce the contribution of nuclear power from the existing 32–11%, and increase that of renewable energy from the existing 4–32%. - Highlights: • We derive an electricity mix reflecting Korean consumers’ latent preferences. • We use the discrete choice experiment and hierarchical Bayesian logit model. • The danger posed by the generation source is the most important attribute. • The consumers wish to increase the renewable energy proportion from 4.3% to 32.8%. • Korea's cost-oriented energy supply policy and consumers’ preference differ markedly.
Very short-term probabilistic forecasting of wind power with generalized logit-Normal distributions
DEFF Research Database (Denmark)
Pinson, Pierre
2012-01-01
and probability masses at the bounds. Both auto-regressive and conditional parametric auto-regressive models are considered for the dynamics of their location and scale parameters. Estimation is performed in a recursive least squares framework with exponential forgetting. The superiority of this proposal over......Very-short-term probabilistic forecasts, which are essential for an optimal management of wind generation, ought to account for the non-linear and double-bounded nature of that stochastic process. They take here the form of discrete–continuous mixtures of generalized logit–normal distributions...
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Directory of Open Access Journals (Sweden)
Ifechukwude Obiamaka Okwechime
Full Text Available Individuals with pre-diabetes and diabetes have increased risks of developing macro-vascular complications including heart disease and stroke; which are the leading causes of death globally. The objective of this study was to estimate the prevalence of pre-diabetes and diabetes, and to investigate their predictors among adults ≥18 years in Florida.Data covering the time period January-December 2013, were obtained from Florida's Behavioral Risk Factor Surveillance System (BRFSS. Survey design of the study was declared using SVYSET statement of STATA 13.1. Descriptive analyses were performed to estimate the prevalence of pre-diabetes and diabetes. Predictors of pre-diabetes and diabetes were investigated using multinomial logistic regression model. Model goodness-of-fit was evaluated using both the multinomial goodness-of-fit test proposed by Fagerland, Hosmer, and Bofin, as well as, the Hosmer-Lemeshow's goodness of fit test.There were approximately 2,983 (7.3% and 5,189 (12.1% adults in Florida diagnosed with pre-diabetes and diabetes, respectively. Over half of the study respondents were white, married and over the age of 45 years while 36.4% reported being physically inactive, overweight (36.4% or obese (26.4%, hypertensive (34.6%, hypercholesteremic (40.3%, and 26% were arthritic. Based on the final multivariable multinomial model, only being overweight (Relative Risk Ratio [RRR] = 1.85, 95% Confidence Interval [95% CI] = 1.41, 2.42, obese (RRR = 3.41, 95% CI = 2.61, 4.45, hypertensive (RRR = 1.69, 95% CI = 1.33, 2.15, hypercholesterolemic (RRR = 1.94, 95% CI = 1.55, 2.43, and arthritic (RRR = 1.24, 95% CI = 1.00, 1.55 had significant associations with pre-diabetes. However, more predictors had significant associations with diabetes and the strengths of associations tended to be higher than for the association with pre-diabetes. For instance, the relative risk ratios for the association between diabetes and being overweight (RRR = 2.00, 95
A Smooth Transition Logit Model of the Effects of Deregulation in the Electricity Market
DEFF Research Database (Denmark)
Hurn, A.S.; Silvennoinen, Annastiina; Teräsvirta, Timo
We consider a nonlinear vector model called the logistic vector smooth transition autoregressive model. The bivariate single-transition vector smooth transition regression model of Camacho (2004) is generalised to a multivariate and multitransition one. A modelling strategy consisting of specific......We consider a nonlinear vector model called the logistic vector smooth transition autoregressive model. The bivariate single-transition vector smooth transition regression model of Camacho (2004) is generalised to a multivariate and multitransition one. A modelling strategy consisting...... of specification, including testing linearity, estimation and evaluation of these models is constructed. Nonlinear least squares estimation of the parameters of the model is discussed. Evaluation by misspecification tests is carried out using tests derived in a companion paper. The use of the modelling strategy...
Gregory, T.; Sewando, P.
2013-01-01
Adoption of technology is an important factor in economic development. The thrust of this study was to establish factors affecting adoption of QPM technology in Northern zone of Tanzania. Primary data was collected from a random sample of 120 smallholder maize farmers in four villages. Data collected were analysed using descriptive and quantitative methods. Logit model was used to determine factors that influence adoption of QPM technology. The regression results indicated that education of t...
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Directory of Open Access Journals (Sweden)
Tolga Kaya
2010-11-01
Full Text Available The purpose of this study is to compare the performances of Artificial Neural Networks (ANN and Multinomial Probit (MNP approaches in modeling the choice decision within fast moving consumer goods sector. To do this, based on 2597 toothpaste purchases of a panel sample of 404 households, choice models are built and their performances are compared on the 861 purchases of a test sample of 135 households. Results show that ANN's predictions are better while MNP is useful in providing marketing insight.
Directory of Open Access Journals (Sweden)
ALI AHMED MOHAMMED
2013-06-01
Full Text Available A study was carried out to examine the perceptions and preferences of students on choosing the type of transportation for their travels in university campus. This study focused on providing personal transport users road transport alternatives as a countermeasure aimed at shifting car users to other modes of transportation. Overall 456 questionnaires were conducted to develop a choice of transportation mode preferences. Consequently, Logit model and SPSS were used to identify the factors that affect the determination of the choice of transportation mode. Results indicated that by reducing travel time by 70% the amount of private cars users will be reduced by 84%, while reduction the travel cost was found to be highly improving the public modes of utilization. This study revealed positive aspects is needed to shift travellers from private modes to public. The positive aspect contributes to travel time and travel cost reduction, hence improving the services, whereby contributing to sustainability.
Chen, Feng; Chen, Suren; Ma, Xiaoxiang
2018-06-01
Driving environment, including road surface conditions and traffic states, often changes over time and influences crash probability considerably. It becomes stretched for traditional crash frequency models developed in large temporal scales to capture the time-varying characteristics of these factors, which may cause substantial loss of critical driving environmental information on crash prediction. Crash prediction models with refined temporal data (hourly records) are developed to characterize the time-varying nature of these contributing factors. Unbalanced panel data mixed logit models are developed to analyze hourly crash likelihood of highway segments. The refined temporal driving environmental data, including road surface and traffic condition, obtained from the Road Weather Information System (RWIS), are incorporated into the models. Model estimation results indicate that the traffic speed, traffic volume, curvature and chemically wet road surface indicator are better modeled as random parameters. The estimation results of the mixed logit models based on unbalanced panel data show that there are a number of factors related to crash likelihood on I-25. Specifically, weekend indicator, November indicator, low speed limit and long remaining service life of rutting indicator are found to increase crash likelihood, while 5-am indicator and number of merging ramps per lane per mile are found to decrease crash likelihood. The study underscores and confirms the unique and significant impacts on crash imposed by the real-time weather, road surface, and traffic conditions. With the unbalanced panel data structure, the rich information from real-time driving environmental big data can be well incorporated. Copyright © 2018 National Safety Council and Elsevier Ltd. All rights reserved.
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Handayani, Dewi; Cahyaning Putri, Hera; Mahmudah, AMH
2017-12-01
Solo-Ngawi toll road project is part of the mega project of the Trans Java toll road development initiated by the government and is still under construction until now. PT Solo Ngawi Jaya (SNJ) as the Solo-Ngawi toll management company needs to determine the toll fare that is in accordance with the business plan. The determination of appropriate toll rates will affect progress in regional economic sustainability and decrease the traffic congestion. These policy instruments is crucial for achieving environmentally sustainable transport. Therefore, the objective of this research is to find out how the toll fare sensitivity of Solo-Ngawi toll road based on Willingness To Pay (WTP). Primary data was obtained by distributing stated preference questionnaires to four wheeled vehicle users in Kartasura-Palang Joglo artery road segment. Further data obtained will be analysed with logit and probit model. Based on the analysis, it is found that the effect of fare change on the amount of WTP on the binomial logit model is more sensitive than the probit model on the same travel conditions. The range of tariff change against values of WTP on the binomial logit model is 20% greater than the range of values in the probit model . On the other hand, the probability results of the binomial logit model and the binary probit have no significant difference (less than 1%).
Understanding logistic regression analysis
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...
Construction of risk prediction model of type 2 diabetes mellitus based on logistic regression
Directory of Open Access Journals (Sweden)
Li Jian
2017-01-01
Full Text Available Objective: to construct multi factor prediction model for the individual risk of T2DM, and to explore new ideas for early warning, prevention and personalized health services for T2DM. Methods: using logistic regression techniques to screen the risk factors for T2DM and construct the risk prediction model of T2DM. Results: Male’s risk prediction model logistic regression equation: logit(P=BMI × 0.735+ vegetables × (−0.671 + age × 0.838+ diastolic pressure × 0.296+ physical activity× (−2.287 + sleep ×(−0.009 +smoking ×0.214; Female’s risk prediction model logistic regression equation: logit(P=BMI ×1.979+ vegetables× (−0.292 + age × 1.355+ diastolic pressure× 0.522+ physical activity × (−2.287 + sleep × (−0.010.The area under the ROC curve of male was 0.83, the sensitivity was 0.72, the specificity was 0.86, the area under the ROC curve of female was 0.84, the sensitivity was 0.75, the specificity was 0.90. Conclusion: This study model data is from a compared study of nested case, the risk prediction model has been established by using the more mature logistic regression techniques, and the model is higher predictive sensitivity, specificity and stability.
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Análisis del rendimiento academico mediante un modelo logit
Directory of Open Access Journals (Sweden)
María del Carmen Ibarra
2010-07-01
Full Text Available Este trabajo analiza el rendimiento académico de los estudiantes de la Facultad de Ingeniería de la Universidad Nacional de Misiones; la población objetivo está conformada por los alumnos de las cohortes 1999 a 2003 (589 estudiantes. Se define al rendimiento académico como el promedio de materias aprobadas anualmente y mediante la técnica estadística multivariada de Regresión Logística, se determina la incidencia que tienen diferentes factores de índole personal, socioeconómica y académica. Los resultados obtenidos permiten concluir que las variables significativas del rendimiento académico son: el promedio de calificaciones del nivel medio, el tipo de Institución donde cursó estos estudios y el número de asignaturas aprobadas en el primer año de carrera, siendo este último factor el más relevante, destacando la importancia de esta primera etapa de la carrera en los posteriores resultados académicos del estudiante. The purpose of this work is to analyse the determining factors which influence students’ performance at university. The research has been carried out on five (5 engineering students’ cohorts (1999-2003 from Universidad Nacional de Misiones (UNaM and includes 589 students. The academic performance is defined as average subjects approved annually. By means of the Logistic Regression technique, we determine the impact of different personal, socioeconomic and academic factors. The main conclusion that can be drawn is that students’ performance, is related to the grade point average (GPA at high school, the kind of high school (public or private students had attended, and the number of passing subjects in their first year at university. The latter being the most important factor, emphasizing the importance of this first stage at the Universtity in the student’s academic performance.
Directory of Open Access Journals (Sweden)
César Rubicundo
2016-08-01
Full Text Available In Venezuela there have been more than 30 mergers, after the approval of the Banking Act in 1999, since from 103 institutions, the financial system closed the year 2013 with 35 brokerage firms, which represents a decrease of 66% due to 20 coalitions, 30 transformations and 18 settlements. Therefore, an analysis is proposed of the current economic situation of the financial system in the context of mergers and interventions, considering internal and external factors according to the constituted capital. The study was based on information from 37 private capital institutions and 04 public institutions, between January 2009 and December 2013. The previous analysis for privately held banks, showed that these institutions have a 78.40% chance of not incurring in situations of fragility; while the State capital banks have 83.30% of surviving in the market. As for the estimated logit models, it was found that the liquidity ratio, ROE, management index and inflation, are components that push towards a fragile situation for privately held banks, with a probability forecast of fragility. With regard to the state capital banks, this situation is explained by a 62.50% equity index, ROE, and inflation. A probability of stability for these banks is expected. The joint model forecasted a probability of a stable financial system for the coming months.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Directory of Open Access Journals (Sweden)
Ahmed Yangui
2014-07-01
Full Text Available Methods that account for preference heterogeneity have received a significant amount of attention in recent literature. Most of them have focused on preference heterogeneity around the mean of the random parameters, which has been specified as a function of socio-demographic characteristics. This paper aims at analyzing consumers’ preferences towards extra-virgin olive oil in Catalonia using a methodological framework with two novelties over past studies: 1 it accounts for both preference heterogeneity around the mean and the variance; and 2 it considers both socio-demographic characteristics of consumers as well as their attitudinal factors. Estimated coefficients and moments of willingness to pay (WTP distributions are compared with those obtained from alternative Random Parameter Logit (RPL models. Results suggest that the proposed framework increases the goodness-of-fit and provides more useful insights for policy analysis. The most important attributes affecting consumers’ preferences towards extra virgin olive oil are the price and the product’s origin. The consumers perceive the organic olive oil attribute negatively, as they think that it is not worth paying a premium for a product that is healthy in nature.
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Multicollinearity and Regression Analysis
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
Chen, Chen; Anderson, Jason C; Wang, Haizhong; Wang, Yinhai; Vogt, Rachel; Hernandez, Salvador
2017-11-01
Transportation agencies need efficient methods to determine how to reduce bicycle accidents while promoting cycling activities and prioritizing safety improvement investments. Many studies have used standalone methods, such as level of traffic stress (LTS) and bicycle level of service (BLOS), to better understand bicycle mode share and network connectivity for a region. However, in most cases, other studies rely on crash severity models to explain what variables contribute to the severity of bicycle related crashes. This research uniquely correlates bicycle LTS with reported bicycle crash locations for four cities in New Hampshire through geospatial mapping. LTS measurements and crash locations are compared visually using a GIS framework. Next, a bicycle injury severity model, that incorporates LTS measurements, is created through a mixed logit modeling framework. Results of the visual analysis show some geospatial correlation between higher LTS roads and "Injury" type bicycle crashes. It was determined, statistically, that LTS has an effect on the severity level of bicycle crashes and high LTS can have varying effects on severity outcome. However, it is recommended that further analyses be conducted to better understand the statistical significance and effect of LTS on injury severity. As such, this research will validate the use of LTS as a proxy for safety risk regardless of the recorded bicycle crash history. This research will help identify the clustering patterns of bicycle crashes on high-risk corridors and, therefore, assist with bicycle route planning and policy making. This paper also suggests low-cost countermeasures or treatments that can be implemented to address high-risk areas. Specifically, with the goal of providing safer routes for cyclists, such countermeasures or treatments have the potential to substantially reduce the number of fatalities and severe injuries. Published by Elsevier Ltd.
Higgs, Megan D.; Link, William; White, Gary C.; Haroldson, Mark A.; Bjornlie, Daniel D.
2013-01-01
Mark-resight designs for estimation of population abundance are common and attractive to researchers. However, inference from such designs is very limited when faced with sparse data, either from a low number of marked animals, a low probability of detection, or both. In the Greater Yellowstone Ecosystem, yearly mark-resight data are collected for female grizzly bears with cubs-of-the-year (FCOY), and inference suffers from both limitations. To overcome difficulties due to sparseness, we assume homogeneity in sighting probabilities over 16 years of bi-annual aerial surveys. We model counts of marked and unmarked animals as multinomial random variables, using the capture frequencies of marked animals for inference about the latent multinomial frequencies for unmarked animals. We discuss undesirable behavior of the commonly used discrete uniform prior distribution on the population size parameter and provide OpenBUGS code for fitting such models. The application provides valuable insights into subtleties of implementing Bayesian inference for latent multinomial models. We tie the discussion to our application, though the insights are broadly useful for applications of the latent multinomial model.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...... with the proposed explicit noise-model extension....
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Khorramdel, Lale; von Davier, Matthias
2014-01-01
This study shows how to address the problem of trait-unrelated response styles (RS) in rating scales using multidimensional item response theory. The aim is to test and correct data for RS in order to provide fair assessments of personality. Expanding on an approach presented by Böckenholt (2012), observed rating data are decomposed into multiple response processes based on a multinomial processing tree. The data come from a questionnaire consisting of 50 items of the International Personality Item Pool measuring the Big Five dimensions administered to 2,026 U.S. students with a 5-point rating scale. It is shown that this approach can be used to test if RS exist in the data and that RS can be differentiated from trait-related responses. Although the extreme RS appear to be unidimensional after exclusion of only 1 item, a unidimensional measure for the midpoint RS is obtained only after exclusion of 10 items. Both RS measurements show high cross-scale correlations and item response theory-based (marginal) reliabilities. Cultural differences could be found in giving extreme responses. Moreover, it is shown how to score rating data to correct for RS after being proved to exist in the data.
Para Krizleri Öngörüsünde Logit Model ve Sinyal Yaklaşımının Değeri: Türkiye Tecrübesi
Kaya, Vedat; Yilmaz, Omer
2007-01-01
Logit model and the signal approach are two analysis methods being commonly used to forecast and explain currency crises. Logit model is successful to determine explaining variables of crisis and to calculate the probability of crisis in particular during the period experienced with a crisis. On the other hand, the signal approach aims at determining any possible currency crisis in advance, following some variables showing unusual change over the periods of economic fluctuation and thus it pr...
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
SEPARATION PHENOMENA LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Ikaro Daniel de Carvalho Barreto
2014-03-01
Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Energy Technology Data Exchange (ETDEWEB)
Otake, M [Hiroshima Univ. (Japan). Faculty of Science
1976-12-01
Various statistical models designed to determine the effects of radiation dose on mortality of atomic bomb survivors in Hiroshima and Nagasaki from specific cancers were evaluated on the basis of a basic k(age) x c(dose) x 2 contingency table. From the aspects of application and fits of different models, analysis based on the additive logit model was applied to the mortality experience of this population during the 22year period from 1 Oct. 1950 to 31 Dec. 1972. The advantages and disadvantages of the additive logit model were demonstrated. Leukemia mortality showed a sharp rise with an increase in dose. The dose response relationship suggests a possible curvature or a log linear model, particularly if the dose estimated to be more than 600 rad were set arbitrarily at 600 rad, since the average dose in the 200+ rad group would then change from 434 to 350 rad. In the 22year period from 1950 to 1972, a high mortality risk due to radiation was observed in survivors with doses of 200 rad and over for all cancers except leukemia. On the other hand, during the latest period from 1965 to 1972 a significant risk was noted also for stomach and breast cancers. Survivors who were 9 year old or less at the time of the bomb and who were exposed to high doses of 200+ rad appeared to show a high mortality risk for all cancers except leukemia, although the number of observed deaths is yet small. A number of interesting areas are discussed from the statistical and epidemiological standpoints, i.e., the numerical comparison of risks in various models, the general evaluation of cancer mortality by the additive logit model, the dose response relationship, the relative risk in the high dose group, the time period of radiation induced cancer mortality, the difference of dose response between Hiroshima and Nagasaki and the relative biological effectiveness of neutrons.
Luis Gabriel Márquez Díaz
2013-01-01
Resumen: El estudio analiza la diferencia en la disposición a pagar de estudiantes y trabajadores por reducir el tiempo de viaje, en un contexto de elección de modo de transporte para la ciudad de Tunja (Colombia). Se utilizó un modelo logit mixto, calibrado con datos provenientes de una encuesta de preferencias declaradas. La especificación del modelo supuso la variación aleatoria de los coeficientes del tiempo de acceso, tiempo de espera y tiempo de viaje. Se encontró que la disposición a p...
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy....... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes.......This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy...
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Relating cost-benefit analysis results with transport project decisions in the Netherlands
Annema, Jan Anne; Frenken, Koen|info:eu-repo/dai/nl/207145253; Koopmans, Carl; Kroesen, Maarten
2017-01-01
This paper relates the cost-benefit analysis (CBA) results of transportation policy proposals in the Netherlands with the decision to implement or abandon the proposal. The aim of this study is to explore the relation between the CBA results and decision-making. Multinomial logit regression models
Relating cost-benefit analysis results with transport project decisions in the Netherlands
Annema, J.A.; Frenken, Koen; Koopmans, Carl; Kroesen, M.
2017-01-01
This paper relates the cost-benefit analysis (CBA) results of transportation policy proposals in the Netherlands with the decision to implement or abandon the proposal. The aim of this study is to explore the relation between the CBA results and decision-making. Multinomial logit regression
Farmers' choice of cattle marketing channels under transaction cost ...
African Journals Online (AJOL)
The theoretical predictions of transaction cost economics were tested based on primary data collected from 230 cattle farm households in 13 communities of the Okhahlamba Local Municipality. The results of a multinomial logit regression revealed some unique insights. They showed that the probability of selling at auction ...
An Analysis of Losses to the Southern Commercial Timberland Base
Ian A. Munn; David Cleaves
1998-01-01
Demographic and physical factors influencing the conversion of commercial timberland iu the south to non-forestry uses between the last two Forest Inventory Analysis (FIA) surveys were investigated. GIS techniques linked Census data and FIA plot level data. Multinomial logit regression identified factors associated with losses to the timberland base. Conversion to...
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo
2015-05-12
To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis
Directory of Open Access Journals (Sweden)
Maarten van Smeden
2016-11-01
Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Directory of Open Access Journals (Sweden)
Guanghao Sun
2016-11-01
Full Text Available Background and Objectives: Heart rate variability (HRV has been intensively studied as a promising biological marker of major depressive disorder (MDD. Our previous study confirmed that autonomic activity and reactivity in depression revealed by HRV during rest and mental task (MT conditions can be used as diagnostic measures and in clinical evaluation. In this study, logistic regression analysis (LRA was utilized for the classification and prediction of MDD based on HRV data obtained in an MT paradigm.Methods: Power spectral analysis of HRV on R-R intervals before, during, and after an MT (random number generation was performed in 44 drug-naïve patients with MDD and 47 healthy control subjects at Department of Psychiatry in Shizuoka Saiseikai General Hospital. Logit scores of LRA determined by HRV indices and heart rates discriminated patients with MDD from healthy subjects. The high frequency (HF component of HRV and the ratio of the low frequency (LF component to the HF component (LF/HF correspond to parasympathetic and sympathovagal balance, respectively.Results: The LRA achieved a sensitivity and specificity of 80.0% and 79.0%, respectively, at an optimum cutoff logit score (0.28. Misclassifications occurred only when the logit score was close to the cutoff score. Logit scores also correlated significantly with subjective self-rating depression scale scores (p < 0.05.Conclusion: HRV indices recorded during a mental task may be an objective tool for screening patients with MDD in psychiatric practice. The proposed method appears promising for not only objective and rapid MDD screening, but also evaluation of its severity.
Generazio, Edward R.
2014-01-01
Unknown risks are introduced into failure critical systems when probability of detection (POD) capabilities are accepted without a complete understanding of the statistical method applied and the interpretation of the statistical results. The presence of this risk in the nondestructive evaluation (NDE) community is revealed in common statements about POD. These statements are often interpreted in a variety of ways and therefore, the very existence of the statements identifies the need for a more comprehensive understanding of POD methodologies. Statistical methodologies have data requirements to be met, procedures to be followed, and requirements for validation or demonstration of adequacy of the POD estimates. Risks are further enhanced due to the wide range of statistical methodologies used for determining the POD capability. Receiver/Relative Operating Characteristics (ROC) Display, simple binomial, logistic regression, and Bayes' rule POD methodologies are widely used in determining POD capability. This work focuses on Hit-Miss data to reveal the framework of the interrelationships between Receiver/Relative Operating Characteristics Display, simple binomial, logistic regression, and Bayes' Rule methodologies as they are applied to POD. Knowledge of these interrelationships leads to an intuitive and global understanding of the statistical data, procedural and validation requirements for establishing credible POD estimates.
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Nobre, Aline Araújo; Carvalho, Marilia Sá; Griep, Rosane Härter; Fonseca, Maria de Jesus Mendes da; Melo, Enirtes Caetano Prates; Santos, Itamar de Souza; Chor, Dora
2017-08-17
To compare two methodological approaches: the multinomial model and the zero-inflated gamma model, evaluating the factors associated with the practice and amount of time spent on leisure time physical activity. Data collected from 14,823 baseline participants in the Longitudinal Study of Adult Health (ELSA-Brasil - Estudo Longitudinal de Saúde do Adulto ) have been analysed. Regular leisure time physical activity has been measured using the leisure time physical activity module of the International Physical Activity Questionnaire. The explanatory variables considered were gender, age, education level, and annual per capita family income. The main advantage of the zero-inflated gamma model over the multinomial model is that it estimates mean time (minutes per week) spent on leisure time physical activity. For example, on average, men spent 28 minutes/week longer on leisure time physical activity than women did. The most sedentary groups were young women with low education level and income. The zero-inflated gamma model, which is rarely used in epidemiological studies, can give more appropriate answers in several situations. In our case, we have obtained important information on the main determinants of the duration of leisure time physical activity. This information can help guide efforts towards the most vulnerable groups since physical inactivity is associated with different diseases and even premature death.
Directory of Open Access Journals (Sweden)
Aline Araújo Nobre
2017-08-01
Full Text Available ABSTRACT OBJECTIVE To compare two methodological approaches: the multinomial model and the zero-inflated gamma model, evaluating the factors associated with the practice and amount of time spent on leisure time physical activity. METHODS Data collected from 14,823 baseline participants in the Longitudinal Study of Adult Health (ELSA-Brasil – Estudo Longitudinal de Saúde do Adulto have been analysed. Regular leisure time physical activity has been measured using the leisure time physical activity module of the International Physical Activity Questionnaire. The explanatory variables considered were gender, age, education level, and annual per capita family income. RESULTS The main advantage of the zero-inflated gamma model over the multinomial model is that it estimates mean time (minutes per week spent on leisure time physical activity. For example, on average, men spent 28 minutes/week longer on leisure time physical activity than women did. The most sedentary groups were young women with low education level and income CONCLUSIONS The zero-inflated gamma model, which is rarely used in epidemiological studies, can give more appropriate answers in several situations. In our case, we have obtained important information on the main determinants of the duration of leisure time physical activity. This information can help guide efforts towards the most vulnerable groups since physical inactivity is associated with different diseases and even premature death.
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
International Nuclear Information System (INIS)
Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei
2007-01-01
Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Regression to Causality : Regression-style presentation influences causal attribution
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Quantile Regression With Measurement Error
Wei, Ying; Carroll, Raymond J.
2009-01-01
. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
DEFF Research Database (Denmark)
Tvedebrink, Torben; Eriksen, Poul Svante; Morling, Niels
2015-01-01
In this paper, we discuss the construction of a multivariate generalisation of the Dirichlet-multinomial distribution. An example from forensic genetics in the statistical analysis of DNA mixtures motivates the study of this multivariate extension. In forensic genetics, adjustment of the match...... probabilities due to remote ancestry in the population is often done using the so-called θ-correction. This correction increases the probability of observing multiple copies of rare alleles in a subpopulation and thereby reduces the weight of the evidence for rare genotypes. A recent publication by Cowell et al....... (2015) showed elegantly how to use Bayesian networks for efficient computations of likelihood ratios in a forensic genetic context. However, their underlying population genetic model assumed independence of alleles, which is not realistic in real populations. We demonstrate how the so-called θ...
Directory of Open Access Journals (Sweden)
Luis Gabriel Marquez Diaz
2013-01-01
Full Text Available Resumen: El estudio analiza la diferencia en la disposición a pagar de estudiantes y trabajadores por reducir el tiempo de viaje, en un contexto de elección de modo de transporte para la ciudad de Tunja (Colombia. Se utilizó un modelo logit mixto, calibrado con datos provenientes de una encuesta de preferencias declaradas. La especificación del modelo supuso la variación aleatoria de los coeficientes del tiempo de acceso, tiempo de espera y tiempo de viaje. Se encontró que la disposición a pagar por reducir el tiempo de viaje es de 38.14 $/min para estudiantes, siendo 23% mayor para trabajadores de menor ingreso y 73% mayor para los trabajadores de mayor ingreso. Se determinó que el valor del tiempo de espera es 1.95 veces mayor que el tiempo viaje, en tanto que el tiempo de acceso mantiene una relación de 1 a 2.57 con respecto al tiempo de viaje, la cual se considera válida únicamente para el contexto estudiado.
Flexible link functions in nonparametric binary regression with Gaussian process priors.
Li, Dan; Wang, Xia; Lin, Lizhen; Dey, Dipak K
2016-09-01
In many scientific fields, it is a common practice to collect a sequence of 0-1 binary responses from a subject across time, space, or a collection of covariates. Researchers are interested in finding out how the expected binary outcome is related to covariates, and aim at better prediction in the future 0-1 outcomes. Gaussian processes have been widely used to model nonlinear systems; in particular to model the latent structure in a binary regression model allowing nonlinear functional relationship between covariates and the expectation of binary outcomes. A critical issue in modeling binary response data is the appropriate choice of link functions. Commonly adopted link functions such as probit or logit links have fixed skewness and lack the flexibility to allow the data to determine the degree of the skewness. To address this limitation, we propose a flexible binary regression model which combines a generalized extreme value link function with a Gaussian process prior on the latent structure. Bayesian computation is employed in model estimation. Posterior consistency of the resulting posterior distribution is demonstrated. The flexibility and gains of the proposed model are illustrated through detailed simulation studies and two real data examples. Empirical results show that the proposed model outperforms a set of alternative models, which only have either a Gaussian process prior on the latent regression function or a Dirichlet prior on the link function. © 2015, The International Biometric Society.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Directory of Open Access Journals (Sweden)
Maria José Silva
2013-01-01
Full Text Available This research examined the degree of importance of factors internal and external determinants of innovative company, under the Portuguese service companies. Based on the literature has built up a conceptual model and hypotheses were formulated several research have been tested empirically, using the secondary data provided by the Center for Science and Higher Education (OCES, belonging to the 4th Community Innovation Survey (CIS 4, supervised by EUROSTAT. The method was to use logistic regression. According to the results, as greater financial investment in internal research activities and development in acquisition of external knowledge and marketing activities in the greater propensity of firms to innovate at the level of services. The results allow a joint analysis of the factors that promote and restrict the innovative capacity of service companies, which identify the main determinants and improve the knowledge on innovation in services. The contribution of the results refers to the identification of which factors are really important in stimulating innovation in service firms.Esta investigación analizó el grado de importancia de los factores internos y externos determinantes de la capacidad innovadora empresarial en el ámbito de las empresas de servicios portuguesas. Con base en la literatura se construyó un modelo conceptual y se formularon varias hipótesis de investigación que fueron testadas empíricamente, utilizándose los datos secundarios facultados por el “Observatorio de Ciencia y de la Enseñanza Superior” (OCES, pertenecientes a la 4ª Indagación Comunitaria a la Innovación (CIS 4, bajo la supervisión de EUROSTAT. El método utilizado fue la regresión logística. De acuerdo con los resultados obtenidos, cuanto mayores sean las inversiones financieras en actividades internas de investigación y desarrollo, en adquisición de conocimientos externos y en actividades de marketing, mayor es la
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Kernel regression with functional response
Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe
2011-01-01
We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.
Milte, Rachel; Ratcliffe, Julie; Chen, Gang; Lancsar, Emily; Miller, Michelle; Crotty, Maria
2014-07-01
This exploratory study sought to investigate the effect of cognitive functioning on the consistency of individual responses to a discrete choice experiment (DCE) study conducted exclusively with older people. A DCE to investigate preferences for multidisciplinary rehabilitation was administered to a consenting sample of older patients (aged 65 years and older) after surgery to repair a fractured hip (N = 84). Conditional logit, mixed logit, heteroscedastic conditional logit, and generalized multinomial logit regression models were used to analyze the DCE data and to explore the relationship between the level of cognitive functioning (specifically the absence or presence of mild cognitive impairment as assessed by the Mini-Mental State Examination) and preference and scale heterogeneity. Both the heteroscedastic conditional logit and generalized multinomial logit models indicated that the presence of mild cognitive impairment did not have a significant effect on the consistency of responses to the DCE. This study provides important preliminary evidence relating to the effect of mild cognitive impairment on DCE responses for older people. It is important that further research be conducted in larger samples and more diverse populations to further substantiate the findings from this exploratory study and to assess the practicality and validity of the DCE approach with populations of older people. Copyright © 2014 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Multivariate and semiparametric kernel regression
Härdle, Wolfgang; Müller, Marlene
1997-01-01
The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...
Regression algorithm for emotion detection
Berthelon , Franck; Sander , Peter
2013-01-01
International audience; We present here two components of a computational system for emotion detection. PEMs (Personalized Emotion Maps) store links between bodily expressions and emotion values, and are individually calibrated to capture each person's emotion profile. They are an implementation based on aspects of Scherer's theoretical complex system model of emotion~\\cite{scherer00, scherer09}. We also present a regression algorithm that determines a person's emotional feeling from sensor m...
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Polylinear regression analysis in radiochemistry
International Nuclear Information System (INIS)
Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.
1995-01-01
A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
Directory of Open Access Journals (Sweden)
Gregory, T.
2013-06-01
Full Text Available Adoption of technology is an important factor in economic development. The thrust of this study was to establish factors affecting adoption of QPM technology in Northern zone of Tanzania. Primary data was collected from a random sample of 120 smallholder maize farmers in four villages. Data collected were analysed using descriptive and quantitative methods. Logit model was used to determine factors that influence adoption of QPM technology. The regression results indicated that education of the household head, farmers’ participation on demonstration trials, attendance to field days, and numbers of livestock owned have positively influenced the rate of adoption of the technology. Access to credit, and poor QPM marketing problem perception by farmers negatively influenced the rate of adoption. The study recommended government to ensure efficiency input-output linkage for QPM production.
Spontaneous regression of pulmonary bullae
International Nuclear Information System (INIS)
Satoh, H.; Ishikawa, H.; Ohtsuka, M.; Sekizawa, K.
2002-01-01
The natural history of pulmonary bullae is often characterized by gradual, progressive enlargement. Spontaneous regression of bullae is, however, very rare. We report a case in which complete resolution of pulmonary bullae in the left upper lung occurred spontaneously. The management of pulmonary bullae is occasionally made difficult because of gradual progressive enlargement associated with abnormal pulmonary function. Some patients have multiple bulla in both lungs and/or have a history of pulmonary emphysema. Others have a giant bulla without emphysematous change in the lungs. Our present case had treated lung cancer with no evidence of local recurrence. He had no emphysematous change in lung function test and had no complaints, although the high resolution CT scan shows evidence of underlying minimal changes of emphysema. Ortin and Gurney presented three cases of spontaneous reduction in size of bulla. Interestingly, one of them had a marked decrease in the size of a bulla in association with thickening of the wall of the bulla, which was observed in our patient. This case we describe is of interest, not only because of the rarity with which regression of pulmonary bulla has been reported in the literature, but also because of the spontaneous improvements in the radiological picture in the absence of overt infection or tumor. Copyright (2002) Blackwell Science Pty Ltd
Quantum algorithm for linear regression
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Prediction, Regression and Critical Realism
DEFF Research Database (Denmark)
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Wilson, Edward C F; Usher-Smith, Juliet A; Emery, Jon; Corrie, Pippa G; Walter, Fiona M
2018-06-01
Expert elicitation is required to inform decision making when relevant "better quality" data either do not exist or cannot be collected. An example of this is to inform decisions as to whether to screen for melanoma. A key input is the counterfactual, in this case the natural history of melanoma in patients who are undiagnosed and hence untreated. To elicit expert opinion on the probability of disease progression in patients with melanoma that is undetected and hence untreated. A bespoke webinar-based expert elicitation protocol was administered to 14 participants in the United Kingdom, Australia, and New Zealand, comprising 12 multinomial questions on the probability of progression from one disease stage to another in the absence of treatment. A modified Connor-Mosimann distribution was fitted to individual responses to each question. Individual responses were pooled using a Monte-Carlo simulation approach. Participants were asked to provide feedback on the process. A pooled modified Connor-Mosimann distribution was successfully derived from participants' responses. Feedback from participants was generally positive, with 86% willing to take part in such an exercise again. Nevertheless, only 57% of participants felt that this was a valid approach to determine the risk of disease progression. Qualitative feedback reflected some understanding of the need to rely on expert elicitation in the absence of "hard" data. We successfully elicited and pooled the beliefs of experts in melanoma regarding the probability of disease progression in a format suitable for inclusion in a decision-analytic model. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Regularized Label Relaxation Linear Regression.
Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu
2018-04-01
Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.
Directory of Open Access Journals (Sweden)
Pedro Silveira Máximo
2009-12-01
Full Text Available O objetivo deste estudo foi, justamente, identificar, entre os métodos LOGIT e a análise multivariada, qual a mais eficaz para estimar a Disposição a Aceitar Compensação (DAC dos cafeicultores quando o viés da utilidade marginal é passível de ocorrência. Para tal, foi elaborado um formulário com 33 perguntas envolvendo informações sobre características socioeconômicas dos cafeicultores, o uso da metodologia de valoração de contingente (MVC e do veículo de pagamento dos "Jogos de Lances", que revelou a Disposição a Aceitar uma Compensação (DAC na troca de um hectare de café por um hectare de mata. Como esperado, por causa do viés da utilidade marginal o método LOGIT foi incapaz de produzir resultados consistentes. Já a estimação da DAC pela análise multivariada mostrou que, caso o governo estivesse disposto a aumentar a provisão de mata em 70 ha, ele deveria despender 254.200 reais por ano, tratando apenas dos cafeicultores vinculados ao programa do PRO-CAFÉ.The object of this study was to identify which method, either LOGIT or multivariate analyses, was the most efficient to estimate the coffee planters' Willingness to Accept a Compensation, when there was a possibility of occurrence of marginal utility. For such, a questionnaire was formulated, with 33 questions involving information on coffee planters' socio - economic characteristics, the use of the methodology of contingent valuation (MCV, and the payment of the "offer game" that reveled the willingness to accept a compensation (WAC, by exchanging a hectare of coffee by a hectare of forest. As expected, because of the marginal utility's bias, the LOGIT method was unable to produce consistent results. However, when the WAC was estimated by multivariate analyses, the results showed that if the government is willing to increase the provision of forest to 70 hectares, it should pay out 254,200 reais (around 116,000 dollars, dealing only with the coffee planters
Machado, Michele Rílany Rodrigues; Gartner, Ivan Ricardo
2017-01-01
ABSTRACT This article fills a technical-scientific gap that currently exists in the Brazilian literature on corporative fraud, by combining the theoretical framework of agency theory, of criminology, and of the economics of crime. In addition, it focuses on a sector that is usually excluded from analyses due to its specific characteristics and shows the application of multinomial logit panel data regression with random effects, which is rarely used in studies in the area of accounting. The ai...
Michele Rílany Rodrigues Machado; Ivan Ricardo Gartner
2017-01-01
ABSTRACT This article fills a technical-scientific gap that currently exists in the Brazilian literature on corporative fraud, by combining the theoretical framework of agency theory, of criminology, and of the economics of crime. In addition, it focuses on a sector that is usually excluded from analyses due to its specific characteristics and shows the application of multinomial logit panel data regression with random effects, which is rarely used in studies in the area of accounting. The ai...
Linking apple farmers to markets: Determinants and impacts of marketing contracts in China
Ma, Wanglin; Abdulai, Awudu
2015-01-01
This study investigates the determinants of marketing contract choices and the related impact on farm net returns of apple farmers in China. We employ a two-stage selection correction approach (BFG) for the multinomial logit model. On the basis of the BFG estimation, we also use an endogenous switching regression model and a propensity score matching technique to estimate the causal effects of marketing contract choices on net returns. The empirical results reveal that written contracts incre...
Working Paper 175 - Youth Employment in Africa: New Evidence and Policies from Swaziland
Zuzana Brixiova; Thierry Kangoye
2013-01-01
Drawing on the 2007 and 2010 Swaziland Labor Force Surveys, this paper provides first systematic evidence on recent youth employment challenges in Swaziland, a small, land-locked, middle-income country with one of the highest youth unemployment rates in Africa. The paper first documents the various labor market disadvantages faced by the Swazi youth, such as high unemployment and discouragement, and how they changed from 2007 to 2010. A multinomial logit regression analysis is carried out to ...
Youth Employment in Africa: New Evidence and Policies from Swaziland
Brixiova, Zuzana; Kangoye, Thierry
2013-01-01
Drawing on the 2007 and 2010 Swaziland Labor Force Surveys, this paper provides first systematic evidence on recent youth employment challenges in Swaziland, a small, land-locked, middle-income country with one of the highest youth unemployment rates in Africa. The paper first documents the various labor market disadvantages faced by the Swazi youth, such as high unemployment and discouragement, and how they changed from 2007 to 2010. A multinomial logit regression analysis is then carried ou...
How do we value our income from which we save?
Barbara Liberda; Marek Pęczkowski; Ewa Gucwa-Leśny
2011-01-01
In this paper we analyze the relationship between the perception of income as satisfying household needs and saving rate of this household. Using the multinomial logit regression function we measure the probability of a household to fall into one of the groups categorized by the subjective perception of income in relation to the current household disposable income. The variable specified for the valuation of income is income perception, defined as a class of observed disposable income located...
Logistic regression analysis of financial literacy implications for retirement planning in Croatia
Directory of Open Access Journals (Sweden)
Dajana Barbić
2016-12-01
Full Text Available The relationship between financial literacy and financial behavior is important, as individuals are increasingly being asked to take responsibility for their financial wellbeing, especially their retirement. Analyzing of individual savings and attitudes towards retirement planning is important, as these types of investments are a way of preserving security during years of financial vulnerability. Research indicates that individuals who do not save adequately for their retirement, generally have a relatively low level of financial literacy. This research investigates the relationship between financial literacy and retirement planning in Croatia. To analyze the relationship between financial literacy and planning for retirement, maximum likelihood logistic regression analysis was used. The paper shows that those who answer financial literacy questions correctly are more likely to have a positive attitude towards retirement planning and are more likely to save for retirement, ensuring them of higher levels of financial security in retirement. The Goodness-of-Fit evaluation for the estimated logit model was performed using the Andrews and Hosmer-Lemeshow Tests.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Patterns and correlates of solid waste disposal practices in Dar es ...
African Journals Online (AJOL)
USER
collection. Key words: Solid waste, garbage, waste disposal, waste management, Multinomial Logit model. INTRODUCTION. Urbanization introduces society to a new, modern way of ..... Multinomial logistic estimation. .... The trend of using.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Semiparametric regression during 2003–2007
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Gaussian process regression analysis for functional data
Shi, Jian Qing
2011-01-01
Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Regression Analysis by Example. 5th Edition
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
Spontaneous regression of a congenital melanocytic nevus
Directory of Open Access Journals (Sweden)
Amiya Kumar Nath
2011-01-01
Full Text Available Congenital melanocytic nevus (CMN may rarely regress which may also be associated with a halo or vitiligo. We describe a 10-year-old girl who presented with CMN on the left leg since birth, which recently started to regress spontaneously with associated depigmentation in the lesion and at a distant site. Dermoscopy performed at different sites of the regressing lesion demonstrated loss of epidermal pigments first followed by loss of dermal pigments. Histopathology and Masson-Fontana stain demonstrated lymphocytic infiltration and loss of pigment production in the regressing area. Immunohistochemistry staining (S100 and HMB-45, however, showed that nevus cells were present in the regressing areas.
Energy Technology Data Exchange (ETDEWEB)
Ryding, Kristen E.; Skalski, John R.
1999-06-01
The purpose of this report is to illustrate the development of a stochastic model using coded wire-tag (CWT) release and age-at-return data, in order to regress first year ocean survival probabilities against coastal ocean conditions and climate covariates.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Regression models of reactor diagnostic signals
International Nuclear Information System (INIS)
Vavrin, J.
1989-01-01
The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,
This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Hierarchical regression analysis in structural Equation Modeling
de Jong, P.F.
1999-01-01
In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main
Categorical regression dose-response modeling
The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
Stepwise versus Hierarchical Regression: Pros and Cons
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Suppression Situations in Multiple Linear Regression
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Gibrat’s law and quantile regressions
DEFF Research Database (Denmark)
Distante, Roberta; Petrella, Ivan; Santoro, Emiliano
2017-01-01
The nexus between firm growth, size and age in U.S. manufacturing is examined through the lens of quantile regression models. This methodology allows us to overcome serious shortcomings entailed by linear regression models employed by much of the existing literature, unveiling a number of important...
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Repeated Results Analysis for Middleware Regression Benchmarking
Czech Academy of Sciences Publication Activity Database
Bulej, Lubomír; Kalibera, T.; Tůma, P.
2005-01-01
Roč. 60, - (2005), s. 345-358 ISSN 0166-5316 R&D Projects: GA ČR GA102/03/0672 Institutional research plan: CEZ:AV0Z10300504 Keywords : middleware benchmarking * regression benchmarking * regression testing Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.756, year: 2005
Principles of Quantile Regression and an Application
Chen, Fang; Chalhoub-Deville, Micheline
2014-01-01
Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…
ON REGRESSION REPRESENTATIONS OF STOCHASTIC-PROCESSES
RUSCHENDORF, L; DEVALK, [No Value
We construct a.s. nonlinear regression representations of general stochastic processes (X(n))n is-an-element-of N. As a consequence we obtain in particular special regression representations of Markov chains and of certain m-dependent sequences. For m-dependent sequences we obtain a constructive
Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of
Regression of environmental noise in LIGO data
International Nuclear Information System (INIS)
Tiwari, V; Klimenko, S; Mitselmakher, G; Necula, V; Drago, M; Prodi, G; Frolov, V; Yakushin, I; Re, V; Salemi, F; Vedovato, G
2015-01-01
We address the problem of noise regression in the output of gravitational-wave (GW) interferometers, using data from the physical environmental monitors (PEM). The objective of the regression analysis is to predict environmental noise in the GW channel from the PEM measurements. One of the most promising regression methods is based on the construction of Wiener–Kolmogorov (WK) filters. Using this method, the seismic noise cancellation from the LIGO GW channel has already been performed. In the presented approach the WK method has been extended, incorporating banks of Wiener filters in the time–frequency domain, multi-channel analysis and regulation schemes, which greatly enhance the versatility of the regression analysis. Also we present the first results on regression of the bi-coherent noise in the LIGO data. (paper)
Pathological assessment of liver fibrosis regression
Directory of Open Access Journals (Sweden)
WANG Bingqiong
2017-03-01
Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Variable and subset selection in PLS regression
DEFF Research Database (Denmark)
Høskuldsson, Agnar
2001-01-01
The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding
Genetics Home Reference: caudal regression syndrome
... umbilical artery: Further support for a caudal regression-sirenomelia spectrum. Am J Med Genet A. 2007 Dec ... AK, Dickinson JE, Bower C. Caudal dysgenesis and sirenomelia-single centre experience suggests common pathogenic basis. Am ...
Dynamic travel time estimation using regression trees.
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
Two Paradoxes in Linear Regression Analysis
FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong
2016-01-01
Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214
Discriminative Elastic-Net Regularized Linear Regression.
Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen
2017-03-01
In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Computing multiple-output regression quantile regions
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2012-01-01
Roč. 56, č. 4 (2012), s. 840-853 ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : halfspace depth * multiple-output regression * parametric linear programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 1.304, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376413.pdf
There is No Quantum Regression Theorem
International Nuclear Information System (INIS)
Ford, G.W.; OConnell, R.F.
1996-01-01
The Onsager regression hypothesis states that the regression of fluctuations is governed by macroscopic equations describing the approach to equilibrium. It is here asserted that this hypothesis fails in the quantum case. This is shown first by explicit calculation for the example of quantum Brownian motion of an oscillator and then in general from the fluctuation-dissipation theorem. It is asserted that the correct generalization of the Onsager hypothesis is the fluctuation-dissipation theorem. copyright 1996 The American Physical Society
Caudal regression syndrome : a case report
International Nuclear Information System (INIS)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun
1998-01-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging
Caudal regression syndrome : a case report
Energy Technology Data Exchange (ETDEWEB)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun [Chungang Gil Hospital, Incheon (Korea, Republic of)
1998-07-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging.
Spontaneous regression of metastatic Merkel cell carcinoma.
LENUS (Irish Health Repository)
Hassan, S J
2010-01-01
Merkel cell carcinoma is a rare aggressive neuroendocrine carcinoma of the skin predominantly affecting elderly Caucasians. It has a high rate of local recurrence and regional lymph node metastases. It is associated with a poor prognosis. Complete spontaneous regression of Merkel cell carcinoma has been reported but is a poorly understood phenomenon. Here we present a case of complete spontaneous regression of metastatic Merkel cell carcinoma demonstrating a markedly different pattern of events from those previously published.
Forecasting exchange rates: a robust regression approach
Preminger, Arie; Franck, Raphael
2005-01-01
The least squares estimation method as well as other ordinary estimation method for regression models can be severely affected by a small number of outliers, thus providing poor out-of-sample forecasts. This paper suggests a robust regression approach, based on the S-estimation method, to construct forecasting models that are less sensitive to data contamination by outliers. A robust linear autoregressive (RAR) and a robust neural network (RNN) models are estimated to study the predictabil...
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Post-processing through linear regression
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Post-processing through linear regression
Directory of Open Access Journals (Sweden)
B. Van Schaeybroeck
2011-03-01
Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.
These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...... in the theoretical predictive equation by suggesting a data generating process, where returns are generated as linear functions of a lagged latent I(0) risk process. The observed predictor is a function of this latent I(0) process, but it is corrupted by a fractionally integrated noise. Such a process may arise due...... to aggregation or unexpected level shifts. In this setup, the practitioner estimates a misspecified, unbalanced, and endogenous predictive regression. We show that the OLS estimate of this regression is inconsistent, but standard inference is possible. To obtain a consistent slope estimate, we then suggest...
Directory of Open Access Journals (Sweden)
Folefack, AJZ.
2018-01-01
Full Text Available Three years after the beginning of a goat project in the Centre region of Cameroon, the engagement of farmers in this activity has been timid. As this region is not a traditional pastoral zone, farmers have not yet incorporated the crop-livestock integration into their habits. Hence, this paper uses a logistic regression approach in order to analyse the factors affecting the adoption of goat raising activity by farmers of this locality. The computed odds ratio indicate that the practice of goat raising activity is significantly influenced by the farmer's age, gender, farming experience, practice of other livestock activities, frequency of contact with extension agents, access to credit and farm income. However, being a goat raiser does not depend on the farmer's marital status, education, farm size, household size, membership into a common initiative group. The study therefore recommends that the government authorities should give more attention to significant factors so as to popularize the goat raising activity in this region.
Energy Technology Data Exchange (ETDEWEB)
Schlattmann, Peter [University Hospital of Friedrich-Schiller University Jena, Department of Medical Statistics, Informatics and Documentation, Jena (Germany); Schuetz, Georg M. [Freie Universitaet Berlin, Charite, Medical School, Department of Radiology, Humboldt-Universitaet zu Berlin, Berlin (Germany); Dewey, Marc [Freie Universitaet Berlin, Charite, Medical School, Department of Radiology, Humboldt-Universitaet zu Berlin, Berlin (Germany); Charite, Institut fuer Radiologie, Berlin (Germany)
2011-09-15
To evaluate the impact of coronary artery disease (CAD) prevalence on the predictive values of coronary CT angiography. We performed a meta-regression based on a generalised linear mixed model using the binomial distribution and a logit link to analyse the influence of the prevalence of CAD in published studies on the per-patient negative and positive predictive values of CT in comparison to conventional coronary angiography as the reference standard. A prevalence range in which the negative predictive value was higher than 90%, while at the same time the positive predictive value was higher than 70% was considered appropriate. The summary negative and positive predictive values of coronary CT angiography were 93.7% (95% confidence interval [CI] 92.8-94.5%) and 87.5% (95% CI, 86.5-88.5%), respectively. With 95% confidence, negative and positive predictive values higher than 90% and 70% were available with CT for a CAD prevalence of 18-63%. CT systems with >16 detector rows met these requirements for the positive (P < 0.01) and negative (P < 0.05) predictive values in a significantly broader range than systems with {<=}16 detector rows. It is reasonable to perform coronary CT angiography as a rule-out test in patients with a low-to-intermediate likelihood of disease. (orig.)
International Nuclear Information System (INIS)
Schlattmann, Peter; Schuetz, Georg M.; Dewey, Marc
2011-01-01
To evaluate the impact of coronary artery disease (CAD) prevalence on the predictive values of coronary CT angiography. We performed a meta-regression based on a generalised linear mixed model using the binomial distribution and a logit link to analyse the influence of the prevalence of CAD in published studies on the per-patient negative and positive predictive values of CT in comparison to conventional coronary angiography as the reference standard. A prevalence range in which the negative predictive value was higher than 90%, while at the same time the positive predictive value was higher than 70% was considered appropriate. The summary negative and positive predictive values of coronary CT angiography were 93.7% (95% confidence interval [CI] 92.8-94.5%) and 87.5% (95% CI, 86.5-88.5%), respectively. With 95% confidence, negative and positive predictive values higher than 90% and 70% were available with CT for a CAD prevalence of 18-63%. CT systems with >16 detector rows met these requirements for the positive (P < 0.01) and negative (P < 0.05) predictive values in a significantly broader range than systems with ≤16 detector rows. It is reasonable to perform coronary CT angiography as a rule-out test in patients with a low-to-intermediate likelihood of disease. (orig.)
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Is past life regression therapy ethical?
Andrade, Gabriel
2017-01-01
Past life regression therapy is used by some physicians in cases with some mental diseases. Anxiety disorders, mood disorders, and gender dysphoria have all been treated using life regression therapy by some doctors on the assumption that they reflect problems in past lives. Although it is not supported by psychiatric associations, few medical associations have actually condemned it as unethical. In this article, I argue that past life regression therapy is unethical for two basic reasons. First, it is not evidence-based. Past life regression is based on the reincarnation hypothesis, but this hypothesis is not supported by evidence, and in fact, it faces some insurmountable conceptual problems. If patients are not fully informed about these problems, they cannot provide an informed consent, and hence, the principle of autonomy is violated. Second, past life regression therapy has the great risk of implanting false memories in patients, and thus, causing significant harm. This is a violation of the principle of non-malfeasance, which is surely the most important principle in medical ethics.
On Solving Lq-Penalized Regressions
Directory of Open Access Journals (Sweden)
Tracy Zhou Wu
2007-01-01
Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
Refractive regression after laser in situ keratomileusis.
Yan, Mabel K; Chang, John Sm; Chan, Tommy Cy
2018-04-26
Uncorrected refractive errors are a leading cause of visual impairment across the world. In today's society, laser in situ keratomileusis (LASIK) has become the most commonly performed surgical procedure to correct refractive errors. However, regression of the initially achieved refractive correction has been a widely observed phenomenon following LASIK since its inception more than two decades ago. Despite technological advances in laser refractive surgery and various proposed management strategies, post-LASIK regression is still frequently observed and has significant implications for the long-term visual performance and quality of life of patients. This review explores the mechanism of refractive regression after both myopic and hyperopic LASIK, predisposing risk factors and its clinical course. In addition, current preventative strategies and therapies are also reviewed. © 2018 Royal Australian and New Zealand College of Ophthalmologists.
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
On directional multiple-output quantile regression
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2011-01-01
Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf
Removing Malmquist bias from linear regressions
Verter, Frances
1993-01-01
Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.
Robust median estimator in logisitc regression
Czech Academy of Sciences Publication Activity Database
Hobza, T.; Pardo, L.; Vajda, Igor
2008-01-01
Roč. 138, č. 12 (2008), s. 3822-3840 ISSN 0378-3758 R&D Projects: GA MŠk 1M0572 Grant - others:Instituto Nacional de Estadistica (ES) MPO FI - IM3/136; GA MŠk(CZ) MTM 2006-06872 Institutional research plan: CEZ:AV0Z10750506 Keywords : Logistic regression * Median * Robustness * Consistency and asymptotic normality * Morgenthaler * Bianco and Yohai * Croux and Hasellbroeck Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.679, year: 2008 http://library.utia.cas.cz/separaty/2008/SI/vajda-robust%20median%20estimator%20in%20logistic%20regression.pdf
Demonstration of a Fiber Optic Regression Probe
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
KELEŞ, Taliha; ALTUN, Murat
2016-01-01
Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...
Islam, Mohammad Mafijul; Alam, Morshed; Tariquzaman, Md; Kabir, Mohammad Alamgir; Pervin, Rokhsona; Begum, Munni; Khan, Md Mobarak Hossain
2013-01-08
Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance variable namely mother's education, father's education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh.
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Measurement Error in Education and Growth Regressions
Portela, Miguel; Alessie, Rob; Teulings, Coen
2010-01-01
The use of the perpetual inventory method for the construction of education data per country leads to systematic measurement error. This paper analyzes its effect on growth regressions. We suggest a methodology for correcting this error. The standard attenuation bias suggests that using these
The M Word: Multicollinearity in Multiple Regression.
Morrow-Howell, Nancy
1994-01-01
Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…
Regression Discontinuity Designs Based on Population Thresholds
DEFF Research Database (Denmark)
Eggers, Andrew C.; Freier, Ronny; Grembi, Veronica
In many countries, important features of municipal government (such as the electoral system, mayors' salaries, and the number of councillors) depend on whether the municipality is above or below arbitrary population thresholds. Several papers have used a regression discontinuity design (RDD...
Deriving the Regression Line with Algebra
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Piecewise linear regression splines with hyperbolic covariates
International Nuclear Information System (INIS)
Cologne, John B.; Sposto, Richard
1992-09-01
Consider the problem of fitting a curve to data that exhibit a multiphase linear response with smooth transitions between phases. We propose substituting hyperbolas as covariates in piecewise linear regression splines to obtain curves that are smoothly joined. The method provides an intuitive and easy way to extend the two-phase linear hyperbolic response model of Griffiths and Miller and Watts and Bacon to accommodate more than two linear segments. The resulting regression spline with hyperbolic covariates may be fit by nonlinear regression methods to estimate the degree of curvature between adjoining linear segments. The added complexity of fitting nonlinear, as opposed to linear, regression models is not great. The extra effort is particularly worthwhile when investigators are unwilling to assume that the slope of the response changes abruptly at the join points. We can also estimate the join points (the values of the abscissas where the linear segments would intersect if extrapolated) if their number and approximate locations may be presumed known. An example using data on changing age at menarche in a cohort of Japanese women illustrates the use of the method for exploratory data analysis. (author)
Functional data analysis of generalized regression quantiles
Guo, Mengmeng
2013-11-05
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Regression testing Ajax applications : Coping with dynamism
Roest, D.; Mesbah, A.; Van Deursen, A.
2009-01-01
Note: This paper is a pre-print of: Danny Roest, Ali Mesbah and Arie van Deursen. Regression Testing AJAX Applications: Coping with Dynamism. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10), Paris, France. IEEE Computer Society, 2010.
Group-wise partial least square regression
Camacho, José; Saccenti, Edoardo
2018-01-01
This paper introduces the group-wise partial least squares (GPLS) regression. GPLS is a new sparse PLS technique where the sparsity structure is defined in terms of groups of correlated variables, similarly to what is done in the related group-wise principal component analysis. These groups are
Functional data analysis of generalized regression quantiles
Guo, Mengmeng; Zhou, Lan; Huang, Jianhua Z.; Hä rdle, Wolfgang Karl
2013-01-01
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Finite Algorithms for Robust Linear Regression
DEFF Research Database (Denmark)
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Function approximation with polynomial regression slines
International Nuclear Information System (INIS)
Urbanski, P.
1996-01-01
Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Predicting Social Trust with Binary Logistic Regression
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Yet another look at MIDAS regression
Ph.H.B.F. Franses (Philip Hans)
2016-01-01
textabstractA MIDAS regression involves a dependent variable observed at a low frequency and independent variables observed at a higher frequency. This paper relates a true high frequency data generating process, where also the dependent variable is observed (hypothetically) at the high frequency,
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
Fast multi-output relevance vector regression
Ha, Youngmin
2017-01-01
This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V
Regression Equations for Birth Weight Estimation using ...
African Journals Online (AJOL)
In this study, Birth Weight has been estimated from anthropometric measurements of hand and foot. Linear regression equations were formed from each of the measured variables. These simple equations can be used to estimate Birth Weight of new born babies, in order to identify those with low birth weight and referred to ...
Superquantile Regression: Theory, Algorithms, and Applications
2014-12-01
Highway, Suite 1204, Arlington, Va 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. 1...Navy submariners, reliability engineering, uncertainty quantification, and financial risk management . Superquantile, superquantile regression...Royset Carlos F. Borges Associate Professor of Operations Research Dissertation Supervisor Professor of Applied Mathematics Lyn R. Whitaker Javier
Measurement Error in Education and Growth Regressions
Portela, M.; Teulings, C.N.; Alessie, R.
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Measurement error in education and growth regressions
Portela, Miguel; Teulings, Coen; Alessie, R.
2004-01-01
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
transformation of independent variables in polynomial regression ...
African Journals Online (AJOL)
Ada
preferable when possible to work with a simple functional form in transformed variables rather than with a more complicated form in the original variables. In this paper, it is shown that linear transformations applied to independent variables in polynomial regression models affect the t ratio and hence the statistical ...
Multiple Linear Regression: A Realistic Reflector.
Nutt, A. T.; Batsell, R. R.
Examples of the use of Multiple Linear Regression (MLR) techniques are presented. This is done to show how MLR aids data processing and decision-making by providing the decision-maker with freedom in phrasing questions and by accurately reflecting the data on hand. A brief overview of the rationale underlying MLR is given, some basic definitions…
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Controlling attribute effect in linear regression
Calders, Toon; Karim, Asim A.; Kamiran, Faisal; Ali, Wasif Mohammad; Zhang, Xiangliang
2013-01-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Stochastic development regression using method of moments
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Beta-binomial regression and bimodal utilization.
Liu, Chuan-Fen; Burgess, James F; Manning, Willard G; Maciejewski, Matthew L
2013-10-01
To illustrate how the analysis of bimodal U-shaped distributed utilization can be modeled with beta-binomial regression, which is rarely used in health services research. Veterans Affairs (VA) administrative data and Medicare claims in 2001-2004 for 11,123 Medicare-eligible VA primary care users in 2000. We compared means and distributions of VA reliance (the proportion of all VA/Medicare primary care visits occurring in VA) predicted from beta-binomial, binomial, and ordinary least-squares (OLS) models. Beta-binomial model fits the bimodal distribution of VA reliance better than binomial and OLS models due to the nondependence on normality and the greater flexibility in shape parameters. Increased awareness of beta-binomial regression may help analysts apply appropriate methods to outcomes with bimodal or U-shaped distributions. © Health Research and Educational Trust.
Testing homogeneity in Weibull-regression models.
Bolfarine, Heleno; Valença, Dione M
2005-10-01
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Are increases in cigarette taxation regressive?
Borren, P; Sutton, M
1992-12-01
Using the latest published data from Tobacco Advisory Council surveys, this paper re-evaluates the question of whether or not increases in cigarette taxation are regressive in the United Kingdom. The extended data set shows no evidence of increasing price-elasticity by social class as found in a major previous study. To the contrary, there appears to be no clear pattern in the price responsiveness of smoking behaviour across different social classes. Increases in cigarette taxation, while reducing smoking levels in all groups, fall most heavily on men and women in the lowest social class. Men and women in social class five can expect to pay eight and eleven times more of a tax increase respectively, than their social class one counterparts. Taken as a proportion of relative incomes, the regressive nature of increases in cigarette taxation is even more pronounced.
Controlling attribute effect in linear regression
Calders, Toon
2013-12-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Model selection in kernel ridge regression
DEFF Research Database (Denmark)
Exterkate, Peter
2013-01-01
Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...
Confidence bands for inverse regression models
International Nuclear Information System (INIS)
Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo
2010-01-01
We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract
Regressing Atherosclerosis by Resolving Plaque Inflammation
2017-07-01
regression requires the alteration of macrophages in the plaques to a tissue repair “alternatively” activated state. This switch in activation state... tissue repair “alternatively” activated state. This switch in activation state requires the action of TH2 cytokines interleukin (IL)-4 or IL-13. To...regulation of tissue macrophage and dendritic cell population dynamics by CSF-1. J Exp Med. 2011;208(9):1901–1916. 35. Xu H, Exner BG, Chilton PM
Determination of regression laws: Linear and nonlinear
International Nuclear Information System (INIS)
Onishchenko, A.M.
1994-01-01
A detailed mathematical determination of regression laws is presented in the article. Particular emphasis is place on determining the laws of X j on X l to account for source nuclei decay and detector errors in nuclear physics instrumentation. Both linear and nonlinear relations are presented. Linearization of 19 functions is tabulated, including graph, relation, variable substitution, obtained linear function, and remarks. 6 refs., 1 tab
Directional quantile regression in Octave (and MATLAB)
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2016-01-01
Roč. 52, č. 1 (2016), s. 28-51 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * multivariate quantile * depth contour * Matlab Subject RIV: IN - Informatics, Computer Science Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/bocek-0458380.pdf
Logistic regression a self-learning text
Kleinbaum, David G
1994-01-01
This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
Complex regression Doppler optical coherence tomography
Elahi, Sahar; Gu, Shi; Thrane, Lars; Rollins, Andrew M.; Jenkins, Michael W.
2018-04-01
We introduce a new method to measure Doppler shifts more accurately and extend the dynamic range of Doppler optical coherence tomography (OCT). The two-point estimate of the conventional Doppler method is replaced with a regression that is applied to high-density B-scans in polar coordinates. We built a high-speed OCT system using a 1.68-MHz Fourier domain mode locked laser to acquire high-density B-scans (16,000 A-lines) at high enough frame rates (˜100 fps) to accurately capture the dynamics of the beating embryonic heart. Flow phantom experiments confirm that the complex regression lowers the minimum detectable velocity from 12.25 mm / s to 374 μm / s, whereas the maximum velocity of 400 mm / s is measured without phase wrapping. Complex regression Doppler OCT also demonstrates higher accuracy and precision compared with the conventional method, particularly when signal-to-noise ratio is low. The extended dynamic range allows monitoring of blood flow over several stages of development in embryos without adjusting the imaging parameters. In addition, applying complex averaging recovers hidden features in structural images.
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Satellite rainfall retrieval by logistic regression
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Modeling oil production based on symbolic regression
International Nuclear Information System (INIS)
Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju
2015-01-01
Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak
Face Alignment via Regressing Local Binary Features.
Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian
2016-03-01
This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.
Geographically weighted regression model on poverty indicator
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
Mixed-effects regression models in linguistics
Heylen, Kris; Geeraerts, Dirk
2018-01-01
When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
On logistic regression analysis of dichotomized responses.
Lu, Kaifeng
2017-01-01
We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
General regression and representation model for classification.
Directory of Open Access Journals (Sweden)
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
Image superresolution using support vector regression.
Ni, Karl S; Nguyen, Truong Q
2007-06-01
A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.
Directory of Open Access Journals (Sweden)
H. Mohammadi
2016-03-01
governmental job (21%, non-related to field of study and governmental job (25.5%, related to field of study and private job (25.7, non-related to field of study and private job (8.4%, and 41.5% of graduates wereunemployed. 77% of the statistical population said that the cause of unemployment is lack of capital. The variables of professionalism and evaluation of work culture in the statistical population was intermediate. The model showed that gender, field of study choice based on the level of interest, satisfaction with the studies, andlack of capital as a barrier to employment, and evaluation of the educational and professional skills acquired during the studieshave a significantimpact on employment. The regression equation as a whole is significant. According to this, thelikelihoodof womenworkingingovernmentjobsand privateand non-private was lower than that of men. With increasinginterestin thefield of study,the possibility ofworkinginrelated and unrelatedprivate jobsincreased. By increasing thedegreeofsatisfaction, the possibility ofworkingingovernmentjobsincreased. The lack of funds for activities in the field of employment is one of the problems. Therefore, the more thelack of capitalto start andcontinueis highlighted, the more thetendency of people toget employed in a public officeand to avoidthe possibility ofsetting upprivate businesses is. Aboutthe possibility of working in unrelated government jobs, none of the variablesshoweda significant relationship. And the possibility ofworkinginprivatejobs, genderandlevel ofinterest inthe choiceof the field of studyshoweda significant relationship. Women are facing more difficulties than men in employmentin the agricultural sector. Conclusion:Therisk ofunemploymentis higheramong womengraduates since the usual trend is to create better condition and motivations formale farmerswho live in villages. Also they should be given them some low-interest loans usingbank’s financialresources until economicactivitiesare undertaken by
International Nuclear Information System (INIS)
Jafri, Y.Z.; Kamal, L.
2007-01-01
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
A method for nonlinear exponential regression analysis
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Three Contributions to Robust Regression Diagnostics
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2015-01-01
Roč. 11, č. 2 (2015), s. 69-78 ISSN 1336-9180 Grant - others:GA ČR(CZ) GA13-01930S; Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : robust regression * robust econometrics * hypothesis test ing Subject RIV: BA - General Mathematics http://www.degruyter.com/view/j/jamsi.2015.11.issue-2/jamsi-2015-0013/jamsi-2015-0013.xml?format=INT
SDE based regression for random PDEs
Bayer, Christian
2016-01-01
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Bayesian regression of piecewise homogeneous Poisson processes
Directory of Open Access Journals (Sweden)
Diego Sevilla
2015-12-01
Full Text Available In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Received: 2 November 2015, Accepted: 27 November 2015; Edited by: R. Dickman; Reviewed by: M. Hutter, Australian National University, Canberra, Australia.; DOI: http://dx.doi.org/10.4279/PIP.070018 Cite as: D J R Sevilla, Papers in Physics 7, 070018 (2015
Selecting a Regression Saturated by Indicators
DEFF Research Database (Denmark)
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain the fin...... the finite-sample distribution of estimators of the mean and variance in a simple location-scale model under the null that no impulses matter. A Monte Carlo simulation confirms the null distribution, and shows power against an alternative of interest....
Selecting a Regression Saturated by Indicators
DEFF Research Database (Denmark)
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain the fin...... the finite-sample distribution of estimators of the mean and variance in a simple location-scale model under the null that no impulses matter. A Monte Carlo simulation confirms the null distribution, and shows power against an alternative of interest...
Mapping geogenic radon potential by regression kriging
Energy Technology Data Exchange (ETDEWEB)
Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)
2016-02-15
Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method
Fixed kernel regression for voltammogram feature extraction
International Nuclear Information System (INIS)
Acevedo Rodriguez, F J; López-Sastre, R J; Gil-Jiménez, P; Maldonado Bascón, S; Ruiz-Reyes, N
2009-01-01
Cyclic voltammetry is an electroanalytical technique for obtaining information about substances under analysis without the need for complex flow systems. However, classifying the information in voltammograms obtained using this technique is difficult. In this paper, we propose the use of fixed kernel regression as a method for extracting features from these voltammograms, reducing the information to a few coefficients. The proposed approach has been applied to a wine classification problem with accuracy rates of over 98%. Although the method is described here for extracting voltammogram information, it can be used for other types of signals
Regression analysis for the social sciences
Gordon, Rachel A
2010-01-01
The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming and interpretation on the same data set and course exercises in which students can choose their own research questions and data set.
SDE based regression for random PDEs
Bayer, Christian
2016-01-06
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Neutrosophic Correlation and Simple Linear Regression
Directory of Open Access Journals (Sweden)
A. A. Salama
2014-09-01
Full Text Available Since the world is full of indeterminacy, the neutrosophics found their place into contemporary research. The fundamental concepts of neutrosophic set, introduced by Smarandache. Recently, Salama et al., introduced the concept of correlation coefficient of neutrosophic data. In this paper, we introduce and study the concepts of correlation and correlation coefficient of neutrosophic data in probability spaces and study some of their properties. Also, we introduce and study the neutrosophic simple linear regression model. Possible applications to data processing are touched upon.
Spectral density regression for bivariate extremes
Castro Camilo, Daniela
2016-05-11
We introduce a density regression model for the spectral density of a bivariate extreme value distribution, that allows us to assess how extremal dependence can change over a covariate. Inference is performed through a double kernel estimator, which can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts of practical interest. An extreme temperature dataset is used to illustrate our methods. © 2016 Springer-Verlag Berlin Heidelberg
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events (SPEs). The total dose received from a major SPE in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be mitigated with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train and provides predictions as accurate as neural network models previously used. (authors)
Mapping geogenic radon potential by regression kriging
International Nuclear Information System (INIS)
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-01-01
Radon ( 222 Rn) gas is produced in the radioactive decay chain of uranium ( 238 U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method, regression
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside Earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events. The total dose received from a major solar particle event in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be reduced with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train, and provides predictions as accurate as the neural network models that were used previously. (authors)
AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Н. Білак
2012-04-01
Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, Veerle
2012-01-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logisti...
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Random Coefficient Logit Model for Large Datasets
C. Hernández-Mireles (Carlos); D. Fok (Dennis)
2010-01-01
textabstractWe present an approach for analyzing market shares and products price elasticities based on large datasets containing aggregate sales data for many products, several markets and for relatively long time periods. We consider the recently proposed Bayesian approach of Jiang et al [Jiang,
Logit Analysis for Profit Maximizing Loan Classification
Watt, David L.; Mortensen, Timothy L.; Leistritz, F. Larry
1988-01-01
Lending criteria and loan classification methods are developed. Rating system breaking points are analyzed to present a method to maximize loan revenues. Financial characteristics of farmers are used as determinants of delinquency in a multivariate logistic model. Results indicate that debt-to-asset and operating ration are most indicative of default.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak; Ghosh, Malay; Mallick, Bani K.
2012-01-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik's ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Spontaneous regression of intracranial malignant lymphoma
International Nuclear Information System (INIS)
Kojo, Nobuto; Tokutomi, Takashi; Eguchi, Gihachirou; Takagi, Shigeyuki; Matsumoto, Tomie; Sasaguri, Yasuyuki; Shigemori, Minoru.
1988-01-01
In a 46-year-old female with a 1-month history of gait and speech disturbances, computed tomography (CT) demonstrated mass lesions of slightly high density in the left basal ganglia and left frontal lobe. The lesions were markedly enhanced by contrast medium. The patient received no specific treatment, but her clinical manifestations gradually abated and the lesions decreased in size. Five months after her initial examination, the lesions were absent on CT scans; only a small area of low density remained. Residual clinical symptoms included mild right hemiparesis and aphasia. After 14 months the patient again deteriorated, and a CT scan revealed mass lesions in the right frontal lobe and the pons. However, no enhancement was observed in the previously affected regions. A biopsy revealed malignant lymphoma. Despite treatment with steroids and radiation, the patient's clinical status progressively worsened and she died 27 months after initial presentation. Seven other cases of spontaneous regression of primary malignant lymphoma have been reported. In this case, the mechanism of the spontaneous regression was not clear, but changes in immunologic status may have been involved. (author)
Regression testing in the TOTEM DCS
International Nuclear Information System (INIS)
Rodríguez, F Lucas; Atanassov, I; Burkimsher, P; Frost, O; Taskinen, J; Tulimaki, V
2012-01-01
The Detector Control System of the TOTEM experiment at the LHC is built with the industrial product WinCC OA (PVSS). The TOTEM system is generated automatically through scripts using as input the detector Product Breakdown Structure (PBS) structure and its pinout connectivity, archiving and alarm metainformation, and some other heuristics based on the naming conventions. When those initial parameters and automation code are modified to include new features, the resulting PVSS system can also introduce side-effects. On a daily basis, a custom developed regression testing tool takes the most recent code from a Subversion (SVN) repository and builds a new control system from scratch. This system is exported in plain text format using the PVSS export tool, and compared with a system previously validated by a human. A report is sent to the developers with any differences highlighted, in readiness for validation and acceptance as a new stable version. This regression approach is not dependent on any development framework or methodology. This process has been satisfactory during several months, proving to be a very valuable tool before deploying new versions in the production systems.
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Structural Break Tests Robust to Regression Misspecification
Directory of Open Access Journals (Sweden)
Alaa Abi Morshed
2018-05-01
Full Text Available Structural break tests for regression models are sensitive to model misspecification. We show—analytically and through simulations—that the sup Wald test for breaks in the conditional mean and variance of a time series process exhibits severe size distortions when the conditional mean dynamics are misspecified. We also show that the sup Wald test for breaks in the unconditional mean and variance does not have the same size distortions, yet benefits from similar power to its conditional counterpart in correctly specified models. Hence, we propose using it as an alternative and complementary test for breaks. We apply the unconditional and conditional mean and variance tests to three US series: unemployment, industrial production growth and interest rates. Both the unconditional and the conditional mean tests detect a break in the mean of interest rates. However, for the other two series, the unconditional mean test does not detect a break, while the conditional mean tests based on dynamic regression models occasionally detect a break, with the implied break-point estimator varying across different dynamic specifications. For all series, the unconditional variance does not detect a break while most tests for the conditional variance do detect a break which also varies across specifications.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Hyperspectral Unmixing with Robust Collaborative Sparse Regression
Directory of Open Access Journals (Sweden)
Chang Li
2016-07-01
Full Text Available Recently, sparse unmixing (SU of hyperspectral data has received particular attention for analyzing remote sensing images. However, most SU methods are based on the commonly admitted linear mixing model (LMM, which ignores the possible nonlinear effects (i.e., nonlinearity. In this paper, we propose a new method named robust collaborative sparse regression (RCSR based on the robust LMM (rLMM for hyperspectral unmixing. The rLMM takes the nonlinearity into consideration, and the nonlinearity is merely treated as outlier, which has the underlying sparse property. The RCSR simultaneously takes the collaborative sparse property of the abundance and sparsely distributed additive property of the outlier into consideration, which can be formed as a robust joint sparse regression problem. The inexact augmented Lagrangian method (IALM is used to optimize the proposed RCSR. The qualitative and quantitative experiments on synthetic datasets and real hyperspectral images demonstrate that the proposed RCSR is efficient for solving the hyperspectral SU problem compared with the other four state-of-the-art algorithms.
Supporting Regularized Logistic Regression Privately and Efficiently.
Directory of Open Access Journals (Sweden)
Wenfa Li
Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Interpreting parameters in the logistic regression model with random effects
DEFF Research Database (Denmark)
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Directory of Open Access Journals (Sweden)
Perinetti, G.
2016-04-01
Full Text Available Introduction: The identification of the onset of the pubertal growth spurt has major clinical implications when dealing with orthodontic treatment in growing subjects. Aim: Through multivariate methods, this study evaluated possible relationships between the gingival crevicular fluid (GCF alkaline phosphatase (ALP activity and pubertal growth spurt and dentition phase. Materials and methods: One hundred healthy growing subjects (62 females, 38 males; mean age, 11.5±2.4 years were enrolled into this doubleblind, prospective, cross-sectional-design study. Phases of skeletal maturation (pre - pubertal, pubertal, post - pubertal was assessed using the cervical vertebral maturation method. Samples of GCF for the ALP activity determination were collected at the mesial and distal sites of the mandibular central incisors. The phases of the dentition were recorded as intermediate mixed, late mixed, or permanent. A multinomial multiple logistic regression model was used to assess relationships of the enzymatic activity to growth phases and dentition phases. Results: The GCF ALP activity was greater in the pubertal growth phase as compared to the pre - pubertal and post - pubertal growth phases. Significant adjusted odds ratios for the GCF ALP activity for the pre - pubertal and post - pubertal subjects, in relation to the pubertal group, were 0.76 and 0.84, respectively. No significant correlations were seen for the dentition phase. Conclusions: The GCF ALP activity is a valid candidate as a non - invasive biomarker for the identification of the pubertal growth spurt irrespective of the dentition phase.
BANK FAILURE PREDICTION WITH LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Taha Zaghdoudi
2013-04-01
Full Text Available In recent years the economic and financial world is shaken by a wave of financial crisis and resulted in violent bank fairly huge losses. Several authors have focused on the study of the crises in order to develop an early warning model. It is in the same path that our work takes its inspiration. Indeed, we have tried to develop a predictive model of Tunisian bank failures with the contribution of the binary logistic regression method. The specificity of our prediction model is that it takes into account microeconomic indicators of bank failures. The results obtained using our provisional model show that a bank's ability to repay its debt, the coefficient of banking operations, bank profitability per employee and leverage financial ratio has a negative impact on the probability of failure.
Robust Mediation Analysis Based on Median Regression
Yuan, Ying; MacKinnon, David P.
2014-01-01
Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925
ANYOLS, Least Square Fit by Stepwise Regression
International Nuclear Information System (INIS)
Atwoods, C.L.; Mathews, S.
1986-01-01
Description of program or function: ANYOLS is a stepwise program which fits data using ordinary or weighted least squares. Variables are selected for the model in a stepwise way based on a user- specified input criterion or a user-written subroutine. The order in which variables are entered can be influenced by user-defined forcing priorities. Instead of stepwise selection, ANYOLS can try all possible combinations of any desired subset of the variables. Automatic output for the final model in a stepwise search includes plots of the residuals, 'studentized' residuals, and leverages; if the model is not too large, the output also includes partial regression and partial leverage plots. A data set may be re-used so that several selection criteria can be tried. Flexibility is increased by allowing the substitution of user-written subroutines for several default subroutines
Nonparametric additive regression for repeatedly measured data
Carroll, R. J.
2009-05-20
We develop an easily computed smooth backfitting algorithm for additive model fitting in repeated measures problems. Our methodology easily copes with various settings, such as when some covariates are the same over repeated response measurements. We allow for a working covariance matrix for the regression errors, showing that our method is most efficient when the correct covariance matrix is used. The component functions achieve the known asymptotic variance lower bound for the scalar argument case. Smooth backfitting also leads directly to design-independent biases in the local linear case. Simulations show our estimator has smaller variance than the usual kernel estimator. This is also illustrated by an example from nutritional epidemiology. © 2009 Biometrika Trust.
Conjoined legs: Sirenomelia or caudal regression syndrome?
Directory of Open Access Journals (Sweden)
Sakti Prasad Das
2013-01-01
Full Text Available Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Conjoined legs: Sirenomelia or caudal regression syndrome?
Das, Sakti Prasad; Ojha, Niranjan; Ganesh, G Shankar; Mohanty, Ram Narayan
2013-07-01
Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Logistic regression against a divergent Bayesian network
Directory of Open Access Journals (Sweden)
Noel Antonio Sánchez Trujillo
2015-01-01
Full Text Available This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered; we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.
Adaptive regression for modeling nonlinear relationships
Knafl, George J
2016-01-01
This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...
Crime Modeling using Spatial Regression Approach
Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.
2018-01-01
Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.
Regression analysis for the social sciences
Gordon, Rachel A
2015-01-01
Provides graduate students in the social sciences with the basic skills they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of Stata and use of chapter exercises in which students practice programming and interpretation on the same data set. A separate set of exercises allows students to select a data set to apply the concepts learned in each chapter to a research question of interest to them, all updated for this edition.
Entrepreneurial intention modeling using hierarchical multiple regression
Directory of Open Access Journals (Sweden)
Marina Jeger
2014-12-01
Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.
Gaussian process regression for geometry optimization
Denzel, Alexander; Kästner, Johannes
2018-03-01
We implemented a geometry optimizer based on Gaussian process regression (GPR) to find minimum structures on potential energy surfaces. We tested both a two times differentiable form of the Matérn kernel and the squared exponential kernel. The Matérn kernel performs much better. We give a detailed description of the optimization procedures. These include overshooting the step resulting from GPR in order to obtain a higher degree of interpolation vs. extrapolation. In a benchmark against the Limited-memory Broyden-Fletcher-Goldfarb-Shanno optimizer of the DL-FIND library on 26 test systems, we found the new optimizer to generally reduce the number of required optimization steps.
Least square regularized regression in sum space.
Xu, Yong-Li; Chen, Di-Rong; Li, Han-Xiong; Liu, Lu
2013-04-01
This paper proposes a least square regularized regression algorithm in sum space of reproducing kernel Hilbert spaces (RKHSs) for nonflat function approximation, and obtains the solution of the algorithm by solving a system of linear equations. This algorithm can approximate the low- and high-frequency component of the target function with large and small scale kernels, respectively. The convergence and learning rate are analyzed. We measure the complexity of the sum space by its covering number and demonstrate that the covering number can be bounded by the product of the covering numbers of basic RKHSs. For sum space of RKHSs with Gaussian kernels, by choosing appropriate parameters, we tradeoff the sample error and regularization error, and obtain a polynomial learning rate, which is better than that in any single RKHS. The utility of this method is illustrated with two simulated data sets and five real-life databases.
Statistical learning from a regression perspective
Berk, Richard A
2016-01-01
This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this can be seen as an extension of nonparametric regression. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. A continued emphasis on the implications for practice runs through the text. Among the statistical learning procedures examined are bagging, random forests, boosting, support vector machines and neural networks. Response variables may be quantitative or categorical. As in the first edition, a unifying theme is supervised learning that can be trea...
Model Selection in Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter
Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...
Learning Inverse Rig Mappings by Nonlinear Regression.
Holden, Daniel; Saito, Jun; Komura, Taku
2017-03-01
We present a framework to design inverse rig-functions-functions that map low level representations of a character's pose such as joint positions or surface geometry to the representation used by animators called the animation rig. Animators design scenes using an animation rig, a framework widely adopted in animation production which allows animators to design character poses and geometry via intuitive parameters and interfaces. Yet most state-of-the-art computer animation techniques control characters through raw, low level representations such as joint angles, joint positions, or vertex coordinates. This difference often stops the adoption of state-of-the-art techniques in animation production. Our framework solves this issue by learning a mapping between the low level representations of the pose and the animation rig. We use nonlinear regression techniques, learning from example animation sequences designed by the animators. When new motions are provided in the skeleton space, the learned mapping is used to estimate the rig controls that reproduce such a motion. We introduce two nonlinear functions for producing such a mapping: Gaussian process regression and feedforward neural networks. The appropriate solution depends on the nature of the rig and the amount of data available for training. We show our framework applied to various examples including articulated biped characters, quadruped characters, facial animation rigs, and deformable characters. With our system, animators have the freedom to apply any motion synthesis algorithm to arbitrary rigging and animation pipelines for immediate editing. This greatly improves the productivity of 3D animation, while retaining the flexibility and creativity of artistic input.
DRREP: deep ridge regressed epitope predictor.
Sher, Gene; Zhi, Degui; Zhang, Shaojie
2017-10-03
The ability to predict epitopes plays an enormous role in vaccine development in terms of our ability to zero in on where to do a more thorough in-vivo analysis of the protein in question. Though for the past decade there have been numerous advancements and improvements in epitope prediction, on average the best benchmark prediction accuracies are still only around 60%. New machine learning algorithms have arisen within the domain of deep learning, text mining, and convolutional networks. This paper presents a novel analytically trained and string kernel using deep neural network, which is tailored for continuous epitope prediction, called: Deep Ridge Regressed Epitope Predictor (DRREP). DRREP was tested on long protein sequences from the following datasets: SARS, Pellequer, HIV, AntiJen, and SEQ194. DRREP was compared to numerous state of the art epitope predictors, including the most recently published predictors called LBtope and DMNLBE. Using area under ROC curve (AUC), DRREP achieved a performance improvement over the best performing predictors on SARS (13.7%), HIV (8.9%), Pellequer (1.5%), and SEQ194 (3.1%), with its performance being matched only on the AntiJen dataset, by the LBtope predictor, where both DRREP and LBtope achieved an AUC of 0.702. DRREP is an analytically trained deep neural network, thus capable of learning in a single step through regression. By combining the features of deep learning, string kernels, and convolutional networks, the system is able to perform residue-by-residue prediction of continues epitopes with higher accuracy than the current state of the art predictors.
Collaborative regression-based anatomical landmark detection
International Nuclear Information System (INIS)
Gao, Yaozong; Shen, Dinggang
2015-01-01
Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)
Multinomial malware classification based on call graphs
Østbye, Morten Oscar
2017-01-01
Ever since the computer was invented, people have found ways to evolve interaction or simplify tasks with computational resources, this for both good and bad. For the known lifespan of the digital age, malicious software (malware) has been a constant threat to computer systems. Malware has been the cause of enormous damage related to both governmental and private sectors, but also for individuals. Malware has evolved to target different systems and environments and therefore there exists a va...
Logistic regression applied to natural hazards: rare event logistic regression with replications
Directory of Open Access Journals (Sweden)
M. Guns
2012-06-01
Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Smith, Paul F; Ganesh, Siva; Liu, Ping
2013-10-30
Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION
Krzyśko, Mirosław; Smaga, Łukasz
2017-01-01
In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...
Spatial vulnerability assessments by regression kriging
Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor
2016-04-01
information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Automation of Flight Software Regression Testing
Tashakkor, Scott B.
2016-01-01
NASA is developing the Space Launch System (SLS) to be a heavy lift launch vehicle supporting human and scientific exploration beyond earth orbit. SLS will have a common core stage, an upper stage, and different permutations of boosters and fairings to perform various crewed or cargo missions. Marshall Space Flight Center (MSFC) is writing the Flight Software (FSW) that will operate the SLS launch vehicle. The FSW is developed in an incremental manner based on "Agile" software techniques. As the FSW is incrementally developed, testing the functionality of the code needs to be performed continually to ensure that the integrity of the software is maintained. Manually testing the functionality on an ever-growing set of requirements and features is not an efficient solution and therefore needs to be done automatically to ensure testing is comprehensive. To support test automation, a framework for a regression test harness has been developed and used on SLS FSW. The test harness provides a modular design approach that can compile or read in the required information specified by the developer of the test. The modularity provides independence between groups of tests and the ability to add and remove tests without disturbing others. This provides the SLS FSW team a time saving feature that is essential to meeting SLS Program technical and programmatic requirements. During development of SLS FSW, this technique has proved to be a useful tool to ensure all requirements have been tested, and that desired functionality is maintained, as changes occur. It also provides a mechanism for developers to check functionality of the code that they have developed. With this system, automation of regression testing is accomplished through a scheduling tool and/or commit hooks. Key advantages of this test harness capability includes execution support for multiple independent test cases, the ability for developers to specify precisely what they are testing and how, the ability to add
Laplacian embedded regression for scalable manifold regularization.
Chen, Lin; Tsang, Ivor W; Xu, Dong
2012-06-01
Semi-supervised learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms, such as Laplacian support vector machine (LapSVM) and Laplacian regularized least squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian embedded regression by introducing an intermediate decision variable into the manifold regularization framework. By using ∈-insensitive loss, we obtain the Laplacian embedded support vector regression (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are two-fold: (1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately and (2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel principal component analysis, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale SSL problems. Extensive experiments on both toy and real
Directory of Open Access Journals (Sweden)
Qiutong Jin
2016-06-01
Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Hecht, Jeffrey B.
The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…
A rotor optimization using regression analysis
Giansante, N.
1984-01-01
The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
Free Software Development. 1. Fitting Statistical Regressions
Directory of Open Access Journals (Sweden)
Lorentz JÄNTSCHI
2002-12-01
Full Text Available The present paper is focused on modeling of statistical data processing with applications in field of material science and engineering. A new method of data processing is presented and applied on a set of 10 Ni–Mn–Ga ferromagnetic ordered shape memory alloys that are known to exhibit phonon softening and soft mode condensation into a premartensitic phase prior to the martensitic transformation itself. The method allows to identify the correlations between data sets and to exploit them later in statistical study of alloys. An algorithm for computing data was implemented in preprocessed hypertext language (PHP, a hypertext markup language interface for them was also realized and put onto comp.east.utcluj.ro educational web server, and it is accessible via http protocol at the address http://vl.academicdirect.ro/applied_statistics/linear_regression/multiple/v1.5/. The program running for the set of alloys allow to identify groups of alloys properties and give qualitative measure of correlations between properties. Surfaces of property dependencies are also fitted.
Kepler AutoRegressive Planet Search (KARPS)
Caceres, Gabriel
2018-01-01
One of the main obstacles in detecting faint planetary transits is the intrinsic stellar variability of the host star. The Kepler AutoRegressive Planet Search (KARPS) project implements statistical methodology associated with autoregressive processes (in particular, ARIMA and ARFIMA) to model stellar lightcurves in order to improve exoplanet transit detection. We also develop a novel Transit Comb Filter (TCF) applied to the AR residuals which provides a periodogram analogous to the standard Box-fitting Least Squares (BLS) periodogram. We train a random forest classifier on known Kepler Objects of Interest (KOIs) using select features from different stages of this analysis, and then use ROC curves to define and calibrate the criteria to recover the KOI planet candidates with high fidelity. These statistical methods are detailed in a contributed poster (Feigelson et al., this meeting).These procedures are applied to the full DR25 dataset of NASA’s Kepler mission. Using the classification criteria, a vast majority of known KOIs are recovered and dozens of new KARPS Candidate Planets (KCPs) discovered, including ultra-short period exoplanets. The KCPs will be briefly presented and discussed.
DNBR Prediction Using a Support Vector Regression
International Nuclear Information System (INIS)
Yang, Heon Young; Na, Man Gyun
2008-01-01
PWRs (Pressurized Water Reactors) generally operate in the nucleate boiling state. However, the conversion of nucleate boiling into film boiling with conspicuously reduced heat transfer induces a boiling crisis that may cause the fuel clad melting in the long run. This type of boiling crisis is called Departure from Nucleate Boiling (DNB) phenomena. Because the prediction of minimum DNBR in a reactor core is very important to prevent the boiling crisis such as clad melting, a lot of research has been conducted to predict DNBR values. The object of this research is to predict minimum DNBR applying support vector regression (SVR) by using the measured signals of a reactor coolant system (RCS). The SVR has extensively and successfully been applied to nonlinear function approximation like the proposed problem for estimating DNBR values that will be a function of various input variables such as reactor power, reactor pressure, core mass flowrate, control rod positions and so on. The minimum DNBR in a reactor core is predicted using these various operating condition data as the inputs to the SVR. The minimum DBNR values predicted by the SVR confirm its correctness compared with COLSS values
Agogo, George O; van der Voet, Hilko; van't Veer, Pieter; Ferrari, Pietro; Leenders, Max; Muller, David C; Sánchez-Cantalejo, Emilio; Bamia, Christina; Braaten, Tonje; Knüppel, Sven; Johansson, Ingegerd; van Eeuwijk, Fred A; Boshuizen, Hendriek
2014-01-01
In epidemiologic studies, measurement error in dietary variables often attenuates association between dietary intake and disease occurrence. To adjust for the attenuation caused by error in dietary intake, regression calibration is commonly used. To apply regression calibration, unbiased reference measurements are required. Short-term reference measurements for foods that are not consumed daily contain excess zeroes that pose challenges in the calibration model. We adapted two-part regression calibration model, initially developed for multiple replicates of reference measurements per individual to a single-replicate setting. We showed how to handle excess zero reference measurements by two-step modeling approach, how to explore heteroscedasticity in the consumed amount with variance-mean graph, how to explore nonlinearity with the generalized additive modeling (GAM) and the empirical logit approaches, and how to select covariates in the calibration model. The performance of two-part calibration model was compared with the one-part counterpart. We used vegetable intake and mortality data from European Prospective Investigation on Cancer and Nutrition (EPIC) study. In the EPIC, reference measurements were taken with 24-hour recalls. For each of the three vegetable subgroups assessed separately, correcting for error with an appropriately specified two-part calibration model resulted in about three fold increase in the strength of association with all-cause mortality, as measured by the log hazard ratio. Further found is that the standard way of including covariates in the calibration model can lead to over fitting the two-part calibration model. Moreover, the extent of adjusting for error is influenced by the number and forms of covariates in the calibration model. For episodically consumed foods, we advise researchers to pay special attention to response distribution, nonlinearity, and covariate inclusion in specifying the calibration model.
Sirenomelia and severe caudal regression syndrome.
Seidahmed, Mohammed Z; Abdelbasit, Omer B; Alhussein, Khalid A; Miqdad, Abeer M; Khalil, Mohammed I; Salih, Mustafa A
2014-12-01
To describe cases of sirenomelia and severe caudal regression syndrome (CRS), to report the prevalence of sirenomelia, and compare our findings with the literature. Retrospective data was retrieved from the medical records of infants with the diagnosis of sirenomelia and CRS and their mothers from 1989 to 2010 (22 years) at the Security Forces Hospital, Riyadh, Saudi Arabia. A perinatologist, neonatologist, pediatric neurologist, and radiologist ascertained the diagnoses. The cases were identified as part of a study of neural tube defects during that period. A literature search was conducted using MEDLINE. During the 22-year study period, the total number of deliveries was 124,933 out of whom, 4 patients with sirenomelia, and 2 patients with severe forms of CRS were identified. All the patients with sirenomelia had single umbilical artery, and none were the infant of a diabetic mother. One patient was a twin, and another was one of triplets. The 2 patients with CRS were sisters, their mother suffered from type II diabetes mellitus and morbid obesity on insulin, and neither of them had a single umbilical artery. Other associated anomalies with sirenomelia included an absent radius, thumb, and index finger in one patient, Potter's syndrome, abnormal ribs, microphthalmia, congenital heart disease, hypoplastic lungs, and diaphragmatic hernia. The prevalence of sirenomelia (3.2 per 100,000) is high compared with the international prevalence of one per 100,000. Both cases of CRS were infants of type II diabetic mother with poor control, supporting the strong correlation of CRS and maternal diabetes.
Gaussian process regression for tool wear prediction
Kong, Dongdong; Chen, Yongjie; Li, Ning
2018-05-01
To realize and accelerate the pace of intelligent manufacturing, this paper presents a novel tool wear assessment technique based on the integrated radial basis function based kernel principal component analysis (KPCA_IRBF) and Gaussian process regression (GPR) for real-timely and accurately monitoring the in-process tool wear parameters (flank wear width). The KPCA_IRBF is a kind of new nonlinear dimension-increment technique and firstly proposed for feature fusion. The tool wear predictive value and the corresponding confidence interval are both provided by utilizing the GPR model. Besides, GPR performs better than artificial neural networks (ANN) and support vector machines (SVM) in prediction accuracy since the Gaussian noises can be modeled quantitatively in the GPR model. However, the existence of noises will affect the stability of the confidence interval seriously. In this work, the proposed KPCA_IRBF technique helps to remove the noises and weaken its negative effects so as to make the confidence interval compressed greatly and more smoothed, which is conducive for monitoring the tool wear accurately. Moreover, the selection of kernel parameter in KPCA_IRBF can be easily carried out in a much larger selectable region in comparison with the conventional KPCA_RBF technique, which helps to improve the efficiency of model construction. Ten sets of cutting tests are conducted to validate the effectiveness of the presented tool wear assessment technique. The experimental results show that the in-process flank wear width of tool inserts can be monitored accurately by utilizing the presented tool wear assessment technique which is robust under a variety of cutting conditions. This study lays the foundation for tool wear monitoring in real industrial settings.
Keith, Timothy Z
2014-01-01
Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.
Kepler AutoRegressive Planet Search
Caceres, Gabriel Antonio; Feigelson, Eric
2016-01-01
The Kepler AutoRegressive Planet Search (KARPS) project uses statistical methodology associated with autoregressive (AR) processes to model Kepler lightcurves in order to improve exoplanet transit detection in systems with high stellar variability. We also introduce a planet-search algorithm to detect transits in time-series residuals after application of the AR models. One of the main obstacles in detecting faint planetary transits is the intrinsic stellar variability of the host star. The variability displayed by many stars may have autoregressive properties, wherein later flux values are correlated with previous ones in some manner. Our analysis procedure consisting of three steps: pre-processing of the data to remove discontinuities, gaps and outliers; AR-type model selection and fitting; and transit signal search of the residuals using a new Transit Comb Filter (TCF) that replaces traditional box-finding algorithms. The analysis procedures of the project are applied to a portion of the publicly available Kepler light curve data for the full 4-year mission duration. Tests of the methods have been made on a subset of Kepler Objects of Interest (KOI) systems, classified both as planetary `candidates' and `false positives' by the Kepler Team, as well as a random sample of unclassified systems. We find that the ARMA-type modeling successfully reduces the stellar variability, by a factor of 10 or more in active stars and by smaller factors in more quiescent stars. A typical quiescent Kepler star has an interquartile range (IQR) of ~10 e-/sec, which may improve slightly after modeling, while those with IQR ranging from 20 to 50 e-/sec, have improvements from 20% up to 70%. High activity stars (IQR exceeding 100) markedly improve. A periodogram based on the TCF is constructed to concentrate the signal of these periodic spikes. When a periodic transit is found, the model is displayed on a standard period-folded averaged light curve. Our findings to date on real
Detection of epistatic effects with logic regression and a classical linear regression model.
Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata
2014-02-01
To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
National and international graduate migration flows.
Mosca, Irene; Wright, Robert E
2010-01-01
This article examines the nature of national and international graduate migration flows in the UK. Migration equations are estimated with microdata from a matched dataset of Students and Destinations of Leavers from Higher Education, information collected by the Higher Education Statistical Agency. The probability of migrating is related to a set of observable characteristics using multinomial logit regression. The analysis suggests that migration is a selective process with graduates with certain characteristics having considerably higher probabilities of migrating, both to other regions of the UK and abroad.
Sparse Regression by Projection and Sparse Discriminant Analysis
Qi, Xin; Luo, Ruiyan; Carroll, Raymond J.; Zhao, Hongyu
2015-01-01
predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Dimension Reduction and Discretization in Stochastic Problems by Regression Method
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager
1996-01-01
The chapter mainly deals with dimension reduction and field discretizations based directly on the concept of linear regression. Several examples of interesting applications in stochastic mechanics are also given.Keywords: Random fields discretization, Linear regression, Stochastic interpolation, ...
Linear regression crash prediction models : issues and proposed solutions.
2010-05-01
The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...
An Additive-Multiplicative Cox-Aalen Regression Model
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
2002-01-01
Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...
Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits
National Research Council Canada - National Science Library
Gravier, Michael
1999-01-01
.... The research identified logistic regression as a powerful tool for analysis of DMSMS and further developed twenty models attempting to identify the "best" way to model and predict DMSMS using logistic regression...
Model-based Quantile Regression for Discrete Data
Padellini, Tullia; Rue, Haavard
2018-01-01
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite
The MIDAS Touch: Mixed Data Sampling Regression Models
Ghysels, Eric; Santa-Clara, Pedro; Valkanov, Rossen
2004-01-01
We introduce Mixed Data Sampling (henceforth MIDAS) regression models. The regressions involve time series data sampled at different frequencies. Technically speaking MIDAS models specify conditional expectations as a distributed lag of regressors recorded at some higher sampling frequencies. We examine the asymptotic properties of MIDAS regression estimation and compare it with traditional distributed lag models. MIDAS regressions have wide applicability in macroeconomics and ï¿½nance.
Regression Benchmarking: An Approach to Quality Assurance in Performance
Bulej, Lubomír
2005-01-01
The paper presents a short summary of our work in the area of regression benchmarking and its application to software development. Specially, we explain the concept of regression benchmarking, the requirements for employing regression testing in a software project, and methods used for analyzing the vast amounts of data resulting from repeated benchmarking. We present the application of regression benchmarking on a real software project and conclude with a glimpse at the challenges for the fu...
Sosa-Rubi, Sandra G.; Galárraga, Omar
2009-01-01
Objective We evaluated the impact of Seguro Popular (SP), a program introduced in 2001 in Mexico primarily to finance health care for the poor. We focused on the effect of household enrollment in SP on pregnant women’s access to obstetrical services, an important outcome measure of both maternal and infant health. Data We relied upon data from the cross-sectional 2006 National Health and Nutrition Survey (ENSANUT) in Mexico. We analyzed the responses of 3,890 women who delivered babies during 2001–2006 and whose households lacked employer-based health care coverage. Methods We formulated a multinomial probit model that distinguished between three mutually exclusive sites for delivering a baby: a health unit specifically accredited by SP; a non-SP-accredited clinic run by the Department of Health (Secretaría de Salud, or SSA); and private obstetrical care. Our model accounted for the endogeneity of the household’s binary decision to enroll in the SP program. Results Women in households that participated in the SP program had a much stronger preference for having a baby in a SP-sponsored unit rather than paying out of pocket for a private delivery. At the same time, participation in SP was associated with a stronger preference for delivering in the private sector rather than at a state-run SSA clinic. On balance, the Seguro Popular program reduced pregnant women’s attendance at an SSA clinic much more than it reduced the probability of delivering a baby in the private sector. The quantitative impact of the SP program varied with the woman’s education and health, as well as the assets and location (rural versus urban) of the household. Conclusions The SP program had a robust, significantly positive impact on access to obstetrical services. Our finding that women enrolled in SP switched from non-SP state-run facilities, rather than from out-of-pocket private services, is important for public policy and requires further exploration. PMID:18824268
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Meta-Modeling by Symbolic Regression and Pareto Simulated Annealing
Stinstra, E.; Rennen, G.; Teeuwen, G.J.A.
2006-01-01
The subject of this paper is a new approach to Symbolic Regression.Other publications on Symbolic Regression use Genetic Programming.This paper describes an alternative method based on Pareto Simulated Annealing.Our method is based on linear regression for the estimation of constants.Interval
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Morales, Esteban; de Leon, John Mark S; Abdollahi, Niloufar; Yu, Fei; Nouri-Mahdavi, Kouros; Caprioli, Joseph
2016-03-01
The study was conducted to evaluate threshold smoothing algorithms to enhance prediction of the rates of visual field (VF) worsening in glaucoma. We studied 798 patients with primary open-angle glaucoma and 6 or more years of follow-up who underwent 8 or more VF examinations. Thresholds at each VF location for the first 4 years or first half of the follow-up time (whichever was greater) were smoothed with clusters defined by the nearest neighbor (NN), Garway-Heath, Glaucoma Hemifield Test (GHT), and weighting by the correlation of rates at all other VF locations. Thresholds were regressed with a pointwise exponential regression (PER) model and a pointwise linear regression (PLR) model. Smaller root mean square error (RMSE) values of the differences between the observed and the predicted thresholds at last two follow-ups indicated better model predictions. The mean (SD) follow-up times for the smoothing and prediction phase were 5.3 (1.5) and 10.5 (3.9) years. The mean RMSE values for the PER and PLR models were unsmoothed data, 6.09 and 6.55; NN, 3.40 and 3.42; Garway-Heath, 3.47 and 3.48; GHT, 3.57 and 3.74; and correlation of rates, 3.59 and 3.64. Smoothed VF data predicted better than unsmoothed data. Nearest neighbor provided the best predictions; PER also predicted consistently more accurately than PLR. Smoothing algorithms should be used when forecasting VF results with PER or PLR. The application of smoothing algorithms on VF data can improve forecasting in VF points to assist in treatment decisions.
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Few crystal balls are crystal clear : eyeballing regression
International Nuclear Information System (INIS)
Wittebrood, R.T.
1998-01-01
The theory of regression and statistical analysis as it applies to reservoir analysis was discussed. It was argued that regression lines are not always the final truth. It was suggested that regression lines and eyeballed lines are often equally accurate. The many conditions that must be fulfilled to calculate a proper regression were discussed. Mentioned among these conditions were the distribution of the data, hidden variables, knowledge of how the data was obtained, the need for causal correlation of the variables, and knowledge of the manner in which the regression results are going to be used. 1 tab., 13 figs
Sparse reduced-rank regression with covariance estimation
Chen, Lisha
2014-12-08
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Sparse reduced-rank regression with covariance estimation
Chen, Lisha; Huang, Jianhua Z.
2014-01-01
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Nahib, Irmadi; Suryanta, Jaka; Niedyawati; Kardono, Priyadi; Turmudi; Lestari, Sri; Windiastuti, Rizka
2018-05-01
Ministry of Agriculture have targeted production of 1.718 million tons of dry grain harvest during period of 2016-2021 to achieve food self-sufficiency, through optimization of special commodities including paddy, soybean and corn. This research was conducted to develop a sustainable paddy field zone delineation model using logistic regression and multicriteria land evaluation in Indramayu Regency. A model was built on the characteristics of local function conversion by considering the concept of sustainable development. Spatial data overlay was constructed using available data, and then this model was built upon the occurrence of paddy field between 1998 and 2015. Equation for the model of paddy field changes obtained was: logit (paddy field conversion) = - 2.3048 + 0.0032*X1 – 0.0027*X2 + 0.0081*X3 + 0.0025*X4 + 0.0026*X5 + 0.0128*X6 – 0.0093*X7 + 0.0032*X8 + 0.0071*X9 – 0.0046*X10 where X1 to X10 were variables that determine the occurrence of changes in paddy fields, with a result value of Relative Operating Characteristics (ROC) of 0.8262. The weakest variable in influencing the change of paddy field function was X7 (paddy field price), while the most influential factor was X1 (distance from river). Result of the logistic regression was used as a weight for multicriteria land evaluation, which recommended three scenarios of paddy fields protection policy: standard, protective, and permissive. The result of this modelling, the priority paddy fields for protected scenario were obtained, as well as the buffer zones for the surrounding paddy fields.
Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro
2012-11-01
Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. Copyright
630 understanding farmers' response to climate variability in nigeria
African Journals Online (AJOL)
Osondu
Data were analyzed using descriptive statistics, and multinomial logit models. Farmers used multiple adaptation strategies; Crop Diversification (CD), Soil ... Increases in temperature, cloud ... and the effect of climate elements and their extreme ...
Labour market transitions and job satisfaction
G.E. Bijwaard (Govert); A. van Dijk (Bram); J. de Koning (Jaap)
2003-01-01
textabstractThe paper investigates the relationship between job satisfaction and labour market transitions. Using a multinomial logit model, a model is estimated on the basis of individual data in which transitions are explained from individual characteristics, job characteristics, dissatisfaction
Determinants of agricultural micro-credit repayment-evidence from ...
African Journals Online (AJOL)
Agro-Science ... for the study while descriptive statistics and multinomial binary logit model were employed for data analyses. ... on long term funds (without prejudice to the existing revolving loan mechanism) such as the pension contributions, ...
Robust Regression and its Application in Financial Data Analysis
Mansoor Momeni; Mahmoud Dehghan Nayeri; Ali Faal Ghayoumi; Hoda Ghorbani
2010-01-01
This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from th...
Local bilinear multiple-output quantile/depth regression
Czech Academy of Sciences Publication Activity Database
Hallin, M.; Lu, Z.; Paindaveine, D.; Šiman, Miroslav
2015-01-01
Roč. 21, č. 3 (2015), s. 1435-1466 ISSN 1350-7265 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : conditional depth * growth chart * halfspace depth * local bilinear regression * multivariate quantile * quantile regression * regression depth Subject RIV: BA - General Mathematics Impact factor: 1.372, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/siman-0446857.pdf