Random regression test-day model for the analysis of dairy cattle ...
African Journals Online (AJOL)
Random regression test-day model for the analysis of dairy cattle production data in South Africa: Creating the framework. EF Dzomba, KA Nephawe, AN Maiwashe, SWP Cloete, M Chimonyo, CB Banga, CJC Muller, K Dzama ...
Directory of Open Access Journals (Sweden)
Ajay Singh
2016-06-01
Full Text Available A single trait linear mixed random regression test-day model was applied for the first time for analyzing the first lactation monthly test-day milk yield records in Karan Fries cattle. The test-day milk yield data was modeled using a random regression model (RRM considering different order of Legendre polynomial for the additive genetic effect (4th order and the permanent environmental effect (5th order. Data pertaining to 1,583 lactation records spread over a period of 30 years were recorded and analyzed in the study. The variance component, heritability and genetic correlations among test-day milk yields were estimated using RRM. RRM heritability estimates of test-day milk yield varied from 0.11 to 0.22 in different test-day records. The estimates of genetic correlations between different test-day milk yields ranged 0.01 (test-day 1 [TD-1] and TD-11 to 0.99 (TD-4 and TD-5. The magnitudes of genetic correlations between test-day milk yields decreased as the interval between test-days increased and adjacent test-day had higher correlations. Additive genetic and permanent environment variances were higher for test-day milk yields at both ends of lactation. The residual variance was observed to be lower than the permanent environment variance for all the test-day milk yields.
Selection of locations of knots for linear splines in random regression test-day models.
Jamrozik, J; Bohmanova, J; Schaeffer, L R
2010-04-01
Using spline functions (segmented polynomials) in regression models requires the knowledge of the location of the knots. Knots are the points at which independent linear segments are connected. Optimal positions of knots for linear splines of different orders were determined in this study for different scenarios, using existing estimates of covariance functions and an optimization algorithm. The traits considered were test-day milk, fat and protein yields, and somatic cell score (SCS) in the first three lactations of Canadian Holsteins. Two ranges of days in milk (from 5 to 305 and from 5 to 365) were taken into account. In addition, four different populations of Holstein cows, from Australia, Canada, Italy and New Zealand, were examined with respect to first lactation (305 days) milk only. The estimates of genetic and permanent environmental covariance functions were based on single- and multiple-trait test-day models, with Legendre polynomials of order 4 as random regressions. A differential evolution algorithm was applied to find the best location of knots for splines of orders 4 to 7 and the criterion for optimization was the goodness-of-fit of the spline covariance function. Results indicated that the optimal position of knots for linear splines differed between genetic and permanent environmental effects, as well as between traits and lactations. Different populations also exhibited different patterns of optimal knot locations. With linear splines, different positions of knots should therefore be used for different effects and traits in random regression test-day models when analysing milk production traits.
Review Random regression test-day model for the analysis of dairy ...
African Journals Online (AJOL)
jannes
Abstract. Genetic evaluation of dairy cattle using test-day models is now common internationally. In South. Africa a fixed regression test-day model is used to generate breeding values for dairy animals on a routine basis. The model is, however, often criticized for erroneously assuming a standard lactation curve for cows.
Random regression test-day model for the analysis of dairy cattle ...
African Journals Online (AJOL)
Genetic evaluation of dairy cattle using test-day models is now common internationally. In South Africa a fixed regression test-day model is used to generate breeding values for dairy animals on a routine basis. The model is, however, often criticized for erroneously assuming a standard lactation curve for cows in similar ...
Gonzalez-Herrera, L G; El Faro, L; Bignardi, A B; Pereira, R J; Machado, C H C; Albuquerque, L G
2015-12-09
The objective of the present study was to estimate the genetic parameters for test-day milk yields (TDMY) in the first and second lactations using random regression models (RRM) in order to contribute to the application of these models in genetic evaluation of milk yield in Gyr cattle. A total of 53,328 TDMY records from 7118 lactations of 5853 Gyr cows were analyzed. The model included the direct additive, permanent environmental, and residual random effects. In addition, contemporary group and linear and quadratic effects of the age of cows at calving were included as fixed effects. A random regression model fitting fourth-order Legendre polynomials for additive genetic and permanent environmental effects, with five classes of residual variance, was applied. In the first lactation, the heritabilities increased from early lactation (0.26) until TDMY3 (0.38), followed by a decrease until the end of lactation. In the second lactation, the estimates increased from the first (0.29) to the fifth test day (0.36), with a slight decrease thereafter, and again increased on the last two test days (0.34 and 0.41). There were positive and high genetic correlations estimated between first-lactation TDMY and the remaining TDMY of the two lactations. The moderate heritability estimates, as well as the high genetic correlations between half the first-lactation TDMY and all TDMY of the two lactations, suggest that the selection based only on first lactation TDMY is the best selection strategy to increase milk production across first and second lactations of Gyr cows.
Sesana, R C; Bignardi, A B; Borquis, R R A; El Faro, L; Baldi, F; Albuquerque, L G; Tonhati, H
2010-10-01
The objective of this work was to estimate covariance functions for additive genetic and permanent environmental effects and, subsequently, to obtain genetic parameters for buffalo's test-day milk production using random regression models on Legendre polynomials (LPs). A total of 17 935 test-day milk yield (TDMY) from 1433 first lactations of Murrah buffaloes, calving from 1985 to 2005 and belonging to 12 herds located in São Paulo state, Brazil, were analysed. Contemporary groups (CGs) were defined by herd, year and month of milk test. Residual variances were modelled through variance functions, from second to fourth order and also by a step function with 1, 4, 6, 22 and 42 classes. The model of analyses included the fixed effect of CGs, number of milking, age of cow at calving as a covariable (linear and quadratic) and the mean trend of the population. As random effects were included the additive genetic and permanent environmental effects. The additive genetic and permanent environmental random effects were modelled by LP of days in milk from quadratic to seventh degree polynomial functions. The model with additive genetic and animal permanent environmental effects adjusted by quintic and sixth order LP, respectively, and residual variance modelled through a step function with six classes was the most adequate model to describe the covariance structure of the data. Heritability estimates decreased from 0.44 (first week) to 0.18 (fourth week). Unexpected negative genetic correlation estimates were obtained between TDMY records at first weeks with records from middle to the end of lactation, being the values varied from -0.07 (second with eighth week) to -0.34 (1st with 42nd week). TDMY heritability estimates were moderate in the course of the lactation, suggesting that this trait could be applied as selection criteria in milking buffaloes. Copyright 2010 Blackwell Verlag GmbH.
Directory of Open Access Journals (Sweden)
Y Shamshirgaran
2011-12-01
Full Text Available The Fixed Regression Test-Day Model (FRM and Random Regression Test-Day Model (RRM for genetic evaluation of milk yield trait of dairy cattle in Khorasan Razavi province were studied. Breeding values and genetic parameters of milk yield trait from two models were compared. A total of 164391 monthly test day milk records (three times milking per day obtained from 19217 Holstein cows distributed in 172 herds and calved from 1991 to 2008, were used to estimate genetic parameters and to predict breeding values. The contemporary group of herd- year- month of production was fitted as fixed effects in the models. Also linear and quadratic forms of age at calving and Holstein gene percentage were fitted as covariate. The random factors of the models were additive genetic and permanent environmental effects. In the random regression model, orthogonal legendre polynomial up to order 4(cubic was implemented to take account of genetic and environmental aspects of milk production over the course of lactation. Heritability estimates resulted from the FRM was 0.15. The average heritability estimates resulted from the RRM of monthly test day milk production for the second half of the lactation was higher than that of the first half of lactation period. The highest and lowest heritability values were found for the first (0.102 and sixth (0.235 month of lactation. Breeding value of animals predicted from FRM and RRM were also compared. The results showed similar ranking of animals based on their breeding values from both models.
Directory of Open Access Journals (Sweden)
Roberto Mantovani
2010-01-01
Full Text Available This study has aimed to compare Repeatability (RP-TDm and Random-Regression Test Day models (RR-TDm in genetic evaluations of milk (M, fat (F and protein (P yields in Rendena breed. Variance estimations for Milk (M, Fat (F and Protein (P were obtained on a sample of 43,842 TD belonging to 2,692 animals controlled over 15 years (1990-2005. RP-TDm estimates of h2 were of 0.21 for M and 0.17 for both F and P, whereas RR-TDM provided a trend of h2 ranging from 0.15-0.34 for M, 0.15-0.31 for F and 0.10-0.24 for P. Both RP-TDm and RR-TDm results agreed with literature, even though RR-TDm provided a pattern of h2 along the lactation different from other studies, with the lowest h2 at the beginning and at the end of lactation. PSB, MAD and -2Log L parameters revealed lower power of RP-TDm as compare with the RR-TDm.
Baba, Toshimi; Gotoh, Yusaku; Yamaguchi, Satoshi; Nakagawa, Satoshi; Abe, Hayato; Masuda, Yutaka; Kawahara, Takayoshi
2017-08-01
This study aimed to evaluate a validation reliability of single-step genomic best linear unbiased prediction (ssGBLUP) with a multiple-lactation random regression test-day model and investigate an effect of adding genotyped cows on the reliability. Two data sets for test-day records from the first three lactations were used: full data from February 1975 to December 2015 (60 850 534 records from 2 853 810 cows) and reduced data cut off in 2011 (53 091 066 records from 2 502 307 cows). We used marker genotypes of 4480 bulls and 608 cows. Genomic enhanced breeding values (GEBV) of 305-day milk yield in all the lactations were estimated for at least 535 young bulls using two marker data sets: bull genotypes only and both bulls and cows genotypes. The realized reliability (R 2 ) from linear regression analysis was used as an indicator of validation reliability. Using only genotyped bulls, R 2 was ranged from 0.41 to 0.46 and it was always higher than parent averages. The very similar R 2 were observed when genotyped cows were added. An application of ssGBLUP to a multiple-lactation random regression model is feasible and adding a limited number of genotyped cows has no significant effect on reliability of GEBV for genotyped bulls. © 2016 Japanese Society of Animal Science.
Directory of Open Access Journals (Sweden)
S. Meseret
2015-09-01
Full Text Available The development of effective genetic evaluations and selection of sires requires accurate estimates of genetic parameters for all economically important traits in the breeding goal. The main objective of this study was to assess the relative performance of the traditional lactation average model (LAM against the random regression test-day model (RRM in the estimation of genetic parameters and prediction of breeding values for Holstein Friesian herds in Ethiopia. The data used consisted of 6,500 test-day (TD records from 800 first-lactation Holstein Friesian cows that calved between 1997 and 2013. Co-variance components were estimated using the average information restricted maximum likelihood method under single trait animal model. The estimate of heritability for first-lactation milk yield was 0.30 from LAM whilst estimates from the RRM model ranged from 0.17 to 0.29 for the different stages of lactation. Genetic correlations between different TDs in first-lactation Holstein Friesian ranged from 0.37 to 0.99. The observed genetic correlation was less than unity between milk yields at different TDs, which indicated that the assumption of LAM may not be optimal for accurate evaluation of the genetic merit of animals. A close look at estimated breeding values from both models showed that RRM had higher standard deviation compared to LAM indicating that the TD model makes efficient utilization of TD information. Correlations of breeding values between models ranged from 0.90 to 0.96 for different group of sires and cows and marked re-rankings were observed in top sires and cows in moving from the traditional LAM to RRM evaluations.
Meseret, S.; Tamir, B.; Gebreyohannes, G.; Lidauer, M.; Negussie, E.
2015-01-01
The development of effective genetic evaluations and selection of sires requires accurate estimates of genetic parameters for all economically important traits in the breeding goal. The main objective of this study was to assess the relative performance of the traditional lactation average model (LAM) against the random regression test-day model (RRM) in the estimation of genetic parameters and prediction of breeding values for Holstein Friesian herds in Ethiopia. The data used consisted of 6,500 test-day (TD) records from 800 first-lactation Holstein Friesian cows that calved between 1997 and 2013. Co-variance components were estimated using the average information restricted maximum likelihood method under single trait animal model. The estimate of heritability for first-lactation milk yield was 0.30 from LAM whilst estimates from the RRM model ranged from 0.17 to 0.29 for the different stages of lactation. Genetic correlations between different TDs in first-lactation Holstein Friesian ranged from 0.37 to 0.99. The observed genetic correlation was less than unity between milk yields at different TDs, which indicated that the assumption of LAM may not be optimal for accurate evaluation of the genetic merit of animals. A close look at estimated breeding values from both models showed that RRM had higher standard deviation compared to LAM indicating that the TD model makes efficient utilization of TD information. Correlations of breeding values between models ranged from 0.90 to 0.96 for different group of sires and cows and marked re-rankings were observed in top sires and cows in moving from the traditional LAM to RRM evaluations. PMID:26194217
Directory of Open Access Journals (Sweden)
Claudio Napolis Costa
2005-10-01
número de estimativas negativas entre as PLC do início e fim da lactação do que a FAS. Exceto para a FAS, observou-se redução das estimativas de correlação genética próximas à unidade entre as PLC adjacentes para valores negativos entre as PLC no início e no fim da lactação. Entre os polinômios de Legendre, o de quinta ordem apresentou um melhor o ajuste das PLC. Os resultados indicam o potencial de uso de regressão aleatória, com os modelos LP5 e a FAS apresentando-se como os mais adequados para a modelagem das variâncias genética e de efeito permanente das PLC da raça Gir.Data comprising 8,183 test day records of 1,273 first lactations of Gyr cows from herds supervised by ABCZ were used to estimate variance components and genetic parameters for milk yield using repeatability and random regression animal models by REML. Genetic modelling of logarithmic (FAS, exponential (FW curves was compared to orthogonal Legendre polynomials (LP of order 3 to 5. Residual variance was assumed to be constant in all (ME=1 or some periods of lactation (ME=4. Lactation milk yield in 305-d was also adjusted by an animal model. Genetic variance, heritability and repeatability for test day milk yields estimated by a repeatability animal model were 1.74 kg2, 0.27, and 0.76, respectively. Genetic variance and heritability estimates for lactation milk yield were respectively 121,094.6 and 0.22. Heritability estimates from FAS and FW, respectively, decreased from 0,59 and 0.74 at the beginning of lactation to 0.20 at the end of the period. Except for a fifth-order LP with ME=1, heritability estimates decreased from around 0,70 at early lactation to 0,30 at the end of lactation. Residual variance estimates were slightly smaller for logarithimic than for exponential curves both for homogeneous and heterogeneous variance assumptions. Estimates of residual variance in all stages of lactation decreased as the order of LP increased and depended on the assumption about ME
Directory of Open Access Journals (Sweden)
Lenira El Faro
2003-10-01
Full Text Available Foram utilizados quatorze modelos de regressão aleatória, para ajustar 86.598 dados de produção de leite no dia do controle de 2.155 primeiras lactações de vacas Caracu, truncadas aos 305 dias. Os modelos incluíram os efeitos fixos de grupo contemporâneo e a covariável idade da vaca ao parto. Uma regressão ortogonal de ordem cúbica foi usada para modelar a trajetória média da população. Os efeitos genéticos aditivos e de ambiente permanente foram modelados por meio de regressões aleatórias, usando polinômios ortogonais de Legendre, de ordens cúbicas. Diferentes estruturas de variâncias residuais foram testadas e consideradas por meio de classes contendo 1, 10, 15 e 43 variâncias residuais e de funções de variâncias (FV usando polinômios ordinários e ortogonais, cujas ordens variaram de quadrática até sêxtupla. Os modelos foram comparados usando o teste da razão de verossimilhança, o Critério de Informação de Akaike e o Critério de Informação Bayesiano de Schwar. Os testes indicaram que, quanto maior a ordem da função de variâncias, melhor o ajuste. Dos polinômios ordinários, a função de sexta ordem foi superior. Os modelos com classes de variâncias residuais foram aparentemente superiores àqueles com funções de variância. O modelo com homogeneidade de variâncias foi inadequado. O modelo com 15 classes heterogêneas foi o que melhor ajustou às variâncias residuais, entretanto, os parâmetros genéticos estimados foram muito próximos para os modelos com 10, 15 ou 43 classes de variâncias ou com FV de sexta ordem.Fourteen random regression models were used to adjust 86,595 test-day milk records of 2,155 first lactation of native Caracu cows. The models include fixed effects of contemporary group and age of cow as covariable. A cubic regression on Legendre orthogonal polynomial of days in milk was used to model the mean trend and the additive genetic and permanent environmental regressions
Directory of Open Access Journals (Sweden)
Jaime Araujo Cobuci
2004-06-01
Full Text Available Foram utilizados 87.045 registros de produção de leite, na primeira lactação, de 11.023 vacas da raça Holandesa, obtidos nos anos de 1997 a 2001, em diferentes rebanhos distribuídos em dez núcleos do Estado de Minas Gerais. Foram avaliados seis tipos de mensuração da persistência na lactação utilizando-se os valores genéticos da produção de leite, obtidos por meio do modelo de regressão aleatória - MRA. Utilizou-se a função de Wilmink na descrição dos efeitos aleatórios e fixos, pelo MRA. As estimativas de herdabilidade e de correlação genética, para as várias mensurações da persistência na lactação, variaram em decorrência da definição da persistência. As estimativas de herdabilidade para persistência na lactação variaram de 0,11 a 0,27 e as estimativas de correlação genética entre as mensurações da persistência na lactação e produção de leite até 305 dias, de -0,31 a 0,55, indicando que a persistência na lactação é uma característica de moderada herdabilidade e pouco correlacionada com a produção de leite até 305 dias. A seleção de animais para persistência na lactação, com o objetivo de alterar a forma da curva de lactação, pode ser eficiente.A total of 87,045 milk yield records of 11,023 first-parity Holstein cows was utilized, obtained from 1997 to 2001 from different herds of 10 Minas Gerais locations. Six types of persistency measures in lactation were evaluated using milk yield breeding values, obtained by means of Random Regression Model - RRM. The Wilmink function was used to describe the random and fixed effects by RRM. Heritability estimates and genetic correlations for various persistency measures in lactation were dependent on the definition of persistency. The heritability estimates for persistency in lactation ranged from 0.11 to 0.27 and the genetic variations among persistency measures in lactation and milk yield up to d 305 ranged from -0.31 to 0.55, showing that
African Journals Online (AJOL)
zlukovi
modelled as a quadratic regression, nested within parity. The previous lactation length was ... This proportion was mainly covered by linear and quadratic coefficients. Results suggest that RRM could .... The multiple trait models in scalar notation are presented by equations (1, 2), while equation. (3) represents the random ...
Genetic analysis of somatic cell score in Danish dairy cattle using ramdom regression test-day model
DEFF Research Database (Denmark)
Elsaid, Reda; Sabry, Ayman; Lund, Mogens Sandø
2011-01-01
with fifth order LP for PE effect and genetic effect were adequate to fit the data. The average heritability differed over the lactation and was lowest at the beginning (0.098) and higher at the end of lactation (0.138 to 0.151). Genetic correlations between daily SCS were high for adjacent tests (nearly 1......The objective of this study was to estimate the genetic and permanent environmental (PE) covariance functions for test-day records of logarithm of somatic cell count (SCS) of the first lactation for Danish Holstein cattle, and to test the hypotheses that: genetic and environmental variances change...... over first lactation, genetic correlations are near unity between any time points in first lactation, and including a Wilmink term will improve the likelihood of more than an extra order Legendre polynomial. Ten data sets, consisting of 1,190,584 test day somatic cell count (SCC) records from 149...
Chandler, T L; Pralle, R S; Dórea, J R R; Poock, S E; Oetzel, G R; Fourdraine, R H; White, H M
2018-03-01
Although cowside testing strategies for diagnosing hyperketonemia (HYK) are available, many are labor intensive and costly, and some lack sufficient accuracy. Predicting milk ketone bodies by Fourier transform infrared spectrometry during routine milk sampling may offer a more practical monitoring strategy. The objectives of this study were to (1) develop linear and logistic regression models using all available test-day milk and performance variables for predicting HYK and (2) compare prediction methods (Fourier transform infrared milk ketone bodies, linear regression models, and logistic regression models) to determine which is the most predictive of HYK. Given the data available, a secondary objective was to evaluate differences in test-day milk and performance variables (continuous measurements) between Holsteins and Jerseys and between cows with or without HYK within breed. Blood samples were collected on the same day as milk sampling from 658 Holstein and 468 Jersey cows between 5 and 20 d in milk (DIM). Diagnosis of HYK was at a serum β-hydroxybutyrate (BHB) concentration ≥1.2 mmol/L. Concentrations of milk BHB and acetone were predicted by Fourier transform infrared spectrometry (Foss Analytical, Hillerød, Denmark). Thresholds of milk BHB and acetone were tested for diagnostic accuracy, and logistic models were built from continuous variables to predict HYK in primiparous and multiparous cows within breed. Linear models were constructed from continuous variables for primiparous and multiparous cows within breed that were 5 to 11 DIM or 12 to 20 DIM. Milk ketone body thresholds diagnosed HYK with 64.0 to 92.9% accuracy in Holsteins and 59.1 to 86.6% accuracy in Jerseys. Logistic models predicted HYK with 82.6 to 97.3% accuracy. Internally cross-validated multiple linear regression models diagnosed HYK of Holstein cows with 97.8% accuracy for primiparous and 83.3% accuracy for multiparous cows. Accuracy of Jersey models was 81.3% in primiparous and 83
Test-day models : breeding value estimation based on individual test-day records
Pool, M.H.
2000-01-01
The studies described in this thesis were achieved within the graduate school Wageningen Institute of Animal Science (WIAS), carried out at the Institute for Animal Science and Health (ID-Lelystad BV) at the department of Genetics and Reproduction, and financially supported by the product division NRS of CR-DELTA.
This thesis describes choices and decisions made to develop a random regression test-day model. Studies included were performed on Dutch dairy cattle data...
Interpreting parameters in the logistic regression model with random effects
DEFF Research Database (Denmark)
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
SDE based regression for random PDEs
Bayer, Christian
2016-01-06
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Centers for Disease Control (CDC) Podcasts
2011-06-09
Dr. Kevin A. Fenton, Director of CDC's National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention, discusses National HIV Testing Day, an annual observance which raises awareness of the importance of knowing one's HIV status and encourages at-risk individuals to get an HIV test. Created: 6/9/2011 by National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention (NCHHSTP). Date Released: 6/9/2011.
Random regression models for milk, fat and protein in Colombian Buffaloes
Directory of Open Access Journals (Sweden)
Naudin Hurtado-Lugo
2015-01-01
Full Text Available Objective. Covariance functions for additive genetic and permanent environmental effects and, subsequently, genetic parameters for test-day milk (MY, fat (FY protein (PY yields and mozzarella cheese (MP in buffaloes from Colombia were estimate by using Random regression models (RRM with Legendre polynomials (LP. Materials and Methods. Test-day records of MY, FY, PY and MP from 1884 first lactations of buffalo cows from 228 sires were analyzed. The animals belonged to 14 herds in Colombia between 1995 and 2011. Ten monthly classes of days in milk were considered for test-day yields. The contemporary groups were defined as herd-year-month of milk test-day. Random additive genetic, permanent environmental and residual effects were included in the model. Fixed effects included the contemporary group, linear and quadratic effects of age at calving, and the average lactation curve of the population, which was modeled by third-order LP. Random additive genetic and permanent environmental effects were estimated by RRM using third- to- sixth-order LP. Residual variances were modeled using homogeneous and heterogeneous structures. Results. The heritabilities for MY, FY, PY and MP ranged from 0.38 to 0.05, 0.67 to 0.11, 0.50 to 0.07 and 0.50 to 0.11, respectively. Conclusions. In general, the RRM are adequate to describe the genetic variation in test-day of MY, FY, PY and MP in Colombian buffaloes.
Zoche-Golob, V; Heuwieser, W; Krömker, V
2015-09-01
The objective of the present study was to investigate the association between the milk fat-protein ratio and the incidence rate of clinical mastitis including repeated cases of clinical mastitis to determine the usefulness of this association to monitor metabolic disorders as risk factors for udder health. Herd records from 10 dairy herds of Holstein cows in Saxony, Germany, from September 2005-2011 (36,827 lactations of 17,657 cows) were used for statistical analysis. A mixed Poisson regression model with the weekly incidence rate of clinical mastitis as outcome variable was fitted. The model included repeated events of the outcome, time-varying covariates and multilevel clustering. Because the recording of clinical mastitis might have been imperfect, a probabilistic bias analysis was conducted to assess the impact of the misclassification of clinical mastitis on the conventional results. The lactational incidence of clinical mastitis was 38.2%. In 36.2% and 34.9% of the lactations, there was at least one dairy herd test day with a fat-protein ratio of 1.5, respectively. Misclassification of clinical mastitis was assumed to have resulted in bias towards the null. A clinical mastitis case increased the incidence rate of following cases of the same cow. Fat-protein ratios of 1.5 were associated with higher incidence rates of clinical mastitis depending on week in milk. The effect of a fat-protein ratio >1.5 on the incidence rate of clinical mastitis increased considerably over the course of lactation, whereas the effect of a fat-protein ratio 1.5 on the precedent test days of all cows irrespective of their time in milk seemed to be better predictors for clinical mastitis than the first test day results per lactation. Copyright © 2015 Elsevier B.V. All rights reserved.
Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.
2003-01-01
Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Solving large test-day models by iteration on data and preconditioned conjugate gradient.
Lidauer, M; Strandén, I; Mäntysaari, E A; Pösö, J; Kettunen, A
1999-12-01
A preconditioned conjugate gradient method was implemented into an iteration on a program for data estimation of breeding values, and its convergence characteristics were studied. An algorithm was used as a reference in which one fixed effect was solved by Gauss-Seidel method, and other effects were solved by a second-order Jacobi method. Implementation of the preconditioned conjugate gradient required storing four vectors (size equal to number of unknowns in the mixed model equations) in random access memory and reading the data at each round of iteration. The preconditioner comprised diagonal blocks of the coefficient matrix. Comparison of algorithms was based on solutions of mixed model equations obtained by a single-trait animal model and a single-trait, random regression test-day model. Data sets for both models used milk yield records of primiparous Finnish dairy cows. Animal model data comprised 665,629 lactation milk yields and random regression test-day model data of 6,732,765 test-day milk yields. Both models included pedigree information of 1,099,622 animals. The animal model ¿random regression test-day model¿ required 122 ¿305¿ rounds of iteration to converge with the reference algorithm, but only 88 ¿149¿ were required with the preconditioned conjugate gradient. To solve the random regression test-day model with the preconditioned conjugate gradient required 237 megabytes of random access memory and took 14% of the computation time needed by the reference algorithm.
Evaluation of random forest regression for prediction of breeding ...
Indian Academy of Sciences (India)
cation of the random forest (RF), a model-free ensemble learning method, is not widely used for prediction. In this study, the ... [Sarkar R. K., Rao A. R., Meher P. K., Nepolean T. and Mohapatra T. 2015 Evaluation of random forest regression for prediction of breeding value from .... Ten-fold cross validation technique (Stone.
Buffalos milk yield analysis using random regression models
Directory of Open Access Journals (Sweden)
A.S. Schierholt
2010-02-01
Full Text Available Data comprising 1,719 milk yield records from 357 females (predominantly Murrah breed, daughters of 110 sires, with births from 1974 to 2004, obtained from the Programa de Melhoramento Genético de Bubalinos (PROMEBUL and from records of EMBRAPA Amazônia Oriental - EAO herd, located in Belém, Pará, Brazil, were used to compare random regression models for estimating variance components and predicting breeding values of the sires. The data were analyzed by different models using the Legendre’s polynomial functions from second to fourth orders. The random regression models included the effects of herd-year, month of parity date of the control; regression coefficients for age of females (in order to describe the fixed part of the lactation curve and random regression coefficients related to the direct genetic and permanent environment effects. The comparisons among the models were based on the Akaike Infromation Criterion. The random effects regression model using third order Legendre’s polynomials with four classes of the environmental effect were the one that best described the additive genetic variation in milk yield. The heritability estimates varied from 0.08 to 0.40. The genetic correlation between milk yields in younger ages was close to the unit, but in older ages it was low.
Strategies for estimating the parameters needed for different test-day models
Misztal, I.; Strabel, T.; Jamrozik, J.; Mäntysaari, E.A.; Meuwissen, T.H.E.
2000-01-01
Currently, most analyses of parameters in test-day models involve two types of models: random regression, where various functions describe variability of (co)variances with regard to days in milk, and multiple traits, where observations in adjacent days in milk are treated as one trait. The
Generalized and synthetic regression estimators for randomized branch sampling
David L. R. Affleck; Timothy G. Gregoire
2015-01-01
In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...
Bioprocess data mining using regularized regression and random forests.
Hassan, Syeda; Farhan, Muhammad; Mangayil, Rahul; Huttunen, Heikki; Aho, Tommi
2013-01-01
In bioprocess development, the needs of data analysis include (1) getting overview to existing data sets, (2) identifying primary control parameters, (3) determining a useful control direction, and (4) planning future experiments. In particular, the integration of multiple data sets causes that these needs cannot be properly addressed by regression models that assume linear input-output relationship or unimodality of the response function. Regularized regression and random forests, on the other hand, have several properties that may appear important in this context. They are capable, e.g., in handling small number of samples with respect to the number of variables, feature selection, and the visualization of response surfaces in order to present the prediction results in an illustrative way. In this work, the applicability of regularized regression (Lasso) and random forests (RF) in bioprocess data mining was examined, and their performance was benchmarked against multiple linear regression. As an example, we used data from a culture media optimization study for microbial hydrogen production. All the three methods were capable in providing a significant model when the five variables of the culture media optimization were linearly included in modeling. However, multiple linear regression failed when also the multiplications and squares of the variables were included in modeling. In this case, the modeling was still successful with Lasso (correlation between the observed and predicted yield was 0.69) and RF (0.91). We found that both regularized regression and random forests were able to produce feasible models, and the latter was efficient in capturing the non-linearity in the data. In this kind of a data mining task of bioprocess data, both methods outperform multiple linear regression.
Genetic evaluation for persistency of lactation in Holstein cows using a random regression model
Directory of Open Access Journals (Sweden)
Jaime Araujo Cobuci
2007-03-01
Full Text Available A model for analyzing test day records including both fixed and random coefficients was applied to the genetic evaluation of first lactation data for Holstein cows. Data comprising 87045 test-day milk yield records from calving between 1997 and 2001 from Holstein herds in 10 regions of the Brazilian state of Minas Gerais. Six persistency of lactation measures were evaluated using breeding values obtained by random regression analyses. The Wilmink function was used to model the additive genetic and permanent environmental effects. Residual variance was constant throughout lactation. Ranking for animals did not change among criteria for persistency measurements, but ranking changes were observed when the estimated breeding value (EBV for persistency of lactation was contrasted with those estimated for 305-day milk yield (305MY. The rank correlation estimates for persistency of lactation and 305MY were practically the same for sire and cows, and ranged from -0.45 to 0.69. The EBVs for milk yield during lactation for sires producing daughters with superior 305MY indicate genetic differences between sires regarding their ability to transmit desirable persistency of lactation traits. This suggests that selection for total lactation milk yield does not identify sires or cows that are genetically superior in regard to persistency of lactation. Genetic evaluation for persistency of lactation is important for improving the efficiency of the milk production capacity of Holstein cows.
Lidauer, M H; Emmerling, R; Mäntysaari, E A
2008-06-01
A multiplicative random regression (M-RRM) test-day (TD) model was used to analyse daily milk yields from all available parities of German and Austrian Simmental dairy cattle. The method to account for heterogeneous variance (HV) was based on the multiplicative mixed model approach of Meuwissen. The variance model for the heterogeneity parameters included a fixed region x year x month x parity effect and a random herd x test-month effect with a within-herd first-order autocorrelation between test-months. Acceleration of variance model solutions after each multiplicative model cycle enabled fast convergence of adjustment factors and reduced total computing time significantly. Maximum Likelihood estimation of within-strata residual variances was enhanced by inclusion of approximated information on loss in degrees of freedom due to estimation of location parameters. This improved heterogeneity estimates for very small herds. The multiplicative model was compared with a model that assumed homogeneous variance. Re-estimated genetic variances, based on Mendelian sampling deviations, were homogeneous for the M-RRM TD model but heterogeneous for the homogeneous random regression TD model. Accounting for HV had large effect on cow ranking but moderate effect on bull ranking.
Smith, Paul F; Ganesh, Siva; Liu, Ping
2013-10-30
Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.
Portolano, B.; Maizon, D. O.; Riggio, V.; Tolone, M.; Cacioppo, D.
2007-01-01
The aims of the present study were to compare estimated breeding values (EBV) for milk yield using different testing schemes with a test-day animal model and to evaluate the effect of different testing schemes on the ranking of top sheep. Alternative recording schemes that use less information than that currently obtained with a monthly test-day schedule were employed to estimate breeding values. A random regression animal mixed model that used a spline function of days in milk was fitted. EB...
Genetic evaluation of European quails by random regression models
Directory of Open Access Journals (Sweden)
Flaviana Miranda Gonçalves
2012-09-01
Full Text Available The objective of this study was to compare different random regression models, defined from different classes of heterogeneity of variance combined with different Legendre polynomial orders for the estimate of (covariance of quails. The data came from 28,076 observations of 4,507 female meat quails of the LF1 lineage. Quail body weights were determined at birth and 1, 14, 21, 28, 35 and 42 days of age. Six different classes of residual variance were fitted to Legendre polynomial functions (orders ranging from 2 to 6 to determine which model had the best fit to describe the (covariance structures as a function of time. According to the evaluated criteria (AIC, BIC and LRT, the model with six classes of residual variances and of sixth-order Legendre polynomial was the best fit. The estimated additive genetic variance increased from birth to 28 days of age, and dropped slightly from 35 to 42 days. The heritability estimates decreased along the growth curve and changed from 0.51 (1 day to 0.16 (42 days. Animal genetic and permanent environmental correlation estimates between weights and age classes were always high and positive, except for birth weight. The sixth order Legendre polynomial, along with the residual variance divided into six classes was the best fit for the growth rate curve of meat quails; therefore, they should be considered for breeding evaluation processes by random regression models.
Calibration of stormwater quality regression models: a random process?
Dembélé, A; Bertrand-Krajewski, J-L; Barillon, B
2010-01-01
Regression models are among the most frequently used models to estimate pollutants event mean concentrations (EMC) in wet weather discharges in urban catchments. Two main questions dealing with the calibration of EMC regression models are investigated: i) the sensitivity of models to the size and the content of data sets used for their calibration, ii) the change of modelling results when models are re-calibrated when data sets grow and change with time when new experimental data are collected. Based on an experimental data set of 64 rain events monitored in a densely urbanised catchment, four TSS EMC regression models (two log-linear and two linear models) with two or three explanatory variables have been derived and analysed. Model calibration with the iterative re-weighted least squares method is less sensitive and leads to more robust results than the ordinary least squares method. Three calibration options have been investigated: two options accounting for the chronological order of the observations, one option using random samples of events from the whole available data set. Results obtained with the best performing non linear model clearly indicate that the model is highly sensitive to the size and the content of the data set used for its calibration.
Directory of Open Access Journals (Sweden)
D. Cacioppo
2010-04-01
Full Text Available The aims of the present study were to compare estimated breeding values (EBV for milk yield using different testing schemes with a test-day animal model and to evaluate the effect of different testing schemes on the ranking of top sheep. Alternative recording schemes that use less information than that currently obtained with a monthly test-day schedule were employed to estimate breeding values. A random regression animal mixed model that used a spline function of days in milk was fitted. EBVs obtained with alternative recording schemes showed different degrees of Spearman correlation with EBVs obtained using the monthly recording scheme. These correlations ranged from 0.77 to 0.92. A reduction in accuracy and intensity of selection could be anticipated if these alternative schemes are used; more research in this area is needed to reduce the costs of test-day recording.
Mookprom, S; Boonkum, W; Kunhareang, S; Siripanya, S; Duangjinda, M
2017-02-01
The objective of this research is to investigate appropriate random regression models with various covariance functions, for the genetic evaluation of test-day egg production. Data included 7,884 monthly egg production records from 657 Thai native chickens (Pradu Hang Dam) that were obtained during the first to sixth generation and were born during 2007 to 2014 at the Research and Development Network Center for Animal Breeding (Native Chickens), Khon Kaen University. Average annual and monthly egg productions were 117 ± 41 and 10.20 ± 6.40 eggs, respectively. Nine random regression models were analyzed using the Wilmink function (WM), Koops and Grossman function (KG), Legendre polynomials functions with second, third, and fourth orders (LG2, LG3, LG4), and spline functions with 4, 5, 6, and 8 knots (SP4, SP5, SP6, and SP8). All covariance functions were nested within the same additive genetic and permanent environmental random effects, and the variance components were estimated by Restricted Maximum Likelihood (REML). In model comparisons, mean square error (MSE) and the coefficient of detemination (R2) calculated the goodness of fit; and the correlation between observed and predicted values [Formula: see text] was used to calculate the cross-validated predictive abilities. We found that the covariance functions of SP5, SP6, and SP8 proved appropriate for the genetic evaluation of the egg production curves for Thai native chickens. The estimated heritability of monthly egg production ranged from 0.07 to 0.39, and the highest heritability was found during the first to third months of egg production. In conclusion, the spline functions within monthly egg production can be applied to breeding programs for the improvement of both egg number and persistence of egg production. © 2016 Poultry Science Association Inc.
Directory of Open Access Journals (Sweden)
Mohammad Jabarzadeh Ivrigh
2016-04-01
Full Text Available Introduction Productive traits such as milk production and fat and protein percentage have economic importance in the livestock industry. Accurate prediction of breeding value of animals is one of the best tools available for maximizing response to selection program. It is a fact that the main objective of the breeding program, is to achieve the maximum economic benefit. For breeders of dairy cattle, milk, fat, and protein are the main sources of income that are the most important traits in the firm goals. For evaluating the dairy cattle based on these traits (milk production, fat, and protein percentage, prediction of breeding values is essential. The present study was performed in order to estimate the genetic and phenotypic parameters and genetic and phenotypic trends of production traits in the Mediterranean climate of Iran (including; Ardebil, Hamadan, East and West Azerbaijan and Zanjan provinces using 105118 records for Test Day and 30985 records for 305-day lactation records Related 8808 Herd of first lactation Holstein Cattle calving between 2003 to 2013. All records collected by Animal Breeding Center of Iran. Materials and Methods Records were edited using Fox pro 8.0 and ACCESS 2010 software and the wrong and unusual records were removed from the dataset. All analyses were performed using the RR (random regression routine of the WOMBAT software package using AIREML algorithm on Linux operation system. Test day records were analyzed with the following random regression model (RRM: Where; Pk; kth fixed effect of province, YSl; lth fixed effect of year-season of calving, Yklimnptv; test day record i obtained at dimt of cow p calved at the nth age group in herd-test day m, HTDm; fixed effect of mth herd-test date, Cf; The fth fixed regression coefficient for calving age, agen; The nth calving age, k; The order of fit for fixed regression coefficients (k=4, βr; The rth fixed regression coefficient, ka; The order of fit for additive
Random forest regression for magnetic resonance image synthesis.
Jog, Amod; Carass, Aaron; Roy, Snehashis; Pham, Dzung L; Prince, Jerry L
2017-01-01
By choosing different pulse sequences and their parameters, magnetic resonance imaging (MRI) can generate a large variety of tissue contrasts. This very flexibility, however, can yield inconsistencies with MRI acquisitions across datasets or scanning sessions that can in turn cause inconsistent automated image analysis. Although image synthesis of MR images has been shown to be helpful in addressing this problem, an inability to synthesize both T2-weighted brain images that include the skull and FLuid Attenuated Inversion Recovery (FLAIR) images has been reported. The method described herein, called REPLICA, addresses these limitations. REPLICA is a supervised random forest image synthesis approach that learns a nonlinear regression to predict intensities of alternate tissue contrasts given specific input tissue contrasts. Experimental results include direct image comparisons between synthetic and real images, results from image analysis tasks on both synthetic and real images, and comparison against other state-of-the-art image synthesis methods. REPLICA is computationally fast, and is shown to be comparable to other methods on tasks they are able to perform. Additionally REPLICA has the capability to synthesize both T2-weighted images of the full head and FLAIR images, and perform intensity standardization between different imaging datasets. Copyright © 2016 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Ali William Canaza-Cayo
2015-10-01
Full Text Available A total of 32,817 test-day milk yield (TDMY records of the first lactation of 4,056 Girolando cows daughters of 276 sires, collected from 118 herds between 2000 and 2011 were utilized to estimate the genetic parameters for TDMY via random regression models (RRM using Legendre’s polynomial functions whose orders varied from 3 to 5. In addition, nine measures of persistency in milk yield (PSi and the genetic trend of 305-day milk yield (305MY were evaluated. The fit quality criteria used indicated RRM employing the Legendre’s polynomial of orders 3 and 5 for fitting the genetic additive and permanent environment effects, respectively, as the best model. The heritability and genetic correlation for TDMY throughout the lactation, obtained with the best model, varied from 0.18 to 0.23 and from −0.03 to 1.00, respectively. The heritability and genetic correlation for persistency and 305MY varied from 0.10 to 0.33 and from −0.98 to 1.00, respectively. The use of PS7 would be the most suitable option for the evaluation of Girolando cattle. The estimated breeding values for 305MY of sires and cows showed significant and positive genetic trends. Thus, the use of selection indices would be indicated in the genetic evaluation of Girolando cattle for both traits.
Directory of Open Access Journals (Sweden)
A. KETTUNEN
2008-12-01
Full Text Available Genetic parameters for test-day milk production at different stages of lactation of Finnish Ayrshire heifers were estimated with the REML method using the AI algorithm and animal model. The data consisted of 38679 first lactation test-day milk yields of 4205 cows from 231 herds in three geographical regions (North Savo, Central Ostrobothnia and Lapland. To identify different test days, records were numbered according to the days in milk after calving, and were further categorized into three part-lactations according to the test-day classification. Expressions in the three part-lactations were considered as separate traits, and tests were treated as repeated observations within the trait. Heritability estimates for test-day milk yield varied between 0. 11 and 0. 17, being lowest at the beginning of lactation. Genetic correlations between test-day milk yields at different trimesters ranged from 0.64 to 0.91, being highest between consecutive trimesters. Standard errors of the estimates of genetic parameters varied between 0.02 and 0.08. Genetic interrelationships differed from 1.0, supporting the assumption that genetic variation exists in the shape of the lactation curve. The necessity of considering deviations from the general lactation curve in the test-day model, e.g. fitting random regression coefficients, is discussed.;
A Unified Approach to Power Calculation and Sample Size Determination for Random Regression Models
Shieh, Gwowen
2007-01-01
The underlying statistical models for multiple regression analysis are typically attributed to two types of modeling: fixed and random. The procedures for calculating power and sample size under the fixed regression models are well known. However, the literature on random regression models is limited and has been confined to the case of all…
Milk yield persistency in Brazilian Gyr cattle based on a random regression model.
Pereira, R J; Verneque, R S; Lopes, P S; Santana, M L; Lagrotta, M R; Torres, R A; Vercesi Filho, A E; Machado, M A
2012-06-15
With the objective of evaluating measures of milk yield persistency, 27,000 test-day milk yield records from 3362 first lactations of Brazilian Gyr cows that calved between 1990 and 2007 were analyzed with a random regression model. Random, additive genetic and permanent environmental effects were modeled using Legendre polynomials of order 4 and 5, respectively. Residual variance was modeled using five classes. The average lactation curve was modeled using a fourth-order Legendre polynomial. Heritability estimates for measures of persistency ranged from 0.10 to 0.25. Genetic correlations between measures of persistency and 305-day milk yield (Y305) ranged from -0.52 to 0.03. At high selection intensities for persistency measures and Y305, few animals were selected in common. As the selection intensity for the two traits decreased, a higher percentage of animals were selected in common. The average predicted breeding values for Y305 according to year of birth of the cows had a substantial annual genetic gain. In contrast, no improvement in the average persistency breeding value was observed. We conclude that selection for total milk yield during lactation does not identify bulls or cows that are genetically superior in terms of milk yield persistency. A measure of persistency represented by the sum of deviations of estimated breeding value for days 31 to 280 in relation to estimated breeding value for day 30 should be preferred in genetic evaluations of this trait in the Gyr breed, since this measure showed a medium heritability and a genetic correlation with 305-day milk yield close to zero. In addition, this measure is more adequate at the time of peak lactation, which occurs between days 25 and 30 after calving in this breed.
Genetic parameters for various random regression models to describe the weight data of pigs
Huisman, A.E.; Veerkamp, R.F.; Arendonk, van J.A.M.
2002-01-01
Various random regression models have been advocated for the fitting of covariance structures. It was suggested that a spline model would fit better to weight data than a random regression model that utilizes orthogonal polynomials. The objective of this study was to investigate which kind of random
Genetic parameters for different random regression models to describe weight data of pigs
Huisman, A.E.; Veerkamp, R.F.; Arendonk, van J.A.M.
2001-01-01
Various random regression models have been advocated for the fitting of covariance structures. It was suggested that a spline model would fit better to weight data than a random regression model that utilizes orthogonal polynomials. The objective of this study was to investigate which kind of random
Directory of Open Access Journals (Sweden)
Maria Gabriela Campolina Diniz Peixoto
2014-05-01
Full Text Available The objective of this work was to compare random regression models for the estimation of genetic parameters for Guzerat milk production, using orthogonal Legendre polynomials. Records (20,524 of test-day milk yield (TDMY from 2,816 first-lactation Guzerat cows were used. TDMY grouped into 10-monthly classes were analyzed for additive genetic effect and for environmental and residual permanent effects (random effects, whereas the contemporary group, calving age (linear and quadratic effects and mean lactation curve were analized as fixed effects. Trajectories for the additive genetic and permanent environmental effects were modeled by means of a covariance function employing orthogonal Legendre polynomials ranging from the second to the fifth order. Residual variances were considered in one, four, six, or ten variance classes. The best model had six residual variance classes. The heritability estimates for the TDMY records varied from 0.19 to 0.32. The random regression model that used a second-order Legendre polynomial for the additive genetic effect, and a fifth-order polynomial for the permanent environmental effect is adequate for comparison by the main employed criteria. The model with a second-order Legendre polynomial for the additive genetic effect, and that with a fourth-order for the permanent environmental effect could also be employed in these analyses.
Evaluation of random forest regression for prediction of breeding ...
Indian Academy of Sciences (India)
Model-based techniques have been widely used for prediction of breeding values of genotypes from genomewide association studies. However, application of the random forest (RF), a model-free ensemble learning method, is not widely used for prediction. In this study, the optimum values of tuning parameters of RF have ...
Application of Random-Effects Probit Regression Models.
Gibbons, Robert D.; Hedeker, Donald
1994-01-01
Develops random-effects probit model for case in which outcome of interest is series of correlated binary responses, obtained as product of longitudinal response process where individual is repeatedly classified on binary outcome variable or in multilevel or clustered problems in which individuals within groups are considered to share…
Random Decrement and Regression Analysis of Traffic Responses of Bridges
DEFF Research Database (Denmark)
Asmussen, J. C.; Ibrahim, S. R.; Brincker, Rune
1996-01-01
The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data fro the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e. g. wind, traffic...... and small ground motion. The random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time domain method. The possible influence of the traffic mass load on the bridge...... is investigated by assuming that the response level of the bridge is dependent on the mass of the vehicle load. The eigenfrequencies of the bridge is estimated as a function of the response level. This indicates the degree of influence of the mass load on the estimated eigenfrequencies. The results...
Random Decrement and Regression Analysis of Traffic Responses of Bridges
DEFF Research Database (Denmark)
Asmussen, J. C.; Ibrahim, S. R.; Brincker, Rune
The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data from the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e.g. wind, traffic...... and small ground motion. The Random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time Domain method. The possible influence of the traffic mass load on the bridge...... is investigated by assuming that the response level of the bridge is dependent on the mass of the vehicle load. The eigenfrequencies of the bridge are estimated as a function of the response level. This indicates the degree of influence of the mass load on the estimated eigenfrequencies. The results...
DEFF Research Database (Denmark)
Strathe, Anders B; Mark, Thomas; Nielsen, Bjarne
Random regression models were used to estimate covariance functions between cumulated feed intake (CFI) and body weight (BW) in 8424 Danish Duroc pigs. Random regressions on second order Legendre polynomials of age were used to describe genetic and permanent environmental curves in BW and CFI. Ba...
Technology diffusion in hospitals : A log odds random effects regression model
Blank, J.L.T.; Valdmanis, V.G.
2013-01-01
This study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to describe the
Technology diffusion in hospitals: A log odds random effects regression model
J.L.T. Blank (Jos); V.G. Valdmanis (Vivian G.)
2015-01-01
textabstractThis study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to
Directory of Open Access Journals (Sweden)
fatemh kazemi borzel abad
2016-08-01
Full Text Available Introduction Milk solid no-fat is economically very important in cheese industry. Compared to the other kinds of milk, ewe’s milk contains higher amount of milk solids no-fat. Milk solids no-fat (MSNF contains lactose, caseins, whey proteins, and minerals. The use of test day records in random regression method has several benefits including flexibility to account for the environmental and genetic components of the shape of lactation, reducing generation interval and cost of recording by making fewer measurements, increasing the accuracy of genetic evaluation and direct correction for fixed effects. Therefore, the objective of the present study was to estimate genetic parameters for test-day milk solid no-fat percentage in Kurdi sheep of Shirvan using fixed and random regression models. Materials and methods In the present investigation, genetic analysis of milk solid no-fat percentage was carried out using fixed and random regression models by Wombat software. Data included 1094 test day records of milk solid no-fat percentage collected from 250 ewes in Hossien Abad Kurdi sheep breeding station. Milking was carried out by hand milking combined with lamb suckling at 14 days interval starting from May to August 2012. Then, 50 ml of milk samples were immediately analysed by Ecomilk total to determine the milk solid no-fat percentage. Fixed effects of litter size, parity, month of recording and days in milk as covariate and random effects of direct genetic and permanent environmental effects were included in the models. General linear model was used to identify effective fixed effects on the trait by SAS 9.1 software. Variance and covariance components were estimated using restricted maximum likelihood procedure. In random regression model, orthogonal Legendre polynomials of order 2 for permanent environmental and additive genetic effects was fitted. Results and Discussion Average milk solid no-fat percentage of Kurdi ewes was 11.83. Average
Random regression models in the evaluation of the growth curve of Simbrasil beef cattle
Mota, M.; Marques, F.A.; Lopes, P.S.; Hidalgo, A.M.
2013-01-01
Random regression models were used to estimate the types and orders of random effects of (co)variance functions in the description of the growth trajectory of the Simbrasil cattle breed. Records for 7049 animals totaling 18,677 individual weighings were submitted to 15 models from the third to the
Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne
2013-02-15
When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only
Strategies for estimating the parameters needed for different test-day models.
Misztal, I; Strabel, T; Jamrozik, J; Mäntysaari, E A; Meuwissen, T H
2000-05-01
Currently, most analyses of parameters in test-day models involve two types of models: random regression, where various functions describe variability of (co)variances with regard to days in milk, and multiple traits, where observations in adjacent days in milk are treated as one trait. The methodologies used for estimation of parameters included Bayesian via Gibbs sampling, and REML in the form of derivative-free, expectation-maximization, or average-information algorithms. The first method is simpler and uses less memory but may need many rounds to produce posterior samples. In REML, however, the stopping point is well established. Because of computing limitations, the largest estimations of parameters were on fewer than 20,000 animals. The magnitude and pattern of heritabilities varied widely, which could be caused by simplifications in the model, overparameterization, small sample size, and unrepresentative samples. Patterns of heritability differ among random regression and multiple-trait models. Accurate parameters for large multi-trait random regression models may be difficult to obtain at the present time. Parameters that are sufficiently accurate in practice may be obtained outside the complete prediction model by a constructive approach, where parameters averaged over the lactation would be combined with several typical curves for (co)variances for days in milk. Obtained parameters could be used for any model, and could also aid in comparison of models.
Directory of Open Access Journals (Sweden)
Mahdi Elahi Torshizi
2017-10-01
Full Text Available Objective During the last decade, genetic evaluation of dairy cows using longitudinal data (test day milk yield or 305- day milk yield using random regression method has been officially adopted in several countries. The objectives of this study were to estimate covariance functions for genetic and permanent environmental effects and to obtain genetic parameters of 305-day milk yield over seven parities. Methods Data including 60,279 total 305–day milk yield of 17,309 Iranian Holstein dairy cows in 7 parities calved between 20 to 140 months between 2004 and 2011. Residual variances were modeled by homogeneous and step functions with 7 and 10 classes. Results The results showed that a third order polynomial for additive genetic and permanent environmental effects plus a step function with 10 classes for the residual variance was the most adequate and parsimonious model to describe the covariance structure of the data. Heritability estimates obtained by this model varied from 0.17 to 0.28. The performance of this model was better than repeatability model. Moreover, 10 classes of residual variance produce the more accurate result than 7 classes or homogeneous residual effect. Conclusion A quadratic Legendre polynomial for additive genetic and permanent environmental effects with 10 step function residual classes are sufficient to produce a parsimonious model that explained the change in 305-day milk yield over consecutive parities of Iranian Holstein cows.
Fagard, Robert H; Celis, Hilde; Thijs, Lutgarde; Wouters, Stijn
2009-11-01
Blood pressure-lowering therapy reduces left ventricular mass, but the question of whether differences exist among drug classes has not been fully resolved. Our aim was to compare the effects of diuretics, beta-blockers, calcium channel blockers, angiotensin-converting enzyme inhibitors, and angiotensin receptor blockers on left ventricular mass regression in patients with hypertension on the basis of prospective, randomized comparative studies. We performed meta-analyses, involving pooled pairwise comparisons of the drug classes and of each class versus other classes statistically combined, and meta-regression analyses to identify the determinants of the regression. The 75 relevant publications involved 84 pairwise comparisons and 6001 patients. Regression of left ventricular mass was significantly less (P=0.01) with beta-blockers (9.8%) than with angiotensin receptor blockers (12.5%), but none of the other analyzable pairwise comparisons between drug classes revealed significant differences (P>0.10). In addition, beta-blockers showed less regression than the other 4 classes statistically combined (Pmeta-regression analysis on all of the treatment arms, beta-blocker treatment was a significant and negative predictor of the regression (-3.6%; Pclasses, including angiotensin receptor blockers. In conclusion, beta-blockers show less regression of left ventricular mass, whereas angiotensin receptor blockers may induce larger regression. The inferiority of beta-blockers appears to be more convincing than the superiority of angiotensin receptor blockers.
A random regression model in analysis of litter size in pigs | Lukovi& ...
African Journals Online (AJOL)
Dispersion parameters for number of piglets born alive (NBA) were estimated using a random regression model (RRM). Two data sets of litter records from the Nemščak farm in Slovenia were used for analyses. The first dataset (DS1) included records from the first to the sixth parity. The second dataset (DS2) was extended ...
Jeffrey T. Walton
2008-01-01
Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
Random regression models for daily feed intake in Danish Duroc pigs
DEFF Research Database (Denmark)
Strathe, Anders Bjerring; Mark, Thomas; Jensen, Just
The objective of this study was to develop random regression models and estimate covariance functions for daily feed intake (DFI) in Danish Duroc pigs. A total of 476201 DFI records were available on 6542 Duroc boars between 70 to 160 days of age. The data originated from the National test statio...
DEFF Research Database (Denmark)
Petersen, Jørgen Holm
2016-01-01
This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied...
Genetic analysis of tolerance to infections using random regressions: a simulation study
Kause, A.
2011-01-01
Tolerance to infections is the ability of a host to limit the impact of a given pathogen burden on host performance. This simulation study demonstrated the merit of using random regressions to estimate unbiased genetic variances for tolerance slope and its genetic correlations with other traits,
Wang, Wei; Griswold, Michael E.
2016-01-01
The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the ‘Average Predicted Value’ method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. PMID:27449636
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
The limiting behavior of the estimated parameters in a misspecified random field regression model
DEFF Research Database (Denmark)
Dahl, Christian Møller; Qin, Yu
This paper examines the limiting properties of the estimated parameters in the random field regression model recently proposed by Hamilton (Econometrica, 2001). Though the model is parametric, it enjoys the flexibility of the nonparametric approach since it can approximate a large collection......, as a consequence the random field model specification introduces non-stationarity and non-ergodicity in the misspecified model and it becomes non-trivial, relative to the existing literature, to establish the limiting behavior of the estimated parameters. The asymptotic results are obtained by applying some...
Zheng, Shimin; Rao, Uma; Bartolucci, Alfred A.; Singh, Karan P.
2011-01-01
Bartolucci et al.(2003) extended the distribution assumption from the normal (Lyles et al., 2000) to the elliptical contoured distribution (ECD) for random regression models used in analysis of longitudinal data accounting for both undetectable values and informative drop-outs. In this paper, the random regression models are constructed on the multivariate skew ECD. A real data set is used to illustrate that the skew ECDs can fit some unimodal continuous data better than the Gaussian distributions or more general continuous symmetric distributions when the symmetric distribution assumption is violated. Also, a simulation study is done for illustrating the model fitness from a variety of skew ECDs. The software we used is SAS/STAT, V. 9.13. PMID:21637734
Suzuki, Kohta; Kondo, Naoki; Sato, Miri; Tanaka, Taichiro; Ando, Daisuke; Yamagata, Zentaro
2012-01-01
Background Although maternal smoking during pregnancy has been reported to have an effect on childhood overweight/obesity, the impact of maternal smoking on the trajectory of the body mass of their offspring is not very clear. Previously, we investigated this effect by using a fixed-effect model. However, this analysis was limited because it rounded and categorized the age of the children. Therefore, we used a random-effects hierarchical linear regression model in the present study. Methods T...
Weight evaluation of Tabapuã cattle raised in northeastern Brazil using random-regression models
Directory of Open Access Journals (Sweden)
M.R. Oliveira
Full Text Available ABSTRACT The objective of this study is to compare random-regression models used to describe changes in evaluation parameters for growth in Tabapuã bovine raised in the Northeast of Brazilian. The M4532-5 random-regression model was found to be best for estimating the variation and heritability of growth characteristics in the animals evaluated. Estimates of direct additive genetic variance increased with age, while the maternal additive genetic variance demonstrated growth from birth to up to nearly 420 days of age. The genetic correlations between the first four characteristics were positive with moderate to large ranges. The greatest genetic correlation was observed between birth weight and at 240 days of age (0.82. The phenotypic correlation between birth weight and other characteristics was low. The M4532-5 random-regression model with 39 parameters was found to be best for describing the growth curve of the animals evaluated providing improved selection for heavier animals when performed after weaning. The interpretation of genetic parameters to predict the growth curve of cattle may allow the selection of animals to accelerate slaughter procedures.
Directory of Open Access Journals (Sweden)
Jaime Araújo Cobuci
2011-03-01
Full Text Available Records of test-day milk yields of the first three lactations of 25,500 Holstein cows were used to estimate genetic parameters for milk yield by using two alternatives of definition of fixed regression of the random regression models (RRM. Legendre polynomials of fourth and fifth orders were used to model regression of fixed curve (defined based on averages of the populations or multiple sub-populations formed by grouping animals which calved at the same age and in the same season of the year or random lactation curves (additive genetic and permanent enviroment. Akaike information criterion (AIC and Bayesian information criterion (BIC indicated that the models which used multiple regression of fixed lactation curves of lactation multiple regression model with fixed lactation curves had the best fit for the first lactation test-day milk yields and the models which used a single regression of fixed curve had the best fit for the second and third lactations. Heritability for milk yield during lactation estimates did not vary among models but ranged from 0.22 to 0.34, from 0.11 to 0.21, and from 0.10 to 0.20, respectively, in the first three lactations. Similarly to heridability estimates of genetic correlations did not vary among models. The use of single or multiple fixed regressions for fixed lactation curves by RRM does not influence the estimates of genetic parameters for test-day milk yield across lactations.Os registros de produção de leite no dia do controle das três primeiras lactações de 25,5 mil vacas da raça Holandesa foram utilizados para estimar parâmetros genéticos para produção de leite usando duas alternativas de definição da regressão fixa dos modelos de regressão aleatória (MRA. Os polinômios de Legendre de ordens 4 e 5 foram usados para modelar as regressões das curvas fixas (definidas com base nas médias das produções de leite no dia do controle da população ou de múltiplas sub-populações formadas pelo
Bowden, Jack; Davey Smith, George; Burgess, Stephen
2015-04-01
The number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. However, some genetic variants may not be valid instrumental variables, in particular due to them having more than one proximal phenotypic correlate (pleiotropy). We view Mendelian randomization with multiple instruments as a meta-analysis, and show that bias caused by pleiotropy can be regarded as analogous to small study bias. Causal estimates using each instrument can be displayed visually by a funnel plot to assess potential asymmetry. Egger regression, a tool to detect small study bias in meta-analysis, can be adapted to test for bias from pleiotropy, and the slope coefficient from Egger regression provides an estimate of the causal effect. Under the assumption that the association of each genetic variant with the exposure is independent of the pleiotropic effect of the variant (not via the exposure), Egger's test gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid instrumental variables. We illustrate the use of this approach by re-analysing two published Mendelian randomization studies of the causal effect of height on lung function, and the causal effect of blood pressure on coronary artery disease risk. The conservative nature of this approach is illustrated with these examples. An adaption of Egger regression (which we call MR-Egger) can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations. The approach provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation. © The Author 2015; Published by Oxford University Press on behalf of the International Epidemiological Association.
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
Random-effects regression analysis of correlated grouped-time survival data.
Hedeker, D; Siddiqui, O; Hu, F B
2000-04-01
Random-effects regression modelling is proposed for analysis of correlated grouped-time survival data. Two analysis approaches are considered. The first treats survival time as an ordinal outcome, which is either right-censored or not. The second approach treats survival time as a set of dichotomous indicators of whether the event occurred for time periods up to the period of the event or censor. For either approach both proportional hazards and proportional odds versions of the random-effects model are developed, while partial proportional hazards and odds generalizations are described for the latter approach. For estimation, a full-information maximum marginal likelihood solution is implemented using numerical quadrature to integrate over the distribution of multiple random effects. The quadrature solution allows some flexibility in the choice of distributions for the random effects; both normal and rectangular distributions are considered in this article. An analysis of a dataset where students are clustered within schools is used to illustrate features of random-effects analysis of clustered grouped-time survival data.
DEFF Research Database (Denmark)
Shirali, Mahmoud; Nielsen, Vivi Hunnicke; Møller, Steen Henrik
at the end compared to the early growing period suggesting that heterogeneous residual variance should be considered for analyzing feed efficiency data in mink. This study suggests random regression methods are suitable for analyzing feed efficiency and that genetic selection for RFI in mink is promising.......Heritability of residual feed intake (RFI) increased from low to high over the growing period in male and female mink. The lowest heritability for RFI (male: 0.04 ± 0.01 standard deviation (SD); female: 0.05 ± 0.01 SD) was in early and the highest heritability (male: 0.33 ± 0.02; female: 0.34 ± 0...
Proceedings of the 17th Dutch Testing Day: Testing Evolvability
Stoelinga, Mariëlle Ida Antoinette; Timmer, Mark; Unknown, [Unknown
These are the postproceedings of 17th Dutch Testing Day, held on the 29th of November 2011 at the - recently completely renovated - campus of the University of Twente. These postproceedings cover a selection of the material presented during the Dutch Testing Day. The synergy between academic and
Liu, Xian; Engel, Charles C
2012-12-20
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.
Eisavi, Vahid; Homayouni, Saeid
2016-10-01
Information on land use and land cover changes is considered as a foremost requirement for monitoring environmental change. Developing change detection methodology in the remote sensing community is an active research topic. However, to the best of our knowledge, no research has been conducted so far on the application of random forest regression (RFR) and support vector regression (SVR) for natural hazard change detection from high-resolution optical remote sensing observations. Hence, the objective of this study is to examine the use of RFR and SVR to discriminate between changed and unchanged areas after a tsunami. For this study, RFR and SVR were applied to two different pilot coastlines in Indonesia and Japan. Two different remotely sensed data sets acquired by Quickbird and Ikonos sensors were used for efficient evaluation of the proposed methodology. The results demonstrated better performance of SVM compared to random forest (RF) with an overall accuracy higher by 3% to 4% and kappa coefficient by 0.05 to 0.07. Using McNemar's test, statistically significant differences (Z≥1.96), at the 5% significance level, between the confusion matrices of the RF classifier and the support vector classifier were observed in both study areas. The high accuracy of change detection obtained in this study confirms that these methods have the potential to be used for detecting changes due to natural hazards.
Suzuki, Kohta; Kondo, Naoki; Sato, Miri; Tanaka, Taichiro; Ando, Daisuke; Yamagata, Zentaro
2012-01-01
Although maternal smoking during pregnancy has been reported to have an effect on childhood overweight/obesity, the impact of maternal smoking on the trajectory of the body mass of their offspring is not very clear. Previously, we investigated this effect by using a fixed-effect model. However, this analysis was limited because it rounded and categorized the age of the children. Therefore, we used a random-effects hierarchical linear regression model in the present study. The study population comprised children born between 1 April 1991 and 31 March 1999 in Koshu City, Japan and their mothers. Maternal smoking during early pregnancy was the exposure studied. The body mass index (BMI) z-score trajectory of children born to smoking and non-smoking mothers, by gender, was used as the outcome. We modeled BMI trajectory using a 2-level random intercept and slope regression. The participating mothers delivered 1619 babies during the study period. For male children, there was very strong evidence that the effect of age in months on the increase in BMI z-score was enhanced by maternal smoking during pregnancy (P smoking during pregnancy (P = 0.054), which suggests that the effect of maternal smoking during pregnancy on the early-life BMI trajectory of offspring differed by gender. These results may be valuable for exploring the mechanism of fetal programming and might therefore be clinically important.
Technology diffusion in hospitals: a log odds random effects regression model.
Blank, Jos L T; Valdmanis, Vivian G
2015-01-01
This study identifies the factors that affect the diffusion of hospital innovations. We apply a log odds random effects regression model on hospital micro data. We introduce the concept of clustering innovations and the application of a log odds random effects regression model to describe the diffusion of technologies. We distinguish a number of determinants, such as service, physician, and environmental, financial and organizational characteristics of the 60 Dutch hospitals in our sample. On the basis of this data set on Dutch general hospitals over the period 1995-2002, we conclude that there is a relation between a number of determinants and the diffusion of innovations underlining conclusions from earlier research. Positive effects were found on the basis of the size of the hospitals, competition and a hospital's commitment to innovation. It appears that if a policy is developed to further diffuse innovations, the external effects of demand and market competition need to be examined, which would de facto lead to an efficient use of technology. For the individual hospital, instituting an innovations office appears to be the most prudent course of action. © 2013 The Authors. International Journal of Health Planning and Management published by John Wiley & Sons, Ltd.
Ricci, Claudio; Casadei, Riccardo; Taffurelli, Giovanni; Pacilio, Carlo Alberto; Beltrami, Denis; Minni, Francesco
To evaluate the clinically relevant POPF rate between Pancreatogastrostomy (PG) and pancreaticojejunostomy (PJ) after pancreaticoduodenectomy (PD). To evaluate the confounding factors affecting meta-analytic results. A systematic literature search of randomized clinical trials (RCTs) comparing PG to PJ with an International Study Group of Pancreatic Fistula (ISGPF) definition of postoperative pancreatic fistula (POPF). Risk difference (RD) and number needed to treat or harm (NNT and NNH) were used. Fixed and random-effect models were applied. Impact of confounding covariates on the meta-analytic results was evaluated using meta-regression analysis, reporting β coefficient ± standard error (SE). Seven RCTs were identified involving 1184 patients: 603 PG and 581 PJ. RD in the fixed model of clinically relevant POPFs suggested that PG was superior to PJ (RD-0.07; 95% CI: -0.11 to -0.03) with an NNT of 14 (95% CI: 9 to 33). In random model, PG was not superior to PJ (RD-0.06; 95% CI: -0.13 to 0.01) with an NNT of 17 and a possibility of harm in some cases (NNH = 100). Meta-regression suggested that the increase in the proportion of "soft pancreas" in the PG arm corresponded to a more positive value of RD (β = 0.47 ± 0.19; P value: 0.045 ± 0.003). A PG could be slightly superior to PJ in the prevention of clinically relevant POPF. The presence of high risk pancreatic remnant remains the main limitation of PG. Copyright © 2017. Published by Elsevier B.V.
Comparison between the Lactation Model and the Test-Day Model ...
African Journals Online (AJOL)
ARC-IRENE
Genetic Evaluation, using a Lactation Model (LM). The other set was obtained in the 2004 South African. National Genetic Evaluation, using a Fixed Regression Test-day Model (TDM). This comparison is made for. Ayrshire, Guernsey, Holstein and Jersey cows participating in the South African Dairy Animal Improvement.
ESTIMATION OF GENETIC PARAMETERS IN TROPICARNE CATTLE WITH RANDOM REGRESSION MODELS USING B-SPLINES
Directory of Open Access Journals (Sweden)
Joel DomÃnguez Viveros
2015-04-01
Full Text Available The objectives were to estimate variance components, and direct (h2 and maternal (m2 heritability in the growth of Tropicarne cattle based on a random regression model using B-Splines for random effects modeling. Information from 12 890 monthly weightings of 1787 calves, from birth to 24 months old, was analyzed. The pedigree included 2504 animals. The random effects model included genetic and permanent environmental (direct and maternal of cubic order, and residuals. The fixed effects included contemporaneous groups (year â€“ season of weighed, sex and the covariate age of the cow (linear and quadratic. The B-Splines were defined in four knots through the growth period analyzed. Analyses were performed with the software Wombat. The variances (phenotypic and residual presented a similar behavior; of 7 to 12 months of age had a negative trend; from birth to 6 months and 13 to 18 months had positive trend; after 19 months were maintained constant. The m2 were low and near to zero, with an average of 0.06 in an interval of 0.04 to 0.11; the h2 also were close to zero, with an average of 0.10 in an interval of 0.03 to 0.23.
A review of R-packages for random-intercept probit regression in small clusters
Directory of Open Access Journals (Sweden)
Haeike Josephy
2016-10-01
Full Text Available Generalized Linear Mixed Models (GLMMs are widely used to model clustered categorical outcomes. To tackle the intractable integration over the random effects distributions, several approximation approaches have been developed for likelihood-based inference. As these seldom yield satisfactory results when analyzing binary outcomes from small clusters, estimation within the Structural Equation Modeling (SEM framework is proposed as an alternative. We compare the performance of R-packages for random-intercept probit regression relying on: the Laplace approximation, adaptive Gaussian quadrature (AGQ, penalized quasi-likelihood, an MCMC-implementation, and integrated nested Laplace approximation within the GLMM-framework, and a robust diagonally weighted least squares estimation within the SEM-framework. In terms of bias for the fixed and random effect estimators, SEM usually performs best for cluster size two, while AGQ prevails in terms of precision (mainly because of SEM's robust standard errors. As the cluster size increases, however, AGQ becomes the best choice for both bias and precision.
Directory of Open Access Journals (Sweden)
Giselle Mariano Lessa de Assis
2006-06-01
Full Text Available Modelos de regressão aleatória foram utilizados neste estudo para estimar parâmetros genéticos da produção de leite no dia do controle (PLDC em caprinos leiteiros da raça Alpina, por meio da metodologia Bayesiana. As estimativas geradas foram comparadas às obtidas com análise de regressão aleatória, utilizando-se o REML. As herdabilidades encontradas pela análise Bayesiana variaram de 0,18 a 0,37, enquanto, pelo REML, variaram de 0,09 a 0,32. As correlações genéticas entre dias de controle próximos se aproximaram da unidade, decrescendo gradualmente conforme a distância entre os dias de controle aumentou. Os resultados obtidos indicam que: a estrutura de covariâncias da PLDC em caprinos ao longo da lactação pode ser modelada adequadamente por meio da regressão aleatória; a predição de ganhos genéticos e a seleção de animais geneticamente superiores é viável ao longo de toda a trajetória da lactação; os resultados gerados pelas análises de regressão aleatória utilizando-se a Amostragem de Gibbs e o REML foram semelhantes, embora as estimativas das variâncias genéticas e das herdabilidades tenham sido levemente superiores na análise Bayesiana, utilizando-se a Amostragem de Gibbs.Random regression models were used to estimate genetic parameters for test-day milk yield (PLDC of Alpine dairy goats, implemented by Bayesian methods with Gibbs Sampling. The estimates were compared with those obtained by random regression analysis, using REML. Heritability estimates obtained by Bayesian analysis ranged from 0.18 to 0.37, while those obtained by REML ranged from 0.09 to 0.32. Genetic correlations between yields of close test days approached the unit, but decreased gradually as the interval between test days increased. Results indicated that random regression models are appropriate to model the covariance structure of PLDC and to predict genetic gains and select animals along the lactation trajectory of dairy goats
Zhang, Zhiwei; Cheon, Kyeongmi
2017-04-01
A common problem in randomized clinical trials is nonignorable missingness, namely that the clinical outcome(s) of interest can be missing in a way that is not fully explained by the observed quantities. This happens when the continued participation of patients depends on the current outcome after adjusting for the observed history. Standard methods for handling nonignorable missingness typically require specification of the response mechanism, which can be difficult in practice. This article proposes a reverse regression approach that does not require a model for the response mechanism. Instead, the proposed approach relies on the assumption that missingness is independent of treatment assignment upon conditioning on the relevant outcome(s). This conditional independence assumption is motivated by the observation that, when patients are effectively masked to the assigned treatment, their decision to either stay in the trial or drop out cannot depend on the assigned treatment directly. Under this assumption, one can estimate parameters in the reverse regression model, test for the presence of a treatment effect, and in some cases estimate the outcome distributions. The methodology can be extended to longitudinal outcomes under natural conditions. The proposed approach is illustrated with real data from a cardiovascular study.
Tchetgen Tchetgen, Eric J; Wirth, Kathleen E
2017-02-23
The instrumental variable (IV) design is a well-known approach for unbiased evaluation of causal effects in the presence of unobserved confounding. In this article, we study the IV approach to account for selection bias in regression analysis with outcome missing not at random. In such a setting, a valid IV is a variable which (i) predicts the nonresponse process, and (ii) is independent of the outcome in the underlying population. We show that under the additional assumption (iii) that the IV is independent of the magnitude of selection bias due to nonresponse, the population regression in view is nonparametrically identified. For point estimation under (i)-(iii), we propose a simple complete-case analysis which modifies the regression of primary interest by carefully incorporating the IV to account for selection bias. The approach is developed for the identity, log and logit link functions. For inferences about the marginal mean of a binary outcome assuming (i) and (ii) only, we describe novel and approximately sharp bounds which unlike Robins-Manski bounds, are smooth in model parameters, therefore allowing for a straightforward approach to account for uncertainty due to sampling variability. These bounds provide a more honest account of uncertainty and allows one to assess the extent to which a violation of the key identifying condition (iii) might affect inferences. For illustration, the methods are used to account for selection bias induced by HIV testing nonparticipation in the evaluation of HIV prevalence in the Zambian Demographic and Health Surveys. © 2017, The International Biometric Society.
Genetic parameters for test day somatic cell score in Brazilian Holstein cattle.
Costa, C N; Santos, G G; Cobuci, J A; Thompson, G; Carvalheira, J G V
2015-12-29
Selection for lower somatic cell count has been included in the breeding objectives of several countries in order to increase resistance to mastitis. Genetic parameters of somatic cell scores (SCS) were estimated from the first lactation test day records of Brazilian Holstein cows using random-regression models with Legendre polynomials (LP) of the order 3-5. Data consisted of 87,711 TD produced by 10,084 cows, sired by 619 bulls calved from 1993 to 2007. Heritability estimates varied from 0.06 to 0.14 and decreased from the beginning of the lactation up to 60 days in milk (DIM) and increased thereafter to the end of lactation. Genetic correlations between adjacent DIM were very high (>0.83) but decreased to negative values, obtained with LP of order four, between DIM in the extremes of lactation. Despite the favorable trend, genetic changes in SCS were not significant and did not differ among LP. There was little benefit of fitting an LP of an order >3 to model animal genetic and permanent environment effects for SCS. Estimates of variance components found in this study may be used for breeding value estimation for SCS and selection for mastitis resistance in Holstein cattle in Brazil.
Sasaki, O; Aihara, M; Nishiura, A; Takeda, H; Satoh, M
2015-08-01
Longevity is a crucial economic trait in the dairy farming industry. In this study, our objective was to develop a random regression model for genetic evaluation of survival. For the analysis, we used test-day records obtained for the first 5 lactations of 380,252 cows from 1,296 herds in Japan between 2001 and 2010; this data set was randomly divided into 7 subsets. The cumulative pseudo-survival rate (PSR) was determined according to whether a cow was alive (1) or absent (0) in her herd on the test day within each lactation group. Each lactation number was treated as an independent trait in a random regression multiple-trait model (MTM) or as a repeated measure in a random regression single-trait repeatability model (STRM). A proportional hazard model (PHM) was also developed as a piecewise-hazards model. The average (± standard deviation) heritability estimates of the PSR at 365 d in milk (DIM) among the 7 data sets in the first (LG1), second (LG2), and third to fifth lactations (LG3) of the MTM were 0.042±0.007, 0.070±0.012, and 0.084±0.007, respectively. The heritability estimate of the STRM was 0.038±0.004. The genetic correlations of PSR between distinct DIM within or between lactation groups were high when the interval between DIM was short. These results indicated that whereas the genetic factors contributing to the PSR between closely associated DIM would be similar even for different lactation numbers, the genetic factors contributing to PSR would differ between distinct lactation periods. The average (± standard deviation) effective heritability estimate based on the relative risk of the PHM among the 7 data sets was 0.068±0.009. The estimated breeding values (EBV) in LG1, LG2, LG3, the STRM, and the PHM were unbiased estimates of the genetic trend. The absolute values of the Spearman's rank correlation coefficients between the EBV of the relative risk of the PHM and the EBV of PSR at 365 DIM for LG1, LG2, LG3, and the STRM were 0.75, 0.87, 0
Lin, Yi Hung; Tu, Yu Kang; Lu, Chun Tai; Chung, Wen Chen; Huang, Chiung Fang; Huang, Mao Suan; Lu, Hsein Kun
2014-01-01
Repigmentation variably occurs with different treatment methods in patients with gingival pigmentation. A systemic review was conducted of various treatment modalities for eliminating melanin pigmentation of the gingiva, comprising bur abrasion, scalpel surgery, cryosurgery, electrosurgery, gingival grafts, and laser techniques, to compare the recurrence rates (Rrs) of these treatment procedures. Electronic databases, including PubMed, Web of Science, Google, and Medline were comprehensively searched, and manual searches were conducted for studies published from January 1951 to June 2013. After applying inclusion and exclusion criteria, the final list of articles was reviewed in depth to achieve the objectives of this review. A Poisson regression was used to analyze the outcome of depigmentation using the various treatment methods. The systematic review was based on case reports mainly. In total, 61 eligible publications met the defined criteria. The various therapeutic procedures showed variable clinical results with a wide range of Rrs. A random-effects Poisson regression showed that cryosurgery (Rr = 0.32%), electrosurgery (Rr = 0.74%), and laser depigmentation (Rr = 1.16%) yielded superior result, whereas bur abrasion yielded the highest Rr (8.89%). Within the limit of the sampling level, the present evidence-based results show that cryosurgery exhibits the optimal predictability for depigmentation of the gingiva among all procedures examined, followed by electrosurgery and laser techniques. It is possible to treat melanin pigmentation of the gingiva with various methods and prevent repigmentation. Among those treatment modalities, cryosurgery, electrosurgery, and laser surgery appear to be the best choices for treating gingival pigmentation. © 2014 Wiley Periodicals, Inc.
Multi-fidelity Gaussian process regression for prediction of random fields
Energy Technology Data Exchange (ETDEWEB)
Parussini, L. [Department of Engineering and Architecture, University of Trieste (Italy); Venturi, D., E-mail: venturi@ucsc.edu [Department of Applied Mathematics and Statistics, University of California Santa Cruz (United States); Perdikaris, P. [Department of Mechanical Engineering, Massachusetts Institute of Technology (United States); Karniadakis, G.E. [Division of Applied Mathematics, Brown University (United States)
2017-05-01
We propose a new multi-fidelity Gaussian process regression (GPR) approach for prediction of random fields based on observations of surrogate models or hierarchies of surrogate models. Our method builds upon recent work on recursive Bayesian techniques, in particular recursive co-kriging, and extends it to vector-valued fields and various types of covariances, including separable and non-separable ones. The framework we propose is general and can be used to perform uncertainty propagation and quantification in model-based simulations, multi-fidelity data fusion, and surrogate-based optimization. We demonstrate the effectiveness of the proposed recursive GPR techniques through various examples. Specifically, we study the stochastic Burgers equation and the stochastic Oberbeck–Boussinesq equations describing natural convection within a square enclosure. In both cases we find that the standard deviation of the Gaussian predictors as well as the absolute errors relative to benchmark stochastic solutions are very small, suggesting that the proposed multi-fidelity GPR approaches can yield highly accurate results.
2013-01-01
Objectives. Global perceptions of stress (GPS) have major implications for mental and physical health, and stress in midlife may influence adaptation in later life. Thus, it is important to determine the unique and interactive effects of diverse influences of role stress (at work or in personal relationships), loneliness, life events, time pressure, caregiving, finances, discrimination, and neighborhood circumstances on these GPS. Method. Exploratory regression trees and random forests were used to examine complex interactions among myriad events and chronic stressors in middle-aged participants’ (N = 410; mean age = 52.12) GPS. Results. Different role and domain stressors were influential at high and low levels of loneliness. Varied combinations of these stressors resulting in similar levels of perceived stress are also outlined as examples of equifinality. Loneliness emerged as an important predictor across trees. Discussion. Exploring multiple stressors simultaneously provides insights into the diversity of stressor combinations across individuals—even those with similar levels of global perceived stress—and answers theoretical mandates to better understand the influence of stress by sampling from many domain and role stressors. Further, the unique influences of each predictor relative to the others inform theory and applied work. Finally, examples of equifinality and multifinality call for targeted interventions. PMID:23341437
Box-Cox Transformation and Random Regression Models for Fecal egg Count Data.
da Silva, Marcos Vinícius Gualberto Barbosa; Van Tassell, Curtis P; Sonstegard, Tad S; Cobuci, Jaime Araujo; Gasbarre, Louis C
2011-01-01
Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants, fecal egg count (FEC) is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used in an effort to achieve normality before analysis. However, the transformed data are often still not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (co)variance components. We also proposed using random regression models (RRM) for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4) adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
BOX-COX transformation and random regression models for fecal egg count data
Directory of Open Access Journals (Sweden)
Marcos Vinicius Silva
2012-01-01
Full Text Available Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants fecal egg count (FEC is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used to achieve normality before analysis. However, the transformed data are often not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6,375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (covariance components. We also proposed using random regression models (RRM for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4 adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
Microbiome Data Accurately Predicts the Postmortem Interval Using Random Forest Regression Models
Directory of Open Access Journals (Sweden)
Aeriel Belk
2018-02-01
Full Text Available Death investigations often include an effort to establish the postmortem interval (PMI in cases in which the time of death is uncertain. The postmortem interval can lead to the identification of the deceased and the validation of witness statements and suspect alibis. Recent research has demonstrated that microbes provide an accurate clock that starts at death and relies on ecological change in the microbial communities that normally inhabit a body and its surrounding environment. Here, we explore how to build the most robust Random Forest regression models for prediction of PMI by testing models built on different sample types (gravesoil, skin of the torso, skin of the head, gene markers (16S ribosomal RNA (rRNA, 18S rRNA, internal transcribed spacer regions (ITS, and taxonomic levels (sequence variants, species, genus, etc.. We also tested whether particular suites of indicator microbes were informative across different datasets. Generally, results indicate that the most accurate models for predicting PMI were built using gravesoil and skin data using the 16S rRNA genetic marker at the taxonomic level of phyla. Additionally, several phyla consistently contributed highly to model accuracy and may be candidate indicators of PMI.
Directory of Open Access Journals (Sweden)
Cláudio Vieira de Araújo
2006-06-01
Full Text Available Registros de produção de leite de 68.523 controles leiteiros de 8.536 vacas da raça Holandesa, com parições nos anos de 1996 a 2001, foram utilizados na comparação entre modelos de regressão aleatória para estimação de componentes de variância. Os registros de controle leiteiro foram analisados como características múltiplas, considerando cada controle uma característica distinta. Os mesmos registros de controle leiteiro foram analisados como dados longitudinais, por meio de modelos de regressão aleatória, que diferiram entre si pela função utilizada para descrever a trajetória da curva de lactação dos animais. As funções utilizadas foram a exponencial de Wilmink, a função de Ali e Schaeffer e os polinômios de Legendre de segundo e quarto graus. A comparação entre modelos foi realizada com base nos seguintes critérios: estimativas de componentes de variância, obtidas no modelo multicaractístico e por regressão aleatória; valores da variância residual; e valores do logaritmo da função de verossimilhança. As estimativas de herdabilidade obtidas por meio dos modelos de características múltiplas variaram de 0,110 a 0,244. Para os modelos de regressão aleatória, esses valores oscilaram de 0,127 a 0,301, observando-se as maiores estimativas nos modelos com maior número de parâmetros. Verificou-se que os modelos de regressão aleatória que utilizaram os polinômios de Legendre descreveram melhor a variação genética da produção de leite.Data comprising 68,523 test day milk yield of 8,536 cows of the Holstein breed, calving from 1996 to 2001, were used to compare random regression models, for estimating variance components. Test day records (TD were analyzed as multiple traits, considering each TD as a different trait. The test day records were analyzed as longitudinal traits by different random regression models regarding the function used to describe the trajectory of the lactation curve of the animals
Manafiazar, G; McFadden, T; Goonewardene, L; Okine, E; Basarab, J; Li, P; Wang, Z
2013-01-01
Residual Feed Intake (RFI) is a measure of energy efficiency. Developing an appropriate model to predict expected energy intake while accounting for multifunctional energy requirements of metabolic body weight (MBW), empty body weight (EBW), milk production energy requirements (MPER), and their nonlinear lactation profiles, is the key to successful prediction of RFI in dairy cattle. Individual daily actual energy intake and monthly body weight of 281 first-lactation dairy cows from 1 to 305 d in milk were recorded at the Dairy Research and Technology Centre of the University of Alberta (Edmonton, AB, Canada); individual monthly milk yield and compositions were obtained from the Dairy Herd Improvement Program. Combinations of different orders (1-5) of fixed (F) and random (R) factors were fitted using Legendre polynomial regression to model the nonlinear lactation profiles of MBW, EBW, and MPER over 301 d. The F5R3, F5R3, and F5R2 (subscripts indicate the order fitted) models were selected, based on the combination of the log-likelihood ratio test and the Bayesian information criterion, as the best prediction equations for MBW, EBW, and MPER, respectively. The selected models were used to predict daily individual values for these traits. To consider the body reserve changes, the differences of predicted EBW between 2 consecutive days were considered as the EBW change between these days. The smoothed total 301-d actual energy intake was then linearly regressed on the total 301-d predicted traits of MBW, EBW change, and MPER to obtain the first-lactation RFI (coefficient of determination=0.68). The mean of predicted daily average lactation RFI was 0 and ranged from -6.58 to 8.64 Mcal of NE(L)/d. Fifty-one percent of the animals had an RFI value below the mean (efficient) and 49% of them had an RFI value above the mean (inefficient). These results indicate that the first-lactation RFI can be predicted from its component traits with a reasonable coefficient of
Doron, J; Martinent, G
2017-09-01
Understanding more about the stress process is important for the performance of athletes during stressful situations. Grounded in Lazarus's (1991, 1999, 2000) CMRT of emotion, this study tracked longitudinally the relationships between cognitive appraisal, coping, emotions, and performance in nine elite fencers across 14 international matches (representing 619 momentary assessments) using a naturalistic, video-assisted methodology. A series of hierarchical linear modeling analyses were conducted to: (a) explore the relationships between cognitive appraisals (challenge and threat), coping strategies (task- and disengagement oriented coping), emotions (positive and negative) and objective performance; (b) ascertain whether the relationship between appraisal and emotion was mediated by coping; and (c) examine whether the relationship between appraisal and objective performance was mediated by emotion and coping. The results of the random coefficient regression models showed: (a) positive relationships between challenge appraisal, task-oriented coping, positive emotions, and performance, as well as between threat appraisal, disengagement-oriented coping and negative emotions; (b) that disengagement-oriented coping partially mediated the relationship between threat and negative emotions, whereas task-oriented coping partially mediated the relationship between challenge and positive emotions; and (c) that disengagement-oriented coping mediated the relationship between threat and performance, whereas task-oriented coping and positive emotions partially mediated the relationship between challenge and performance. As a whole, this study furthered knowledge during sport performance situations of Lazarus's (1999) claim that these psychological constructs exist within a conceptual unit. Specifically, our findings indicated that the ways these constructs are inter-related influence objective performance within competitive settings. © 2016 John Wiley & Sons A/S. Published by
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2017-08-01
Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Directory of Open Access Journals (Sweden)
Cláudio Vieira de Araújo
2006-06-01
Full Text Available Registros de produção de leite de 68.523 controles leiteiros de 8.536 vacas da raça Holandesa, filhas de 537 reprodutores, distribuídas em 266 rebanhos, com parições nos anos de 1996 a 2001, foram utilizados na comparação de modelos de regressão aleatória, para estimação de componentes de variância. Os modelos de regressão aleatória diferiram entre si pelo grau do polinômio de Legendre utilizado para descrever a trajetória da curva de lactação dos animais. Os modelos incluíram os efeitos rebanho-mês-ano do controle, composição genética dos animais, freqüência de ordenhas diárias, regressão polinomial em cada classe de idade-estação de parto para descrever a parte fixa da lactação e regressão polinomial aleatória relacionadas aos efeitos genético direto e de ambiente permanente. As estimativas de herdabilidade obtidas oscilaram de 0,122 a 0,291. Verificou-se que o modelo de regressão aleatória que utilizou a maior ordem para os polinômios de Legendre descreveu melhor a variação genética da produção de leite, de acordo com o critério de Akaike.Data comprising 68,523 test day milk yield of 8,536 cows of the Holstein breed, daughters of 537 sires, distributed in 266 herds, calving from 1996 to 2001, were used to compare random regression models, for estimating variance. Test day records (TD were analyzed by different random regression models regarding the function used to describe the trajectory of the lactation curve of the animals. Legendre orthogonal polynomials function of second, third and fourth order were used. The random regression models included the effects of herd-month-year of the control, genetic group of the animals; the frequency of the daily milk; regression coefficients for each class of age-season (in order to describe the fixed part of the lactation curve and random regression coefficients related to the direct genetic and the permanent environmental effects. The heritability estimates
Meer, D. van der; Hoekstra, P.J.; Donkelaar, M.M.J. van; Bralten, J.B.; Oosterlaan, J.; Heslenfeld, D.; Faraone, S.V; Franke, B.; Buitelaar, J.K.; Hartman, C.A.
2017-01-01
Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression
Shabani, Farzin; Kumar, Lalit; Solhjouy-fard, Samaneh
2017-08-01
The aim of this study was to have a comparative investigation and evaluation of the capabilities of correlative and mechanistic modeling processes, applied to the projection of future distributions of date palm in novel environments and to establish a method of minimizing uncertainty in the projections of differing techniques. The location of this study on a global scale is in Middle Eastern Countries. We compared the mechanistic model CLIMEX (CL) with the correlative models MaxEnt (MX), Boosted Regression Trees (BRT), and Random Forests (RF) to project current and future distributions of date palm ( Phoenix dactylifera L.). The Global Climate Model (GCM), the CSIRO-Mk3.0 (CS) using the A2 emissions scenario, was selected for making projections. Both indigenous and alien distribution data of the species were utilized in the modeling process. The common areas predicted by MX, BRT, RF, and CL from the CS GCM were extracted and compared to ascertain projection uncertainty levels of each individual technique. The common areas identified by all four modeling techniques were used to produce a map indicating suitable and unsuitable areas for date palm cultivation for Middle Eastern countries, for the present and the year 2100. The four different modeling approaches predict fairly different distributions. Projections from CL were more conservative than from MX. The BRT and RF were the most conservative methods in terms of projections for the current time. The combination of the final CL and MX projections for the present and 2100 provide higher certainty concerning those areas that will become highly suitable for future date palm cultivation. According to the four models, cold, hot, and wet stress, with differences on a regional basis, appears to be the major restrictions on future date palm distribution. The results demonstrate variances in the projections, resulting from different techniques. The assessment and interpretation of model projections requires reservations
Datema, Frank R; Moya, Ana; Krause, Peter; Bäck, Thomas; Willmes, Lars; Langeveld, Ton; Baatenburg de Jong, Robert J; Blom, Henk M
2012-01-01
Electronic patient files generate an enormous amount of medical data. These data can be used for research, such as prognostic modeling. Automatization of statistical prognostication processes allows automatic updating of models when new data is gathered. The increase of power behind an automated prognostic model makes its predictive capability more reliable. Cox proportional hazard regression is most frequently used in prognostication. Automatization of a Cox model is possible, but we expect the updating process to be time-consuming. A possible solution lies in an alternative modeling technique called random survival forests (RSFs). RSF is easily automated and is known to handle the proportionality assumption coherently and automatically. Performance of RSF has not yet been tested on a large head and neck oncological dataset. This study investigates performance of head and neck overall survival of RSF models. Performances are compared to a Cox model as the "gold standard." RSF might be an interesting alternative modeling approach for automatization when performances are similar. RSF models were created in R (Cox also in SPSS). Four RSF splitting rules were used: log-rank, conservation of events, log-rank score, and log-rank approximation. Models were based on historical data of 1371 patients with primary head-and-neck cancer, diagnosed between 1981 and 1998. Models contain 8 covariates: tumor site, T classification, N classification, M classification, age, sex, prior malignancies, and comorbidity. Model performances were determined by Harrell's concordance error rate, in which 33% of the original data served as a validation sample. RSF and Cox models delivered similar error rates. The Cox model performed slightly better (error rate, 0.2826). The log-rank splitting approach gave the best RSF performance (error rate, 0.2873). In accord with Cox and RSF models, high T classification, high N classification, and severe comorbidity are very important covariates in the
Directory of Open Access Journals (Sweden)
C.K.P. Dorneles
2009-04-01
Full Text Available Foram utilizados 21.702 registros de produção de leite no dia do controle de 2.429 vacas primíparas da raça Holandesa, filhas de 233 touros, coletados em 33 rebanhos do Estado do Rio Grande do Sul, para estimar parâmetros genéticos para produção de leite no dia do controle. O modelo de regressão aleatória ajustado aos controles leiteiros entre o sexto e o 305º dia de lactação incluiu o efeito de rebanho-ano-mês do controle, idade da vaca no parto e os parâmetros do polinômio de Legendre de ordem quatro, para modelar a curva média da produção de leite da população e parâmetros do mesmo polinômio, para modelar os efeitos aleatórios genético-aditivo e de ambiente permanente. As variâncias genéticas e de ambiente permanente para produção de leite no dia do controle variaram, respectivamente, de 2,38 a 3,14 e de 7,55 a 10,35. As estimativas de herdabilidade aumentaram gradativamente do início (0,14 para o final do período de lactação (0,20, indicando ser uma característica de moderada herdabilidade. As correlações genéticas entre as produções de leite de diferentes estágios leiteiros variaram de 0,33 a 0,99 e foram maiores entre os controles adjacentes. As correlações de ambiente permanente seguiram a mesma tendência das correlações genéticas. O modelo de regressão aleatória com polinômio de Legendre de ordem quatro pode ser considerado como uma boa ferramenta para estimação de parâmetros genéticos para a produção de leite ao longo da lactação.A total of 21,702 records of milk production from 2,429 first-lactation Holstein cows, sired by 233 bulls, collected in 33 herds in the State of Rio Grande do Sul from 1991 to 2003, were used to estimate genetic parameters for that characteristic. The random regression model adjusted to test day from the 6th and the 305th lactation day included the effect of herd-year-month of the test day, the age of the cow at parturition, and the order fourth Legendre
Application of Test-day Models for Variance Components Estimation ...
African Journals Online (AJOL)
Julio Carvalheira
the random effect of the animal, LTE is the random effect of the long-term environmental effects accounting for the autocorrelations generated by the cow across repeated lactations, STE is the random effect of short term environmental effects accounting for the autocorrelations due to cow within each lactation, and e is the.
Liu, Xian; Engel, Charles C.
2012-01-01
Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond e...
DEFF Research Database (Denmark)
Shirali, Mahmoud; Nielsen, Vivi Hunnicke; Møller, Steen Henrik
2014-01-01
be obtained by only considering RFI estimate and BW at pelting, however, lower genetic correlations than unity indicate that extra genetic gain can be obtained by including estimates of these traits at the growing period. This study suggests random regression methods are suitable for analysing feed efficiency......The aim of this study was to determine genetic background of longitudinal residual feed intake (RFI) and body weight (BW) growth in farmed mink using random regression methods considering heterogeneous residual variances. Eight BW measures for each mink was recorded every three weeks from 63 to 210...... days of age for 2139 male mink and the same number of females. Cumulative feed intake was calculated six times with three weeks interval based on daily feed consumption between weighing’s from 105 to 210 days of age. Heritability estimates for RFI increased by age from 0.18 (0.03, standard deviation...
Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S
2016-01-01
Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The
Directory of Open Access Journals (Sweden)
Lefeng Qiu
Full Text Available Soil cadmium (Cd contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR, classification and regression tree (CART and random forest (RF models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm samples were collected and randomly divided into calibration (222 samples and validation datasets (54 samples. Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF. The SLR model exhibited the largest predicted deviation, with a mean error (ME of 0.074 mg/kg, a mean absolute error (MAE of 0.160 mg/kg, and a root mean squared error (RMSE of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772. The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries
Genetic parameters of test day milk yields of Holstein cows
Directory of Open Access Journals (Sweden)
S.G. Machado
1999-09-01
Full Text Available Data were obtained from 17,968 records from 2,130 first lactations of Holstein cows calving between 1988 and 1991. The subjects were daughters of 136 sires monitored by Brazilian Breeders Association, Animal Science Institute, Department of Agriculture, a branch of the State of São Paulo. Data were divided into 10 subsets based on the number of days in milk yield. Test day milk yields (M1 to M10 and 305-day milk yield (M305 were the traits studied. These traits were adjusted for several environmental effects: class of cow age at calving, interval from calving to first test day, and herd-year-season. Restricted maximum likelihood estimates of (covariance components were obtained from one and two-traits analysis under a sire model. Estimates of heritabilities for M ranged from 0.04 to 0.32. The highest values were found in the second half of lactation (M5 to M7. Heritability estimate for M305 was 0.32. Genetic correlations between individual test days and M305 ranged from 0.78 to 1.00. Results suggested that test day milk yields, mainly in mid-lactation, can be used instead of 305-day milk yield in genetic evaluations, because estimates of these two-trait heritabilities are nearly alike. Moreover, early selection can reduce generation intervals.No presente estudo foram utilizados 17.968 registros de produção de leite, referentes a 2130 primeiras lactações de vacas da raça Holandesa, paridas nos anos de 1988 a 1991, filhas de 136 touros e controladas pela Associação Brasileira de Criadores (ABC. Os dados foram distribuídos em dez sub-arquivos de acordo com o número do controle (M1 a M10. As características estudadas foram: produção de leite no dia do controle (M e produção aos 305 dias de lactação (M305, as quais foram ajustadas para os seguintes fatores de variação: idade da vaca ao parto em classes, intervalo parto-primeiro controle e subclasses de rebanho-ano-estação de parto. Os componentes de (covariância foram obtidos a
DEFF Research Database (Denmark)
Larsen, Klaus; Merlo, Juan
2005-01-01
The logistic regression model is frequently used in epidemiologic studies, yielding odds ratio or relative risk interpretations. Inspired by the theory of linear normal models, the logistic regression model has been extended to allow for correlated responses by introducing random effects. However......, the model does not inherit the interpretational features of the normal model. In this paper, the authors argue that the existing measures are unsatisfactory (and some of them are even improper) when quantifying results from multilevel logistic regression analyses. The authors suggest a measure...... of heterogeneity, the median odds ratio, that quantifies cluster heterogeneity and facilitates a direct comparison between covariate effects and the magnitude of heterogeneity in terms of well-known odds ratios. Quantifying cluster-level covariates in a meaningful way is a challenge in multilevel logistic...
Sørensen, By Ole H
2016-10-01
Organizational-level occupational health interventions have great potential to improve employees' health and well-being. However, they often compare unfavourably to individual-level interventions. This calls for improving methods for designing, implementing and evaluating organizational interventions. This paper presents and discusses the regression discontinuity design because, like the randomized control trial, it is a strong summative experimental design, but it typically fits organizational-level interventions better. The paper explores advantages and disadvantages of a regression discontinuity design with an embedded randomized control trial. It provides an example from an intervention study focusing on reducing sickness absence in 196 preschools. The paper demonstrates that such a design fits the organizational context, because it allows management to focus on organizations or workgroups with the most salient problems. In addition, organizations may accept an embedded randomized design because the organizations or groups with most salient needs receive obligatory treatment as part of the regression discontinuity design. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Künzi Niklaus
2002-01-01
Full Text Available Abstract A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG, feed conversion ratio (FCR, carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch and random (additive genetic, pen and individual permanent environmental effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.
Norajitra, Tobias; Meinzer, Hans-Peter; Maier-Hein, Klaus H.
2015-03-01
During image segmentation, 3D Statistical Shape Models (SSM) usually conduct a limited search for target landmarks within one-dimensional search profiles perpendicular to the model surface. In addition, landmark appearance is modeled only locally based on linear profiles and weak learners, altogether leading to segmentation errors from landmark ambiguities and limited search coverage. We present a new method for 3D SSM segmentation based on 3D Random Forest Regression Voting. For each surface landmark, a Random Regression Forest is trained that learns a 3D spatial displacement function between the according reference landmark and a set of surrounding sample points, based on an infinite set of non-local randomized 3D Haar-like features. Landmark search is then conducted omni-directionally within 3D search spaces, where voxelwise forest predictions on landmark position contribute to a common voting map which reflects the overall position estimate. Segmentation experiments were conducted on a set of 45 CT volumes of the human liver, of which 40 images were randomly chosen for training and 5 for testing. Without parameter optimization, using a simple candidate selection and a single resolution approach, excellent results were achieved, while faster convergence and better concavity segmentation were observed, altogether underlining the potential of our approach in terms of increased robustness from distinct landmark detection and from better search coverage.
Directory of Open Access Journals (Sweden)
Humberto Tonhati
2008-01-01
Full Text Available Due to the great demand for buffalo milk by-products the interest in technical-scientific information about this species is increasing. Our objective was to propose selection criteria for milk yield in buffaloes based on total milk yield, 305-day milk yield (M305, and test-day milk yield. A total of 3,888 lactations from 1,630 Murrah (Bubalus bubalis cows recorded between 1987 and 2001, from 10 herds in the State of São Paulo, Brazil, were analyzed. Covariance components were obtained using the restricted maximum likelihood method applied to a bivariate animal model. Additive genetic and permanent environmental effects were considered as random, and contemporary group and lactation order as fixed effects. The heritability estimates were 0.22 for total milk yield and 0.19 for M305. For test-day yields, the heritability estimates ranged from 0.12 to 0.30, with the highest values being observed up to the third test month, followed by a decline until the end of lactation. The present results show that test-day milk yield, mainly during the first six months of lactation, could be adopted as a selection criterion to increase total milk yield.
Huang, Lei
2015-09-30
To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required.
Silva, L P; Ribeiro, J C; Leite, C D S; Sousa, M F; Bonafé, C M; Caetano, G C; Crispim, A C; Torres, R A
2013-05-13
Data from 8759 meat-type quails from the UFV1 strain and 9128 from the UFV2 strain were used to assess the possibility of reducing the number of body weight records in genetic evaluations. The evaluated animals were weighed weekly since hatching to the 6th week of life, with up to 7 records of body weight for each bird. The data were evaluated by random regression models, with 9 alternative schemes of data recording, which included 4 records for each scheme and their covariance functions for additive and permanent environmental effects of order 3, fitting 4 intervals for residual variance, and a complete scheme, with 7 records, order of fit 6 for additive and permanent environmental effects and 7 intervals for residual variance. Estimates of heritability for body weight at the 6th week varied from 0.45 to 0.53 for the UFV1 strain and from 0.28 to 0.54 for UFV2 strain. The schemes that had more records in points at the final extreme of the age range showed better estimates, which was likely due to certain properties of polynomial regression that led to biased results in the final extreme of the age range when data are unbalanced. The reduction of the number of body weight records taken during the growth phase is feasible, with little change to breeding value estimates, when 4 body weight records are used in random regression models.
Kim, Jane Paik
2013-03-01
In the context of randomized trials, Rosenblum and van der Laan (2009, Biometrics 63, 937-945) considered the null hypothesis of no treatment effect on the mean outcome within strata of baseline variables. They showed that hypothesis tests based on linear regression models and generalized linear regression models are guaranteed to have asymptotically correct Type I error regardless of the actual data generating distribution, assuming the treatment assignment is independent of covariates. We consider another important outcome in randomized trials, the time from randomization until failure, and the null hypothesis of no treatment effect on the survivor function conditional on a set of baseline variables. By a direct application of arguments in Rosenblum and van der Laan (2009), we show that hypothesis tests based on multiplicative hazards models with an exponential link, i.e., proportional hazards models, and multiplicative hazards models with linear link functions where the baseline hazard is parameterized, are asymptotically valid under model misspecification provided that the censoring distribution is independent of the treatment assignment given the covariates. In the case of the Cox model and linear link model with unspecified baseline hazard function, the arguments in Rosenblum and van der Laan (2009) cannot be applied to show the robustness of a misspecified model. Instead, we adopt an approach used in previous literature (Struthers and Kalbfleisch, 1986, Biometrika 73, 363-369) to show that hypothesis tests based on these models, including models with interaction terms, have correct type I error. Copyright © 2013, The International Biometric Society.
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-11-01
The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire
Production loss due to new subclinical mastitis in Dutch dairy cows estimated iwth a test-day model
Halasa, T.; Nielen, M.; Roos, de S.; Hoorne, van R.; Jong, de G.; Lam, T.J.G.M.; Werven, van T.; Hogeveen, H.
2009-01-01
Milk, fat, and protein loss due to a new subclinical mastitis case may be economically important, and the objective of this study was to estimate this loss. The loss was estimated based on test-day (TD) cow records collected over a 1-yr period from 400 randomly selected Dutch dairy herds. After
Directory of Open Access Journals (Sweden)
Chong Wei
2015-01-01
Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.
Directory of Open Access Journals (Sweden)
Diego Augusto Campos da Cruz
2012-12-01
Full Text Available The electrical conductivity of milk is an indirect method of mastitis diagnosis and can be used as selection criterion in breeding programs to obtain resistant animals to infection. For the present study data from 9,302 milk electrical conductivity measurements in the morning (ECM, from 1,129 Holstein cows in first lactation, calving between 2001 and 2011, belonging to eight herds in the Southeast of Brazil, obtained from automated milking equipment WESTFALIA® with system management "Dairyplan" was utilized. Classes of ECM were formed at weekly intervals, representing a total of 42 classes. The model included direct additive genetic, permanent environmental and residual effects as random and the fixed effects of contemporary group (herd - year and season of the control, age at calving as a covariate (linear and quadratic. Mean trends were modeled by an orthogonal Legendre polynomial with three coefficients of days in milk. The residual variance was considered homogeneous throughout lactation. Variance components were estimated by restricted maximum likelihood method (REML, using the statistical package Wombat (Meyer, 2006. The mean and standard deviation of the electrical conductivity of milk were 4.799 ± 0.543 ms/cm. The heritability for ECM were increased from the beginning to the middle of lactation (154 days, when it reached the maximum value (0.44, decreasing thereafter and reaching its minimum value at 300 days (0.17. Genetic correlations between the ECM at different periods of lactation were high and positive across the course of lactation, ranging from 0.73 to 0.99. It was observed that the correlation estimates were considerably lower when compared to the ECM 300 days with those of other periods. The data suggest that significant gains can be obtained via selection when using the ECM as selection criterion aimed at resistance to mastitis. It was verified also, that the selection for this trait in the early period of lactation, to
Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan
2014-09-01
Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.
He, Jie; Zhao, Yunfeng; Zhao, Jingli; Gao, Jin; Han, Dandan; Xu, Pao; Yang, Runqing
2017-11-02
Because of their high economic importance, growth traits in fish are under continuous improvement. For growth traits that are recorded at multiple time-points in life, the use of univariate and multivariate animal models is limited because of the variable and irregular timing of these measures. Thus, the univariate random regression model (RRM) was introduced for the genetic analysis of dynamic growth traits in fish breeding. We used a multivariate random regression model (MRRM) to analyze genetic changes in growth traits recorded at multiple time-point of genetically-improved farmed tilapia. Legendre polynomials of different orders were applied to characterize the influences of fixed and random effects on growth trajectories. The final MRRM was determined by optimizing the univariate RRM for the analyzed traits separately via penalizing adaptively the likelihood statistical criterion, which is superior to both the Akaike information criterion and the Bayesian information criterion. In the selected MRRM, the additive genetic effects were modeled by Legendre polynomials of three orders for body weight (BWE) and body length (BL) and of two orders for body depth (BD). By using the covariance functions of the MRRM, estimated heritabilities were between 0.086 and 0.628 for BWE, 0.155 and 0.556 for BL, and 0.056 and 0.607 for BD. Only heritabilities for BD measured from 60 to 140 days of age were consistently higher than those estimated by the univariate RRM. All genetic correlations between growth time-points exceeded 0.5 for either single or pairwise time-points. Moreover, correlations between early and late growth time-points were lower. Thus, for phenotypes that are measured repeatedly in aquaculture, an MRRM can enhance the efficiency of the comprehensive selection for BWE and the main morphological traits.
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2017-12-11
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of
Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu
2018-02-01
A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.
Directory of Open Access Journals (Sweden)
Cristian Kelen Pinto Dorneles
2009-08-01
Full Text Available Foram utilizados 21.702 registros de produção de leite no dia do controle de 2.429 vacas primíparas da raça Holandesa, filhas de 233 touros, coletados em 33 rebanhos do Estado do Rio Grande do Sul, entre 1992 e 2003, para estimar parâmetros genéticos, para três medidas de persistência (PS1, PS2 e PS3 e para a produção de leite até 305 dias (P305 de lactação. Os modelos de regressão aleatória ajustados aos controles leiteiros entre o sexto e o 300o dia de lactação incluíram o efeito de rebanho-ano-mês do controle, a idade da vaca ao parto e os parâmetros do polinômio de Legendre de ordem quatro, para modelar a curva média da produção de leite da população e os parâmetros do mesmo polinômio, para modelar os efeitos aleatórios genético-aditivo direto e de ambiente permanente. As estimativas de herdabilidade obtidas foram 0,05, 0,08 e 0,19, respectivamente, para PS1, PS2 e PS3 e 0,25, para P305 sugerindo a possibilidade de ganho genético por meio da seleção para PS3 e para P305. As correlações genéticas entre as três medidas de persistência e P305, variaram de -0,05 a 0,07, indicando serem persistência e produção, características determinadas por grupos de genes diferentes. Assim, consequentemente, a seleção para P305, geralmente praticada, não promove progresso genético para a persistência.There were used 21,702 test day milk yields from 2,429 first parity Holstein breed cows, daughters of 2,031 dams and 233 sires, distributed over 33 herds in the state of Rio Grande do Sul, from 1992 to 2003. Genetic parameters for three measures of lactation persistency (PS1, PS2 e PS3 and for milk production to 305 days (P305 were evaluated. A random regression model adjusted by fourth order Legendre polynomial was used. The random regression model adjusted to test day between the sixth and the 305th lactation day included the herd-year-season of the test day, the age of the cow at the parturition effects and the
Sasaki, O; Aihara, M; Nishiura, A; Takeda, H
2017-09-01
Trends in genetic correlations between longevity, milk yield, and somatic cell score (SCS) during lactation in cows are difficult to trace. In this study, changes in the genetic correlations between milk yield, SCS, and cumulative pseudo-survival rate (PSR) during lactation were examined, and the effect of milk yield and SCS information on the reliability of estimated breeding value (EBV) of PSR were determined. Test day milk yield, SCS, and PSR records were obtained for Holstein cows in Japan from 2004 to 2013. A random subset of the data was used for the analysis (825 herds, 205,383 cows). This data set was randomly divided into 5 subsets (162-168 herds, 83,389-95,854 cows), and genetic parameters were estimated in each subset independently. Data were analyzed using multiple-trait random regression animal models including either the residual effect for the whole lactation period (H0), the residual effects for 5 lactation stages (H5), or both of these residual effects (HD). Milk yield heritability increased until 310 to 351 d in milk (DIM) and SCS heritability increased until 330 to 344 DIM. Heritability estimates for PSR increased with DIM from 0.00 to 0.05. The genetic correlation between milk yield and SCS increased negatively to under -0.60 at 455 DIM. The genetic correlation between milk yield and PSR increased until 342 to 355 DIM (0.53-0.57). The genetic correlation between the SCS and PSR was -0.82 to -0.83 at around 180 DIM, and decreased to -0.65 to -0.71 at 455 DIM. The reliability of EBV of PSR for sires with 30 or more recorded daughters was 0.17 to 0.45 when the effects of correlated traits were ignored. The maximum reliability of EBV was observed at 257 (H0) or 322 (HD) DIM. When the correlations of PSR with milk yield and SCS were considered, the reliabilities of PSR estimates increased to 0.31-0.76. The genetic parameter estimates of H5 were the same as those for HD. The rank correlation coefficients of the EBV of PSR between H0 and H5 or HD were
Cruyff, M.; Böckenholt, U.; van der Heijden, P.G.M.; Frank, L.E.
2016-01-01
In survey research, it is often problematic to ask people sensitive questions because they may refuse to answer or they may provide a socially desirable answer that does not reveal their true status on the sensitive question. To solve this problem Warner (1965) proposed randomized response (RR).
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Li, Hongjian; Leung, Kwong-Sak; Wong, Man-Hon; Ballester, Pedro J
2014-08-27
State-of-the-art protein-ligand docking methods are generally limited by the traditionally low accuracy of their scoring functions, which are used to predict binding affinity and thus vital for discriminating between active and inactive compounds. Despite intensive research over the years, classical scoring functions have reached a plateau in their predictive performance. These assume a predetermined additive functional form for some sophisticated numerical features, and use standard multivariate linear regression (MLR) on experimental data to derive the coefficients. In this study we show that such a simple functional form is detrimental for the prediction performance of a scoring function, and replacing linear regression by machine learning techniques like random forest (RF) can improve prediction performance. We investigate the conditions of applying RF under various contexts and find that given sufficient training samples RF manages to comprehensively capture the non-linearity between structural features and measured binding affinities. Incorporating more structural features and training with more samples can both boost RF performance. In addition, we analyze the importance of structural features to binding affinity prediction using the RF variable importance tool. Lastly, we use Cyscore, a top performing empirical scoring function, as a baseline for comparison study. Machine-learning scoring functions are fundamentally different from classical scoring functions because the former circumvents the fixed functional form relating structural features with binding affinities. RF, but not MLR, can effectively exploit more structural features and more training samples, leading to higher prediction performance. The future availability of more X-ray crystal structures will further widen the performance gap between RF-based and MLR-based scoring functions. This further stresses the importance of substituting RF for MLR in scoring function development.
Stubbs, Brendon; Vancampfort, Davy; Rosenbaum, Simon; Ward, Philip B; Richards, Justin; Soundy, Andrew; Veronese, Nicola; Solmi, Marco; Schuch, Felipe B
2016-01-15
Exercise has established efficacy in improving depressive symptoms. Dropouts from randomized controlled trials (RCT's) pose a threat to the validity of this evidence base, with dropout rates varying across studies. We conducted a systematic review and meta-analysis to investigate the prevalence and predictors of dropout rates among adults with depression participating in exercise RCT's. Three authors identified RCT's from a recent Cochrane review and conducted updated searches of major electronic databases from 01/2013 to 08/2015. We included RCT's of exercise interventions in people with depression (including major depressive disorder (MDD) and depressive symptoms) that reported dropout rates. A random effects meta-analysis and meta regression were conducted. Overall, 40 RCT's were included reporting dropout rates across 52 exercise interventions including 1720 people with depression (49.1 years (range=19-76 years), 72% female (range=0-100)). The trim and fill adjusted prevalence of dropout across all studies was 18.1% (95%CI=15.0-21.8%) and 17.2% (95%CI=13.5-21.7, N=31) in MDD only. In MDD participants, higher baseline depressive symptoms (β=0.0409, 95%CI=0.0809-0.0009, P=0.04) predicted greater dropout, whilst supervised interventions delivered by physiotherapists (β=-1.2029, 95%CI=-2.0967 to -0.3091, p=0.008) and exercise physiologists (β=-1.3396, 95%CI=-2.4478 to -0.2313, p=0.01) predicted lower dropout. A comparative meta-analysis (N=29) established dropout was lower in exercise than control conditions (OR=0.642, 95%CI=0.43-0.95, p=0.02). Exercise is well tolerated by people with depression and drop out in RCT's is lower than control conditions. Thus, exercise is a feasible treatment, in particular when delivered by healthcare professionals with specific training in exercise prescription. Copyright © 2015 Elsevier B.V. All rights reserved.
Sun, Jin; Rutkoski, Jessica E; Poland, Jesse A; Crossa, José; Jannink, Jean-Luc; Sorrells, Mark E
2017-07-01
High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat ( L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect selection for grain yield. In this study, we evaluated three statistical models, simple repeatability (SR), multitrait (MT), and random regression (RR), for the longitudinal data of secondary traits and compared the impact of the proposed models for secondary traits on their predictive abilities for grain yield. Grain yield and secondary traits, canopy temperature (CT) and normalized difference vegetation index (NDVI), were collected in five diverse environments for 557 wheat lines with available pedigree and genomic information. A two-stage analysis was applied for pedigree and genomic selection (GS). First, secondary traits were fitted by SR, MT, or RR models, separately, within each environment. Then, best linear unbiased predictions (BLUPs) of secondary traits from the above models were used in the multivariate prediction models to compare predictive abilities for grain yield. Predictive ability was substantially improved by 70%, on average, from multivariate pedigree and genomic models when including secondary traits in both training and test populations. Additionally, (i) predictive abilities slightly varied for MT, RR, or SR models in this data set, (ii) results indicated that including BLUPs of secondary traits from the MT model was the best in severe drought, and (iii) the RR model was slightly better than SR and MT models under drought environment. Copyright © 2017 Crop Science Society of America.
Meister, Ramona; Jansen, Alessa; Härter, Martin; Nestoriuc, Yvonne; Kriston, Levente
2017-06-01
We aimed to investigate placebo and nocebo reactions in randomized controlled trials (RCT) of pharmacological treatments for persistent depressive disorder (PDD). We conducted a systematic electronic search and included RCTs investigating antidepressants for the treatment of PDD. Outcomes were the number of patients experiencing response and remission in placebo arms (=placebo reaction). Additional outcomes were the incidence of patients experiencing adverse events and related discontinuations in placebo arms (=nocebo reaction). A priori defined effect modifiers were analyzed using a series of meta-regression analyses. Twenty-three trials were included in the analyses. We found a pooled placebo response rate of 31% and a placebo remission rate of 22%. The pooled adverse event rate and related discontinuations were 57% and 4%, respectively. All placebo arm outcomes were positively associated with the corresponding medication arm outcomes. Placebo response rate was associated with a greater proportion of patients with early onset depression, a smaller chance to receive placebo and a larger sample size. The adverse event rate in placebo arms was associated with a greater proportion of patients with early onset depression, a smaller proportion of females and a more recent publication. Pooled placebo and nocebo reaction rates in PDD were comparable to those in episodic depression. The identified effect modifiers should be considered to assess unbiased effects in RCTs, to influence placebo and nocebo reactions in practice. Limitations result from the methodology applied, the fact that we conducted only univariate analyses, and the number and quality of included trials. Copyright © 2017 Elsevier B.V. All rights reserved.
Norajitra, Tobias; Maier-Hein, Klaus H
2017-01-01
3D Statistical Shape Models (3D-SSM) are widely used for medical image segmentation. However, during segmentation, they typically perform a very limited unidirectional search for suitable landmark positions in the image, relying on weak learners or use-case specific appearance models that solely take local image information into account. As a consequence, segmentation errors arise, and results in general depend on the accuracy of a previous model initialization. Furthermore, these methods become subject to a tedious and use-case dependent parameter tuning in order to obtain optimized results. To overcome these limitations, we propose an extension of 3D-SSM by landmark-wise random regression forests that perform an enhanced omni-directional search for landmark positions, thereby taking rich non-local image information into account. In addition, we provide a long distance model fitting based on a multi-scale approach, that allows an accurate and reproducible segmentation even from distant image positions, thus enabling an application without model initialization. Finally, translation of the proposed method to different organs is straightforward and requires no adaptation of the training process. In segmentation experiments on 45 clinical CT volumes, the proposed omni-directional search significantly increased accuracy and displayed great precision regardless of model initialization. Furthermore, for liver, spleen and kidney segmentation in a competitive multi-organ labeling challenge on publicly available data, the proposed method achieved similar or better results than the state of the art. Finally, liver segmentation results were obtained that successfully compete with specialized state-of-the-art methods from the well-known liver segmentation challenge SLIVER.
Golkarian, Ali; Naghibi, Seyed Amir; Kalantar, Bahareh; Pradhan, Biswajeet
2018-02-17
Ever increasing demand for water resources for different purposes makes it essential to have better understanding and knowledge about water resources. As known, groundwater resources are one of the main water resources especially in countries with arid climatic condition. Thus, this study seeks to provide groundwater potential maps (GPMs) employing new algorithms. Accordingly, this study aims to validate the performance of C5.0, random forest (RF), and multivariate adaptive regression splines (MARS) algorithms for generating GPMs in the eastern part of Mashhad Plain, Iran. For this purpose, a dataset was produced consisting of spring locations as indicator and groundwater-conditioning factors (GCFs) as input. In this research, 13 GCFs were selected including altitude, slope aspect, slope angle, plan curvature, profile curvature, topographic wetness index (TWI), slope length, distance from rivers and faults, rivers and faults density, land use, and lithology. The mentioned dataset was divided into two classes of training and validation with 70 and 30% of the springs, respectively. Then, C5.0, RF, and MARS algorithms were employed using R statistical software, and the final values were transformed into GPMs. Finally, two evaluation criteria including Kappa and area under receiver operating characteristics curve (AUC-ROC) were calculated. According to the findings of this research, MARS had the best performance with AUC-ROC of 84.2%, followed by RF and C5.0 algorithms with AUC-ROC values of 79.7 and 77.3%, respectively. The results indicated that AUC-ROC values for the employed models are more than 70% which shows their acceptable performance. As a conclusion, the produced methodology could be used in other geographical areas. GPMs could be used by water resource managers and related organizations to accelerate and facilitate water resource exploitation.
Estimation of genetic parameters of test day fat and protein yields in ...
African Journals Online (AJOL)
This study was aimed to estimate variance components and genetic parameters for daily fat and protein yields of Brazilian Holstein cattle, using an autoregressive test day multiple lactations (AR) animal model. Data consisted of test day (TD) records produced by Holstein cows under milk recording supervised by the ...
A Model for Quantifying Sources of Variation in Test-day Milk Yield ...
African Journals Online (AJOL)
A cow's test-day milk yield is influenced by several systematic environmental effects, which have to be removed when estimating the genetic potential of an animal. The present study quantified the variation due to test date and month of test in test-day lactation yield records using full and reduced models. The data consisted ...
Analysis of test day yield data of Costa Rican dairy cattle.
Vargas, B.; Perez, E.; Arendonk, van J.A.M.
1998-01-01
Estimates of variance components for test day records in an animal model that considered multiple traits over multiple lactations were calculated using REML methodology. Test day records were classified into 11 periods within first and later lactations. Missing ancestors in the relationship matrix
Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.
2003-01-01
(Co)variance components for milk yield, body condition score (BCS), body weight (BW), BCS change and BW change over different herd-year mean milk yields (HMY) and nutritional environments (concentrate feeding level, grazing severity and silage quality) were estimated using a random regression model.
L.R. Iverson; A.M. Prasad; A. Liaw
2004-01-01
More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-04-01
first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.
Hamidi, Omid; Tapak, Leili; Abbasi, Hamed; Maryanaji, Zohreh
2017-10-01
We have conducted a case study to investigate the performance of support vector machine, multivariate adaptive regression splines, and random forest time series methods in snowfall modeling. These models were applied to a data set of monthly snowfall collected during six cold months at Hamadan Airport sample station located in the Zagros Mountain Range in Iran. We considered monthly data of snowfall from 1981 to 2008 during the period from October/November to April/May as the training set and the data from 2009 to 2015 as the testing set. The root mean square errors (RMSE), mean absolute errors (MAE), determination coefficient (R 2), coefficient of efficiency (E%), and intra-class correlation coefficient (ICC) statistics were used as evaluation criteria. Our results indicated that the random forest time series model outperformed the support vector machine and multivariate adaptive regression splines models in predicting monthly snowfall in terms of several criteria. The RMSE, MAE, R 2, E, and ICC for the testing set were 7.84, 5.52, 0.92, 0.89, and 0.93, respectively. The overall results indicated that the random forest time series model could be successfully used to estimate monthly snowfall values. Moreover, the support vector machine model showed substantial performance as well, suggesting it may also be applied to forecast snowfall in this area.
Directory of Open Access Journals (Sweden)
Ibrahim Fayad
2014-11-01
Full Text Available Estimating forest canopy height from large-footprint satellite LiDAR waveforms is challenging given the complex interaction between LiDAR waveforms, terrain, and vegetation, especially in dense tropical and equatorial forests. In this study, canopy height in French Guiana was estimated using multiple linear regression models and the Random Forest technique (RF. This analysis was either based on LiDAR waveform metrics extracted from the GLAS (Geoscience Laser Altimeter System spaceborne LiDAR data and terrain information derived from the SRTM (Shuttle Radar Topography Mission DEM (Digital Elevation Model or on Principal Component Analysis (PCA of GLAS waveforms. Results show that the best statistical model for estimating forest height based on waveform metrics and digital elevation data is a linear regression of waveform extent, trailing edge extent, and terrain index (RMSE of 3.7 m. For the PCA based models, better canopy height estimation results were observed using a regression model that incorporated both the first 13 principal components (PCs and the waveform extent (RMSE = 3.8 m. Random Forest regressions revealed that the best configuration for canopy height estimation used all the following metrics: waveform extent, leading edge, trailing edge, and terrain index (RMSE = 3.4 m. Waveform extent was the variable that best explained canopy height, with an importance factor almost three times higher than those for the other three metrics (leading edge, trailing edge, and terrain index. Furthermore, the Random Forest regression incorporating the first 13 PCs and the waveform extent had a slightly-improved canopy height estimation in comparison to the linear model, with an RMSE of 3.6 m. In conclusion, multiple linear regressions and RF regressions provided canopy height estimations with similar precision using either LiDAR metrics or PCs. However, a regression model (linear regression or RF based on the PCA of waveform samples with waveform
Ibrahim Fayad; Nicolas Baghdadi; Jean-Stéphane Bailly; Nicolas Barbier; Valéry Gond; Mahmoud El Hajj; Frédéric Fabre; Bernard Bourgine
2014-01-01
International audience; Estimating forest canopy height from large-footprint satellite LiDAR waveforms is challenging given the complex interaction between LiDAR waveforms, terrain, and vegetation, especially in dense tropical and equatorial forests. In this study, canopy height in French Guiana was estimated using multiple linear regression models and the Random Forest technique (RF). This analysis was either based on LiDAR waveform metrics extracted from the GLAS (Geoscience Laser Altimeter...
Directory of Open Access Journals (Sweden)
Aderbal Cavalcante-Neto
2011-12-01
Full Text Available Objetivou-se comparar modelos de regressão aleatória com diferentes estruturas de variância residual, a fim de se buscar a melhor modelagem para a característica tamanho da leitegada ao nascer (TLN. Utilizaram-se 1.701 registros de TLN, que foram analisados por meio de modelo animal, unicaracterística, de regressão aleatória. As regressões fixa e aleatórias foram representadas por funções contínuas sobre a ordem de parto, ajustadas por polinômios ortogonais de Legendre de ordem 3. Para averiguar a melhor modelagem para a variância residual, considerou-se a heterogeneidade de variância por meio de 1 a 7 classes de variância residual. O modelo geral de análise incluiu grupo de contemporâneo como efeito fixo; os coeficientes de regressão fixa para modelar a trajetória média da população; os coeficientes de regressão aleatória do efeito genético aditivo-direto, do comum-de-leitegada e do de ambiente permanente de animal; e o efeito aleatório residual. O teste da razão de verossimilhança, o critério de informação de Akaike e o critério de informação bayesiano de Schwarz apontaram o modelo que considerou homogeneidade de variância como o que proporcionou melhor ajuste aos dados utilizados. As herdabilidades obtidas foram próximas a zero (0,002 a 0,006. O efeito de ambiente permanente foi crescente da 1ª (0,06 à 5ª (0,28 ordem, mas decrescente desse ponto até a 7ª ordem (0,18. O comum-de-leitegada apresentou valores baixos (0,01 a 0,02. A utilização de homogeneidade de variância residual foi mais adequada para modelar as variâncias associadas à característica tamanho da leitegada ao nascer nesse conjunto de dado.The objective of this work was to compare random regression models with different residual variance structures, so as to obtain the best modeling for the trait litter size at birth (LSB in swine. One thousand, seven hundred and one records of LSB were analyzed. LSB was analyzed by means of a
Directory of Open Access Journals (Sweden)
Francesca M. Sarti
2015-07-01
Full Text Available The Appenninica breed is an Italian meat sheep; the rams are approved according to a phenotypic index that is based on an average daily gain at performance test. The 8546 live weights of 1930 Appenninica male lambs tested in the performance station of the ASSONAPA (National Sheep Breeders Association, Italy from 1986 to 2010 showed a great variability in age at weighing and in number of records by year. The goal of the study is to verify the feasibility of the estimation of a genetic index for weight in the Appenninica sheep by a mixed model, and to explore the use of random regression to avoid the corrections for weighing at different ages. The heritability and repeatability (mean±SE of the average live weight were 0.27±0.04 and 0.54±0.08 respectively; the heritabilities of weights recorded at different weighing days ranged from 0.27 to 0.58, while the heritabilities of weights at different ages showed a narrower variability (0.29÷0.41. The estimates of live weight heritability by random regressions ranged between 0.34 at 123 d of age and 0.52 at 411 d. The results proved that the random regression model is the most adequate to analyse the data of Appenninica breed.
Cappio-Borlino, A; Portolano, B; Todaro, M; Macciotta, N P; Giaccone, P; Pulina, G
1997-11-01
Test day models were used to estimate the lactation curves for Valle del Belice ewes and to study the main environmental effects on milk yield and on percentage of fat and protein. Environmental effects were treated as fixed. A random effect was associated with each lactation to evaluate the mean correlation among all test day records of an individual ewe. Lactation curves were constructed by adding solutions for classes of either days in milk nested within parity or days in milk nested within season of lambing to appropriate general means. Parity primarily affected the lactation curve for milk yield, which was lower and flatter for first lambing ewes; effects on fat and protein were smaller. Season of lambing affected all traits. Seasonal productivity had the greatest effect on milk composition, resulting in an imbalance between fat and protein percentages. Flock and feed supplementation affected only the lactation curve for milk yield. The lactation curve of Valle del Belice ewes stood at a relatively high level. However, the presence of notable, perturbative effects (environmental and random variation) on milk yield and composition suggests that management is unable to meet the requirements of ewes consistently.
Qi-ling Yuan; Peng Wang; Liang Liu; Fu Sun; Yong-song Cai; Wen-tao Wu; Mao-lin Ye; Jiang-tao Ma; Bang-bang Xu; Yin-gang Zhang
2016-01-01
The aims of this systematic review were to study the analgesic effect of real acupuncture and to explore whether sham acupuncture (SA) type is related to the estimated effect of real acupuncture for musculoskeletal pain. Five databases were searched. The outcome was pain or disability immediately (?1 week) following an intervention. Standardized mean differences (SMDs) with 95% confidence intervals were calculated. Meta-regression was used to explore possible sources of heterogeneity. Sixty-t...
2014-09-18
when 3 or more images were used for classification. All three easily outperform the Naive Bayes Maximum Likelihood method. In [85], the Random Forest is...Maximum Likelihood (MDA/ML) . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5.1.2 Naive Bayes . . . . . . . . . . . . . . . . . . . . . . . . 24...probabilities [59], log-likelihood is used here. Given a known distribution, ML provides optimal classification performance [125]. 2.5.1.2 Naive Bayes If N
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2016-01-01
Cluster-level dynamic treatment regimens can be used to guide sequential, intervention or treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level DTR, the intervention or treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including based on aggregate measures of the individuals or patients that comprise it. Cluster-randomized sequentia...
Yuan, Qi-Ling; Wang, Peng; Liu, Liang; Sun, Fu; Cai, Yong-Song; Wu, Wen-Tao; Ye, Mao-Lin; Ma, Jiang-Tao; Xu, Bang-Bang; Zhang, Yin-Gang
2016-07-29
The aims of this systematic review were to study the analgesic effect of real acupuncture and to explore whether sham acupuncture (SA) type is related to the estimated effect of real acupuncture for musculoskeletal pain. Five databases were searched. The outcome was pain or disability immediately (≤1 week) following an intervention. Standardized mean differences (SMDs) with 95% confidence intervals were calculated. Meta-regression was used to explore possible sources of heterogeneity. Sixty-three studies (6382 individuals) were included. Eight condition types were included. The pooled effect size was moderate for pain relief (59 trials, 4980 individuals, SMD -0.61, 95% CI -0.76 to -0.47; P meta-regression model, sham needle location and/or depth could explain most or all heterogeneities for some conditions (e.g., shoulder pain, low back pain, osteoarthritis, myofascial pain, and fibromyalgia); however, the interactions between subgroups via these covariates were not significant (P < 0.05). Our review provided low-quality evidence that real acupuncture has a moderate effect (approximate 12-point reduction on the 100-mm visual analogue scale) on musculoskeletal pain. SA type did not appear to be related to the estimated effect of real acupuncture.
Directory of Open Access Journals (Sweden)
Luis Gabriel González Herrera
2008-09-01
of Gyr cows calving between 1990 and 2005 were used to estimate genetic parameters of monthly test-day milk yield (TDMY. Records were analyzed by random regression models (MRA that included the additive genetic and permanent environmental random effects and the contemporary group, age of cow at calving (linear and quadratic components and the average trend of the population as fixed effects. Random trajectories were fitted by Wilmink's (WIL and Ali & Schaeffer's (AS parametric functions. Residual variances were fitted by step functions with 1, 4, 6 or 10 classes. The contemporary group was defined by herd-year-season of test-day and included at least three animals. Models were compared by Akaike's and Schwarz's Bayesian (BIC information criterion. The AS function used for modeling the additive genetic and permanent environmental effects with heterogeneous residual variances adjusted with a step function with four classes was the best fitted model. Heritability estimates ranged from 0.21 to 0.33 for the AS function and from 0.17 to 0.30 for WIL function and were larger in the first half of the lactation period. Genetic correlations between TDMY were high and positive for adjacent test-days and decreased as days between records increased. Predicted breeding values for total 305-day milk yield (MRA305 and specific periods of lactation (obtained by the mean of all breeding values in the periods using the AS function were compared with that predicted by a standard model using accumulated 305-day milk yield (PTA305 by rank correlation. The magnitude of correlations suggested differences may be observed in ranking animals by using the different criteria which were compared in this study.
Gómez, M D; Menendez-Buxadera, A; Valera, M; Molina, A
2010-10-01
A total of 71 522 records (from 3154 horses) with the times per kilometre (TPK), recorded in Spanish Trotter horses (individual races) from racing performances held from 1991 to 2007, were available for this study. The TPK values for the different age groups (young and adult horses) and different distances (1600-2700 m) were considered as different traits, and a bi character random regression model (RRM) was applied to estimate the (co)variance components throughout the trajectory of age groups and distances. The following effects were considered as fixed: the combination of hippodrome-date of race (404 levels); sex of the animals (3 levels); type of start (2 levels) and a fixed regression of Legendre polynomials (order 2). Those considered as random effects were the random regression Legendre polynomial (order 1) for animals (9201 animals in the pedigree); the individual environment permanent (3154 animals with data) and the driver (n = 957 levels). The residual variance was considered as heterogeneous with two classes (ages). The heritability estimated by distance ranged from 0.12 to 0.34, with a different trajectory for the two age groups. Within each age group, the genetic correlations between adjacent distances were high (>0.90), but decreased when the differences between them were over 400 metres for both age groups. The genetic correlations for the same distance across the age groups ranged from 0.47 to 0.78. Accordingly, the analysed trait (TPK) can be considered as positive genetic correlated but as different traits along the trajectory of distance and age. Therefore, some re-ranking should be expected in the breeding value of the horses at different characteristics of the racing. The use of RRM is recommended because it allows us to estimate the breeding value along the whole trajectory of race competition. Copyright 2010 Blackwell Verlag GmbH.
Genetic analysis of Test Day Milk Yields of Brown Swiss cattle raised ...
African Journals Online (AJOL)
A total of 3696 Test Day Milk Yield (TDMY) records of Brown Swiss cows raised at Konuklar State Farm in the Konya Province of Turkey were used for estimating phenotypic and genetic parameters for TDMY. The phenotypic and genetic parameters were estimated by an MTDFREML programme using a Single Trait Animal ...
Test-day models for South African dairy cattle for participation in ...
African Journals Online (AJOL)
Variance components and breeding values of production traits and somatic cell score of South African Guernsey, Ayrshire, Holstein and Jersey breeds have been estimated using a multi-lactation repeatability test-day model, including tests of the first three lactations as repeated measures and fitting the permanent ...
Genetic parameters for test-day milk yield in tropical Holstein ...
African Journals Online (AJOL)
Accurate estimates of genetic parameters are essential for genetic improvement of milk yield in dairy cattle and for setting up breeding programmes. Estimates of genetic parameters from test-day models, particularly for Holstein Friesian cattle maintained in tropical environments, are scant in the literature. The objective of ...
Portolano, B.; Maizon, D.O.; Riggio, V.; Tolone, M.; Cacioppo, D.
2007-01-01
The aims of the present study were to compare estimated breeding values (EBV) for milk yield using different testing schemes with a test-day animal model and to evaluate the effect of different testing schemes on the ranking of top sheep. Alternative recording schemes that use less information than
Strategic test-day recording regimes to estimate lactation yield in tropical dairy animals
McGill, D.M.; Thomson, P.C.; Mulder, H.A.; Lievaart, J.J.
2014-01-01
Background In developing dairy sectors, genetic improvement programs have limited resources and recording of herds is minimal. This study evaluated different methods to estimate lactation yield and sampling schedules with fewer test-day records per lactation to determine recording regimes that (1)
Directory of Open Access Journals (Sweden)
Bruno Bastos Teixeira
2012-09-01
Full Text Available Objetivou-se comparar diferentes modelos de regressão aleatória por meio de funções polinomiais de Legendre de diferentes ordens, para avaliar o que melhor se ajusta ao estudo genético da curva de crescimento de codornas de corte. Foram avaliados dados de 2136 matrizes de codorna de corte, dos quais 1026 pertenciam ao grupo genético UFV1 e 1110 ao grupo UFV2. As codornas foram pesadas nos 1°, 7°, 14°, 21°, 28°, 35°, 42°, 77°, 112° e 147° dias de idade e seus pesos utilizados para a análise. Foram testadas duas possíveis modelagens de variância residual heterogênea, sendo agrupadas em 3 e 5 classes de idade. Após, foi realizado o estudo do modelo de regressão aleatória que melhor aplica-se à curva de crescimento das codornas. A comparação entre os modelos foi feita pelo Critério de Informação de Akaike (AIC, Critério de Informação Bayesiano de Schwarz (BIC, Logaritmo da função de verossimilhança (Log e L e teste da razão de verossimilhança (LRT, ao nível de 1%. O modelo que considerou a heterogeneidade de variância residual CL3 mostrou-se adequado à linhagem UFV1, e o modelo CL5 à linhagem UFV2. Uma função polinomial de Legendre com ordem 5, para efeito genético aditivo direto e 5 para efeito permanente de animal, para a linhagem UFV1 e, com ordem 3, para efeito genético aditivo direto e 5 para efeito permanente de animal para a linhagem UFV2, deve ser utilizada na avaliação genética da curva de crescimento das codornas de corte.The objective was to compare different random regression models using Legendre polynomial functions of different orders, to evaluate what best fits the genetic study of the growth curve of meat quails. It was evaluated data from 2136 cut dies quail, of which 1026 belonged to genetic group UFV1 and 1110 the group UFV2. Quail were weighed at 10, 70, 140, 210, 280, 350, 420, 770, 1120 and 1470 days of age, and weights used for the analysis. It was tested two possible modeling
Milner, Allison; LaMontagne, Anthony D
2017-05-01
Underemployment occurs when workers are available for more hours of work than offered. It is a serious problem in many Organisation for Economic Co-operation and Development (OECD) countries, and particularly in Australia, where it affects about 8% of the employed population. This paper seeks to answer the question: does an increase in underemployment have an influence on mental health? The current paper uses data from an Australian cohort of working people (2001-2013) to investigate both within-person and between-person differences in mental health associated with being underemployed compared with being fully employed. The main exposure was underemployment (not underemployed, underemployed 1-5, 6-10, 11-20 and over 21 hours), and the outcome was the five-item Mental Health Inventory. Results suggest that stepwise declines in mental health are associated with an increasing number of hours underemployed. Results were stronger in the random-effects (11-20 hours =-1.53, 95% CI -2.03 to -1.03, p<0.001; 21 hours and over -2.24, 95% CI -3.06 to -1.43, p<0.001) than fixed-effects models (11-20 hours =-1.11, 95% CI -1.63 to -0.58, p<0.001; 21 hours and over -1.19, 95% CI -2.06 to -0.32, p=0.008). This likely reflects the fact that certain workers were more likely to suffer the negative effects of underemployment than others (eg, women, younger workers, workers in lower-skilled jobs and who were casually employed). We suggest underemployment to be a target of future workplace prevention strategies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Ikeda, Koji; Takahashi, Tomosaburo; Yamada, Hiroyuki; Matsui, Kiyoaki; Sawada, Takahisa; Nakamura, Takashi; Matsubara, Hiroaki
2013-12-01
Atherosclerosis often advances before symptoms appear. It remains uncertain whether intensive cholesterol-lowering therapy with statin is beneficial when compared with moderate cholesterol-lowering therapy in patients with subclinical carotid atherosclerosis. The PEACE study was a prospective, randomized, open-labeled, blinded end points, two-arm parallel treatment group comparison study conducted at 15 centers in Japan. A total of 303 patients with carotid intima-media thickness (CIMT) thickening (>1.1 mm) whose low-density lipoprotein cholesterol (LDL-C) level was more than 100 mg/dl were enrolled, in which 223 patients completed the 12 months' follow-up study. Patients were randomly assigned to receive either moderate (target LDL-C; 100 mg/dl) or intensive (target LDL-C; 80 mg/dl) cholesterol-lowering therapy with pitavastatin. The primary end point was the change in mean far wall common CIMT. LDL-C level declined to 89.4 ± 20 mg/dl in the intensive group, while it declined to 95.1 ± 22.5 mg/dl in the moderate group at 12 months' follow-up (p confidence interval -0.046 to -0.0014) mm/year (p confidence interval -0.028 to 0.012) mm/year (p = 0.4406 vs. baseline) in the moderate group. However, there was no significant difference in the change in mean far wall common CIMT between the groups (p = 0.29). Intensive cholesterol-lowering therapy did not show superior effects on the progression of CIMT to moderate cholesterol-lowering therapy, whereas only intensive cholesterol-lowering therapy regressed the carotid atherosclerosis over one year.
Rotondi, Michael Anthony; Khobzi, Nooshin
2010-09-01
To assess the relationship between the prevalence of vitamin A deficiency among pregnant women and the effect of neonatal vitamin A supplementation on infant mortality. Studies of neonatal supplementation with vitamin A have yielded contradictory findings with regard to its effect on the risk of infant death, possibly owing to heterogeneity between studies. One source of that heterogeneity is the prevalence of vitamin A deficiency among pregnant women, which we examined using meta-regression techniques on eligible individual and cluster-randomized trials. Adapting standard techniques to control for the inclusion of a cluster-randomized trial, we modelled the logarithm of the relative risk of infant death comparing vitamin A supplementation at birth to a standard treatment, as a linear function of the prevalence of vitamin A deficiency in pregnant women. Meta-regression analysis revealed a statistically significant linear relationship between the prevalence of vitamin A deficiency in pregnant women and the observed effectiveness of vitamin A supplementation at birth. In regions where at least 22% of pregnant women have vitamin A deficiency, giving neonates vitamin A supplements will have a protective effect against infant death. A meta-regression analysis is observational in nature and may suffer from confounding bias. Nevertheless, our study suggests that vitamin A supplementation can reduce infant mortality in regions where this micronutrient deficiency is common. Thus, neonatal supplementation programmes may prove most beneficial in regions where the prevalence of vitamin A deficiency among pregnant women is high.
Vargas, Maria; Chiumello, Davide; Sutherasan, Yuda; Ball, Lorenzo; Esquinas, Antonio M; Pelosi, Paolo; Servillo, Giuseppe
2017-05-29
The aims of this systematic review and meta-analysis of randomized controlled trials are to evaluate the effects of active heated humidifiers (HHs) and moisture exchangers (HMEs) in preventing artificial airway occlusion and pneumonia, and on mortality in adult critically ill patients. In addition, we planned to perform a meta-regression analysis to evaluate the relationship between the incidence of artificial airway occlusion, pneumonia and mortality and clinical features of adult critically ill patients. Computerized databases were searched for randomized controlled trials (RCTs) comparing HHs and HMEs and reporting artificial airway occlusion, pneumonia and mortality as predefined outcomes. Relative risk (RR), 95% confidence interval for each outcome and I 2 were estimated for each outcome. Furthermore, weighted random-effect meta-regression analysis was performed to test the relationship between the effect size on each considered outcome and covariates. Eighteen RCTs and 2442 adult critically ill patients were included in the analysis. The incidence of artificial airway occlusion (RR = 1.853; 95% CI 0.792-4.338), pneumonia (RR = 932; 95% CI 0.730-1.190) and mortality (RR = 1.023; 95% CI 0.878-1.192) were not different in patients treated with HMEs and HHs. However, in the subgroup analyses the incidence of airway occlusion was higher in HMEs compared with HHs with non-heated wire (RR = 3.776; 95% CI 1.560-9.143). According to the meta-regression, the effect size in the treatment group on artificial airway occlusion was influenced by the percentage of patients with pneumonia (β = -0.058; p = 0.027; favors HMEs in studies with high prevalence of pneumonia), and a trend was observed for an effect of the duration of mechanical ventilation (MV) (β = -0.108; p = 0.054; favors HMEs in studies with longer MV time). In this meta-analysis we found no superiority of HMEs and HHs, in terms of artificial airway occlusion, pneumonia and
Jackson, Dan; White, Ian R; Riley, Richard D
2013-03-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
Gould, Rebecca L; Coulson, Mark C; Howard, Robert J
2012-02-01
To review the magnitude and duration of and factors associated with effects of cognitive behavioral therapy (CBT) for anxiety disorders in older people. Electronic literature databases and the Cochrane Trials Registry were searched for articles. A systematic critical review, random-effects meta-analysis, and meta-regression of randomized controlled trials were conducted. Community outpatient clinics. People with diagnoses of anxiety disorders. Outcome measures of anxiety and depression. Twelve studies were included. CBT was significantly more effective than treatment as usual or being on a waiting list at reducing anxiety symptoms at 0-month follow-up, with the effect size being moderate, but when CBT was compared with an active control condition, the between-group difference in favor of CBT was not statistically significant, and the effect size was small. At 6- but not 3- or 12-month follow-up, CBT was significantly more effective at reducing anxiety symptoms than an active control condition, although the effect size was again small. Meta-regression analyses revealed only one factor (type of control group) to be significantly associated with the magnitude of effect sizes. The review confirms the effectiveness of CBT for anxiety disorders in older people but is suggestive of lower efficacy in older than working-age people. The small effect sizes in favor of CBT over an active control condition illustrate the need to investigate other treatment approaches that may be used to substitute or augment CBT to increase the effectiveness of treatment of anxiety disorders in older people. © 2012, Copyright the Authors Journal compilation © 2012, The American Geriatrics Society.
Test-day records as a tool for subclinical ketosis detection
Gantner Vesna; Potočnik K.; Jovanovac Sonja
2009-01-01
The prevalence, as well as the effect of subclinical ketosis on daily milk yield, was observed using 1.299,630 test-day records collected from January 2000 to December 2005 on 73,255 Slovenian Holstein cows. Subclinical ketosis was indicated by the fat to protein ratio (F/P ratio) higher than 1.5 in cows that yielded between 33 to 50 kg of milk per day (Eicher, 2004). The ketosis index was defined in relation to the timing of subclinical ketosis detection to the subsequent measures of test-da...
Effects of the DGAT1 polymorphism on test-day milk production traits throughout lactation
DEFF Research Database (Denmark)
Bovenhuis, Henk; Visker, H P W; van Valenberg, H J F
2015-01-01
Several studies have shown that the diacylglycerol O-acyltransferase 1 (DGAT1) K232A polymorphism has a major effect on milk production traits. It is less clear how effects of DGAT1 on milk production traits change throughout lactation, if dominance effects of DGAT1 are relevant, and whether DGAT1...... also affects lactose content, lactose yield, and total energy output in milk. Results from this study, using test-day records of 3 subsequent parities of around 1,800 cows, confirm previously reported effects of the DGAT1 polymorphism on milk, fat, and protein yield, as well as fat and protein content...
Estimation of genetic parameters for test day records of dairy traits in the first three lactations
Directory of Open Access Journals (Sweden)
Ducrocq Vincent
2005-05-01
Full Text Available Abstract Application of test-day models for the genetic evaluation of dairy populations requires the solution of large mixed model equations. The size of the (covariance matrices required with such models can be reduced through the use of its first eigenvectors. Here, the first two eigenvectors of (covariance matrices estimated for dairy traits in first lactation were used as covariables to jointly estimate genetic parameters of the first three lactations. These eigenvectors appear to be similar across traits and have a biological interpretation, one being related to the level of production and the other to persistency. Furthermore, they explain more than 95% of the total genetic variation. Variances and heritabilities obtained with this model were consistent with previous studies. High correlations were found among production levels in different lactations. Persistency measures were less correlated. Genetic correlations between second and third lactations were close to one, indicating that these can be considered as the same trait. Genetic correlations within lactation were high except between extreme parts of the lactation. This study shows that the use of eigenvectors can reduce the rank of (covariance matrices for the test-day model and can provide consistent genetic parameters.
Analyses of fixed effects for genetic evaluation of dairy cattle using test day records in Indonesia
Directory of Open Access Journals (Sweden)
Asep Anang
2010-06-01
Full Text Available Season, rainfall, day of rain, temperature, humidity, year and farm are fixed effects, which have been reported to influence milk yield. Those factors are often linked together to contribute to the variation of milk production. This research is addressed to study the fixed effect factors, including lactation curve, which should be considered for genetic evaluation of milk yield based on test day records of dairy cattle. The data were taken from four different farms, which were PT. Taurus Dairy Farm, BPPT Cikole, Bandang Dairy Farm, and BBPTU Baturraden. In total of 16806 test day records were evaluated, consisting of 9,302 at first and 7,504 at second lactation, respectively. The results indicated that fixed effects were very specific and the influences had different patterns for each farm. Consequently, in a genetic evaluation, these factors such as lactation, temperature, year, day of rain, and humidity need to be evaluated first. Ali-Schaeffer curve represented the most appropriate curve to use in the genetic evaluation of dairy cattle in Indonesia.
Kahane, Leo H
2007-01-01
Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:
Meyskens, F L; Surwit, E; Moon, T E; Childers, J M; Davis, J R; Dorr, R T; Johnson, C S; Alberts, D S
1994-04-06
Retinoids enhance differentiation of most epithelial tissues. Epidemiologic studies have shown an inverse relationship between dietary intake or serum levels of vitamin A and the development of cervical dysplasia and/or cervical cancer. Pilot and phase I investigations demonstrated the feasibility of the local delivery of all-trans-retinoic acid (RA) to the cervix using a collagen sponge insert and cervical cap. A phase II trial produced a clinical complete response rate of 50%. This randomized phase III trial was designed to determine whether topically applied RA reversed moderate cervical intraepithelial neoplasia (CIN) II or severe CIN. Analyses were based on 301 women with CIN (moderate dysplasia, 151 women; severe dysplasia, 150 women), evaluated by serial colposcopy, Papanicolaou cytology, and cervical biopsy. Cervical caps with sponges containing either 1.0 mL of 0.372% beta-trans-RA or a placebo were inserted daily for 4 days when women entered the trial, and for 2 days at months 3 and 6. Patients receiving treatment and those receiving placebo were similar with respect to age, ethnicity, birth-control methods, histologic features of the endocervical biopsy specimen and koilocytotic atypia, and percentage of involvement of the cervix at study. Treatment effects were compared using Fisher's exact test and logistic regression methods. Side effects were recorded, and differences were compared using Fisher's exact test. RA increased the complete histologic regression rate of CIN II from 27% in the placebo group to 43% in the retinoic acid treatment group (P = .041). No treatment difference between the two arms was evident in the severe dysplasia group. More vaginal and vulvar side effects were seen in the patients receiving RA, but these effects were mild and reversible. A short course of locally applied RA can reverse CIN II, but not more advanced dysplasia, with acceptable local side effects. A derivative of vitamin A can reverse or suppress an epithelial
Brokamp, Cole; Jandarov, Roman; Rao, M. B.; LeMasters, Grace; Ryan, Patrick
2017-02-01
Exposure assessment for elemental components of particulate matter (PM) using land use modeling is a complex problem due to the high spatial and temporal variations in pollutant concentrations at the local scale. Land use regression (LUR) models may fail to capture complex interactions and non-linear relationships between pollutant concentrations and land use variables. The increasing availability of big spatial data and machine learning methods present an opportunity for improvement in PM exposure assessment models. In this manuscript, our objective was to develop a novel land use random forest (LURF) model and compare its accuracy and precision to a LUR model for elemental components of PM in the urban city of Cincinnati, Ohio. PM smaller than 2.5 μm (PM2.5) and eleven elemental components were measured at 24 sampling stations from the Cincinnati Childhood Allergy and Air Pollution Study (CCAAPS). Over 50 different predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources were used to construct LUR and LURF models. Cross validation was used to quantify and compare model performance. LURF and LUR models were created for aluminum (Al), copper (Cu), iron (Fe), potassium (K), manganese (Mn), nickel (Ni), lead (Pb), sulfur (S), silicon (Si), vanadium (V), zinc (Zn), and total PM2.5 in the CCAAPS study area. LURF utilized a more diverse and greater number of predictors than LUR and LURF models for Al, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all showed a decrease in fractional predictive error of at least 5% compared to their LUR models. LURF models for Al, Cu, Fe, K, Mn, Pb, Si, Zn, TRAP, and PM2.5 all had a cross validated fractional predictive error less than 30%. Furthermore, LUR models showed a differential exposure assessment bias and had a higher prediction error variance. Random forest and other machine learning methods may provide more accurate exposure assessment.
Directory of Open Access Journals (Sweden)
Narjes Khavasi
2017-12-01
Full Text Available Purpose: Despite numerous studies on the effects of complementary medicine, to our knowledge, there is no study on the effects of Capparis spinosa on disease regression in non-alcoholic fatty liver disease (NAFLD patients. We compared the effects of caper fruit pickle consumption, as an Iranian traditional medicine product, on the anthropometric measures and biochemical parameters in different NAFLD patients. Methods: A 12-weeks randomized, controlled, double-blind trial was designed in 44 NAFLD patients randomly categorized for the control (n=22 or caper (n=22. The caper group received 40-50 gr of caper fruit pickles with meals daily. Before and after treatment, we assessed anthropometric measures, grade of fatty liver, serum lipoproteins and liver enzymes. Results: Weight and BMI were significantly decreased in the caper (p<0.001 and p<0.001 and control group (p=0.001 and p=0.001, respectively. Serum TG, TC and LDL.C just were significantly decreased in the control group (p=0.01, p<0.001 and p<0.001, respectively. Adjusted to the baseline measures, serum ALT and AST reduction were significantly higher in the caper than control group from baseline up to the end of the study (p<0.001 and p=0.02, respectively. After weeks 12, disease severity was significantly decreased in the caper group (p <0.001. Conclusion: Our results suggest that daily caper fruit pickle consumption for 12 weeks may be potentially effective on improving the biochemical parameters in NAFLD patients. Further, additional larger controlled trials are needed for the verification of these results.
Mota, L F M; Martins, P G M A; Littiere, T O; Abreu, L R A; Silva, M A; Bonafé, C M
2017-08-14
The objective was to estimate (co)variance functions using random regression models (RRM) with Legendre polynomials, B-spline function and multi-trait models aimed at evaluating genetic parameters of growth traits in meat-type quail. A database containing the complete pedigree information of 7000 meat-type quail was utilized. The models included the fixed effects of contemporary group and generation. Direct additive genetic and permanent environmental effects, considered as random, were modeled using B-spline functions considering quadratic and cubic polynomials for each individual segment, and Legendre polynomials for age. Residual variances were grouped in four age classes. Direct additive genetic and permanent environmental effects were modeled using 2 to 4 segments and were modeled by Legendre polynomial with orders of fit ranging from 2 to 4. The model with quadratic B-spline adjustment, using four segments for direct additive genetic and permanent environmental effects, was the most appropriate and parsimonious to describe the covariance structure of the data. The RRM using Legendre polynomials presented an underestimation of the residual variance. Lesser heritability estimates were observed for multi-trait models in comparison with RRM for the evaluated ages. In general, the genetic correlations between measures of BW from hatching to 35 days of age decreased as the range between the evaluated ages increased. Genetic trend for BW was positive and significant along the selection generations. The genetic response to selection for BW in the evaluated ages presented greater values for RRM compared with multi-trait models. In summary, RRM using B-spline functions with four residual variance classes and segments were the best fit for genetic evaluation of growth traits in meat-type quail. In conclusion, RRM should be considered in genetic evaluation of breeding programs.
Brunetti, Natale Daniele; De Gennaro, Luisa; Correale, Michele; Santoro, Francesco; Caldarola, Pasquale; Gaglione, Antonio; Di Biase, Matteo
2017-04-01
A shorter time to treatment has been shown to be associated with lower mortality rates in acute myocardial infarction (AMI). Several strategies have been adopted with the aim to reduce any delay in diagnosis of AMI: pre-hospital triage with telemedicine is one of such strategies. We therefore aimed to measure the real effect of pre-hospital triage with telemedicine in case of AMI in a meta-analysis study. We performed a meta-analysis of non-randomized studies with the aim to quantify the exact reduction of time to treatment achieved by pre-hospital triage with telemedicine. Data were pooled and compared by relative time reduction and 95% C.I.s. A meta-regression analysis was performed in order to find possible predictors of shorter time to treatment. Eleven studies were selected and finally evaluated in the study. The overall relative reduction of time to treatment with pre-hospital triage and telemedicine was -38/-40% (ptriage with telemedicine is associated with a near halved time to treatment in AMI. The benefit is larger in terms of absolute time to treatment reduction in populations with larger delays to treatment. Copyright Â© 2017 Elsevier B.V. All rights reserved.
Jamrozik, J; Schaeffer, L R
2012-02-01
Test-day (TD) records of milk, fat-to-protein ratio (F:P) and somatic cell score (SCS) of first-lactation Canadian Holstein cows were analysed by a three-trait finite mixture random regression model, with the purpose of revealing hidden structures in the data owing to putative, sub-clinical mastitis. Different distributions of the data were allowed in 30 intervals of days in milk (DIM), covering the lactation from 5 to 305 days. Bayesian analysis with Gibbs sampling was used for model inferences. Estimated proportion of TD records originated from cows infected with mastitis was 0.66 in DIM from 5 to 15 and averaged 0.2 in the remaining part of lactation. Data from healthy and mastitic cows exhibited markedly different distributions, with respect to both average value and the variance, across all parts of lactation. Heterogeneity of distributions for infected cows was also apparent in different DIM intervals. Cows with mastitis were characterized by smaller milk yield (down to -5 kg) and larger F:P (up to 0.13) and SCS (up to 1.3) compared with healthy contemporaries. Differences in averages between healthy and infected cows for F:P were the most profound at the beginning of lactation, when a dairy cow suffers the strongest energy deficit and is therefore more prone to mammary infection. Residual variances for data from infected cows were substantially larger than for the other mixture components. Fat-to-protein ratio had a significant genetic component, with estimates of heritability that were larger or comparable with milk yield, and was not strongly correlated with milk and SCS on both genetic and environmental scales. Daily milk, F:P and SCS are easily available from milk-recording data for most breeding schemes in dairy cattle. Fat-to-protein ratio can potentially be a valuable addition to SCS and milk yield as an indicator trait for selection against mastitis. © 2011 Blackwell Verlag GmbH.
Negussie, E; Strandén, I; Mäntysaari, E A
2013-02-01
Interest is growing in finding indicator traits for the evaluation of nutritional or tissue energy status of animals directly at the individual animal level. The development and subsequent use of such traits in practice demands a clear understanding of the genetic and phenotypic associations with the various production and functional traits. In this study, the relationships during lactation between milk fat:protein ratio (FPR) and production and functional traits were estimated for Nordic Red cattle, in which published information is scarce. The objectives of this study were to estimate genetic associations of FPR with milk yield (MY), fertility, and udder health traits during different stages of lactation. Traits included in the analyses were MY, 4 fertility traits-days from calving to insemination (DFI), days open (DO), number of inseminations (NI), and nonreturn rate to 56 d (NRR)-and 2 udder health traits-test-day somatic cell score (SCS) and clinical mastitis (CM). Data were from a total of 22,422 first-lactation cows. Random regression models were used to estimate genetic parameters and associations between traits. The mean FPR in first-lactation cows was 1.28 and ranged from 1.25 to 1.45. During first lactation, the heritability of FPR ranged from 0.14 to 0.25. Genetic correlations between FPR and MY in early lactation (until 50 d in milk) were positive and ranged from 0.05 to 0.22; later in lactation, they were close to zero or negative, indicating that cows may have come out of the negative state of energy balance. The strength of genetic associations between FPR and fertility traits varied during lactation. In early lactation, correlations between FPR and the interval fertility traits DFI and DO were positive and ranged from 0.14 to 0.28. Genetic correlations between FPR and the udder health traits SCS and CM in early lactation ranged from 0.09 to 0.20. Milk fat:protein ratio is a heritable trait and easily available from routine milk-recording schemes
Sayyah-Melli, M; Mobasseri, M; Gharabaghi, P M; Ouladsahebmadarek, E; Rahmani, V
2017-03-01
To evaluate the effect of letrozole in combination with cabergoline and letrozole alone on regression of symptomatic uterine myomas in women of reproductive age. Randomized controlled clinical trial. University hospital. Ninety-one women of reproductive age were enrolled in the study and 88 women were eligible. Eight participants were excluded from the study. Eighty women of reproductive age with symptomatic myomas >4cm were evaluated in two groups. Participants in Group 1 received 2.5mg letrozole once daily and cabergoline 0.5mg/week from the first day of the menstrual cycle for 12 weeks, and participants in Group 2 received letrozole alone. Changes in uterine size and volume; myoma size, volume and number; and side effects of treatment. Overall, 76 patients completed the study. Compared with baseline values, mean uterine volume was reduced significantly in both groups (p=0.01), and there was no significant difference between groups (p=0.99). The mean number of dominant myomas was reduced significantly in both groups (p=0.03), with no significant difference between groups (p=0.6). The mean volume of myomas was reduced significantly in both groups (p=0.01), with no significant difference between groups (p=0.45). Although a significant decrease in number and volume of myomas was documented in each group (pcabergoline group (nine vs two cases, p=0.02), but the two groups were comparable for the remaining minor side effects. This study showed that 12 weeks of treatment with letrozole with and without cabergoline improved the size and volume of the uterus and myomas, led to symptom improvement, and could be used for short-term treatment prior to surgery or fertility programmes. Condensation letrozole in combination with cabergoline in the management of uterine fibroids. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Directory of Open Access Journals (Sweden)
Igor K. Kochanenko
2013-01-01
Full Text Available Procedures of construction of curve regress by criterion of the least fractals, i.e. the greatest probability of the sums of degrees of the least deviations measured intensity from their modelling values are proved. The exponent is defined as fractal dimension of a time number. The difference of results of a well-founded method and a method of the least squares is quantitatively estimated.
Effects of the DGAT1 polymorphism on test-day milk production traits throughout lactation.
Bovenhuis, H; Visker, M H P W; van Valenberg, H J F; Buitenhuis, A J; van Arendonk, J A M
2015-09-01
Several studies have shown that the diacylglycerol O-acyltransferase 1 (DGAT1) K232A polymorphism has a major effect on milk production traits. It is less clear how effects of DGAT1 on milk production traits change throughout lactation, if dominance effects of DGAT1 are relevant, and whether DGAT1 also affects lactose content, lactose yield, and total energy output in milk. Results from this study, using test-day records of 3 subsequent parities of around 1,800 cows, confirm previously reported effects of the DGAT1 polymorphism on milk, fat, and protein yield, as well as fat and protein content. In addition, we found significant effects of the DGAT1 polymorphism on lactose content and lactose yield. No significant effects on somatic cell score were detected. The effect of DGAT1 on total energy excreted in milk was only significant in parity 1 and is mainly due to a higher energy output in milk of heterozygous cows. Significant but relatively small dominance effects of DGAT1 on fat content and yield were detected, which are of little practical relevance. Significant DGAT1 by lactation stage interaction was detected for milk yield, lactose yield, fat content, and protein content, indicating that the effect of the DGAT1 polymorphism changes during lactation. In general, the DGAT1 effect shows a large increase during early lactation (from the start of lactation to d 50 to 150) and tends to decrease later in lactation. No DGAT1 by lactation stage interaction for fat yield was observed. Similar to DGAT1, effects of other genes also might vary throughout lactation and, therefore, using longitudinal models is recommended. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Esposito, Carlo; Barra, Anna; Evans, Stephen G.; Scarascia Mugnozza, Gabriele; Delaney, Keith
2014-05-01
The study of landslide susceptibility by multivariate statistical methods is based on finding a quantitative relationship between controlling factors and landslide occurrence. Such studies have become popular in the last few decades thanks to the development of geographic information systems (GIS) software and the related improved data management. In this work we applied a statistical approach to an area of high landslide susceptibility mainly due to its tropical climate and geological-geomorphological setting. The study area is located in the south-east region of Brazil that has frequently been affected by flood and landslide hazard, especially because of heavy rainfall events during the summer season. In this work we studied a disastrous event that occurred on January 11th and 12th of 2011, which involved Região Serrana (the mountainous region of Rio de Janeiro State) and caused more than 5000 landslides and at least 904 deaths. In order to produce susceptibility maps, we focused our attention on an area of 93,6 km2 that includes Nova Friburgo city. We utilized two different multivariate statistic methods: Logistic Regression (LR), already widely used in applied geosciences, and Random Forest (RF), which has only recently been applied to landslide susceptibility analysis. With reference to each mapping unit, the first method (LR) results in a probability of landslide occurrence, while the second one (RF) gives a prediction in terms of % of area susceptible to slope failure. With this aim in mind, a landslide inventory map (related to the studied event) has been drawn up through analyses of high-resolution GeoEye satellite images, in a GIS environment. Data layers of 11 causative factors have been created and processed in order to be used as continuous numerical or discrete categorical variables in statistical analysis. In particular, the logistic regression method has frequent difficulties in managing numerical continuous and discrete categorical variables
Posavac, Steven S.; Posavac, Emil J.
2017-01-01
The authors describe the Pennies for Milk exercise, a participative classroom experience in which students generate a regression to the mean effect within the context of simulated household milk purchases. Regression to the mean is a ubiquitous threat for marketing researchers and managers but is often hard for students to understand. The Pennies…
Use of test day milk fat and milk protein to detect subclinical ketosis in dairy cattle in Ontario.
Duffield, T F; Kelton, D F; Leslie, K E; Lissemore, K D; Lumsden, J H
1997-01-01
Serum beta-hydroxybutyrate (BHB) levels were determined for 1333 dairy cows in various stages of lactation and parity on 93 dairy farms in Ontario. The data were collected in a cross-sectional manner, as part of the 1992 Ontario Dairy Monitoring and Analysis Program. The median serum BHB was 536 mumol/L for all cows, with a range of 0 to 5801 mumol/L. When subclinical ketosis was defined as a serum BHB level of 1200 mumol/L or higher, the prevalence of ketosis for cows in early lactation ( 149 DIM), and dry cows were 5.3%, 3.2%, and 1.6%, respectively. The mean serum BHB was significantly higher in the early group compared with each of the other 3 groups (P ketosis. However, test-day fat percent and test-day protein percent, used alone or in combination, were not useful screening tests for identifying cows with subclinical ketosis. PMID:9360791
DEFF Research Database (Denmark)
Petersen, Mette Bisgaard; Tolver, Anders; Husted, Louise
2016-01-01
-off value of 7 mmol/L had a sensitivity of 0.66 and a specificity of 0.92 in predicting survival. In independent test data, the sensitivity was 0.69 and the specificity was 0.76. At the observed survival rate (38%), the optimal decision tree identified horses as non-survivors when the Lac at admission...... admitted with acute colitis (trees, as well as random...
Duda, David P.; Minnis, Patrick
2009-01-01
Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.
Hohenemser, K. H.; Crews, S. T.
1972-01-01
A two bladed 16-inch hingeless rotor model was built and tested outside and inside a 24 by 24 inch wind tunnel test section at collective pitch settings up to 5 deg and rotor advance ratios up to .4. The rotor model has a simple eccentric mechanism to provide progressing or regressing cyclic pitch excitation. The flapping responses were compared to analytically determined responses which included flap-bending elasticity but excluded rotor wake effects. Substantial systematic deviations of the measured responses from the computed responses were found, which were interpreted as the effects of interaction of the blades with a rotating asymmetrical wake.
García de la Torre, Nuria; Durán, Alejandra; Del Valle, Laura; Fuentes, Manuel; Barca, Idoya; Martín, Patricia; Montañez, Carmen; Perez-Ferre, Natalia; Abad, Rosario; Sanz, Fuencisla; Galindo, Mercedes; Rubio, Miguel A; Calle-Pascual, Alfonso L
2013-08-01
The aims are to define the regression rate in newly diagnosed type 2 diabetes after lifestyle intervention and pharmacological therapy based on a SMBG (self-monitoring of blood glucose) strategy in routine practice as compared to standard HbA1c-based treatment and to assess whether a supervised exercise program has additional effects. St Carlos study is a 3-year, prospective, randomized, clinic-based, interventional study with three parallel groups. Hundred and ninety-five patients were randomized to the SMBG intervention group [I group; n = 130; Ia: SMBG (n = 65) and Ib: SMBG + supervised exercise (n = 65)] and to the HbA1c control group (C group) (n = 65). The primary outcome was to estimate the regression rate of type 2 diabetes (HbA1c 4 kg was 3.6 (1.8-7); p < 0.001. This study shows that the use of SMBG in an educational program effectively increases the regression rate in newly diagnosed type 2 diabetic patients after 3 years of follow-up. These data suggest that SMBG-based programs should be extended to primary care settings where diabetic patients are usually attended.
Directory of Open Access Journals (Sweden)
Severino Cavalcante de Sousa Júnior
2010-05-01
Full Text Available Foram utilizados 35.732 registros de peso do nascimento aos 660 dias de idade de 8.458 animais da raça Tabapuã para estimar funções de covariância utilizando modelos de regressão aleatória sobre polinômios de Legendre. Os modelos incluíram: como aleatórios, os efeitos genético aditivo direto, materno, de ambiente permanente de animal e materno; como fixos, os efeitos de grupo de contemporâneo; como covariáveis, a idade do animal à pesagem e a idade da vaca ao parto (linear e quadrática; e sobre a idade à pesagem, polinômio ortogonal de Legendre (regressão cúbica foi considerado para modelar a curva média da população. O resíduo foi modelado considerando sete classes de variância e os modelos foram comparados pelos critérios de informação Bayesiano de Schwarz e Akaike. O melhor modelo apresentou ordens 4, 3, 6, 3 para os efeitos genético aditivo direto e materno, de ambiente permanente de animal e materno, respectivamente. As estimativas de covariância e herdabilidades, obtidas utilizando modelo bicaracter, e de regressão aleatória foram semelhantes. As estimativas de herdabilidade para o efeito genético aditivo direto, obtidas com o modelo de regressão aleatória, aumentaram do nascimento (0,15 aos 660 dias de idade (0,45. Maiores estimativas de herdabilidade materna foram obtidas para pesos medidos logo após o nascimento. As correlações genéticas variaram de moderadas a altas e diminuíram com o aumento da distância entre as pesagens. A seleção para maiores pesos em qualquer idade promove maior ganho de peso do nascimento aos 660 dias de idade.In order to estimate covariance functions by using random regression models on Legendre polynomials, 35,732 weight records from birth to 660 days of age of 8,458 animals of Tabapuã cattle were used. The models included: as random effects, direct additive genetic effect, maternal effect, and animal and maternal permanent environmental effets; contemporary groups
Hohenemser, K. H.; Crews, S. T.
1973-01-01
The experiments with progressing/regressing forced rotor flapping modes have been extended in several directions and the data processing method has been considerably refined. The 16 inch hingeless 2-bladed rotor model was equipped with a new set of high precision blades which removed previously encountered tracking difficulties at high advance ratio, so that tests up to .8 rotor advance ratio could be conducted. In addition to data with 1.20 blade natural flapping frequency data at 1.10 flapping frequency were obtained. Outside the wind tunnel, tests with a ground plate located at different distances below the rotor were conducted while recording the dynamic downflow at a station .2R below the rotor plane with a hot wire anemometer.
Pan, Shin-Liang; Chen, Hsiu-Hsi
2010-09-01
The rates of functional recovery after stroke tend to decrease with time. Time-varying Markov processes (TVMP) may be more biologically plausible than time-invariant Markov process for modeling such data. However, analysis of such stochastic processes, particularly tackling reversible transitions and the incorporation of random effects into models, can be analytically intractable. We make use of ordinary differential equations to solve continuous-time TVMP with reversible transitions. The proportional hazard form was used to assess the effects of an individual's covariates on multi-state transitions with the incorporation of random effects that capture the residual variation after being explained by measured covariates under the concept of generalized linear model. We further built up Bayesian directed acyclic graphic model to obtain full joint posterior distribution. Markov chain Monte Carlo (MCMC) with Gibbs sampling was applied to estimate parameters based on posterior marginal distributions with multiple integrands. The proposed method was illustrated with empirical data from a study on the functional recovery after stroke. Copyright 2010 Elsevier Inc. All rights reserved.
Demirel, Aynur; Yorubulut, Mehmet; Ergun, Nevin
2017-09-22
The aim of the study determining whether or not Non-invasive Spinal Decompression Therapy (NSDT) was effective in resorption of herniation, increasing disc height in patients with lumbar disc herniation (LHNP). A total of twenty patients diagnosed as LHNP and suffering from pain at least 8 weeks were enrolled to the study. Patients were allocated in study (SG) and control groups (CG) randomly. Both groups received combination of electrotherapy, deep friction massage and stabilization exercise for fifteen session. SG received additionally NSDT different from CG. Numeric Anolog Scale, Straight leg raise test, Oswestry Disability Index (ODI) were applied at baseline and after treatment. Disc height and herniation thickness were measured on Magnetic Resonance Imagination which performed at baseline and three months after therapy. Both treatments had positive effect for improving pain, functional restoration and reduction in thickness of herniation. Although reduction of herniation size was higher in SG than CG, no significant differences were found between groups and any superiority to each other (p> 0.05). This study showed that patients with LHNP received physiotherapy had improvement based on clinical and radiologic evidence. NSDT can be used as assistive agent for other physiotherapy methods in treatment of lumbar disc herniation.
Directory of Open Access Journals (Sweden)
Kassiana Adriano Pinto de Oliveira
2010-05-01
Full Text Available It was evaluated data set of 19,303 weight records of Santa Inês sheep in order to evaluate distinct polynomial functions with different order for better adjustements of fixed and random regressions of growth trajectory and to estimate (covariances components and genetic parameters of this trajectory. Fixed effects used in analysis were contemporary group, sex and birth type. Ordinary and Legendre polynomials, ranging from two to four orders, were evaluated for fixed regression of average growth trajectory. Legendre and quadratic b-spline functions, ranging from three to four orders, were evaluated for random regressions. Legendre polynomials of order fourth were suitable to fit random regression, while ordinary polynomials of third order were the best for fixed trajectory. Direct heritabilities on days 1, 50, 150, 250 and 411 were 0.24, 0.12, 0.44, 0.84, and 0.96, respectively, while maternal heritabilities for the same ages were 0.24, 0.19, 0.09, 0.02, and 0.01, respectively. Genetic correlations among weights in subsequent ages were high, tending to unity, and there were negative correlations between weights at early ages and weights at late ages. It is possible to modify the growth trajectory by selection with the observed genetic variability. Genetic control of weights at initial ages is not the same in late ages. So, selection of animals for slaughter in early age must be different from that of replacement animals.Foram utilizados 19.303 registros de peso de ovinos da raça Santa Inês com os objetivos de avaliar funções polinomiais com diferentes ordens para melhor ajuste das regressões fixas e aleatórias da trajetória de crescimento e estimar os componentes de covariância e os parâmetros genéticos desta trajetória. Os efeitos fixos utilizados nas análises foram grupo de contemporâneos, sexo e tipo de nascimento. Para ajuste da regressão fixa da trajetória média de crescimento, foram avaliados polinômios ordinários e de
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Njubi, D M; Wakhungu, J W; Badamana, M S
2010-04-01
The study is focused on the capability of artificial neural networks (ANNs) to predict next month and first lactation 305-day milk yields (FLMY305) of Kenyan Holstein-Friesian (KHF) dairy cows based on a few available test days (TD) records in early lactation. The developed model was compared with multiple linear regressions (MLR). A total of 39,034 first parity TD records of KHF dairy cows collected over 102 herds were analyzed. Different ANNs were modeled and the best performing number of hidden layers and neurons and training algorithms retained. The best ANN model had one hidden layer of logistic transfer function for all models, but hidden nodes varied from 2 to 7. The R (2) value for ANNs for training, validation, and test data were consistently high showing that the models captured the features accurately. The R (2), r, and root mean square were consistently superior for ANN than MLR but significantly different (p > 0.05). The prediction equation with four variables, i.e., first, second, third, and fourth TD milk yield, gave adequate accuracy (79.0%) in estimating the FLMY305 from TD yield. It emerges from this study that the ANN model can be an alternative for prediction of FLMY305 and monthly TD in KHF.
Regression analysis by example
National Research Council Canada - National Science Library
Chatterjee, Samprit; Hadi, Ali S
2012-01-01
.... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...
Directory of Open Access Journals (Sweden)
Cláudio Manoel Rodrigues de Melo
2005-06-01
alta correlação com a mesma indicam potencial de uso das PDC nas avaliações genéticas de animais da raça Holandesa no Brasil. Embora predominantemente altas, as estimativas de r g entre as PDC não foram homogêneas (0,64-1,0; entretanto as maiores freqüências foram para valores próximos ou iguais a 1. Assim, modelos de regressão aleatória devem ser também avaliados para se concluir sobre a melhor utilização das PDC da raça Holandesa no Brasil.Covariance components for test day records and lactation milk yield using 263.390 records of 32.448 first lactation Holstein cows, were estimated using animal models by REML. Besides the lactation model, two alternative repeatability models (RM were analyzed. Lactation model included fixed effects of herd-year-season and age of cow with linear and quadratic terms, and random effects of animal and error. The first model for test-day yield (RMF included the same effects, but fixed effect of contemporary group, defined as herd-year-month of test. Alternatively another model for test-day yield (RMF used a logarithmic polynomial sub-model for the shape of the lactation curve. Heritability for lactation yield (0.27 was smaller than those estimated by RMF and RMS, 0.30 and 0.43, respectively. Heritability estimates for univariate (0.22-0.36 and bivariate models (0.23-0.33 for test day milk yields were found to be smallest during early and late lactation. Heritability estimate for lactation milk yield when estimated by univariate model (0.27 was smaller than estimates obtained by bivariate models (0.27-0.30. Genetic correlations were higher between consecutive test days than between test days in the beginning and end of lactation. Larger heritability estimates for test day models and large genetic correlations between test day and lactation yield (0.86-0.99 indicate a potential use of test day records in genetic evaluations.
Directory of Open Access Journals (Sweden)
Luciele Cristina Pelicioni
2009-01-01
permanent environmental effects, using random regression models. The random effects included were modeled by regression on Legendre Polynomials with orders ranging from linear to quartic. The models were compared through the likelihood ratio test, Akaike's information criterion and the Schwarz's Bayesian information criterion. The model with 18 heterogeneous classes was the one that best fitted the residual variances, according to the statistical tests; however, the model with variance function of 5th order also showed to be appropriate. The direct heritability estimates were higher than those found in literature, ranging from 0.04 to 0.53, showing similar trends when compared to those estimated using univariate models. Selection on body weight in any age should improve the body weight in all ages in the studied interval.
Directory of Open Access Journals (Sweden)
P.R.C. Nobre
2009-08-01
Full Text Available Expected progeny differences (EPD of Nellore cattle estimated by random regression model (RRM and multiple trait model (MTM were compared. Genetic evaluation data included 3,819,895 records of up nine sequential weights of 963,227 animals measured at ages ranging from one day (birth weight to 733 days. Traits considered were weights at birth, ten to 110-day old, 102 to 202-day old, 193 to 293-day old, 283 to 383-day old, 376 to 476-day old, 551 to 651-day old, and 633 to 733-day old. Seven data samples were created. Because the parameters estimates biologically were better, two of them were chosen: one with 84,426 records and another with 72,040. Records preadjusted to a fixed age were analyzed by a MTM, which included the effects of contemporary group, age of dam class, additive direct, additive maternal, and maternal permanent environment. Analyses were carried out by REML, with five traits at a time. The RRM included the effects of age of animal, contemporary group, age of dam class, additive direct, permanent environment, additive maternal, and maternal permanent environment. Different degree of Legendre polynomials were used to describe random effects. MTM estimated covariance components and genetic parameters for weight at birth and sequential weights and RRM for all ages. Due to the fact that correlation among the estimates EPD from MTM and all the tested RM were not equal to 1.0, it is not possible to recommend RRM to genetic evaluation to large data sets.Compararam-se as diferenças esperadas nas progênies (DEPs de gado Nelore, estimadas por meio de um modelo de características múltiplas (MTM, com um modelo de regressão aleatória (RRM. Foram utilizados 3.819.895 dados de peso corporal sequenciais para a avaliação genética de 963.227 animais, coletados do nascer aos 733 dias de idade. As características consideradas foram: peso ao nascer e pesos dos 10 aos 110, dos 102 aos 202, dos 193 aos 293, dos 283 aos 383, dos 376 aos 476
Directory of Open Access Journals (Sweden)
L.S. Freitas
2010-04-01
Full Text Available Estimaram-se a herdabilidade e as correlações genéticas e de ambiente permanente entre seis medidas de persistência da lactação de vacas da raça Guzerá, utilizando modelo de regressão aleatória. Foram considerados 8276 registros de produção de leite no dia do controle, na primeira lactação, de 1021 vacas. A regressão aleatória foi calculada pela função logarítmica de Ali e Schaeffer e pelo polinômio de Legendre, obtendo-se os coeficientes para os efeitos fixos, genético aditivo e de ambiente permanente. A função que mais se adequou aos dados foi a de Ali e Schaeffer, mas apresentou problemas de convergência. Os resultados evidenciaram que a persistência é uma característica com herdabilidade de valor moderado e de baixa correlação com o valor genético para produção de leite aos 305 dias, indicando a possibilidade de se obter resposta à seleção para mudança na curva de lactação sem afetar negativamente a produção total de leite na lactação. A medida de persistência que calcula a diferença de produção de leite entre as fases intermediária e inicial da lactação apresentou alta correlação com a produção aos 305 dias.The heritability and the genetic and permanent environment correlations were estimated among six different measures of persistency in the lactation of Guzerat cow, using the Random Regression Model. A total of 8,403 records from 1,034 first lactation cows were evaluated. The Random Regression Model was calculated by the logarithmic function of Ali and Schaeffer and Legendre polynomials to get coefficients for fixed, additive genetic and permanent environment effects. Ali and Schaeffer was the function that better fit to the data, but it had convergence problems. The results showed that persistence is a trait with moderate heritability, and low correlation with genetic value for 305-d milk production which allows to select animals in order to alter the format of the curve of production
Directory of Open Access Journals (Sweden)
Lenira El Faro
2005-04-01
Full Text Available Foram estimados valores genéticos para as produções de leite no dia do controle de animais da raça Caracu, usando modelo unicaracterístico padrão (TDMO e modelo de regressão aleatória (MRA. Os animais foram classificados com base na produção acumulada até 305 (PTA305, como tradicionalmente é feito em gado de leite. Além das produções na semana do controle, o MRA proporcionou predições de valores genéticos para produção até 305 dias, obtidos pela soma dos valores genéticos diários (MRA305. Foram estimadas as correlações de ordem (spearman entre os valores genéticos estimados pelas diferentes metodologias, observando-se coincidência de touros classificados como deca 1 (10% melhores touros. As curvas de lactação genéticas para os cinco melhores touros foram comparadas. As correlações de ordem entre os valores genéticos para as produções de leite na semana do controle, preditos do TDMO foram menores que 0,80. Para o MRA as correlações de ordem foram maiores, variando de 0,41 a 1,00. Entre PTA305 e MRA305 a correlação de ordem foi de 0,87 para touros. Esse valor indica que, apesar de alta, essa correlação não assegura coincidência de classificação para os touros. Verificou-se também que os melhores touros para PTA305 não apresentaram as melhores curvas de lactação genéticas estimadas pelo MRA. O MRA parece ser mais adequado que o TDMO para substituir a produção acumulada até 305 dias em programas de avaliação genética.Breeding values were predicted for test day milk yields of Caracu cows, using univariate model (TDMO and random regression model (RRM. The ranks were based on traditional 305-day milk yield breeding values (PTA305. For RRM, breeding values were predicted for individual test-day milk yield and for sum of all test-days (RRM305. Spearman correlation were estimated among predicted breeding values from different methodos (TDMO and RRM and the coincidence of rank of 10% best sires were
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating e...... eigenvalues and eigenvectors. We give a number of different applications to regression and time series analysis, and show how the reduced rank regression estimator can be derived as a Gaussian maximum likelihood estimator. We briefly mention asymptotic results......The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Maroco, João; Silva, Dina; Rodrigues, Ana; Guerreiro, Manuela; Santana, Isabel; de Mendonça, Alexandre
2011-08-17
Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Press' Q test showed that all classifiers performed better than chance alone (p classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most
Directory of Open Access Journals (Sweden)
Asres Berhan
Full Text Available The development of tipranavir and darunavir, second generation non-peptidic HIV protease inhibitors, with marked improved resistance profiles, has opened a new perspective on the treatment of antiretroviral therapy (ART experienced HIV patients with poor viral load control. The aim of this study was to determine the virologic response in ART experienced patients to tipranavir-ritonavir and darunavir-ritonavir based regimens.A computer based literature search was conducted in the databases of HINARI (Health InterNetwork Access to Research Initiative, Medline and Cochrane library. Meta-analysis was performed by including randomized controlled studies that were conducted in ART experienced patients with plasma viral load above 1,000 copies HIV RNA/ml. The odds ratios and 95% confidence intervals (CI for viral loads of <50 copies and <400 copies HIV RNA/ml at the end of the intervention were determined by the random effects model. Meta-regression, sensitivity analysis and funnel plots were done. The number of HIV-1 patients who were on either a tipranavir-ritonavir or darunavir-ritonavir based regimen and achieved viral load less than 50 copies HIV RNA/ml was significantly higher (overall OR = 3.4; 95% CI, 2.61-4.52 than the number of HIV-1 patients who were on investigator selected boosted comparator HIV-1 protease inhibitors (CPIs-ritonavir. Similarly, the number of patients with viral load less than 400 copies HIV RNA/ml was significantly higher in either the tipranavir-ritonavir or darunavir-ritonavir based regimen treated group (overall OR = 3.0; 95% CI, 2.15-4.11. Meta-regression showed that the viral load reduction was independent of baseline viral load, baseline CD4 count and duration of tipranavir-ritonavir or darunavir-ritonavir based regimen.Tipranavir and darunavir based regimens were more effective in patients who were ART experienced and had poor viral load control. Further studies are required to determine their consistent
Storli, K S; Klemetsdal, G; Volden, H; Salte, R
2017-09-01
Today's Norwegian Red (NR) is markedly different from the one that existed 25 yr ago due to the continuous genetic improvement of economically important traits. Still, current national recommendations on replacement heifer rearing largely are based on results from Danish studies from the late 1980s to the mid 1990s. The objectives of the present study were to gain information on (1) growth and growth profiles of modern NR replacement heifers in commercial dairy herds and (2) how growth during the rearing period affects the heifers' milk yield during their first lactation. To this end, we conducted a field study on 5 high-producing and 5 low-producing commercial dairy farms from each of 3 geographical regions in Norway. On these 30 farms, we combined repeated onsite registrations of growth on all available females from newborn to calving with registrations deriving from the Norwegian Dairy Herd Recording System. Each herd was visited 6 to 8 times over a period of 2 yr. At each visit, heart girth circumference on all available young females was measured. Registrations were made on a total of 3,110 heifers. After imposing restrictions on the data, growth parameters were estimated based on information from 536 animals, whereas 350 of these animals had the required information needed to estimate the relationship between growth and test-day milk yield. Our findings pointed toward an optimal ADG of 830 g/d from 10 to 15 mo of age that would optimize first-lactation yield of heifers in an average Norwegian dairy herd. The optimum will likely increase from selection over time. Utilizing simple proportionality, the ADG between 5 and 10 mo of age ideally should be 879 g/d, taking into account the fact that animal growth rate is higher at low ages and that a high prepubertal growth rate had no negative effect on first-lactation yield. When such a rearing practice is used to meet the requirements of today's genetically improved NR heifer, heifers can both optimize production in
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Flexible survival regression modelling
DEFF Research Database (Denmark)
Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben
2009-01-01
Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...
Wassef, Anthony W A; Khafaji, Hadi; Syed, Ishba; Yan, Andrew T; Udell, Jacob A; Goodman, Shaun G; Cheema, Asim N; Bagai, Akshay
2016-12-01
Current guidelines recommend 12 months of dual-antiplatelet therapy (DAPT) after percutaneous coronary intervention (PCI) with drug-eluting stent (DES) implantation. Whether the duration of DAPT can be safely shortened with use of second-generation DESs is unclear. We conducted a meta-analysis of randomized controlled trials comparing short duration (SD) (3-6 months) with standard longer duration (LD) (≥12 months) DAPT in patients treated with primarily second-generation DES implantation. Meta-regression was performed to explore the relationship between acute coronary syndrome (ACS) and the effect of DAPT duration. Six studies were included, with 12,752/13,928 (91.5%) patients receiving second-generation DESs. A total of 5367 patients (39%) had PCI in the setting of ACS. There was no difference in all-cause mortality (1.1% vs 1.2%; odds ratio [OR], 0.86; 95% confidence interval [CI], 0.63-1.18; P=.36) or cardiac mortality (0.9% vs 1.0%; OR, 0.92; 95% CI, 0.61-1.39; P=.69) with SD-DAPT vs LD-DAPT, respectively. Definite/probable stent thrombosis (0.5% vs 0.3%; OR, 1.33; 95% CI, 0.75-2.34; P=.51), myocardial infarction (1.5% vs 1.3%; OR, 1.17; 95% CI, 0.88-1.56; P=.29), and stroke (0.4% vs 0.4%; OR, 1.04; 95% CI, 0.60-1.81; P=.88) were similar between the groups. Compared with LD-DAPT, SD-DAPT was associated with lower clinically significant bleeding (0.9% vs 1.4%; OR, 0.64; 95% CI, 0.46-0.89; P=.01). Meta-regression analysis showed no significant association between the proportion of ACS patients in trials and duration of DAPT for the outcomes of mortality (P=.95), myocardial infarction (P=.98), or stent thrombosis (P=.89). In low-risk patients treated with contemporary second-generation DES implantation, SD-DAPT has similar rates of mortality, myocardial infarction, and stent thrombosis, with lower rates of bleeding compared with LD-DAPT.
Pickup, John C; Reznik, Yves; Sutton, Alex J
2017-05-01
To compare glycemic control during continuous subcutaneous insulin infusion (CSII) and multiple daily insulin injections (MDI) in people with type 2 diabetes to identify patient characteristics that determine those best treated by CSII. Randomized controlled trials were selected comparing HbA1c during CSII versus MDI in people with type 2 diabetes. Data sources included Cochrane database and Ovid Medline. We explored patient-level determinants of final HbA1c level and insulin dose using Bayesian meta-regression models of individual patient data and summary effects using two-step meta-analysis. Hypoglycemia data were unavailable. Five trials were identified, with 287 patients randomized to receive MDI and 303 to receive CSII. Baseline HbA1c was the best determinant of final HbA1c: HbA1c difference (%) = 1.575 - (0.216 [95% credible interval 0.371-0.043] × baseline HbA1c) for all trials, but with largest effect in the trial with prerandomization optimization of control. Baseline insulin dose was best predictor of final insulin dose: insulin dose difference (units/kg) = 0.1245 - (0.382 [0.510-0.254] × baseline insulin dose). Overall HbA1c difference was -0.40% (-0.86 to 0.05 [-4.4 mmol/mol (-9.4 to 0.6)]). Overall insulin dose was reduced by -0.25 units/kg (-0.31 to -0.19) (26% reduction on CSII), and by -24.0 units/day (-30.6 to -17.5). Mean weight did not differ between treatments (0.08 kg [-0.33 to 0.48]). CSII achieves better glycemic control than MDI in people with poorly controlled type 2 diabetes, with ∼26% reduction in insulin requirements and no weight change. The best effect is in those worst controlled and with the highest insulin dose at baseline. © 2017 by the American Diabetes Association.
Ritsner, Michael S; Strous, Rael D
2010-01-01
While neurosteroids exert multiple effects in the central nervous system, their associations with neurocognitive deficits in schizophrenia are not yet fully understood. The purpose of this study was to identify the contribution of circulating levels of dehydroepiandrosterone (DHEA), its sulfate (DHEAS), androstenedione, and cortisol to neurocognitive deficits through DHEA administration in schizophrenia. Data regarding cognitive function, symptom severity, daily doses, side effects of antipsychotic agents and blood levels of DHEA, DHEAS, androstenedione and cortisol were collected among 55 schizophrenia patients in a double-blind, randomized, placebo-controlled, crossover trial with DHEA at three intervals: upon study entry, after 6weeks of DHEA administration (200mg/d), and after 6weeks of a placebo period. Multiple regression analysis was applied for predicting sustained attention, memory, and executive function scores across three examinations controlling for clinical, treatment and background covariates. Findings indicated that circulating DHEAS and androstenedione levels are shown as positive predictors of cognitive functioning, while DHEA level as negative predictor. Overall, blood neurosteroid levels and their molar ratios accounted for 16.5% of the total variance in sustained attention, 8-13% in visual memory tasks, and about 12% in executive functions. In addition, effects of symptoms, illness duration, daily doses of antipsychotic agents, side effects, education, and age of onset accounted for variability in cognitive functioning in schizophrenia. The present study suggests that alterations in circulating levels of neurosteroids and their molar ratios may reflect pathophysiological processes, which, at least partially, underlie cognitive dysfunction in schizophrenia. Copyright 2009 Elsevier Ltd. All rights reserved.
Visualisation of Regression Trees
Brunsdon, Chris
2007-01-01
he regression tree [1] has been used as a tool for exploring multivariate data sets for some time. As in multiple linear regression, the technique is applied to a data set consisting of a contin- uous response variable y and a set of predictor variables { x 1 ,x 2 ,...,x k } which may be continuous or categorical. However, instead of modelling y as a linear function of the predictors, regression trees model y as a series of ...
Dabrowska, Dorota M.
1997-01-01
Nonparametric regression was shown by Beran and McKeague and Utikal to provide a flexible method for analysis of censored failure times and more general counting processes models in the presence of covariates. We discuss application of kernel smoothing towards estimation in a generalized Cox regression model with baseline intensity dependent on a covariate. Under regularity conditions we show that estimates of the regression parameters are asymptotically normal at rate root-n, and we also dis...
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Directory of Open Access Journals (Sweden)
Maria Eugênia Zerlotti Mercadante
2002-07-01
Full Text Available Os parâmetros genéticos para dias ao parto foram estimados usando um modelo de regressão aleatória, com polinômios ortogonais da idade na monta (em anos como covariável. Os registros de dias ao parto (4.118 foram provenientes de 926 vacas de três rebanhos Nelore experimentais, sendo os rebanhos seleção e tradicional selecionados para maior peso ao sobreano, e o rebanho controle selecionado para a média do peso ao sobreano. As variâncias genética aditiva e permanente de ambiente foram descritas por uma função polinomial de ordem 4, com nove medidas de erro, resultando em variâncias fenotípica e genética aditiva altas nas idades mais avançadas, principalmente após a 6ª monta. As herdabilidades estimadas aumentaram de 0,08 a 0,28 da 1ª à 6ª monta. As correlações genéticas foram médias entre o primeiro desempenho e os demais (0,32 a 0,66, altas entre os desempenhos adjacentes (0,98 a 0,99, e um pouco menores entre os não adjacentes (0,63 a 0,98. A seleção para peso não alterou o valor genético médio das vacas dos rebanhos selecionados, entretanto, os valores genéticos médios das vacas do rebanho controle mostraram tendência de queda no decorrer dos anos.Genetic parameters for days to calving were estimated using a random regression model, with orthogonal polynomials of age at breeding season (in years as covariable. The records of days to calving (4,118 came from 929 cows from three experimental Nelore herds, been the selection and traditional herds selected for higher yearling weight and the control herd selected for the mean of yearling weight. Genetic and permanent environmental variances were described by a fourth order polynomial function, with 9 measures of error. The phenotypic and additive genetic variances were high in late records, especially after the 6th breeding season. Heritabilities estimates increased from 0.08 to 0.28, from first up to 6th breeding season. Genetic correlations were moderate
Directory of Open Access Journals (Sweden)
Gilberto Romeiro de Oliveira Menezes
2010-08-01
Full Text Available Utilizaram-se 10.238 registros semanais de produção de leite no dia do controle, provenientes de 388 primeiras lactações de cabras da raça Saanen, na avaliação de seis medidas da persistência da lactação, a fim de verificar qual a mais adequada para o uso em avaliações genéticas para a característica. As seis medidas avaliadas são adaptações de medidas utilizadas em bovinos de leite, obtidas por substituir, nas fórmulas, os valores de referência de bovinos pelos de caprinos. Os valores usados nos cálculos foram obtidos de modelos de regressão aleatória. As estimativas de herdabilidade para as medidas de persistência variaram entre 0,03 e 0,09. As correlações genéticas entre medidas de persistência e produção de leite até 268 dias variaram entre -0,64 e 0,67. Por apresentar a menor correlação genética com produção aos 268 dias (0,14, a medida de persistência PS4, obtida pelo somatório dos valores do 41º ao 240º dia de lactação como desvios da produção aos 40 dias de lactação, é a mais indicada em avaliações genéticas para persistência da lactação em cabras da raça Saanen. Assim, a seleção de cabras de melhor persistência da lactação não altera a produção aos 268 dias. Em razão da baixa herdabilidade dessa medida (0,03, pequenas respostas à seleção são esperadas neste rebanho.It was used 10,238 weekly milk production records on the control day from the first 388 lactations of Saanen goats on the evalution of six lactation persistency measures in order to find out which was the best fitted for using in genetic evaluations on this trait. These six evaluated measures are adaptations from those used on dairy cattle, obtained by replacing, in the formula, bovine reference values by the goat ones. The values used in the calculations were obtained from random regression models. Heritability estimates for persistency measures ranged from 0.03 to 0.09. Genetic correlations between
Directory of Open Access Journals (Sweden)
Michel Pompeu Barros de Oliveira Sá
2012-12-01
Full Text Available BACKGROUND: Most recent published meta-analysis of randomized controlled trials (RCTs showed that off-pump coronary artery bypass graft surgery (CABG reduces incidence of stroke by 30% compared with on-pump CABG, but showed no difference in other outcomes. New RCTs were published, indicating need of new meta-analysis to investigate pooled results adding these further studies. METHODS: MEDLINE, EMBASE, CENTRAL/CCTR, SciELO, LILACS, Google Scholar and reference lists of relevant articles were searched for RCTs that compared outcomes (30-day mortality for all-cause, myocardial infarction or stroke between off-pump versus on-pump CABG until May 2012. The principal summary measures were relative risk (RR with 95% Confidence Interval (CI and P values (considered statistically significant when INTRODUÇÃO: A meta-análise mais recente de estudos randomizados controlados (ERC mostrou que cirurgia de revascularização (CRM sem circulação extracorpórea (CEC reduz a incidência de acidente vascular cerebral em 30% em comparação com CRM com CEC, mas não mostrou diferença em outros resultados. Novos ERCs foram publicados, indicando necessidade de nova meta-análise para investigar resultados agrupados adicionando esses estudos. MÉTODOS: MEDLINE, EMBASE, CENTRAL / CCTR, SciELO, LILACS, Google Scholar e listas de referências de artigos relevantes foram pesquisados para ERCs que compararam os resultados de 30 dias (mortalidade por todas as causas, infarto do miocárdio ou acidente vascular cerebral - AVC entre CRM com CEC versus sem CEC até maio de 2012. As medidas sumárias principais foram o risco relativo (RR com intervalo de confiança de 95% (IC e os valores de P (considerado estatisticamente significativo quando <0,05. Os RR foram combinados entre os estudos usando modelo de efeito randômico de DerSimonian-Laird. Meta-análise e meta-regressão foram concluídas usando o software versão Meta-Análise Abrangente 2 (Biostat Inc., Englewood
Energy Technology Data Exchange (ETDEWEB)
Gerber, Samuel [Univ. of Utah, Salt Lake City, UT (United States); Rubel, Oliver [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bremer, Peer -Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Whitaker, Ross T. [Univ. of Utah, Salt Lake City, UT (United States)
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...
Dimension Reduction and Discretization in Stochastic Problems by Regression Method
DEFF Research Database (Denmark)
Ditlevsen, Ove Dalager
1996-01-01
The chapter mainly deals with dimension reduction and field discretizations based directly on the concept of linear regression. Several examples of interesting applications in stochastic mechanics are also given.Keywords: Random fields discretization, Linear regression, Stochastic interpolation...
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Three contributions to robust regression diagnostics
Directory of Open Access Journals (Sweden)
Kalina J.
2015-12-01
Full Text Available Robust regression methods have been developed not only as a diagnostic tool for standard least squares estimation in statistical and econometric applications, but can be also used as self-standing regression estimation procedures. Therefore, they need to be equipped by their own diagnostic tools. This paper is devoted to robust regression and presents three contributions to its diagnostic tools or estimating regression parameters under non-standard conditions. Firstly, we derive the Durbin-Watson test of independence of random regression errors for the regression median. The approach is based on the approximation to the exact null distribution of the test statistic. Secondly, we accompany the least trimmed squares estimator by a subjective criterion for selecting a suitable value of the trimming constant. Thirdly, we propose a robust version of the instrumental variables estimator. The new methods are illustrated on examples with real data and their advantages and limitations are discussed.
Semiparametric Regression Pursuit.
Huang, Jian; Wei, Fengrong; Ma, Shuangge
2012-10-01
The semiparametric partially linear model allows flexible modeling of covariate effects on the response variable in regression. It combines the flexibility of nonparametric regression and parsimony of linear regression. The most important assumption in the existing methods for the estimation in this model is to assume a priori that it is known which covariates have a linear effect and which do not. However, in applied work, this is rarely known in advance. We consider the problem of estimation in the partially linear models without assuming a priori which covariates have linear effects. We propose a semiparametric regression pursuit method for identifying the covariates with a linear effect. Our proposed method is a penalized regression approach using a group minimax concave penalty. Under suitable conditions we show that the proposed approach is model-pursuit consistent, meaning that it can correctly determine which covariates have a linear effect and which do not with high probability. The performance of the proposed method is evaluated using simulation studies, which support our theoretical results. A real data example is used to illustrated the application of the proposed method.
Almost opposite regression dependence in bivariate distributions
Siburg, Karl Friedrich; Stoimenov, Pavel A.
2014-01-01
Let X,Y be two continuous random variables. Investigating the regression dependence of Y on X, respectively, of X on Y, we show that the two of them can have almost opposite behavior. Indeed, given any e > 0, we construct a bivariate random vector (X,Y) such that the respective regression dependence measures r2|1(X,Y), r1|2(X,Y) ∈ [0,1] introduced in Dette et al. (2013) satisfy r2|1(X,Y) = 1 as well as r1|2(X,Y)
[Understanding logistic regression].
El Sanharawi, M; Naudet, F
2013-10-01
Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Simultaneous Inference in Regression
Liu, Wei
2010-01-01
The use of simultaneous confidence bands in linear regression is a vibrant area of research. This book presents an overview of the methodology and applications, including necessary background material on linear models. A special chapter on logistic regression gives readers a glimpse into how these methods can be used for generalized linear models. The appendices provide computational tools for simulating confidence bands. The author also includes MATLAB[registered] programs for all examples on the web. With many numerical examples and software implementation, this text serves the needs of rese
Regression Estimator Using Double Ranked Set Sampling
Directory of Open Access Journals (Sweden)
Hani M. Samawi
2002-06-01
Full Text Available The performance of a regression estimator based on the double ranked set sample (DRSS scheme, introduced by Al-Saleh and Al-Kadiri (2000, is investigated when the mean of the auxiliary variable X is unknown. Our primary analysis and simulation indicates that using the DRSS regression estimator for estimating the population mean substantially increases relative efficiency compared to using regression estimator based on simple random sampling (SRS or ranked set sampling (RSS (Yu and Lam, 1997 regression estimator. Moreover, the regression estimator using DRSS is also more efficient than the naïve estimators of the population mean using SRS, RSS (when the correlation coefficient is at least 0.4 and DRSS for high correlation coefficient (at least 0.91. The theory is illustrated using a real data set of trees.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Modern Regression Discontinuity Analysis
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
. There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....
Hierarchical Logistic Regression in Course Placement
Schulz, E. Matthew; Betebenner, Damian; Ahn, Meeyeon
2004-01-01
Whether hierarchical logistic regression can reduce the sample size requirement for estimating optimal cutoff scores in a course placement service where predictive validity is measured by a threshold utility function is explored. Data from courses with varying class size were randomly partitioned into two halves per course. Nonhierarchical and…
Nonparametric and semiparametric dynamic additive regression models
DEFF Research Database (Denmark)
Scheike, Thomas Harder; Martinussen, Torben
Dynamic additive regression models provide a flexible class of models for analysis of longitudinal data. The approach suggested in this work is suited for measurements obtained at random time points and aims at estimating time-varying effects. Both fully nonparametric and semiparametric models can...
Complex Regression Functional And Load Tests Development
Directory of Open Access Journals (Sweden)
Anton Andreevich Krasnopevtsev
2015-10-01
Full Text Available The article describes practical approaches for realization of automatized regression functional and load testing on random software-hardware complex, based on «MARSh 3.0» sample. Testing automatization is being realized for «MARSh 3.0» information security increase.
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
An Introduction to Logistic Regression.
Cizek, Gregory J.; Fitzgerald, Shawn M.
1999-01-01
Where linearity cannot be assumed, logistic regression may be appropriate. This article describes conditions and tests for using logistic regression; introduces the logistic-regression model, the use of logistic-regression software, and some applications in published literature. Univariate and multiple independent-variable conditions and…
Reciprocal Causation in Regression Analysis.
Wolfle, Lee M.
1979-01-01
With even the simplest bivariate regression, least-squares solutions are inappropriate unless one assumes a priori that reciprocal effects are absent, or at least implausible. While this discussion is limited to bivariate regression, the issues apply equally to multivariate regression, including stepwise regression. (Author/CTM)
Bias-Robust Estimates of Regression Based on Projections
Maronna, Ricardo A.; Yohai, Victor J
1993-01-01
A new class of bias-robust estimates of multiple regression is introduced. If $y$ and $x$ are two real random variables, let $T(y, x)$ be a univariate robust estimate of regression of $y$ on $x$ through the origin. The regression estimate $\\mathbf{T}(y, \\mathbf{x})$ of a random variable $y$ on a random vector $\\mathbf{x} = (x_1,\\cdots, x_p)'$ is defined as the vector $\\mathbf{t} \\in \\mathfrak{R}^p$ which minimizes $\\sup_{\\|\\mathbf{\\lambda}\\| = 1} \\mid T(y - \\mathbf{t'x, \\lambda' x}) \\mid s(\\m...
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Gwin, Jess A; Maki, Kevin C; Leidy, Heather J
2017-12-01
Background: Higher-protein (HP) energy-restriction diets improve weight management to a greater extent than normal-protein (NP) versions. Potential mechanisms of action with regard to assessment of eating behaviors across the day have not been widely examined during energy restriction.Objectives: The objectives of this study were to test whether the consumption of an HP energy-restriction diet reduces carbohydrate and fat intakes through improvements in daily appetite, satiety, and food cravings compared with NP versions and to test whether protein type within the NP diets alters protein-related satiety.Methods: Seventeen overweight women [mean ± SEM age: 36 ± 1 y; body mass index (kg/m2): 28.4 ± 0.1] completed a randomized, controlled-feeding crossover study. Participants were provided with the following ∼1250-kcal/d energy-restricted (-750-kcal/d deficit) diets, each for 6 d: HP [124 g protein/d; 60% from beef and 40% from plant sources (HP-BEEF)] or NP (48 g protein/d) that was protein-type matched (NP-BEEF) or unmatched [100% from plant-based sources (NP-PLANT)]. On day 6 of each diet period, participants completed a 12-h testing day containing repetitive appetite, satiety, and food-craving questionnaires. On day 7, the participants were asked to consume their protein requirement within each respective diet but were provided with a surplus of carbohydrate- and fat-rich foods to consume, ad libitum, at each eating occasion across the day. All outcomes reported were primary study outcomes.Results: The HP-BEEF diet reduced daily hunger by 16%, desire to eat by 15%, prospective food consumption by 14%, and fast-food cravings by 15% but increased daily fullness by 25% compared with the NP-BEEF and NP-PLANT diets (all P day did not reduce the energy consumed ad libitum from the fat- and carbohydrate-rich foods (HP-BEEF: 2000 ± 180 kcal/d; NP-BEEF: 2120 ± 190 kcal/d; NP-PLANT: 2070 ± 180 kcal/d). None of the outcomes differed between the NP-BEEF and NP
Insulin resistance: regression and clustering.
Directory of Open Access Journals (Sweden)
Sangho Yoon
Full Text Available In this paper we try to define insulin resistance (IR precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ, a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT. We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with "main effects" is not satisfactory, but prediction that includes interactions may be.
Directory of Open Access Journals (Sweden)
Laila Talarico Dias
2006-10-01
, consisting of 21,762 records from 4,221 animals of Tabapuã cattle, weighted from birth to 550 days of age, were used to estimate covariance functions by random regression models using Legendre polynomials of order two to five. Models included the direct and maternal genetic, animal and maternal permanent environmental random effects and compared by Schwarz´s Bayesian information criteria (BIC and Akaike´s information criteria (AIC. Both criterions suggested the model including direct genetic, maternal genetic, animal permanent and maternal permanent environmental effects respectively adjusted by cubic, quadratic, fourth order and linear polynomials, and residual variances adjusted by fifth order variance function as the best one to describe the covariance structure of the used database. Direct heritability estimates were higher at the beginning and at the end of the growth trajectory. Maternal heritability estimates increased from birth to 160 days of age and decreased thereafter. In general, genetic correlation estimates decreased as age between weights increased. Efficiency of selection may be improved by using weights of the post weaning period because of their higher genetic variance and heritability estimates.
Directory of Open Access Journals (Sweden)
Saroj Kumar Sahoo
2014-12-01
Full Text Available Aim: The aim was to estimate genetic parameters of weekly test-day (TD milk yields and first lactation 305-day milk yield (FL305DY in Murrah buffaloes. Materials and Methods: The data on 39059 weekly test-day milk yield (WTDY records during first lactation of 961 Murrah buffaloes calved from 1977 to 2012 and sired by 101 bulls maintained in an organized farm at National Dairy Research Institute, Karnal was analyzed to study the effect of genetic and non-genetic factors. Least squares maximum likelihood program was used to estimate genetic and non-genetic parameters affecting WTDY and FL305DY. Heritability was estimated using paternal half-sib correlation method. The genetic and phenotypic correlations among WTDY and 305- day milk yield was calculated from the analysis of variance and covariance among sire groups. Results: The least squares means for FL305DY was found to be 1853.49±15.88 Kg. The least squares means of overall WTDY ranged from 2.44±0.07 kg (TD-43 to 7.95±0.06 kg (TD-8. Effect of period, season and age at first calving groups was found to be highly significant (p<0.01, significant (p<0.05 and non-significant on FL305DY, respectively. The h2 estimate of FL305DY was 0.25±0.09. The estimates of phenotypic and genetic correlations between 305-day milk yield and different WTDY ranged from 0.52 to 0.84 and from 0.19 to 0.98, respectively. Conclusions: Our study showed that the effect of period of calving was highly significant (p<0.01 on FL305DY as well as all the WTDY. The estimates of phenotypic and genetic correlations were generally higher in the middle segment of lactation suggesting that these TD yields could be used as the selection criteria for early evaluation and selection of animals.
A logistic regression estimating function for spatial Gibbs point processes
DEFF Research Database (Denmark)
Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege
We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...
Population-Sample Regression in the Estimation of Population Proportions
Weitzman, R. A.
2006-01-01
Focusing on a single sample obtained randomly with replacement from a single population, this article examines the regression of population on sample proportions and develops an unbiased estimator of the square of the correlation between them. This estimator turns out to be the regression coefficient. Use of the squared-correlation estimator as a…
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Mixed-effects regression models in linguistics
Heylen, Kris; Geeraerts, Dirk
2018-01-01
When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Bias-corrected quantile regression estimation of censored regression models
Cizek, Pavel; Sadikoglu, Serhan
2018-01-01
In this paper, an extension of the indirect inference methodology to semiparametric estimation is explored in the context of censored regression. Motivated by weak small-sample performance of the censored regression quantile estimator proposed by Powell (J Econom 32:143–155, 1986a), two- and
Quantum assisted Gaussian process regression
Zhao, Zhikuan; Fitzsimons, Jack K.; Fitzsimons, Joseph F.
2015-01-01
Gaussian processes (GP) are a widely used model for regression problems in supervised machine learning. Implementation of GP regression typically requires $O(n^3)$ logic gates. We show that the quantum linear systems algorithm [Harrow et al., Phys. Rev. Lett. 103, 150502 (2009)] can be applied to Gaussian process regression (GPR), leading to an exponential reduction in computation time in some instances. We show that even in some cases not ideally suited to the quantum linear systems algorith...
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Stochastic development regression using method of moments
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
van Leeuwen, Nikki; Lingsma, Hester F.; de Craen, Anton J. M.; Nieboer, Daan; Mooijaart, Simon P.; Richard, Edo; Steyerberg, Ewout W.
2016-01-01
In epidemiology, the regression discontinuity design has received increasing attention recently and might be an alternative to randomized controlled trials (RCTs) to evaluate treatment effects. In regression discontinuity, treatment is assigned above a certain threshold of an assignment variable for
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Application of random regression models to the genetic evaluation ...
African Journals Online (AJOL)
Estimates of genetic correlations were greater than 0.82 among measures of weight at all ages. The resulting covariance functions were used to estimate breeding values of each animal along the age trajectory. Genetic trends for CW over the years showed only a slightly increasing pattern, suggesting that CW did not ...
Covariance Functions and Random Regression Models in the ...
African Journals Online (AJOL)
ARC-IRENE
Since its inception the application of genetic principles to selective breeding of farm animals has led ... animal increases in size or weight continuously over time until reaching a plateau at maturity. Such a process .... where A and I are the numerator relationship matrix and an identity matrix, respectively; KG and KC are the.
Covariance Functions and Random Regression Models in the ...
African Journals Online (AJOL)
ARC-IRENE
many, highly correlated measures (Meyer, 1998a). Several approaches have been proposed to deal with such data, from simplest repeatability models (SRM) to complex multivariate models (MTM). The SRM considers different measurements at different stages (ages) as a realization of the same genetic trait with constant.
Evaluation of Development Programs: Randomized Controlled Trials or Regressions?
Elbers, C.T.M.; Gunning, J.W.
2014-01-01
Can project evaluation methods be used to evaluate programs: complex interventions involving multiple activities? A program evaluation cannot be based simply on separate evaluations of its components if interactions between the activities are important. In this paper a measure is proposed, the total
Practical Session: Simple Linear Regression
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Multiple Regression and Its Discontents
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Linear and logistic regression analysis
Tripepi, G.; Jager, K. J.; Dekker, F. W.; Zoccali, C.
2008-01-01
In previous articles of this series, we focused on relative risks and odds ratios as measures of effect to assess the relationship between exposure to risk factors and clinical outcomes and on control for confounding. In randomized clinical trials, the random allocation of patients is hoped to
Hypothesis Testing of Parameters for Ordinary Linear Circular Regression
Directory of Open Access Journals (Sweden)
Abdul Ghapor Hussin
2006-07-01
Full Text Available This paper presents the hypothesis testing of parameters for ordinary linear circular regression model assuming the circular random error distributed as von Misses distribution. The main interests are in testing of the intercept and slope parameter of the regression line. As an illustration, this hypothesis testing will be used in analyzing the wind and wave direction data recorded by two different techniques which are HF radar system and anchored wave buoy.
Inferential Models for Linear Regression
Directory of Open Access Journals (Sweden)
Zuoyi Zhang
2011-09-01
Full Text Available Linear regression is arguably one of the most widely used statistical methods in applications. However, important problems, especially variable selection, remain a challenge for classical modes of inference. This paper develops a recently proposed framework of inferential models (IMs in the linear regression context. In general, an IM is able to produce meaningful probabilistic summaries of the statistical evidence for and against assertions about the unknown parameter of interest and, moreover, these summaries are shown to be properly calibrated in a frequentist sense. Here we demonstrate, using simple examples, that the IM framework is promising for linear regression analysis --- including model checking, variable selection, and prediction --- and for uncertain inference in general.
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Logistic regression for circular data
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Quasi-least squares regression
Shults, Justine
2014-01-01
Drawing on the authors' substantial expertise in modeling longitudinal and clustered data, Quasi-Least Squares Regression provides a thorough treatment of quasi-least squares (QLS) regression-a computational approach for the estimation of correlation parameters within the framework of generalized estimating equations (GEEs). The authors present a detailed evaluation of QLS methodology, demonstrating the advantages of QLS in comparison with alternative methods. They describe how QLS can be used to extend the application of the traditional GEE approach to the analysis of unequally spaced longitu
Biplots in Reduced-Rank Regression
Braak, ter C.J.F.; Looman, C.W.N.
1994-01-01
Regression problems with a number of related response variables are typically analyzed by separate multiple regressions. This paper shows how these regressions can be visualized jointly in a biplot based on reduced-rank regression. Reduced-rank regression combines multiple regression and principal
Growth Regression and Economic Theory
Elbers, Chris; Gunning, Jan Willem
2002-01-01
In this note we show that the standard, loglinear growth regression specificationis consistent with one and only one model in the class of stochastic Ramsey models. Thismodel is highly restrictive: it requires a Cobb-Douglas technology and a 100% depreciationrate and it implies that risk does not
Regression of lumbar disk herniation
Directory of Open Access Journals (Sweden)
G. Yu Evzikov
2015-01-01
Full Text Available Compression of the spinal nerve root, giving rise to pain and sensory and motor disorders in the area of its innervation is the most vivid manifestation of herniated intervertebral disk. Different treatment modalities, including neurosurgery, for evolving these conditions are discussed. There has been recent evidence that spontaneous regression of disk herniation can regress. The paper describes a female patient with large lateralized disc extrusion that has caused compression of the nerve root S1, leading to obvious myotonic and radicular syndrome. Magnetic resonance imaging has shown that the clinical manifestations of discogenic radiculopathy, as well myotonic syndrome and morphological changes completely regressed 8 months later. The likely mechanism is inflammation-induced resorption of a large herniated disk fragment, which agrees with the data available in the literature. A decision to perform neurosurgery for which the patient had indications was made during her first consultation. After regression of discogenic radiculopathy, there was only moderate pain caused by musculoskeletal diseases (facet syndrome, piriformis syndrome that were successfully eliminated by minimally invasive techniques.
Claim reserving with fuzzy regression
Bahrami, Tahereh; BAHRAMI, Masuod
2015-01-01
Abstract. Claims reserving plays a key role for the insurance. Therefore, various statistical methods are used to provide for an adequate amount of claim reserves. Since claim reserves are always variable, fuzzy set theory is used to handle this variability. In this paper, non-symmetric fuzzy regression is integrated in the Taylor’s method to develop a new method for claim reserving.
Multimodality in GARCH regression models
Ooms, M.; Doornik, J.A.
2008-01-01
It is shown empirically that mixed autoregressive moving average regression models with generalized autoregressive conditional heteroskedasticity (Reg-ARMA-GARCH models) can have multimodality in the likelihood that is caused by a dummy variable in the conditional mean. Maximum likelihood estimates
Fungible Weights in Multiple Regression
Waller, Niels G.
2008-01-01
Every set of alternate weights (i.e., nonleast squares weights) in a multiple regression analysis with three or more predictors is associated with an infinite class of weights. All members of a given class can be deemed "fungible" because they yield identical "SSE" (sum of squared errors) and R[superscript 2] values. Equations for generating…
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...
PROBIT REGRESSION IN PREDICTION ANALYSIS
African Journals Online (AJOL)
Admin
2008-12-12
Dec 12, 2008 ... GLOBAL JOURNAL OF MATHEMATICAL SCIENCES VOL. ... INTRODUCTION. For some dichotomous variables, the response y is actually a proxy for a variable that is continuous (Newsom, 2005). A regression ... M. E. Nja, Dept. of Mathematics / Statistics Cross River University of Technology, Calabar ...
Ridge Regression for Interactive Models.
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are…
Logistic regression: a brief primer.
Stoltzfus, Jill C
2011-10-01
Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model
Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors.
Woodard, Dawn B; Crainiceanu, Ciprian; Ruppert, David
2013-01-01
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials.
Investigating the Accuracy of Three Estimation Methods for Regression Discontinuity Design
Sun, Shuyan; Pan, Wei
2013-01-01
Regression discontinuity design is an alternative to randomized experiments to make causal inference when random assignment is not possible. This article first presents the formal identification and estimation of regression discontinuity treatment effects in the framework of Rubin's causal model, followed by a thorough literature review of…
Least-squares regression of adsorption equilibrium data: comparing the options.
El-Khaiary, Mohammad I
2008-10-01
Experimental and simulated adsorption equilibrium data were analyzed by different methods of least-squares regression. The methods used were linear regression, nonlinear regression, and orthogonal distance regression. The results of the regression analysis of the experimental data showed that the different regression methods produced different estimates of the adsorption isotherm parameters, and consequently, different conclusions about the surface properties of the adsorbent and the mechanism of adsorption. A Langmuir-type simulated data set was calculated and several levels of random error were added to the data set. The results of regression analysis of the simulated data set showed that orthogonal distance regression gives the most accurate and efficient estimates of the isotherm parameters. Nonlinear regression and one form of the linearized Langmuir isotherm also gave accurate estimates, but only at low levels of random error.
Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors
Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David
2013-01-01
We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for re...
Meaney, Christopher; Moineddin, Rahim
2014-01-24
In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the
Statistical learning from a regression perspective
Berk, Richard A
2016-01-01
This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this can be seen as an extension of nonparametric regression. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. A continued emphasis on the implications for practice runs through the text. Among the statistical learning procedures examined are bagging, random forests, boosting, support vector machines and neural networks. Response variables may be quantitative or categorical. As in the first edition, a unifying theme is supervised learning that can be trea...
Active set support vector regression.
Musicant, David R; Feinberg, Alexander
2004-03-01
This paper presents active set support vector regression (ASVR), a new active set strategy to solve a straightforward reformulation of the standard support vector regression problem. This new algorithm is based on the successful ASVM algorithm for classification problems, and consists of solving a finite number of linear equations with a typically large dimensionality equal to the number of points to be approximated. However, by making use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available. ASVR is extremely fast, produces comparable generalization error to other popular algorithms, and is available on the web for download.
AUTISTIC EPILEPTIFORM REGRESSION (A REVIEW
Directory of Open Access Journals (Sweden)
L. Yu. Glukhova
2012-01-01
Full Text Available The author represents the review of current scientific literature devoted to autistic epileptiform regression — the special form of autistic disorder, characterized by development of severe communicative disorders in children as a result of continuous prolonged epileptiform activity on EEG. This condition has been described by R.F. Tuchman and I. Rapin in 1997. The author describes the aspects of pathogenesis, clinical pictures and diagnostics of this disorder, including the peculiar anomalies on EEG (benign epileptiform patterns of childhood, with a high index of epileptiform activity, especially in the sleep. The especial attention is given to approaches to the treatment of autistic epileptiform regression. Efficacy of valproates, corticosteroid hormones and antiepileptic drugs of other groups is considered.
Binary data regression: Weibull distribution
Caron, Renault; Polpo, Adriano
2009-12-01
The problem of estimation in binary response data has receivied a great number of alternative statistical solutions. Generalized linear models allow for a wide range of statistical models for regression data. The most used model is the logistic regression, see Hosmer et al. [6]. However, as Chen et al. [5] mentions, when the probability of a given binary response approaches 0 at a different rate than it approaches 1, symmetric linkages are inappropriate. A class of models based on Weibull distribution indexed by three parameters is introduced here. Maximum likelihood methods are employed to estimate the parameters. The objective of the present paper is to show a solution for the estimation problem under the Weibull model. An example showing the quality of the model is illustrated by comparing it with the alternative probit and logit models.
Spontaneous regression of colon cancer.
Kihara, Kyoichi; Fujita, Shin; Ohshiro, Taihei; Yamamoto, Seiichiro; Sekine, Shigeki
2015-01-01
A case of spontaneous regression of transverse colon cancer is reported. A 64-year-old man was diagnosed as having cancer of the transverse colon at a local hospital. Initial and second colonoscopy examinations revealed a typical cancer of the transverse colon, which was diagnosed as moderately differentiated adenocarcinoma. The patient underwent right hemicolectomy 6 weeks after the initial colonoscopy. The resected specimen showed only a scar at the tumor site, and no cancerous tissue was proven histologically. The patient is alive with no evidence of recurrence 1 year after surgery. Although an antitumor immune response is the most likely explanation, the exact nature of the phenomenon was unclear. We describe this rare case and review the literature pertaining to spontaneous regression of colorectal cancer. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Polynomial Regressions and Nonsense Inference
Directory of Open Access Journals (Sweden)
Daniel Ventosa-Santaulària
2013-11-01
Full Text Available Polynomial specifications are widely used, not only in applied economics, but also in epidemiology, physics, political analysis and psychology, just to mention a few examples. In many cases, the data employed to estimate such specifications are time series that may exhibit stochastic nonstationary behavior. We extend Phillips’ results (Phillips, P. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. by proving that an inference drawn from polynomial specifications, under stochastic nonstationarity, is misleading unless the variables cointegrate. We use a generalized polynomial specification as a vehicle to study its asymptotic and finite-sample properties. Our results, therefore, lead to a call to be cautious whenever practitioners estimate polynomial regressions.
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information Impact factor: 0.379, year: 2016 http:// library .utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
QUANTILE CALCULUS AND CENSORED REGRESSION.
Huang, Yijian
2010-06-01
Quantile regression has been advocated in survival analysis to assess evolving covariate effects. However, challenges arise when the censoring time is not always observed and may be covariate-dependent, particularly in the presence of continuously-distributed covariates. In spite of several recent advances, existing methods either involve algorithmic complications or impose a probability grid. The former leads to difficulties in the implementation and asymptotics, whereas the latter introduces undesirable grid dependence. To resolve these issues, we develop fundamental and general quantile calculus on cumulative probability scale in this article, upon recognizing that probability and time scales do not always have a one-to-one mapping given a survival distribution. These results give rise to a novel estimation procedure for censored quantile regression, based on estimating integral equations. A numerically reliable and efficient Progressive Localized Minimization (PLMIN) algorithm is proposed for the computation. This procedure reduces exactly to the Kaplan-Meier method in the k-sample problem, and to standard uncensored quantile regression in the absence of censoring. Under regularity conditions, the proposed quantile coefficient estimator is uniformly consistent and converges weakly to a Gaussian process. Simulations show good statistical and algorithmic performance. The proposal is illustrated in the application to a clinical study.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
to be a committed artist, and how that translates into supporting al-Assad’s rule in Syria; the Ramadan programme Harrir Aqlak’s attempt to relaunch an intellectual renaissance and to promote religious pluralism; and finally, al-Mayadeen’s cooperation with the pan-Latin American TV station TeleSur and its ambitions...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
An Evaluation of Ridge Regression.
1981-12-01
of the parameter estimates, is a decreasing function of k. The idea of ridge regression, as suggested by Hoerl and Kennard (Ref 12:58-63), is to pick...CROSS? 0 CR0553 f.812 CR0554 0 CR0555 4.39? CROSS6 0 ALSO 4.922 KSO 0 NVARSO 4. A5059 .622 CONTFNTS OF CASE NUlIPER 209 SEQHUI 209. SUOILE PEGANAL CASWGT...KSQ .000 NVARSO 9. RSOSO .846 CONTENTS OF CASE NUMBER 55 SEONUN 55. SUfTFILE PEGANAL CASWGI 2.0000 459 .970 RI 76600 K .025 NVA? 3. MSE .177 NS[IS
Varying-coefficient functional linear regression
Wu, Yichao; Fan, Jianqing; Müller, Hans-Georg
2010-01-01
Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regression model. We study...
Nonparametric Regression with Common Shocks
Directory of Open Access Journals (Sweden)
Eduardo A. Souza-Rodrigues
2016-09-01
Full Text Available This paper considers a nonparametric regression model for cross-sectional data in the presence of common shocks. Common shocks are allowed to be very general in nature; they do not need to be finite dimensional with a known (small number of factors. I investigate the properties of the Nadaraya-Watson kernel estimator and determine how general the common shocks can be while still obtaining meaningful kernel estimates. Restrictions on the common shocks are necessary because kernel estimators typically manipulate conditional densities, and conditional densities do not necessarily exist in the present case. By appealing to disintegration theory, I provide sufficient conditions for the existence of such conditional densities and show that the estimator converges in probability to the Kolmogorov conditional expectation given the sigma-field generated by the common shocks. I also establish the rate of convergence and the asymptotic distribution of the kernel estimator.
Practical Session: Multiple Linear Regression
Clausel, M.; Grégoire, G.
2014-12-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
Kernel Multitask Regression for Toxicogenetics.
Bernard, Elsa; Jiao, Yunlong; Scornet, Erwan; Stoven, Veronique; Walter, Thomas; Vert, Jean-Philippe
2017-10-01
The development of high-throughput in vitro assays to study quantitatively the toxicity of chemical compounds on genetically characterized human-derived cell lines paves the way to predictive toxicogenetics, where one would be able to predict the toxicity of any particular compound on any particular individual. In this paper we present a machine learning-based approach for that purpose, kernel multitask regression (KMR), which combines chemical characterizations of molecular compounds with genetic and transcriptomic characterizations of cell lines to predict the toxicity of a given compound on a given cell line. We demonstrate the relevance of the method on the recent DREAM8 Toxicogenetics challenge, where it ranked among the best state-of-the-art models, and discuss the importance of choosing good descriptors for cell lines and chemicals. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Lumbar herniated disc: spontaneous regression.
Altun, Idiris; Yüksel, Kasım Zafer
2017-01-01
Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3-L4, L4-L5 or L5-S1 were enrolled. The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3-L4, L4-L5, and L5-S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5-22). It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery.
A fitter use of Monte Carlo simulations in regression models
Directory of Open Access Journals (Sweden)
Alessandro Ferrarini
2011-12-01
Full Text Available In this article, I focus on the use of Monte Carlo simulations (MCS within regression models, being this application very frequent in biology, ecology and economy as well. I'm interested in enhancing a typical fault in this application of MCS, i.e. the inner correlations among independent variables are not used when generating random numbers that fit their distributions. By means of an illustrative example, I provide proof that the misuse of MCS in regression models produces misleading results. Furthermore, I also provide a solution for this topic.
Inconsistency Between Univariate and Multiple Logistic Regressions
WANG, HONGYUE; Peng, Jing; Wang, Bokai; Lu, Xiang; ZHENG, Julia Z.; Wang, Kejia; Tu, Xin M.; Feng, Changyong
2017-01-01
Summary Logistic regression is a popular statistical method in studying the effects of covariates on binary outcomes. It has been widely used in both clinical trials and observational studies. However, the results from the univariate regression and from the multiple logistic regression tend to be conflicting. A covariate may show very strong effect on the outcome in the multiple regression but not in the univariate regression, and vice versa. These facts have not been well appreciated in biom...
Knowledge and Awareness: Linear Regression
Directory of Open Access Journals (Sweden)
Monika Raghuvanshi
2016-12-01
Full Text Available Knowledge and awareness are factors guiding development of an individual. These may seem simple and practicable, but in reality a proper combination of these is a complex task. Economically driven state of development in younger generations is an impediment to the correct manner of development. As youths are at the learning phase, they can be molded to follow a correct lifestyle. Awareness and knowledge are important components of any formal or informal environmental education. The purpose of this study is to evaluate the relationship of these components among students of secondary/ senior secondary schools who have undergone a formal study of environment in their curricula. A suitable instrument is developed in order to measure the elements of Awareness and Knowledge among the participants of the study. Data was collected from various secondary and senior secondary school students in the age group 14 to 20 years using cluster sampling technique from the city of Bikaner, India. Linear regression analysis was performed using IBM SPSS 23 statistical tool. There exists a weak relation between knowledge and awareness about environmental issues, caused due to routine practices mishandling; hence one component can be complemented by other for improvement in both. Knowledge and awareness are crucial factors and can provide huge opportunities in any field. Resource utilization for economic solutions may pave the way for eco-friendly products and practices. If green practices are inculcated at the learning phase, they may become normal routine. This will also help in repletion of the environment.
Estimating equivalence with quantile regression
Cade, B.S.
2011-01-01
Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.
Mondrian Forests: Efficient Online Random Forests
Lakshminarayanan, Balaji; Roy, Daniel M.; Teh, Yee Whye
2014-01-01
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online ...
Hazard rate estimation in nonparametric regression with censored data
VAN KEILEGOM, Ingrid; VERAVERBEKE, Noel
2001-01-01
Consider a regression model in which the responses are sub ject to random right censoring. In this model, Beran studied the nonparametric estimation of the conditional cumulative hazard function and the corresponding cumulative distribution function. The main idea is to use smoothing in the covariates. Here we study asymptotic properties of the corresponding hazard function estimator obtained by convolution smoothing of Beran’s cumulative hazard estimator. We establish asymptotic expressions ...
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Edgington, Eugene
2007-01-01
Statistical Tests That Do Not Require Random Sampling Randomization Tests Numerical Examples Randomization Tests and Nonrandom Samples The Prevalence of Nonrandom Samples in Experiments The Irrelevance of Random Samples for the Typical Experiment Generalizing from Nonrandom Samples Intelligibility Respect for the Validity of Randomization Tests Versatility Practicality Precursors of Randomization Tests Other Applications of Permutation Tests Questions and Exercises Notes References Randomized Experiments Unique Benefits of Experiments Experimentation without Mani
A Spreadsheet Model for Teaching Regression Analysis.
Wood, William C.; O'Hare, Sharon L.
1992-01-01
Presents a spreadsheet model that is useful in introducing students to regression analysis and the computation of regression coefficients. Includes spreadsheet layouts and formulas so that the spreadsheet can be implemented. (Author)
Complete regression of primary malignant melanoma.
Emanuel, Patrick O; Mannion, Meghan; Phelps, Robert G
2008-04-01
Over the years, histopathologic studies to determine the nature and significance of regression in malignant melanoma have yielded different results. At least in part, this most likely reflects differences in the definition of what constitutes regression. Although partial regression is relatively common, complete regression is rare. It has been said that complete regression of a primary lesion is associated with metastatic disease, but the evidence for this is largely anecdotal-the literature contains only case reports and small series. We found 2 cases of complete regression in our dermatopathology database. Metastatic disease was identified in both cases; in 1 case, the suspicion of melanoma was raised on the initial biopsy and subsequent workup revealed lymph node metastasis. These cases illustrate the histologic features of a completely regressed primary melanoma and add credence to the theory that completely regressed melanoma is associated with a poor outcome.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti...
Power analysis for a linear regression model when regressors are matrix sampled
Kolenikov, Stanislav; Hammer, Heather
2017-01-01
Multiple matrix sampling is a survey methodology technique that randomly chooses a relatively small subset of items to be presented to survey respondents for the purpose of reducing respondent burden. The data produced are missing completely at random (MCAR), and special missing data techniques should be used in linear regression and other multivariate statistical analysis. We derive asymptotic variances of regression parameter estimates that allow us to conduct power analysis for linear regr...
Semiparametric regression during 2003–2007
Ruppert, David
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected by...... on the sparse approximation process. Our experimental results show the locally constant form of SPARROW performs competitively....
Regression Analysis by Example. 5th Edition
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Forecasting Using Random Subspace Methods
T. Boot (Tom); D. Nibbering (Didier)
2016-01-01
textabstractRandom subspace methods are a novel approach to obtain accurate forecasts in high-dimensional regression settings. We provide a theoretical justification of the use of random subspace methods and show their usefulness when forecasting monthly macroeconomic variables. We focus on two
A note on the maximum likelihood estimator in the gamma regression model
Directory of Open Access Journals (Sweden)
Jerzy P. Rydlewski
2009-01-01
Full Text Available This paper considers a nonlinear regression model, in which the dependent variable has the gamma distribution. A model is considered in which the shape parameter of the random variable is the sum of continuous and algebraically independent functions. The paper proves that there is exactly one maximum likelihood estimator for the gamma regression model.
Shear, Benjamin R.; Zumbo, Bruno D.
2013-01-01
Type I error rates in multiple regression, and hence the chance for false positive research findings, can be drastically inflated when multiple regression models are used to analyze data that contain random measurement error. This article shows the potential for inflated Type I error rates in commonly encountered scenarios and provides new…
Design and analysis of experiments classical and regression approaches with SAS
Onyiah, Leonard C
2008-01-01
Introductory Statistical Inference and Regression Analysis Elementary Statistical Inference Regression Analysis Experiments, the Completely Randomized Design (CRD)-Classical and Regression Approaches Experiments Experiments to Compare Treatments Some Basic Ideas Requirements of a Good Experiment One-Way Experimental Layout or the CRD: Design and Analysis Analysis of Experimental Data (Fixed Effects Model) Expected Values for the Sums of Squares The Analysis of Variance (ANOVA) Table Follow-Up Analysis to Check fo
Arruda, A G; Godden, S; Rapnicki, P; Gorden, P; Timms, L; Aly, S S; Lehenbauer, T W; Champagne, J
2013-10-01
The objective of this randomized noninferiority clinical trial was to compare the effect of treatment with 3 different dry cow therapy formulations at dry-off on cow-level health and production parameters in the first 100 d in milk (DIM) in the subsequent lactation, including 305-d mature-equivalent (305 ME) milk production, linear score (LS), risk for the cow experiencing a clinical mastitis event, risk for culling or death, and risk for pregnancy by 100 DIM. A total of 1,091 cows from 6 commercial dairy herds in 4 states (California, Iowa, Minnesota, and Wisconsin) were randomly assigned at dry-off to receive treatment with 1 of 3 commercial products: Quartermaster (QT; Zoetis Animal Health, Madison, NJ), Spectramast DC (SP; Zoetis Animal Health) or ToMorrow Dry Cow (TM; Boehringer Ingelheim Vetmedica Inc., St Joseph, MO). All clinical mastitis, pregnancy, culling, and death events occurring in the first 100 DIM were recorded by farm staff using an on-farm electronic record-keeping system. Dairy Herd Improvement Association test-day records of milk production and milk component testing were retrieved electronically. Mixed linear regression analysis was used to describe the effect of treatment on 305ME milk production and LS recorded on the last Dairy Herd Improvement Association test day before 100 DIM. Cox proportional hazards regression analysis was used to describe the effect of treatment on risk for experiencing a case of clinical mastitis, risk for leaving the herd, and risk for pregnancy between calving and 100 DIM. Results showed no effect of treatment on adjusted mean 305 ME milk production (QT=11,759 kg, SP=11,574 kg, and TM=11,761 kg) or adjusted mean LS (QT=1.8, SP=1.9, and TM=1.6) on the last test day before 100 DIM. Similarly, no effect of treatment was observed on risk for a clinical mastitis event (QT=14.8%, SP=12.7%, and TM=15.0%), risk for leaving the herd (QT=7.5%, SP=9.2%, and TM=10.3%), or risk for pregnancy (QT=31.5%, SP=26.1%, and TM=26
Fully Regressive Melanoma: A Case Without Metastasis.
Ehrsam, Eric; Kallini, Joseph R; Lebas, Damien; Khachemoune, Amor; Modiano, Philippe; Cotten, Hervé
2016-08-01
Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis.
Analyzing genotype-by-environment interaction using curvilinear regression
Directory of Open Access Journals (Sweden)
Dulce Gamito Santinhos Pereira
2012-12-01
Full Text Available In the context of multi-environment trials, where a series of experiments is conducted across different environmental conditions, the analysis of the structure of genotype-by-environment interaction is an important topic. This paper presents a generalization of the joint regression analysis for the cases where the response (e.g. yield is not linear across environments and can be written as a second (or higher order polynomial or another non-linear function. After identifying the common form regression function for all genotypes, we propose a selection procedure based on the adaptation of two tests: (i a test for parallelism of regression curves; and (ii a test of coincidence for those regressions. When the hypothesis of parallelism is rejected, subgroups of genotypes where the responses are parallel (or coincident should be identified. The use of the Scheffé multiple comparison method for regression coefficients in second-order polynomials allows to group the genotypes in two types of groups: one with upward-facing concavity (i.e. potential yield growth, and the other with downward-facing concavity (i.e. the yield approaches saturation. Theoretical results for genotype comparison and genotype selection are illustrated with an example of yield from a non-orthogonal series of experiments with winter rye (Secalecereale L.. We have deleted 10 % of that data at random to show that our meteorology is fully applicable to incomplete data sets, often observed in multi-environment trials.
Spontaneous Regression of Lumbar Herniated Disc
Directory of Open Access Journals (Sweden)
Chun-Wei Chang
2009-12-01
Full Text Available Intervertebral disc herniation of the lumbar spine is a common disease presenting with low back pain and involving nerve root radiculopathy. Some neurological symptoms in the majority of patients frequently improve after a period of conservative treatment. This has been regarded as the result of a decrease of pressure exerted from the herniated disc on neighboring neurostructures and a gradual regression of inflammation. Recently, with advances in magnetic resonance imaging, many reports have demonstrated that the herniated disc has the potential for spontaneous regression. Regression coincided with the improvement of associated symptoms. However, the exact regression mechanism remains unclear. Here, we present 2 cases of lumbar intervertebral disc herniation with spontaneous regression. We review the literature and discuss the possible mechanisms, the precipitating factors of spontaneous disc regression and the proper timing of surgical intervention.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Gong, Qi; Schaubel, Douglas E
2018-01-22
Mean survival time is often of inherent interest in medical and epidemiologic studies. In the presence of censoring and when covariate effects are of interest, Cox regression is the strong default, but mostly due to convenience and familiarity. When survival times are uncensored, covariate effects can be estimated as differences in mean survival through linear regression. Tobit regression can validly be performed through maximum likelihood when the censoring times are fixed (ie, known for each subject, even in cases where the outcome is observed). However, Tobit regression is generally inapplicable when the response is subject to random right censoring. We propose Tobit regression methods based on weighted maximum likelihood which are applicable to survival times subject to both fixed and random censoring times. Under the proposed approach, known right censoring is handled naturally through the Tobit model, with inverse probability of censoring weighting used to overcome random censoring. Essentially, the re-weighting data are intended to represent those that would have been observed in the absence of random censoring. We develop methods for estimating the Tobit regression parameter, then the population mean survival time. A closed form large-sample variance estimator is proposed for the regression parameter estimator, with a semiparametric bootstrap standard error estimator derived for the population mean. The proposed methods are easily implementable using standard software. Finite-sample properties are assessed through simulation. The methods are applied to a large cohort of patients wait-listed for kidney transplantation. Copyright © 2018 John Wiley & Sons, Ltd.
Regression techniques for Portfolio Optimisation using MOSEK
Schmelzer, Thomas; Hauser, Raphael; Andersen, Erling; Dahl, Joachim
2013-01-01
Regression is widely used by practioners across many disciplines. We reformulate the underlying optimisation problem as a second-order conic program providing the flexibility often needed in applications. Using examples from portfolio management and quantitative trading we solve regression problems with and without constraints. Several Python code fragments are given. The code and data are available online at http://www.github.com/tschm/MosekRegression.
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
New ridge parameters for ridge regression
Directory of Open Access Journals (Sweden)
A.V. Dorugade
2014-04-01
Full Text Available Hoerl and Kennard (1970a introduced the ridge regression estimator as an alternative to the ordinary least squares (OLS estimator in the presence of multicollinearity. In ridge regression, ridge parameter plays an important role in parameter estimation. In this article, a new method for estimating ridge parameters in both situations of ordinary ridge regression (ORR and generalized ridge regression (GRR is proposed. The simulation study evaluates the performance of the proposed estimator based on the mean squared error (MSE criterion and indicates that under certain conditions the proposed estimators perform well compared to OLS and other well-known estimators reviewed in this article.
Directory of Open Access Journals (Sweden)
Severino Benone Paes Barbosa
2007-02-01
samples before analysis (DS and the random effects of herd, cow and residual. Data was analyzed by REML using the spatial power procedure. Observed means for SCC (x1000 cel/mL and SCS ranged from 400,245±468,558 and 3.55±0.82 in 1993 to 241,360±514,969 and 2.41±1.35 in 1998, respectively. DIM, AC, MTD and DS effects were significant for SCC and SCS. The phenotypic correlations between SCC and SCS on consecutive test-days suggest it is possible to estabilish management routines to reduce the presence of clinical and/or subclinical mastitis in first lactation Holstein cows.
Panel regressions to estimate low-flow response to rainfall variability in ungaged basins
Bassiouni, Maoya; Vogel, Richard M.; Archfield, Stacey A.
2016-12-01
Multicollinearity and omitted-variable bias are major limitations to developing multiple linear regression models to estimate streamflow characteristics in ungaged areas and varying rainfall conditions. Panel regression is used to overcome limitations of traditional regression methods, and obtain reliable model coefficients, in particular to understand the elasticity of streamflow to rainfall. Using annual rainfall and selected basin characteristics at 86 gaged streams in the Hawaiian Islands, regional regression models for three stream classes were developed to estimate the annual low-flow duration discharges. Three panel-regression structures (random effects, fixed effects, and pooled) were compared to traditional regression methods, in which space is substituted for time. Results indicated that panel regression generally was able to reproduce the temporal behavior of streamflow and reduce the standard errors of model coefficients compared to traditional regression, even for models in which the unobserved heterogeneity between streams is significant and the variance inflation factor for rainfall is much greater than 10. This is because both spatial and temporal variability were better characterized in panel regression. In a case study, regional rainfall elasticities estimated from panel regressions were applied to ungaged basins on Maui, using available rainfall projections to estimate plausible changes in surface-water availability and usable stream habitat for native species. The presented panel-regression framework is shown to offer benefits over existing traditional hydrologic regression methods for developing robust regional relations to investigate streamflow response in a changing climate.
Test-day models : breeding value estimation based on individual test-day records
Pool, M.H.
2000-01-01
The studies described in this thesis were achieved within the graduate school Wageningen Institute of Animal Science (WIAS), carried out at the Institute for Animal Science and Health (ID-Lelystad BV) at the department of Genetics and Reproduction, and financially supported by the product
Robust regression for large-scale neuroimaging studies.
Fritsch, Virgile; Da Mota, Benoit; Loth, Eva; Varoquaux, Gaël; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Brühl, Rüdiger; Butzek, Brigitte; Conrod, Patricia; Flor, Herta; Garavan, Hugh; Lemaitre, Hervé; Mann, Karl; Nees, Frauke; Paus, Tomas; Schad, Daniel J; Schümann, Gunter; Frouin, Vincent; Poline, Jean-Baptiste; Thirion, Bertrand
2015-05-01
Multi-subject datasets used in neuroimaging group studies have a complex structure, as they exhibit non-stationary statistical properties across regions and display various artifacts. While studies with small sample sizes can rarely be shown to deviate from standard hypotheses (such as the normality of the residuals) due to the poor sensitivity of normality tests with low degrees of freedom, large-scale studies (e.g. >100 subjects) exhibit more obvious deviations from these hypotheses and call for more refined models for statistical inference. Here, we demonstrate the benefits of robust regression as a tool for analyzing large neuroimaging cohorts. First, we use an analytic test based on robust parameter estimates; based on simulations, this procedure is shown to provide an accurate statistical control without resorting to permutations. Second, we show that robust regression yields more detections than standard algorithms using as an example an imaging genetics study with 392 subjects. Third, we show that robust regression can avoid false positives in a large-scale analysis of brain-behavior relationships with over 1500 subjects. Finally we embed robust regression in the Randomized Parcellation Based Inference (RPBI) method and demonstrate that this combination further improves the sensitivity of tests carried out across the whole brain. Altogether, our results show that robust procedures provide important advantages in large-scale neuroimaging group studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Bayes Estimation of Two-Phase Linear Regression Model
Directory of Open Access Journals (Sweden)
Mayuri Pandya
2011-01-01
Full Text Available Let the regression model be Yi=β1Xi+εi, where εi are i. i. d. N (0,σ2 random errors with variance σ2>0 but later it was found that there was a change in the system at some point of time m and it is reflected in the sequence after Xm by change in slope, regression parameter β2. The problem of study is when and where this change has started occurring. This is called change point inference problem. The estimators of m, β1,β2 are derived under asymmetric loss functions, namely, Linex loss & General Entropy loss functions. The effects of correct and wrong prior information on the Bayes estimates are studied.
On Regression Estimators Using Extreme Ranked Set Samples
Directory of Open Access Journals (Sweden)
Hani M. Samawi
2004-06-01
Full Text Available Regression is used to estimate the population mean of the response variable, , in the two cases where the population mean of the concomitant (auxiliary variable, , is known and where it is unknown. In the latter case, a double sampling method is used to estimate the population mean of the concomitant variable. We invesitagate the performance of the two methods using extreme ranked set sampling (ERSS, as discussed by Samawi et al. (1996. Theoretical and Monte Carlo evaluation results as well as an illustration using actual data are presented. The results show that if the underlying joint distribution of and is symmetric, then using ERSS to obtain regression estimates is more efficient than using ranked set sampling (RSS or simple random sampling (SRS.
The regress problem : Metatheory, development, and criticism
Peijnenburg, Jeanne; Aikin, Scott
This introduction presents selected proceedings of a two-day meeting on the regress problem, sponsored by the Netherlands Organization for Scientific Research (NWO) and hosted by Vanderbilt University in October 2013, along with other submitted essays. Three forms of research on the regress problem
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Regression Analysis: Legal Applications in Institutional Research
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
An identity for kernel ridge regression
Zhdanov, Fedor; Kalnishkan, Yuri
2013-01-01
This paper derives an identity connecting the square loss of ridge regression in on-line mode with the loss of the retrospectively best regressor. Some corollaries about the properties of the cumulative loss of on-line ridge regression are also obtained.
ON REGRESSION REPRESENTATIONS OF STOCHASTIC-PROCESSES
RUSCHENDORF, L; DEVALK, [No Value
We construct a.s. nonlinear regression representations of general stochastic processes (X(n))n is-an-element-of N. As a consequence we obtain in particular special regression representations of Markov chains and of certain m-dependent sequences. For m-dependent sequences we obtain a constructive
Pathological assessment of liver fibrosis regression
Directory of Open Access Journals (Sweden)
WANG Bingqiong
2017-03-01
Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Investigating bias in squared regression structure coefficients.
Nimon, Kim F; Zientek, Linda R; Thompson, Bruce
2015-01-01
The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients.
Regression of altitude-produced cardiac hypertrophy.
Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.
1973-01-01
The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.
Competing Risks Quantile Regression at Work
DEFF Research Database (Denmark)
Dlugosz, Stephan; Lo, Simon M. S.; Wilke, Ralf
2017-01-01
Despite its emergence as a frequently used method for the empirical analysis of multivariate data, quantile regression is yet to become a mainstream tool for the analysis of duration data. We present a pioneering empirical study on the grounds of a competing risks quantile regression model. We us...... into the distribution of transitions out of maternity leave. It is found that cumulative incidences implied by the quantile regression model differ from those implied by a proportional hazards model. To foster the use of the model, we make an R-package (cmprskQR) available....
Particle Swarm Optimization and Regression Analysis II
Mohanty, Soumya D.
2012-10-01
In the first part of this article, Particle Swarm Optimization (PSO) was applied to the problem of optimizing knot placement in the regression spline method. Although promising for broadband signals having smooth, but otherwise unknown, waveforms, this simple approach fails in the case of narrowband signals when the carrier frequency as well as the amplitude and phase modulations are unknown. A method is presented that addresses this challenge by using PSO based regression splines for the in-phase and quadrature amplitudes separately. It is thereby seen that PSO is an effective tool for regression analysis of a broad class of signals.
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Model checking for ROC regression analysis.
Cai, Tianxi; Zheng, Yingye
2007-03-01
The receiver operating characteristic (ROC) curve is a prominent tool for characterizing the accuracy of a continuous diagnostic test. To account for factors that might influence the test accuracy, various ROC regression methods have been proposed. However, as in any regression analysis, when the assumed models do not fit the data well, these methods may render invalid and misleading results. To date, practical model-checking techniques suitable for validating existing ROC regression models are not yet available. In this article, we develop cumulative residual-based procedures to graphically and numerically assess the goodness of fit for some commonly used ROC regression models, and show how specific components of these models can be examined within this framework. We derive asymptotic null distributions for the residual processes and discuss resampling procedures to approximate these distributions in practice. We illustrate our methods with a dataset from the cystic fibrosis registry.
Weighted regression analysis and interval estimators
Donald W. Seegrist
1974-01-01
A method for deriving the weighted least squares estimators for the parameters of a multiple regression model. Confidence intervals for expected values, and prediction intervals for the means of future samples are given.
Multiple Instance Regression with Structured Data
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
This slide presentation reviews the use of multiple instance regression with structured data from multiple and related data sets. It applies the concept to a practical problem, that of estimating crop yield using remote sensed country wide weekly observations.
Patterns of Regression in Rett Syndrome
Directory of Open Access Journals (Sweden)
J Gordon Millichap
2002-10-01
Full Text Available Patterns and features of regression in a case series of 53 girls and women with Rett syndrome were studied at the Institute of Child Health and Great Ormond Street Children’s Hospital, London, UK.
Dynamic travel time estimation using regression trees.
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
STREAMFLOW AND WATER QUALITY REGRESSION MODELING ...
African Journals Online (AJOL)
STREAMFLOW AND WATER QUALITY REGRESSION MODELING OF IMO RIVER SYSTEM: A CASE STUDY. ... Journal of Modeling, Design and Management of Engineering Systems ... Possible sources of contamination of Imo-river system within Nekede and Obigbo hydrological stations watershed were traced.
Leffondré, Karen; Jager, Kitty J.; Boucquemont, Julie; Stel, Vianda S.; Heinze, Georg
2014-01-01
Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome
DART: Dropouts meet Multiple Additive Regression Trees
Rashmi, K. V.; Gilad-Bachrach, Ran
2015-01-01
Multiple Additive Regression Trees (MART), an ensemble model of boosted regression trees, is known to deliver high prediction accuracy for diverse tasks, and it is widely used in practice. However, it suffers an issue which we call over-specialization, wherein trees added at later iterations tend to impact the prediction of only a few instances, and make negligible contribution towards the remaining instances. This negatively affects the performance of the model on unseen data, and also makes...
Multinomial probit Bayesian additive regression trees.
Kindo, Bereket P; Wang, Hao; Peña, Edsel A
This article proposes multinomial probit Bayesian additive regression trees (MPBART) as a multinomial probit extension of BART - Bayesian additive regression trees. MPBART is flexible to allow inclusion of predictors that describe the observed units as well as the available choice alternatives. Through two simulation studies and four real data examples, we show that MPBART exhibits very good predictive performance in comparison to other discrete choice and multiclass classification methods. To implement MPBART, the R package mpbart is freely available from CRAN repositories.
Spontaneous regression of herniated lumbar discs.
Kim, Eric S; Oladunjoye, Azeem O; Li, Jay A; Kim, Kee D
2014-06-01
The spontaneous regression of a lumbar herniated disc is a common occurrence. Studies using imaging techniques as well as immunohistologic analyses have attempted to explain the mechanism for regression. However, the exact mechanism remains elusive. Understanding the process by which herniated discs disappear in the absence of surgery may better guide treatment. Recent case reports, radiographic and immunohistologic studies show that the extent of extrusion of the nucleus pulposus is related to a higher likelihood of regression. To our knowledge, Patient 3 is the first report of spontaneous regression occurring within 2 months. This occurrence was discovered intraoperatively. We present three illustrative patients. Patient 1, a 53-year-old man, presented with a large L2-L3 disc herniation. His 2 year follow-up MRI revealed a complete regression of the extruded fragment. Patient 2, a 58-year-old man, presented with an L3-L4 disc herniation with cephalad migration of a free fragment. MRI 9 months later showed no free fragment but progression of a disc bulge. Intraoperative exploration during the L3-L4 microdiscectomy confirmed the absence of the free fragment. Patient 3, a 58-year-old woman, presented with a large L2-L3 disc extrusion with cephalad migration. An imaging study performed 2 months after the initial study revealed an absence of the free fragment. Our case reports demonstrate the temporal variance in disc regression. While the time course and extent of regression vary widely, the rapid time in which regression can occur should caution surgeons contemplating discectomy based on an MRI performed a significant period prior to surgery. Copyright © 2013 Elsevier Ltd. All rights reserved.
Spontaneous regression of metastatic Merkel cell carcinoma.
LENUS (Irish Health Repository)
Hassan, S J
2010-01-01
Merkel cell carcinoma is a rare aggressive neuroendocrine carcinoma of the skin predominantly affecting elderly Caucasians. It has a high rate of local recurrence and regional lymph node metastases. It is associated with a poor prognosis. Complete spontaneous regression of Merkel cell carcinoma has been reported but is a poorly understood phenomenon. Here we present a case of complete spontaneous regression of metastatic Merkel cell carcinoma demonstrating a markedly different pattern of events from those previously published.
Online Active Linear Regression via Thresholding
Riquelme, Carlos; Johari, Ramesh; Zhang, Baosen
2016-01-01
We consider the problem of online active learning to collect data for regression modeling. Specifically, we consider a decision maker with a limited experimentation budget who must efficiently learn an underlying linear population model. Our main contribution is a novel threshold-based algorithm for selection of most informative observations; we characterize its performance and fundamental lower bounds. We extend the algorithm and its guarantees to sparse linear regression in high-dimensional...
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
The Geometry of Enhancement in Multiple Regression
Waller, Niels G.
2011-01-01
In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…
Two Paradoxes in Linear Regression Analysis.
Feng, Ge; Peng, Jing; Tu, Dongke; Zheng, Julia Z; Feng, Changyong
2016-12-25
Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection.
Variable and subset selection in PLS regression
DEFF Research Database (Denmark)
Høskuldsson, Agnar
2001-01-01
The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion is that vari...... obtained by different methods. We also present an approach to orthogonal scatter correction. The procedures and comparisons are applied to industrial data. (C) 2001 Elsevier Science B.V. All rights reserved....
Post-processing through linear regression
Directory of Open Access Journals (Sweden)
B. Van Schaeybroeck
2011-03-01
Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.
These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Multiple-Instance Regression with Structured Data
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents. Unlike previous MIR methods, MI-ClusterRegress can operate on bags that are structured in that they contain items drawn from a number of distinct (but unknown) distributions. MI-ClusterRegress simultaneously learns a model of the bag's internal structure, the relevance of each item, and a regression model that accurately predicts labels for new bags. We evaluated this approach on the challenging MIR problem of crop yield prediction from remote sensing data. MI-ClusterRegress provided predictions that were more accurate than those obtained with non-multiple-instance approaches or MIR methods that do not model the bag structure.
A highly efficient design strategy for regression with outcome pooling.
Mitchell, Emily M; Lyles, Robert H; Manatunga, Amita K; Perkins, Neil J; Schisterman, Enrique F
2014-12-10
The potential for research involving biospecimens can be hindered by the prohibitive cost of performing laboratory assays on individual samples. To mitigate this cost, strategies such as randomly selecting a portion of specimens for analysis or randomly pooling specimens prior to performing laboratory assays may be employed. These techniques, while effective in reducing cost, are often accompanied by a considerable loss of statistical efficiency. We propose a novel pooling strategy based on the k-means clustering algorithm to reduce laboratory costs while maintaining a high level of statistical efficiency when predictor variables are measured on all subjects, but the outcome of interest is assessed in pools. We perform simulations motivated by the BioCycle study to compare this k-means pooling strategy with current pooling and selection techniques under simple and multiple linear regression models. While all of the methods considered produce unbiased estimates and confidence intervals with appropriate coverage, pooling under k-means clustering provides the most precise estimates, closely approximating results from the full data and losing minimal precision as the total number of pools decreases. The benefits of k-means clustering evident in the simulation study are then applied to an analysis of the BioCycle dataset. In conclusion, when the number of lab tests is limited by budget, pooling specimens based on k-means clustering prior to performing lab assays can be an effective way to save money with minimal information loss in a regression setting. Copyright © 2014 John Wiley & Sons, Ltd.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.
Directory of Open Access Journals (Sweden)
Yanguang Chen
Full Text Available In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
Tightness of M-estimators for multiple linear regression in time series
Johansen, Søren; Nielsen, Bent
2016-01-01
We show tightness of a general M-estimator for multiple linear regression in time series. The positive criterion function for the M-estimator is assumed lower semi-continuous and sufficiently large for large argument: Particular cases are the Huber-skip and quantile regression. Tightness requires an assumption on the frequency of small regressors. We show that this is satisfied for a variety of deterministic and stochastic regressors, including stationary an random walks regressors. The resul...
Symbolic Regression of Conditional Target Expressions
Korns, Michael F.
This chapter examines techniques for improving symbolic regression systems in cases where the target expression contains conditionals. In three previous papers we experimentedwith combining high performance techniques fromthe literature to produce a large scale, industrial strength, symbolic regression-classification system. Performance metrics across multiple problems show deterioration in accuracy for problems where the target expression contains conditionals. The techniques described herein are shown to improve accuracy on such conditional problems. Nine base test cases, from the literature, are used to test the improvement in accuracy. A previously published regression system combining standard genetic programming with abstract expression grammars, particle swarm optimization, differential evolution, context aware crossover and age-layered populations is tested on the nine base test cases. The regression system is enhanced with these additional techniques: pessimal vertical slicing, splicing of uncorrelated champions via abstract conditional expressions, and abstract mutation and crossover. The enhanced symbolic regression system is applied to the nine base test cases and an improvement in accuracy is observed.
Regression Test Selection for C# Programs
Directory of Open Access Journals (Sweden)
Nashat Mansour
2009-01-01
Full Text Available We present a regression test selection technique for C# programs. C# is fairly new and is often used within the Microsoft .Net framework to give programmers a solid base to develop a variety of applications. Regression testing is done after modifying a program. Regression test selection refers to selecting a suitable subset of test cases from the original test suite in order to be rerun. It aims to provide confidence that the modifications are correct and did not affect other unmodified parts of the program. The regression test selection technique presented in this paper accounts for C#.Net specific features. Our technique is based on three phases; the first phase builds an Affected Class Diagram consisting of classes that are affected by the change in the source code. The second phase builds a C# Interclass Graph (CIG from the affected class diagram based on C# specific features. In this phase, we reduce the number of selected test cases. The third phase involves further reduction and a new metric for assigning weights to test cases for prioritizing the selected test cases. We have empirically validated the proposed technique by using case studies. The empirical results show the usefulness of the proposed regression testing technique for C#.Net programs.
National Research Council Canada - National Science Library
de Koning, Lawrence; Merchant, Anwar T; Pogue, Janice; Anand, Sonia S
2007-01-01
.... Methods and results This meta-regression analysis used a search strategy of keywords and MeSH terms to identify prospective cohort studies and randomized clinical trials of CVD risk and abdominal...
Johnson, A P; Godden, S M; Royster, E; Zuidhof, S; Miller, B; Sorg, J
2016-01-01
The study objective was to compare the efficacy of 2 commercial dry cow mastitis formulations containing cloxacillin benzathine or ceftiofur hydrochloride. Quarter-level outcomes included prevalence of intramammary infection (IMI) postcalving, risk for cure of preexisting infections, risk for acquiring a new IMI during the dry period, and risk for clinical mastitis between dry off and 100 d in milk (DIM). Cow-level outcomes included the risk for clinical mastitis and the risk for removal from the herd between dry off and 100 DIM, as well as Dairy Herd Improvement Association (DHIA) test-day milk component and production measures between calving and 100 DIM. A total of 799 cows from 4 Wisconsin dairy herds were enrolled at dry off and randomized to 1 of the 2 commercial dry cow therapy (DCT) treatments: cloxacillin benzathine (DC; n=401) or ceftiofur hydrochloride (SM; n=398). Aseptic quarter milk samples were collected for routine bacteriological culture before DCT at dry off and again at 0 to 10 DIM. Data describing clinical mastitis cases and DHIA test-day results were retrieved from on-farm electronic records. The overall crude quarter-level prevalence of IMI at dry off was 34.7% and was not different between treatment groups. Ninety-six percent of infections at dry off were of gram-positive organisms, with coagulase-negative Staphylococcus and Aerococcus spp. isolated most frequently. Mixed logistic regression analysis showed no difference between treatments as to the risk for presence of IMI at 0 to 10 DIM (DC=22.4%, SM=19.9%) or on the risk for acquiring a new IMI between dry off and 0 to 10 DIM (DC=16.6%, SM=14.1%). Noninferiority analysis and mixed logistic regression analysis both showed no treatment difference in risk for a cure between dry off and 0 to 10 DIM (DC=84.8%, SM=85.7%). Cox proportional hazards regression showed no difference between treatments in quarter-level risk for clinical mastitis (DC=1.99%, SM=2.96%), cow-level risk for clinical
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
On Solving Lq-Penalized Regressions
Directory of Open Access Journals (Sweden)
Tracy Zhou Wu
2007-01-01
Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
LINEAR REGRESSION WITH R AND HADOOP
Directory of Open Access Journals (Sweden)
Bogdan OANCEA
2015-07-01
Full Text Available In this paper we present a way to solve the linear regression model with R and Hadoop using the Rhadoop library. We show how the linear regression model can be solved even for very large models that require special technologies. For storing the data we used Hadoop and for computation we used R. The interface between R and Hadoop is the open source library RHadoop. We present the main features of the Hadoop and R software systems and the way of interconnecting them. We then show how the least squares solution for the linear regression problem could be expressed in terms of map-reduce programming paradigm and how could be implemented using the Rhadoop library.
Computing aspects of power for multiple regression.
Dunlap, William P; Xin, Xue; Myers, Leann
2004-11-01
Rules of thumb for power in multiple regression research abound. Most such rules dictate the necessary sample size, but they are based only upon the number of predictor variables, usually ignoring other critical factors necessary to compute power accurately. Other guides to power in multiple regression typically use approximate rather than precise equations for the underlying distribution; entail complex preparatory computations; require interpolation with tabular presentation formats; run only under software such as Mathmatica or SAS that may not be immediately available to the user; or are sold to the user as parts of power computation packages. In contrast, the program we offer herein is immediately downloadable at no charge, runs under Windows, is interactive, self-explanatory, flexible to fit the user's own regression problems, and is as accurate as single precision computation ordinarily permits.
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Groupwise Retargeted Least-Squares Regression.
Wang, Lingfeng; Pan, Chunhong
2017-01-25
In this brief, we propose a new groupwise retargeted least squares regression (GReLSR) model for multicategory classification. The main motivation behind GReLSR is to utilize an additional regularization to restrict the translation values of ReLSR, so that they should be similar within same class. By analyzing the regression targets of ReLSR, we propose a new formulation of ReLSR, where the translation values are expressed explicitly. On the basis of the new formulation, discriminative least-squares regression can be regarded as a special case of ReLSR with zero translation values. Moreover, a groupwise constraint is added to ReLSR to form the new GReLSR model. Extensive experiments on various machine leaning data sets illustrate that our method outperforms the current state-of-the-art approaches.
Liu, Zhan-yu; Huang, Jing-feng; Shi, Jing-jing; Tao, Rong-xiang; Zhou, Wan; Zhang, Li-Li
2007-10-01
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study, measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2,500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respectively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demonstrates that it is feasible to estimate the disease severity of rice brown spot using hyperspectral reflectance data at the leaf level.
Multicollinearity in cross-sectional regressions
Lauridsen, Jørgen; Mur, Jesùs
2006-10-01
The paper examines robustness of results from cross-sectional regression paying attention to the impact of multicollinearity. It is well known that the reliability of estimators (least-squares or maximum-likelihood) gets worse as the linear relationships between the regressors become more acute. We resolve the discussion in a spatial context, looking closely into the behaviour shown, under several unfavourable conditions, by the most outstanding misspecification tests when collinear variables are added to the regression. A Monte Carlo simulation is performed. The conclusions point to the fact that these statistics react in different ways to the problems posed.
Multiple regression modeling of nonlinear data sets
Kravtsov, S.; Kondrashov, D.; Ghil, M.
2003-04-01
Application of multiple polynomial regression modeling to observational and model generated data sets is discussed. Here the form of classical multiple linear regression is generalized to a model that is still linear in its parameters, but includes general multivariate polynomials of predictor variables as the basis functions. The system's low-frequency evolution is assumed to be the result of deterministic, possibly nonlinear, dynamics excited by a temporally white, but geographically coherent and normally distributed white noise. In determining the appropriate structure of the latter, the multi-level generalization of multiple polynomial regression, where the residual stochastic forcing at a given level is subsequently modeled as a function of variables at this, and all preceding levels, has turned out to be useful. The number of levels is determined so that lag-0 covariance of the residual forcing converges to a constant matrix, while its lag-1 covariance vanishes. The method has been applied to the output from a three-layer quasi-geostrophic model, to the analysis of the Northern Hemisphere wintertime geopotential height anomalies, and to global sea-surface temperature (SST) data. In the former two cases, the nonlinear multi-regime structure of probability density function (PDF) constructed in the phase subspace of a few leading empirical orthogonal functions (EOFs), as well as the detailed spectrum of the data's temporal evolution, have been well reproduced by the regression simulations. We have given a simple dynamical interpretation of these results in terms of synoptic-eddy feedback on the system's low-frequency variability. In modeling of SST data, a simple way to include the seasonal cycle into the regression model has been developed. The regression simulation in this case produces ENSO events with maximum amplitude in December/January, while the positive events generally tend to have a larger amplitude than the negative events -- a feature that cannot be
Multispectral colormapping using penalized least square regression
DEFF Research Database (Denmark)
Dissing, Bjørn Skovlund; Carstensen, Jens Michael; Larsen, Rasmus
2010-01-01
The authors propose a novel method to map a multispectral image into the device independent color space CIE-XYZ. This method provides a way to visualize multispectral images by predicting colorvalues from spectral values while maintaining interpretability and is tested on a light emitting diode......-XYZ color matching functions. The target of the regression is a well known color chart, and the models are validated using leave one out cross validation in order to maintain best possible generalization ability. The authors compare the method with a direct linear regression and see...
Salience Assignment for Multiple-Instance Regression
Wagstaff, Kiri L.; Lane, Terran
2007-01-01
We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag's real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstrate that it provides more extensive, intuitive, and stable salience models than Primary-Instance Regression, which selects a single relevant item from each bag.
Demonstration of a Fiber Optic Regression Probe
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Regression models for predicting anthropometric measurements of ...
African Journals Online (AJOL)
... System (ANFIS) was employed to select the two most influential of the five input measurements. This search was separately conducted for each of the output measurements. Regression models were developed from the collected anthropometric data. Also, the predictive performance of these models was examined using ...
Linear Regression Models for Estimating True Subsurface ...
Indian Academy of Sciences (India)
47
For the fact that subsurface resistivity is nonlinear, the datasets were first. 14 transformed into logarithmic scale to satisfy the basic regression assumptions. Three. 15 models, one each for the three array types, are thus developed based on simple linear. 16 relationships between the dependent and independent variables.
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
Predicting Social Trust with Binary Logistic Regression
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Spontaneous regression of an intraspinal disc cyst
Energy Technology Data Exchange (ETDEWEB)
Demaerel, P.; Eerens, I.; Wilms, G. [University Hospital, Leuven (Belgium). Dept. of Radiology; Goffin, J. [Dept. of Neurosurgery, University Hospitals, Leuven (Belgium)
2001-11-01
We present a patient with a so-called disc cyst. Its location in the ventrolateral epidural space and its communication with the herniated disc are clearly shown. The disc cyst developed rapidly and regressed spontaneously. This observation, which has not been reported until now, appears to support focal degeneration with cyst formation as the pathogenesis. (orig.)
Optimal Changepoint Tests for Normal Linear Regression
Donald W.K. Andrews; Inpyo Lee; Werner Ploberger
1992-01-01
This paper determines a class of finite sample optimal tests for the existence of a changepoint at an unknown time in a normal linear multiple regression model with known variance. Optimal tests for multiple changepoints are also derived. Power comparisons of several tests are provided based on simulations.
A Skew-Normal Mixture Regression Model
Liu, Min; Lin, Tsung-I
2014-01-01
A challenge associated with traditional mixture regression models (MRMs), which rest on the assumption of normally distributed errors, is determining the number of unobserved groups. Specifically, even slight deviations from normality can lead to the detection of spurious classes. The current work aims to (a) examine how sensitive the commonly…
Structural Break Tests Robust to Regression Misspecification
Abi Morshed, Alaa; Andreou, E.; Boldea, Otilia
2016-01-01
Structural break tests developed in the literature for regression models are sensitive to model misspecification. We show - analytically and through simulations - that the sup Wald test for breaks in the conditional mean and variance of a time series process exhibits severe size distortions when the
Assumptions of Multiple Regression: Correcting Two Misconceptions
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason
2013-01-01
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Invariant Ordering of Item-Total Regressions
Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas
2011-01-01
A new observable consequence of the property of invariant item ordering is presented, which holds under Mokken's double monotonicity model for dichotomous data. The observable consequence is an invariant ordering of the item-total regressions. Kendall's measure of concordance "W" and a weighted version of this measure are proposed as measures for…
The M Word: Multicollinearity in Multiple Regression.
Morrow-Howell, Nancy
1994-01-01
Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…
Finite Algorithms for Robust Linear Regression
DEFF Research Database (Denmark)
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Macroeconomic Forecasting Using Penalized Regression Methods
Smeekes, Stephan; Wijler, Etiënne
2016-01-01
We study the suitability of lasso-type penalized regression techniques when applied to macroeconomic forecasting with high-dimensional datasets. We consider performance of the lasso-type methods when the true DGP is a factor model, contradicting the sparsity assumption underlying penalized
Creativity and Regression on the Rorschach.
Lazar, Billie S.
This paper describes the results of a study to further test and replicate previous studies partially supporting Kris's view that creativity is a regression in the service of the ego. For this sample of 42 female art and business college students, it was predicted that (1) highly creative Ss (measured by the Torrance Tests) produce more, and more…
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Regression testing Ajax applications : Coping with dynamism
Roest, D.; Mesbah, A.; Van Deursen, A.
2009-01-01
Note: This paper is a pre-print of: Danny Roest, Ali Mesbah and Arie van Deursen. Regression Testing AJAX Applications: Coping with Dynamism. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10), Paris, France. IEEE Computer Society, 2010.
Regression Formulae for Predicting Hematologic and Liver ...
African Journals Online (AJOL)
Dr Femi Olaleye
Full Length Research Article. Regression Formulae for Predicting. Hematologic and Liver Functions from. Years of Exposure to Cement Dust in. Cement Factory Workers in Sokoto, Nigeria. Mojiminiyi, F.B.O.1, Merenu, I.A.2, Njoku, C.H.3, Ibrahim, M.T.O.2. Departments of Physiology1, Community Health2 and Medicine3,.
Measurement Error in Education and Growth Regressions*
Portela, Miguel; Alessie, Rob; Teulings, Coen
2010-01-01
The use of the perpetual inventory method for the construction of education data per country leads to systematic measurement error. This paper analyzes its effect on growth regressions. We suggest a methodology for correcting this error. The standard attenuation bias suggests that using these
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
A Logistic Regression Model for Personnel Selection.
Raju, Nambury S.; And Others
1991-01-01
A two-parameter logistic regression model for personnel selection is proposed. The model was tested with a database of 84,808 military enlistees. The probability of job success was related directly to trait levels, addressing such topics as selection, validity generalization, employee classification, selection bias, and utility-based fair…
Targeting: Logistic Regression, Special Cases and Extensions
Directory of Open Access Journals (Sweden)
Helmut Schaeben
2014-12-01
Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.
Deriving the Regression Line with Algebra
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Simulation Optimization through Regression or Kriging Metamodels
Kleijnen, J.P.C.
2017-01-01
This chapter surveys two methods for the optimization of real-world systems that are modelled through simulation. These methods use either linear regression metamodels, or Kriging (Gaussian processes). The metamodel type guides the design of the experiment; this design …fixes the input combinations
Functional data analysis of generalized regression quantiles
Guo, Mengmeng
2013-11-05
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Williams, John D.; Lindem, Alfred C.
Four computer programs using the general purpose multiple linear regression program have been developed. Setwise regression analysis is a stepwise procedure for sets of variables; there will be as many steps as there are sets. Covarmlt allows a solution to the analysis of covariance design with multiple covariates. A third program has three…
Bayesian Regression with Network Prior: Optimal Bayesian Filtering Perspective.
Qian, Xiaoning; Dougherty, Edward R
2016-12-01
The recently introduced intrinsically Bayesian robust filter (IBRF) provides fully optimal filtering relative to a prior distribution over an uncertainty class ofjoint random process models, whereas formerly the theory was limited to model-constrained Bayesian robust filters, for which optimization was limited to the filters that are optimal for models in the uncertainty class. This paper extends the IBRF theory to the situation where there are both a prior on the uncertainty class and sample data. The result is optimal Bayesian filtering (OBF), where optimality is relative to the posterior distribution derived from the prior and the data. The IBRF theories for effective characteristics and canonical expansions extend to the OBF setting. A salient focus of the present work is to demonstrate the advantages of Bayesian regression within the OBF setting over the classical Bayesian approach in the context otlinear Gaussian models.
Controlling attribute effect in linear regression
Calders, Toon
2013-12-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Particle Swarm Optimization and regression analysis I
Mohanty, Souyma D.
2012-04-01
Particle Swarm Optimization (PSO) is now widely used in many problems that require global optimization of high-dimensional and highly multi-modal functions. However, PSO has not yet seen widespread use in astronomical data analysis even though optimization problems in this field have become increasingly complex. In this two-part article, we first provide an overview of the PSO method in the concrete context of a ubiquitous problem in astronomy, namely, regression analysis. In particular, we consider the problem of optimizing the placement of knots in regression based on cubic splines (spline smoothing). The second part will describe an in-depth investigation of PSO in some realistic data analysis challenges.
OPTIMAL DESIGNS FOR SPLINE WAVELET REGRESSION MODELS.
Maronge, Jacob M; Zhai, Yi; Wiens, Douglas P; Fang, Zhide
2017-05-01
In this article we investigate the optimal design problem for some wavelet regression models. Wavelets are very flexible in modeling complex relations, and optimal designs are appealing as a means of increasing the experimental precision. In contrast to the designs for the Haar wavelet regression model (Herzberg and Traves 1994; Oyet and Wiens 2000), the I-optimal designs we construct are different from the D-optimal designs. We also obtain c-optimal designs. Optimal (D- and I-) quadratic spline wavelet designs are constructed, both analytically and numerically. A case study shows that a significant saving of resources may be realized by employing an optimal design. We also construct model robust designs, to address response misspecification arising from fitting an incomplete set of wavelets.
Inverse Regression for the Wiener Class of Systems
Lyzell, Christian; Enqvist, Martin
2011-01-01
The concept of inverse regression has turned out to be quite useful for dimension reduction in regression analysis problems. Using methods like sliced inverse regression (SIR) and directional regression (DR), some high-dimensional nonlinear regression problems can be turned into more tractable low-dimensional problems. Here, the usefulness of inverse regression for identification of nonlinear dynamical systems will be discussed. In particular, it will be shown that the inverse regression meth...
Correlated Action Effects in Decision Theoretic Regression
Boutilier, Craig
2013-01-01
Much recent research in decision theoretic planning has adopted Markov decision processes (MDPs) as the model of choice, and has attempted to make their solution more tractable by exploiting problem structure. One particular algorithm, structured policy construction achieves this by means of a decision theoretic analog of goal regression using action descriptions based on Bayesian networks with tree-structured conditional probability tables. The algorithm as presented is not able to deal with...
Multiple Imputations for Linear Regression Models
Brownstone, David
1991-01-01
Rubin (1987) has proposed multiple imputations as a general method for estimation in the presence of missing data. Rubinâ€™s results only strictly apply to Bayesian models, but Schenker and Welsh (1988) directly prove the consistency Â multiple imputations inference~ when there are missing values of the dependent variable in linear regression models. This paper extends and modifies Schenker and Welshâ€™s theorems to give conditions where multiple imputations yield consistent inferences for bo...
Logistic regression a self-learning text
Kleinbaum, David G
1994-01-01
This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.
In utero diagnosis of caudal regression syndrome
Directory of Open Access Journals (Sweden)
Lindsey M. Negrete, BS
2015-01-01
Full Text Available We present a case of caudal regression syndrome (CRS, a relatively uncommon defect of the lower spine accompanied by a wide range of developmental abnormalities. CRS is closely associated with pregestational diabetes and is nearly 200 times more prevalent in infants of diabetic mothers (1, 2. We report a case of prenatally suspected CRS in a fetus of a nondiabetic mother and discuss how the initial neurological abnormalities found on imaging correlate with the postnatal clinical deficits.
Three Contributions to Robust Regression Diagnostics
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2015-01-01
Roč. 11, č. 2 (2015), s. 69-78 ISSN 1336-9180 Grant - others:GA ČR(CZ) GA13-01930S; Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : robust regression * robust econometrics * hypothesis testing Subject RIV: BA - General Mathematics http://www.degruyter.com/view/j/jamsi.2015.11.issue-2/jamsi-2015-0013/jamsi-2015-0013. xml ?format=INT
Prediction of Rainfall Using Logistic Regression
A. H. M. Rahmatullah Imon; Manos C. Roy; S. K. Bhattacharjee
2012-01-01
The use of logistic regression modeling has exploded during the past decade for prediction and forecasting. From its original acceptance in epidemiologic research, the method is now commonly employed in almost all branches of knowledge. Rainfall is one of the most important phenomena of climate system. It is well known that the variability and intensity of rainfall act on natural, agricultural, human and even total biological system. So it is essential to be able to predict rainfall by findi...
Regression analysis of growth responses to water depth in three wetland plant species
DEFF Research Database (Denmark)
Sorrell, Brian K; Tanner, Chris C; Brix, Hans
2012-01-01
) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water...
Price Sensitivity of Demand for Prescription Drugs: Exploiting a Regression Kink Design
DEFF Research Database (Denmark)
Simonsen, Marianne; Skipper, Lars; Skipper, Niels
This paper investigates price sensitivity of demand for prescription drugs using drug purchase records for at 20% random sample of the Danish population. We identify price responsiveness by exploiting exogenous variation in prices caused by kinked reimbursement schemes and implement a regression ...
A systematic review and meta-regression analysis of mivacurium for tracheal intubation
Vanlinthout, L.E.H.; Mesfin, S.H.; Hens, N.; Vanacker, B.F.; Robertson, E.N.; Booij, L.H.D.J.
2014-01-01
We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent
Geographically weighted regression model on poverty indicator
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
Spontaneous Regression of a Cervical Disk Herniation
Directory of Open Access Journals (Sweden)
Emre Delen
2014-03-01
Full Text Available A 54 years old female patient was admitted to our outpatient clinic with a two months history of muscle spasms of her neck and pain radiating to the left upper extremity. Magnetic resonance imaging had shown a large left-sided paracentral disk herniation at the C6-C7 disk space (Figure 1. Neurological examination showed no obvious neurological deficit. She received conservative treatment including bed rest, rehabilitation, and analgesic drugs. After 13 months, requested by the patient, a second magnetic resonance imaging study showed resolution of the disc herniation.(Figure 2 Although the literature contains several reports about spontaneous regression of herniated lumbar disc without surgical intervention, that of phenomenon reported for herniated cervical level is rare, and such reports are few[1]. In conclusion, herniated intervertebral disc have the potential to spontaneously regress independently from the spine level. With further studies, determining the predictive signs for prognostic evaluation for spontaneous regression which would yield to conservative treatment would be beneficial.
General regression and representation model for classification.
Directory of Open Access Journals (Sweden)
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Leukemia prediction using sparse logistic regression.
Directory of Open Access Journals (Sweden)
Tapio Manninen
Full Text Available We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Formula: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical.
[Caudal regression syndrome--two case reports].
Kokrdová, Z; Pavlíková, J
2008-01-01
The authors demonstrate two cases of caudal regression syndrome (CRS), a rare malformative syndrom, seen mainly in cases of maternal diabetes with poor metabolic control. Case report. Department of Obstetrics and Gynecology, Department of Medicine Regional Hospital Pardubice. The caudal regression syndrome (CRS) was revealed in two women with praegestational diabetes. The diagnosis was made at 18 and 20 weeks. The characteristic ultrasound findings include abrupt interruption of the spine and abnormal position of the lower limbs. The femur bones are fixed in a "V" pattern, giving a typical "Buddha's poise". A complete examination must be conducted for possible urinary and intestinal malformations. The mechanism leading to malformation is discussed in the article. To prevent pregnancy at the time of bad controlled diabetes is the only way to minimaze the risk of producing a congenitally malformed baby including caudal regression syndrom in the population of diabetic mothers. Family planning and supervision by the specialists is always advisable. Early diagnosis of CRS is possible using vaginal ultrasound. Emphasis is placed on the association of abrupt disruption of dorsal or lumbar spine and abnormal images of the lower limbs fixed in a,,V" formation, which is characteristic sign of CRS.
PREDICTIONG OF EUCALYPTUS WOOD BY COKRIGING, KRIGING AND REGRESSION
Directory of Open Access Journals (Sweden)
Wellington Jorge Cavalcanti Lundgren
2015-06-01
Full Text Available In the Gypsum Pole of Araripe, semiarid zone of Pernambuco, where is produces 97% of the plaster consumed in Brazil, a forest experiment with 1875 eucalyptus was cut off and all the trees were rigorously cubed by the Smalian method. The location of each tree was marked on a Cartesian plane, and a sample of 200 trees was removed by entirely random process. In the 200 sample trees, three estimation methods for variable volume timber, regression analysis, kriging and cokriging were used. To cokriging method, the secondary variable was the DBH (Diameter at Breast Height, and for the regression model of Spurr or the combined variable, it uses two explanatory variables: total height of the tree (H and the DBH. The variables volume and DBH showed spatial dependency. To compare de methods it was used the coefficient of determination (R2 and the residual distribution of the errors (real x estimated data. The best results were achieved with the Spurr equation R2 = 0.82 and total volume estimated 166.25 m3. The cokriging provided and R2 = 0.72 with total volume estimated of 164.14 m3 and kriging had R2 = 0.32 and the total volume estimated of 163.21 m3. The real volume of the experiment was 166.14 m3. Key words: Forest inventory, Volume of timber, Geostatistics.
ajansen; kwhitefoot; panteltje1; edprochak; sudhakar, the
2014-07-01
In reply to the physicsworld.com news story “How to make a quantum random-number generator from a mobile phone” (16 May, http://ow.ly/xFiYc, see also p5), which describes a way of delivering random numbers by counting the number of photons that impinge on each of the individual pixels in the camera of a Nokia N9 smartphone.
Regression calibration method for correcting measurement-error bias in nutritional epidemiology.
Spiegelman, D; McDermott, A; Rosner, B
1997-04-01
Regression calibration is a statistical method for adjusting point and interval estimates of effect obtained from regression models commonly used in epidemiology for bias due to measurement error in assessing nutrients or other variables. Previous work developed regression calibration for use in estimating odds ratios from logistic regression. We extend this here to estimating incidence rate ratios from Cox proportional hazards models and regression slopes from linear-regression models. Regression calibration is appropriate when a gold standard is available in a validation study and a linear measurement error with constant variance applies or when replicate measurements are available in a reliability study and linear random within-person error can be assumed. In this paper, the method is illustrated by correction of rate ratios describing the relations between the incidence of breast cancer and dietary intakes of vitamin A, alcohol, and total energy in the Nurses' Health Study. An example using linear regression is based on estimation of the relation between ultradistal radius bone density and dietary intakes of caffeine, calcium, and total energy in the Massachusetts Women's Health Study. Software implementing these methods uses SAS macros.
Humanoid environmental perception with Gaussian process regression
Directory of Open Access Journals (Sweden)
Dingsheng Luo
2016-11-01
Full Text Available Nowadays, humanoids are increasingly expected acting in the real world to complete some high-level tasks humanly and intelligently. However, this is a hard issue due to that the real world is always extremely complicated and full of miscellaneous variations. As a consequence, for a real-world-acting robot, precisely perceiving the environmental changes might be an essential premise. Unlike human being, humanoid robot usually turns out to be with much less sensors to get enough information from the real world, which further leads the environmental perception problem to be more challenging. Although it can be tackled by establishing direct sensory mappings or adopting probabilistic filtering methods, the nonlinearity and uncertainty caused by both the complexity of the environment and the high degree of freedom of the robots will result in tough modeling difficulties. In our study, with the Gaussian process regression framework, an alternative learning approach to address such a modeling problem is proposed and discussed. Meanwhile, to debase the influence derived from limited sensors, the idea of fusing multiple sensory information is also involved. To evaluate the effectiveness, with two representative environment changing tasks, that is, suffering unknown external pushing and suddenly encountering sloped terrains, the proposed approach is applied to a humanoid, which is only equipped with a three-axis gyroscope and a three-axis accelerometer. Experimental results reveal that the proposed Gaussian process regression-based approach is effective in coping with the nonlinearity and uncertainty of the humanoid environmental perception problem. Further, a humanoid balancing controller is developed, which takes the output of the Gaussian process regression-based environmental perception as the seed to activate the corresponding balancing strategy. Both simulated and hardware experiments consistently show that our approach is valuable and leads to a
Prediction of Rainfall Using Logistic Regression
Directory of Open Access Journals (Sweden)
A.H.M. Rahmatullah Imon
2012-07-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";} The use of logistic regression modeling has exploded during the past decade for prediction and forecasting. From its original acceptance in epidemiologic research, the method is now commonly employed in almost all branches of knowledge. Rainfall is one of the most important phenomena of climate system. It is well known that the variability and intensity of rainfall act on natural, agricultural, human and even total biological system. So it is essential to be able to predict rainfall by finding out the appropriate predictors. In this paper an attempt has been made to use logistic regression for predicting rainfall. It is evident that the climatic data are often subjected to gross recording errors though this problem often goes unnoticed to the analysts. In this paper we have used very recent screening methods to check and correct the climatic data that we use in our study. We have used fourteen years’ daily rainfall data to formulate our model. Then we use two years’ observed daily rainfall data treating them as future data for the cross validation of our model. Our findings clearly show that if we are able to choose appropriate predictors for rainfall, logistic regression model can predict the rainfall very efficiently.
Bayesian regression of piecewise homogeneous Poisson processes
Directory of Open Access Journals (Sweden)
Diego Sevilla
2015-12-01
Full Text Available In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Received: 2 November 2015, Accepted: 27 November 2015; Edited by: R. Dickman; Reviewed by: M. Hutter, Australian National University, Canberra, Australia.; DOI: http://dx.doi.org/10.4279/PIP.070018 Cite as: D J R Sevilla, Papers in Physics 7, 070018 (2015
Mapping geogenic radon potential by regression kriging
Energy Technology Data Exchange (ETDEWEB)
Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)
2016-02-15
Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method
Paraneoplastic pemphigus regression after thymoma resection
Directory of Open Access Journals (Sweden)
Stergiou Eleni
2008-08-01
Full Text Available Abstract Background Among human neoplasms thymomas are associated with highest frequency with paraneoplastic autoimmune diseases. Case presentation A case of a 42-year-old woman with paraneoplastic pemphigus as the first manifestation of thymoma is reported. Transsternal complete thymoma resection achieved pemphigus regression. The clinical correlations between pemphigus and thymoma are presented. Conclusion Our case report provides further evidence for the important role of autoantibodies in the pathogenesis of paraneoplastic skin diseases in thymoma patients. It also documents the improvement of the associated pemphigus after radical treatment of the thymoma.
A method for nonlinear exponential regression analysis
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Multinomial logistic regression in workers' health
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
Inferring gene regression networks with model trees
Directory of Open Access Journals (Sweden)
Aguilar-Ruiz Jesus S
2010-10-01
Full Text Available Abstract Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear
Non-Standard Semiparametric Regression via BRugs
Directory of Open Access Journals (Sweden)
Jennifer K. Marley
2010-11-01
Full Text Available We provide several illustrations of Bayesian semiparametric regression analyses in the BRugs package. BRugs facilitates use of the BUGS inference engine from the R computing environment and allows analyses to be managed using scripts. The examples are chosen to represent an array of non-standard situations, for which mixed model software is not viable. The situations include: the response variable being outside of the one-parameter exponential family, data subject to missingness, data subject to measurement error and parameters entering the model via an index.
Cyclodextrin promotes atherosclerosis regression via macrophage reprogramming
DEFF Research Database (Denmark)
2016-01-01
Atherosclerosis is an inflammatory disease linked to elevated blood cholesterol concentrations. Despite ongoing advances in the prevention and treatment of atherosclerosis, cardiovascular disease remains the leading cause of death worldwide. Continuous retention of apolipoprotein B...... that increases cholesterol solubility in preventing and reversing atherosclerosis. We showed that CD treatment of murine atherosclerosis reduced atherosclerotic plaque size and CC load and promoted plaque regression even with a continued cholesterol-rich diet. Mechanistically, CD increased oxysterol production...... of CD as well as for augmented reverse cholesterol transport. Because CD treatment in humans is safe and CD beneficially affects key mechanisms of atherogenesis, it may therefore be used clinically to prevent or treat human atherosclerosis....
Affine Projection Algorithm Using Regressive Estimated Error
Zhang, Shu; Zhi, Yongfeng
2011-01-01
An affine projection algorithm using regressive estimated error (APA-REE) is presented in this paper. By redefining the iterated error of the affine projection algorithm (APA), a new algorithm is obtained, and it improves the adaptive filtering convergence rate. We analyze the iterated error signal and the stability for the APA-REE algorithm. The steady-state weights of the APA-REE algorithm are proved to be unbiased and consist. The simulation results show that the proposed algorithm has a f...
Spectral density regression for bivariate extremes
Castro Camilo, Daniela
2016-05-11
We introduce a density regression model for the spectral density of a bivariate extreme value distribution, that allows us to assess how extremal dependence can change over a covariate. Inference is performed through a double kernel estimator, which can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts of practical interest. An extreme temperature dataset is used to illustrate our methods. © 2016 Springer-Verlag Berlin Heidelberg
Bry, Xavier; Verron, Thomas; Cazes, Pierre
2008-01-01
A variable group Y is assumed to depend upon R thematic variable groups X 1, >..., X R . We assume that components in Y depend linearly upon components in the Xr's. In this work, we propose a multiple covariance criterion which extends that of PLS regression to this multiple predictor groups situation. On this criterion, we build a PLS-type exploratory method - Structural Equation Exploratory Regression (SEER) - that allows to simultaneously perform dimension reduction in groups and investiga...
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
ON THE GASTALDI – D’URSO FUZZY LINEAR REGRESSION
Directory of Open Access Journals (Sweden)
DANA-FLORENTA SIMION
2011-04-01
Full Text Available In the crisp regression models, the differences between observed values and calculates ones are suspected to be caused by random distributed errors, although these are due to observation errors and an unappropriate model structure. So, the fuzzy character of model prevails. The Fuzzy linear regression models (FLRM are, roughly speaking, of two kinds: Fuzzy linear programming (FLP based methods and Fuzzy least squares (FLS methods. The FLP methods have been initiated by H.Tanaka (1982 and developed by H. Ishibuchi et al. The classical FLR model, Y=A0+A1X1+...+AkXk, has a explained Fuzzy triangular variable, Y, Fuzzy triangular coefficients {Aj} and crisp explanatory variables {Xj}: the parameters {Aj} of the model are estimated by minimizing the total indetermination of the model, so each data point lies within the limits of the response variable. In a large number of situations the prediction interval of the FLR model were much less than the interval obtained applying classical the Multiple linear regression model (see V.M. Kandala – 2002, 2003. However, this approach is somehow heuristic; on the other side, the LP model complexity overmuch increases as the number of data points increases. The FLS approach (P. Diamond; Miin-Shen Yang, Hsien-Hsiung Liu – 1988 et al is an extension of the classical OLS method, using various metrics defined on the space of the fuzzy numbers. A significant number of recent works (McCauley- Bell (1999, J. deA. Sanchez and A. T. Gomez (2003 who used FLS to estimate the term structure of interest rates deals with models with a fuzzy output, fuzzy coefficients and a crisp input vector. All the fuzzy components are symmetric triangular fuzzy numbers: the main idea of the method is to minimize the total support of the fuzzy coeficients. Sometimes, different restrictions occur. In our paper, we intend to build some examples for the P. d’Urso and T. Gastaldi models, that allow a comparative study on various options
Energy Technology Data Exchange (ETDEWEB)
Dotsenko, Viktor S [Landau Institute for Theoretical Physics, Russian Academy of Sciences, Moscow (Russian Federation)
2011-03-31
In the last two decades, it has been established that a single universal probability distribution function, known as the Tracy-Widom (TW) distribution, in many cases provides a macroscopic-level description of the statistical properties of microscopically different systems, including both purely mathematical ones, such as increasing subsequences in random permutations, and quite physical ones, such as directed polymers in random media or polynuclear crystal growth. In the first part of this review, we use a number of models to examine this phenomenon at a simple qualitative level and then consider the exact solution for one-dimensional directed polymers in a random environment, showing that free energy fluctuations in such a system are described by the universal TW distribution. The second part provides detailed appendix material containing the necessary mathematical background for the first part. (reviews of topical problems)
Regression Models For Saffron Yields in Iran
S. H, Sanaeinejad; S. N, Hosseini
Saffron is an important crop in social and economical aspects in Khorassan Province (Northeast of Iran). In this research wetried to evaluate trends of saffron yield in recent years and to study the relationship between saffron yield and the climate change. A regression analysis was used to predict saffron yield based on 20 years of yield data in Birjand, Ghaen and Ferdows cities.Climatologically data for the same periods was provided by database of Khorassan Climatology Center. Climatologically data includedtemperature, rainfall, relative humidity and sunshine hours for ModelI, and temperature and rainfall for Model II. The results showed the coefficients of determination for Birjand, Ferdows and Ghaen for Model I were 0.69, 0.50 and 0.81 respectively. Also coefficients of determination for the same cities for model II were 0.53, 0.50 and 0.72 respectively. Multiple regression analysisindicated that among weather variables, temperature was the key parameter for variation ofsaffron yield. It was concluded that increasing temperature at spring was the main cause of declined saffron yield during recent years across the province. Finally, yield trend was predicted for the last 5 years using time series analysis.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Supporting Regularized Logistic Regression Privately and Efficiently.
Directory of Open Access Journals (Sweden)
Wenfa Li
Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Regression testing in the TOTEM DCS
Rodríguez, F. Lucas; Atanassov, I.; Burkimsher, P.; Frost, O.; Taskinen, J.; Tulimaki, V.
2012-12-01
The Detector Control System of the TOTEM experiment at the LHC is built with the industrial product WinCC OA (PVSS). The TOTEM system is generated automatically through scripts using as input the detector Product Breakdown Structure (PBS) structure and its pinout connectivity, archiving and alarm metainformation, and some other heuristics based on the naming conventions. When those initial parameters and automation code are modified to include new features, the resulting PVSS system can also introduce side-effects. On a daily basis, a custom developed regression testing tool takes the most recent code from a Subversion (SVN) repository and builds a new control system from scratch. This system is exported in plain text format using the PVSS export tool, and compared with a system previously validated by a human. A report is sent to the developers with any differences highlighted, in readiness for validation and acceptance as a new stable version. This regression approach is not dependent on any development framework or methodology. This process has been satisfactory during several months, proving to be a very valuable tool before deploying new versions in the production systems.
Cox regression model with doubly truncated data.
Rennert, Lior; Xie, Sharon X
2017-10-26
Truncation is a well-known phenomenon that may be present in observational studies of time-to-event data. While many methods exist to adjust for either left or right truncation, there are very few methods that adjust for simultaneous left and right truncation, also known as double truncation. We propose a Cox regression model to adjust for this double truncation using a weighted estimating equation approach, where the weights are estimated from the data both parametrically and nonparametrically, and are inversely proportional to the probability that a subject is observed. The resulting weighted estimators of the hazard ratio are consistent. The parametric weighted estimator is asymptotically normal and a consistent estimator of the asymptotic variance is provided. For the nonparametric weighted estimator, we apply the bootstrap technique to estimate the variance and confidence intervals. We demonstrate through extensive simulations that the proposed estimators greatly reduce the bias compared to the unweighted Cox regression estimator which ignores truncation. We illustrate our approach in an analysis of autopsy-confirmed Alzheimer's disease patients to assess the effect of education on survival. © 2017, The International Biometric Society.
Scientific Progress or Regress in Sports Physiology?
Böning, Dieter
2016-11-01
In modern societies there is strong belief in scientific progress, but, unfortunately, a parallel partial regress occurs because of often avoidable mistakes. Mistakes are mainly forgetting, erroneous theories, errors in experiments and manuscripts, prejudice, selected publication of "positive" results, and fraud. An example of forgetting is that methods introduced decades ago are used without knowing the underlying theories: Basic articles are no longer read or cited. This omission may cause incorrect interpretation of results. For instance, false use of actual base excess instead of standard base excess for calculation of the number of hydrogen ions leaving the muscles raised the idea that an unknown fixed acid is produced in addition to lactic acid during exercise. An erroneous theory led to the conclusion that lactate is not the anion of a strong acid but a buffer. Mistakes occur after incorrect application of a method, after exclusion of unwelcome values, during evaluation of measurements by false calculations, or during preparation of manuscripts. Co-authors, as well as reviewers, do not always carefully read papers before publication. Peer reviewers might be biased against a hypothesis or an author. A general problem is selected publication of positive results. An example of fraud in sports medicine is the presence of doped subjects in groups of investigated athletes. To reduce regress, it is important that investigators search both original and recent articles on a topic and conscientiously examine the data. All co-authors and reviewers should read the text thoroughly and inspect all tables and figures in a manuscript.
Optimization of DWDM Demultiplexer Using Regression Analysis
Directory of Open Access Journals (Sweden)
Venkatachalam Rajarajan Balaji
2016-01-01
Full Text Available We propose a novel twelve-channel Dense Wavelength Division Multiplexing (DWDM demultiplexer, using the two-dimensional photonic crystal (2D PC with square resonant cavity (SRC of ITU-T G.694.1 standard. The DWDM demultiplexer consists of an input waveguide, SRC, and output waveguide. The SRC in the proposed demultiplexer consists of square resonator and microcavity. The microcavity center rod radius (Rm is proportional to refractive index. The refractive index property of the rods filters the wavelengths of odd and even channels. The proposed microcavity can filter twelve ITU-T G.694.1 standard wavelengths with 0.2 nm/25 GHz channel spacing between the wavelengths. From the simulation, we optimize the rod radius and wavelength with linear regression analysis. From the regression analysis, we can achieve 95% of accuracy with an average quality factor of 7890, the uniform spectral line-width of 0.2 nm, the transmission efficiency of 90%, crosstalk of −42 dB, and footprint of about 784 μm2.
Downscaling Wind Forecasts via Clustering and Regression
Lee, H. S.; Zhang, Y.; Liu, Y.; Wu, L.; He, Y.; Schaake, J. C.
2016-12-01
Wind is an important weather variable and a key determinant of evaporation, snowfall and coastal flooding. At present, wind information from medium-range weather forecast is of limited accuracy, and the associated resolution is often too coarse to be used directly for hydrologic prediction purposes. This work presents a statistical post-processing framework that will be used to generate fine-scale wind products to serve the NOAA's National Water Model effort. The prototype of this framework consists of two components: a) a cluster analysis module that classifies Automated Surface Observing System (ASOS) stations into multiple groups based on elevation and/or surface roughness lengths derived from National Land Cover Database 2011 (NLCD2011), and b) a regression module based on the Heteroscedastic Extended Logistic Regression (HXLR) technique that statistically downscales GEFS wind hindcasts to the location of the closest station within each identified cluster. The efficacy of the framework is assessed for a region that is roughly the service area of NOAA's Middle Atlantic River Forecast Center (MARFC). For this region, wind hindcasts generated from Global Ensemble Forecast System (GEFS) are downscaled and corrected using digital elevation model and National Land Cover Database; observations from ASOS serve both as the predictands for establishing the relationship, and as the reference for validation. Our results showed that this framework considerably enhance the quality of wind forecast, with Nash-Sutcliffe efficiency of the downscaled wind speed improved by 0.2 - 0.4 relative to raw GEFS forecast.
Optical proximity correction with principal component regression
Gao, Peiran; Gu, Allan; Zakhor, Avideh
2008-03-01
An important step in today's Integrated Circuit (IC) manufacturing is optical proximity correction (OPC). In model based OPC, masks are systematically modified to compensate for the non-ideal optical and process effects of optical lithography system. The polygons in the layout are fragmented, and simulations are performed to determine the image intensity pattern on the wafer. Then the mask is perturbed by moving the fragments to match the desired wafer pattern. This iterative process continues until the pattern on the wafer matches the desired one. Although OPC increases the fidelity of pattern transfer to the wafer, it is quite CPU intensive; OPC for modern IC designs can take days to complete on computer clusters with thousands of CPU. In this paper, techniques from statistical machine learning are used to predict the fragment movements. The goal is to reduce the number of iterations required in model based OPC by using a fast and efficient solution as the initial guess to model based OPC. To determine the best model, we train and evaluate several principal component regression models based on prediction error. Experimental results show that fragment movement predictions via regression model significantly decrease the number of iterations required in model based OPC.
A reconsideration of the concept of regression.
Dowling, A Scott
2004-01-01
Regression has been a useful psychoanalytic concept, linking present mental functioning with past experiences and levels of functioning. The concept originated as an extension of the evolutionary zeitgeist of the day as enunciated by H. Spencer and H. Jackson and applied by Freud to psychological phenomena. The value system implicit in the contrast of evolution/progression vs dissolution/regression has given rise to unfortunate and powerful assumptions of social, cultural, developmental and individual value as embodied in notions of "higher," "lower;" "primitive," "mature," "archaic," and "advanced." The unhelpful results of these assumptions are evident, for example, in attitudes concerning cultural, sexual, and social "correctness, " same-sex object choice, and goals of treatment. An alternative, a continuously constructed, continuously emerging mental life, in analogy to the ever changing, continuous physical body, is suggested. This view retains the fundamentals of psychoanalysis, for example, unconscious mental life, drive, defense, and psychic structure, but stresses a functional, ever changing, present oriented understanding of mental life as contrasted with a static, onion-layered view.
Regression trees for regulatory element identification.
Phuong, Tu Minh; Lee, Doheon; Lee, Kwang Hyung
2004-03-22
The transcription of a gene is largely determined by short sequence motifs that serve as binding sites for transcription factors. Recent findings suggest direct relationships between the motifs and gene expression levels. In this work, we present a method for identifying regulatory motifs. Our method makes use of tree-based techniques for recovering the relationships between motifs and gene expression levels. We treat regulatory motifs and gene expression levels as predictor variables and responses, respectively, and use a regression tree model to identify the structural relationships between them. The regression tree methodology is extended to handle responses from multiple experiments by modifying the split function. The significance of regulatory elements is determined by analyzing tree structures and using a variable importance measure. When applied to two data sets of the yeast Saccharomyces cerevisiae, the method successfully identifies most of the regulatory motifs that are known to control gene transcription under the given experimental conditions, and suggests several new putative motifs. Analysis of the tree structures also reconfirms several pairs of motifs that are known to regulate gene transcription in combination. http://if.kaist.ac.kr/~phuong/RegTree
Ogutu, Joseph O; Schulz-Streeck, Torben; Piepho, Hans-Peter
2012-05-21
Genomic selection (GS) is emerging as an efficient and cost-effective method for estimating breeding values using molecular markers distributed over the entire genome. In essence, it involves estimating the simultaneous effects of all genes or chromosomal segments and combining the estimates to predict the total genomic breeding value (GEBV). Accurate prediction of GEBVs is a central and recurring challenge in plant and animal breeding. The existence of a bewildering array of approaches for predicting breeding values using markers underscores the importance of identifying approaches able to efficiently and accurately predict breeding values. Here, we comparatively evaluate the predictive performance of six regularized linear regression methods-- ridge regression, ridge regression BLUP, lasso, adaptive lasso, elastic net and adaptive elastic net-- for predicting GEBV using dense SNP markers. We predicted GEBVs for a quantitative trait using a dataset on 3000 progenies of 20 sires and 200 dams and an accompanying genome consisting of five chromosomes with 9990 biallelic SNP-marker loci simulated for the QTL-MAS 2011 workshop. We applied all the six methods that use penalty-based (regularization) shrinkage to handle datasets with far more predictors than observations. The lasso, elastic net and their adaptive extensions further possess the desirable property that they simultaneously select relevant predictive markers and optimally estimate their effects. The regression models were trained with a subset of 2000 phenotyped and genotyped individuals and used to predict GEBVs for the remaining 1000 progenies without phenotypes. Predictive accuracy was assessed using the root mean squared error, the Pearson correlation between predicted GEBVs and (1) the true genomic value (TGV), (2) the true breeding value (TBV) and (3) the simulated phenotypic values based on fivefold cross-validation (CV). The elastic net, lasso, adaptive lasso and the adaptive elastic net all had
Matula, Dominik
2013-01-01
The author summarizes some previous results concerning random triangles. He describes the Gaussian triangle and random triangles whose vertices lie in a unit n-dimensional ball, in a rectangle or in a general bounded convex set. In the second part, the author deals with an inscribed triangle in a triangle - let ABC be an equilateral triangle and let M, N, O be three points, each laying on one side of the ABC. We call MNO inscribed triangle (in an equi- laterral triangle). The median triangle ...
Mehta, Madan Lal
1990-01-01
Since the publication of Random Matrices (Academic Press, 1967) so many new results have emerged both in theory and in applications, that this edition is almost completely revised to reflect the developments. For example, the theory of matrices with quaternion elements was developed to compute certain multiple integrals, and the inverse scattering theory was used to derive asymptotic results. The discovery of Selberg's 1944 paper on a multiple integral also gave rise to hundreds of recent publications. This book presents a coherent and detailed analytical treatment of random matrices, leading
Tatone, Elise H; Duffield, Todd F; LeBlanc, Stephen J; DeVries, Trevor J; Gordon, Jessica L
2017-02-01
An observational study of 790 to over 3,000 herds was conducted to estimate the within-herd prevalence and cow-level risk factors for ketosis in dairy cattle in herds that participate in a Dairy Herd Improvement Association (DHIA) program. Ketosis or hyperketolactia (KET) was diagnosed as milk β-hydroxybutyrate ≥0.15 mmol/L at first DHIA test when tested within the first 30 d in milk. Seven hundred ninety-five herds providing at least 61 first milk tests from June 2014 to December 2015 were used to estimate the provincial within-herd prevalence of KET. All herds on DHIA in Ontario (n = 3,042) were used to construct cow-level multilevel logistic regression models to investigate the association of DHIA collected variables with the odds of KET at first DHIA milk test. Primiparous and multiparous animals were modeled independently. The cow-level KET prevalence in Ontario was 21%, with an average within-herd prevalence of 21% (standard deviation = 10.6) for dairy herds enrolled in a DHIA program. The prevalence of KET had a distinct seasonality with the lowest prevalence occurring from July to November. Automatic milking systems (AMS) were associated with increased within-herd prevalence, as well as increased odds of KET in multiparous animals at first test (odds ratio: 1.45; 95% confidence interval: 1.30 to 1.63). Jersey cattle had over 1.46 times higher odds of KET than Holstein cattle. Milk fat yield ≥1.12 kg/d at the last test of the previous lactation was associated with decreased odds of KET in the current lactation (odds ratio: 0.56; 95% confidence interval: 0.53 to 0.59). Increased days dry and longer calving intervals, for multiparous animals, and older age at first calving for primiparous animals increased the odds of KET at first test. This study confirms previous findings that increased days dry, longer calving intervals, and increased age at first calving are associated with increased odds of KET and is the first report of increased KET in herds with
Logistic regression against a divergent Bayesian network
Directory of Open Access Journals (Sweden)
Noel Antonio Sánchez Trujillo
2015-01-01
Full Text Available This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered; we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.
Robust mediation analysis based on median regression.
Yuan, Ying; Mackinnon, David P
2014-03-01
Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. (c) 2014 APA, all rights reserved.
Adaptive regression for modeling nonlinear relationships
Knafl, George J
2016-01-01
This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...
Conjoined legs: Sirenomelia or caudal regression syndrome?
Directory of Open Access Journals (Sweden)
Sakti Prasad Das
2013-01-01
Full Text Available Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Macrophages, Dendritic Cells, and Regression of Atherosclerosis
Directory of Open Access Journals (Sweden)
Jonathan E. Feig
2012-07-01
Full Text Available Atherosclerosis is the number one cause of death in the Western world. It results from the interaction between modified lipoproteins and monocyte-derived cells such as macrophages, dendritic cells, T cells, and other cellular elements of the arterial wall. This inflammatory process can ultimately lead to the development of complex lesions, or plaques, that protrude into the arterial lumen. Ultimately, plaque rupture and thrombosis can occur leading to the clinical complications of myocardial infarction or stroke. Although each of the cell types plays roles in the pathogenesis of atherosclerosis, in this review, the focus will be primarily on the monocyte derived cells- macrophages and dendritic cells. The roles of these cell types in atherogenesis will be highlighted. Finally, the mechanisms of atherosclerosis regression as it relates to these cells will be discussed.
Nonparametric additive regression for repeatedly measured data
Carroll, R. J.
2009-05-20
We develop an easily computed smooth backfitting algorithm for additive model fitting in repeated measures problems. Our methodology easily copes with various settings, such as when some covariates are the same over repeated response measurements. We allow for a working covariance matrix for the regression errors, showing that our method is most efficient when the correct covariance matrix is used. The component functions achieve the known asymptotic variance lower bound for the scalar argument case. Smooth backfitting also leads directly to design-independent biases in the local linear case. Simulations show our estimator has smaller variance than the usual kernel estimator. This is also illustrated by an example from nutritional epidemiology. © 2009 Biometrika Trust.
Early development and regression in Rett syndrome.
Lee, J Y L; Leonard, H; Piek, J P; Downs, J
2013-12-01
This study utilized developmental profiling to examine symptoms in 14 girls with genetically confirmed Rett syndrome and whose families were participating in the Australian Rett syndrome or InterRett database. Regression was mostly characterized by loss of hand and/or communication skills (13/14) except one girl demonstrated slowing of skill development. Social withdrawal and inconsolable crying often developed simultaneously (9/14), with social withdrawal for shorter duration than inconsolable crying. Previously acquired gross motor skills declined in just over half of the sample (8/14), mostly observed as a loss of balance. Early abnormalities such as vomiting and strabismus were also seen. Our findings provide additional insight into the early clinical profile of Rett syndrome. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Entrepreneurial intention modeling using hierarchical multiple regression
Directory of Open Access Journals (Sweden)
Marina Jeger
2014-12-01
Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.