random regression sire: Topics by WorldWideScience.org

Sample records for random regression sire

Genetic correlations between body condition scores and fertility in dairy cattle using bivariate random regression models.

Science.gov (United States)

De Haas, Y; Janss, L L G; Kadarmideen, H N

2007-10-01

Genetic correlations between body condition score (BCS) and fertility traits in dairy cattle were estimated using bivariate random regression models. BCS was recorded by the Swiss Holstein Association on 22,075 lactating heifers (primiparous cows) from 856 sires. Fertility data during first lactation were extracted for 40,736 cows. The fertility traits were days to first service (DFS), days between first and last insemination (DFLI), calving interval (CI), number of services per conception (NSPC) and conception rate to first insemination (CRFI). A bivariate model was used to estimate genetic correlations between BCS as a longitudinal trait by random regression components, and daughter's fertility at the sire level as a single lactation measurement. Heritability of BCS was 0.17, and heritabilities for fertility traits were low (0.01-0.08). Genetic correlations between BCS and fertility over the lactation varied from: -0.45 to -0.14 for DFS; -0.75 to 0.03 for DFLI; from -0.59 to -0.02 for CI; from -0.47 to 0.33 for NSPC and from 0.08 to 0.82 for CRFI. These results show (genetic) interactions between fat reserves and reproduction along the lactation trajectory of modern dairy cows, which can be useful in genetic selection as well as in management. Maximum genetic gain in fertility from indirect selection on BCS should be based on measurements taken in mid lactation when the genetic variance for BCS is largest, and the genetic correlations between BCS and fertility is strongest.
Siring Success and Paternal Effects in Heterodichogamous Acer opalus

Science.gov (United States)

Gleiser, Gabriela; Segarra-Moragues, José Gabriel; Pannell, John Richard; Verdú, Miguel

2008-01-01

Background and Aims Heterodichogamy (a dimorphic breeding system comprising protandrous and protogynous individuals) is a potential starting point in the evolution of dioecy from hermaphroditism. In the genus Acer, previous work suggests that dioecy evolved from heterodichogamy through an initial spread of unisexual males. Here, the question is asked as to whether the different morphs in Acer opalus, a species in which males co-exist with heterodichogamous hermaphrodites, differ in various components of male in fitness. Methods Several components of male fertility were analysed. Pollination rates in the male phase were recorded across one flowering period. Pollen viability was compared among morphs through hand pollinations both with pollen from a single sexual morph and also simulating a situation of pollen competition; in the latter experiment, paternity was assessed with microsatellite markers. It was also determined whether effects of genetic relatedness between pollen donors and recipients could influence the siring success. Finally, paternal effects occurring beyond the fertilization process were tested for by measuring the height reached by seedlings with different sires over three consecutive growing seasons. Key Results The males and protandrous morphs had higher pollination rates than the protogynous morph, and the seedlings they sired grew taller. No differences in male fertility were found between males and protandrous individuals. Departures from random mating due to effects of genetic relatedness among sires and pollen recipients were also ruled out. Conclusions Males and protandrous individuals are probably better sires than protogynous individuals, as shown by the higher pollination rates and the differential growth of the seedlings sired by these morphs. In contrast, the fertility of males was not higher than the male fertility of the protandrous morph. While the appearance of males in sexually specialized heterodichogamous populations is possible
Genetic correlations among body condition score, yield, and fertility in first-parity cows estimated by random regression models.

Science.gov (United States)

Veerkamp, R F; Koenen, E P; De Jong, G

2001-10-01

Twenty type classifiers scored body condition (BCS) of 91,738 first-parity cows from 601 sires and 5518 maternal grandsires. Fertility data during first lactation were extracted for 177,220 cows, of which 67,278 also had a BCS observation, and first-lactation 305-d milk, fat, and protein yields were added for 180,631 cows. Heritabilities and genetic correlations were estimated using a sire-maternal grandsire model. Heritability of BCS was 0.38. Heritabilities for fertility traits were low (0.01 to 0.07), but genetic standard deviations were substantial, 9 d for days to first service and calving interval, 0.25 for number of services, and 5% for first-service conception. Phenotypic correlations between fertility and yield or BCS were small (-0.15 to 0.20). Genetic correlations between yield and all fertility traits were unfavorable (0.37 to 0.74). Genetic correlations with BCS were between -0.4 and -0.6 for calving interval and days to first service. Random regression analysis (RR) showed that correlations changed with days in milk for BCS. Little agreement was found between variances and correlations from RR, and analysis including a single month (mo 1 to 10) of data for BCS, especially during early and late lactation. However, this was due to excluding data from the conventional analysis, rather than due to the polynomials used. RR and a conventional five-traits model where BCS in mo 1, 4, 7, and 10 was treated as a separate traits (plus yield or fertility) gave similar results. Thus a parsimonious random regression model gave more realistic estimates for the (co)variances than a series of bivariate analysis on subsets of the data for BCS. A higher genetic merit for yield has unfavorable effects on fertility, but the genetic correlation suggests that BCS (at some stages of lactation) might help to alleviate the unfavorable effect of selection for higher yield on fertility.
Evaluation of Columbia, USMARC-Composite, Suffolk, and Texel rams as terminal sires in an extensive rangeland production system: VIII. Quality measures of lamb longissimus dorsi.

Science.gov (United States)

Mousel, M R; Notter, D R; Leeds, T D; Zerby, H N; Moeller, S J; Taylor, J B; Lewis, G S

2014-07-01

Quality measures of lamb longissimus dorsi were evaluated in 514 crossbred wether lambs to assess sire breed differences. Wethers were produced over 3 yr from single-sire matings of 22 Columbia, 22 U.S. Meat Animal Research Center (USMARC)-Composite (Composite), 21 Suffolk, and 17 Texel rams to adult Rambouillet ewes. Lambs were reared to weaning in an extensive western rangeland production system and finished in a feedlot on a high-energy finishing diet. One of three harvest groups were randomly assigned to each lamb, and lambs were transported to The Ohio State University abattoir when the mean BW of wethers remaining in the feedlot reached 54.4, 61.2, or 68.0 kg. After harvest, subjective lean quality scores were assigned and LM pH (immediately after and 24 h after harvest), color (quantified as Minolta L*, a*, and b*), intramuscular fat (IMF), cooking loss percentage, and Warner-Bratzler shear force (WBSF) were determined. Statistical models included fixed effects of sire breed, year of birth, and harvest group and random effects of sire (nested within sire breed and year) and maternal grandsire. Year and harvest group were significant (P 0.28). At comparable numbers of days on feed, Texel-sired wethers had the greatest (more desirable; P lambs were intermediate and Columbia-sired lambs had the lowest (less desirable). Minolta L* values were greater (P lambs, although this difference is not visually discernible by humans. No significant (P > 0.05) sire breed effects were detected for LM pH at or 24 h after harvest, Minolta a* and b*, IMF, percentage of cooking loss, and WBSF at comparable numbers of days on feed. At comparable chilled carcass weight, significant (P lambs had greater scores than Columbia- and Suffolk-sired lambs, but Composite-sired lambs did not differ from lambs sired by the other sire breeds. Sire breed effects were not detected (P > 0.15) for LM pH at or 24 h after harvest, Minolta L*, a*, and b*, cooking loss percentage, IMF, and WBSF at
Bull fertility evaluations for Angus service sires bred to Holstein cows

Science.gov (United States)

Sire conception rate (SCR), a phenotypic evaluation of service-sire fertility implemented in August 2008, is based on data from the most recent 4 years, conventional-semen breedings up to 7 services, and cow parities 1 through 5. Many US dairy cows are now being bred to Angus sires because beef pric...
Critical evaluation and thermodynamic optimisation of the Si-RE systems: Part II. Si-RE system (RE = Gd, Tb, Dy, Ho, Er, Tm, Lu and Y)

International Nuclear Information System (INIS)

Kim, Junghwan; Jung, In-Ho

2015-01-01

Highlights: • The (Si-RE) (RE = Gd, Tb, Dy, Ho, Er, Tm, Lu and Y) systems have been reviewed. • The thermodynamic optimization of the (Si-RE) systems have been performed. • Systematic changes and similarities in the (Si-RE) systems were found. • The systematic approach resolved inconsistencies in the experimental data. • The systematic approach was used to assess the unexplored phase diagrams. - Abstract: A critical evaluation and optimisation of all available phase diagrams and thermodynamic data of the (Si-RE) (RE = Gd, Tb, Dy, Ho, Er, Tm, Lu and Y) systems was conducted to obtain reliable thermodynamic functions of all the phases in the system. In the thermodynamic modelling, a systematic analysis involving the similarity and periodicity observed in the lanthanide series was applied to resolve inconsistencies in the experimental data and to estimate the unknown thermodynamic properties and phase equilibria data. In particular, the phase diagrams and thermodynamic properties of (Si-Tm) and (Si-Lu) systems which are rarely investigated can be predicted from this approach. Systematic trends in thermodynamic properties of solid and liquid phases and phase diagram of the entire (Si-RE) systems were summarized
Smart Integrated Renewable Energy Systems (SIRES: A Novel Approach for Sustainable Development

Directory of Open Access Journals (Sweden)

Zeel Maheshwari

2017-08-01

Full Text Available Technical and economic aspects of the viability of SIRES (Smart Integrated Renewable Energy Systems for sustainable development of remote and rural areas of the world are discussed. The hallmark of the proposed SIRES is the smart utilization of several renewable resources in an integrated fashion and matching of resources and needs a priori with the ultimate goal of “energization”, not just “electrification”. Historical background leading to this approach is succinctly presented along with a comprehensive schematic diagram. Modeling of various components and their collective use in optimizing SIRES with the aid of genetic algorithm are presented using a typical hypothetical example. SIRES is also compared with various approaches for rural development based on Annualized Cost of System (ACS and installation costs. Implementation of SIRES will lead to overall sustainable development of rural communities.
Effects of Using Dorper, Hampshire Down, Bluefaced Leicester and German Blackheaded Rams as Terminal Sires in Extensive Low-Input Production Systems

Directory of Open Access Journals (Sweden)

Dinu Gavojdian

2017-05-01

Full Text Available The current study was conducted to evaluate Dorper, Hampshire Down, Bluefaced Leicester and German Blackheaded breeds as terminal sires in an extensive low-input production system under European temperate conditions, when crossed with native Turcana breed as a maternal genotype. The project breeding herd consisted of 300 multiparous purebred Turcana ewes, managed under extensive low-input production system. Six breeding herds were set-up, with randomly selected ewes (50/group being exposed to Dorper, Hampshire Down, Bluefaced Leicester, German Blackheaded and Turcana (control group rams. Lambs birth weight was influenced (p≤0.01 for the F1 Hampshire Down x Turcana and F1 German Blackheaded x Turcana crossbreds, compared to their counterparts. Lamb survival from birth to weaning was the lowest (88.4±3.30% for the Dorper sired lambs, and the highest (94.0±1.84% in the Bluefaced Leicester sired lambs (p≤0.01. Hampshire Down and German Blackheaded sired lambs had similar survival rates as the purebreds Turcana lambs (p>0.05. Body weight of lambs at the age of 8 months was significantly higher (p≤0.001 in Dorper (41.3±0.51, Bluefaced Leicester (41.2±0.34 and German Blackheaded (42.4±0.58 sired genotypes, while the Hampshire Down half-breeds (39.3±0.65 had intermediate body weights (p≤0.01 compared to the controls (34.6±0.49 and the better performing genotypes.
Random regression models for detection of gene by environment interaction

Directory of Open Access Journals (Sweden)

Meuwissen Theo HE

2007-02-01

Full Text Available Abstract Two random regression models, where the effect of a putative QTL was regressed on an environmental gradient, are described. The first model estimates the correlation between intercept and slope of the random regression, while the other model restricts this correlation to 1 or -1, which is expected under a bi-allelic QTL model. The random regression models were compared to a model assuming no gene by environment interactions. The comparison was done with regards to the models ability to detect QTL, to position them accurately and to detect possible QTL by environment interactions. A simulation study based on a granddaughter design was conducted, and QTL were assumed, either by assigning an effect independent of the environment or as a linear function of a simulated environmental gradient. It was concluded that the random regression models were suitable for detection of QTL effects, in the presence and absence of interactions with environmental gradients. Fixing the correlation between intercept and slope of the random regression had a positive effect on power when the QTL effects re-ranked between environments.
Interpreting parameters in the logistic regression model with random effects

DEFF Research Database (Denmark)

Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben

2000-01-01

interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Multi-trait and random regression mature weight heritability and ...

African Journals Online (AJOL)

Legendre polynomials of orders 4, 3, 6 and 3 were used for animal and maternal genetic and permanent environmental effects, respectively, considering five classes of residual variances. Mature weight (five years) direct heritability estimates were 0.35 (MM) and 0.38 (RRM). Rank correlation between sires' breeding values ...
Polyandry in dragon lizards: inbred paternal genotypes sire fewer offspring

Science.gov (United States)

Frère, Celine H; Chandrasoma, Dani; Whiting, Martin J

2015-01-01

Multiple mating in female animals is something of a paradox because it can either be risky (e.g., higher probability of disease transmission, social costs) or provide substantial fitness benefits (e.g., genetic bet hedging whereby the likelihood of reproductive failure is lowered). The genetic relatedness of parental units, particularly in lizards, has rarely been studied in the wild. Here, we examined levels of multiple paternity in Australia's largest agamid lizard, the eastern water dragon (Intellagama lesueurii), and determined whether male reproductive success is best explained by its heterozygosity coefficient or the extent to which it is related to the mother. Female polyandry was the norm: 2/22 clutches (9.2%) were sired by three or more fathers, 17/22 (77.2%) were sired by two fathers, and only 3/22 (13.6%) clutches were sired by one father. Moreover, we reconstructed the paternal genotypes for 18 known mother–offspring clutches and found no evidence that females were favoring less related males or that less related males had higher fitness. However, males with greater heterozygosity sired more offspring. While the postcopulatory mechanisms underlying this pattern are not understood, female water dragons likely represent another example of reproduction through cryptic means (sperm selection/sperm competition) in a lizard, and through which they may ameliorate the effects of male-driven precopulatory sexual selection. PMID:25937911
Effect of Wagyu- versus Angus-sired calves on feedlot performance, carcass characteristics, and tenderness.

Science.gov (United States)

Radunz, A E; Loerch, S C; Lowe, G D; Fluharty, F L; Zerby, H N

2009-09-01

Wagyu-sired (n = 20) and Angus-sired (n = 19) steers and heifers were used to compare the effects of sire breed on feedlot performance, carcass characteristics, and meat tenderness. Calves were weaned at 138 +/- 5 d of age and individually fed a finishing diet consisting of 65% whole corn, 20% protein/vitamin/mineral supplement, and 15% corn silage on a DM basis. Heifers and steers were slaughtered at 535 and 560 kg of BW, respectively. Carcasses were ribbed between the 12th and 13th (USDA grading system) and the 6th and 7th ribs (Japanese grading system) to measure fat thickness, LM area (LMA), and intramuscular fat (IMF). Two steaks were removed from the 12th rib location and aged for 72 h and 14 d to determine Warner-Bratzler shear force and cooking loss. Sire breed x sex interactions were not significant (P > 0.05). Angus-sired calves had greater (P Angus. Sire breed did not affect (P > 0.20) HCW, 12th-rib fat, or USDA yield grade. Carcasses of Wagyu had greater (P = 0.0001) marbling scores at the 12th rib than those of Angus (770.9 vs. 597.3 +/- 41.01, respectively). Carcasses of Wagyu also had greater (P Angus, resulting in a greater proportion of carcasses grading Prime (65.0 vs. 21.1%; P = 0.006). Carcasses from Wagyu tended (P = 0.08) to have greater LMA at the 12th rib, whereas Angus carcasses had greater (P Angus and Wagyu had similar (P > 0.50) tenderness at aging times of 72 h and 14 d. Cooking loss was greater (P Angus than Wagyu steaks at 72 h and 14 d. Using Wagyu sires vs. Angus sires on British-based commercial cows combined with early weaning management strategies has the potential to produce a product with greater marbling, but is unlikely to significantly enhance tenderness.
Bacterial chondronecrosis with osteomyelitis in broilers: influence of sires and straight-run versus sex-separate rearing.

Science.gov (United States)

Wideman, R F; Al-Rubaye, A; Reynolds, D; Yoho, D; Lester, H; Spencer, C; Hughes, J D; Pevzner, I Y

2014-07-01

Two experiments (E1, E2) were conducted to compare the influence of sires (sire A on dam C vs. sire B on dam C) and straight-run versus sex-separate rearing on the incidence of bacterial chondronecrosis with osteomyelitis (BCO) in broilers. Fertile eggs from commercial breeder flocks were incubated and hatched at the University of Arkansas Poultry Research Hatchery. Male and female chicks were reared together (straight-run) or separately (sex-separate) in 3 × 3 m pens on litter or flat wire flooring with 65 (E1) or 60 (E2) birds per pen. Necropsies revealed lesions that are pathognomonic for BCO in ≥98% of the birds that became lame. The SigmaStat Z-test was used to compare cumulative BCO incidences through 8 wk of age. For birds reared on litter, the incidences of BCO were low regardless of cross or sex (range: 1.7 to 5.1%; P ≥ 0.6). Within a cross and sex, rearing the broilers straight-run versus sex-separate on wire flooring did not significantly affect the incidence of BCO. Significant incidences of BCO did not develop until after d 40. Males from the sire A cross developed a higher incidence of BCO than males from the sire B cross in E1 (27 vs. 17%, respectively; P = 0.009) but not in E2 (28.5 vs. 22.6%, respectively; P = 0.141). In both experiments, males from the sire A cross developed higher incidences of BCO than females from the sire B cross (27 vs. 11.9%, in E1; 28.5 vs. 14.8%, in E2). With the sexes pooled, broilers from the sire A cross consistently developed higher incidences of BCO than broilers from the sire B cross (21.4 vs. 14.9%, P = 0.005 in E1; 26.5 vs. 18.7%, P = 0.003 in E2). High susceptibilities to both femoral head (all femoral head necrosis = 66 to 85% incidences) and tibial head (all tibial head necrosis = 81 to 96% incidences) BCO lesions were demonstrated in lame birds from both sexes and crosses. This study supports a sire influence on the susceptibility of broilers to BCO. Sire lines can be chosen to reduce BCO susceptibility
Live animal measurements, carcass composition and plasma hormone and metabolite concentrations in male progeny of sires differing in genetic merit for beef production.

Science.gov (United States)

Clarke, A M; Drennan, M J; McGee, M; Kenny, D A; Evans, R D; Berry, D P

2009-07-01

In genetic improvement programmes for beef cattle, the effect of selecting for a given trait or index on other economically important traits, or their predictors, must be quantified to ensure no deleterious consequential effects go unnoticed. The objective was to compare live animal measurements, carcass composition and plasma hormone and metabolite concentrations of male progeny of sires selected on an economic index in Ireland. This beef carcass index (BCI) is expressed in euros and based on weaning weight, feed intake, carcass weight and carcass conformation and fat scores. The index is used to aid in the genetic comparison of animals for the expected profitability of their progeny at slaughter. A total of 107 progeny from beef sires of high (n = 11) or low (n = 11) genetic merit for the BCI were compared in either a bull (slaughtered at 16 months of age) or steer (slaughtered at 24 months of age) production system, following purchase after weaning (8 months of age) from commercial beef herds. Data were analysed as a 2 × 2 factorial design (two levels of genetic merit by two production systems). Progeny of high BCI sires had heavier carcasses, greater (P animal value (obtained by multiplying carcass weight by carcass value, which was based on the weight of meat in each cut by its commercial value) than progeny of low BCI sires. Regression of progeny performance on sire genetic merit was also undertaken across the entire data set. In steers, the effect of BCI on carcass meat proportion, calculated carcass value (c/kg) and animal value was positive (P carcass fat proportion (P carcass weight followed the same trends as BCI. Muscularity scores, carcass meat proportion and calculated carcass value increased, whereas scanned fat depth, carcass fat and bone proportions decreased with increasing sire EPD for conformation score. The opposite association was observed for sire EPD for fat score. Results from this study show that selection using the BCI had positive
SDE based regression for random PDEs

KAUST Repository

Bayer, Christian

2016-01-01

A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
SDE based regression for random PDEs

KAUST Repository

Bayer, Christian

2016-01-06

A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Principal component approach in variance component estimation for international sire evaluation

Directory of Open Access Journals (Sweden)

Jakobsen Jette

2011-05-01

Full Text Available Abstract Background The dairy cattle breeding industry is a highly globalized business, which needs internationally comparable and reliable breeding values of sires. The international Bull Evaluation Service, Interbull, was established in 1983 to respond to this need. Currently, Interbull performs multiple-trait across country evaluations (MACE for several traits and breeds in dairy cattle and provides international breeding values to its member countries. Estimating parameters for MACE is challenging since the structure of datasets and conventional use of multiple-trait models easily result in over-parameterized genetic covariance matrices. The number of parameters to be estimated can be reduced by taking into account only the leading principal components of the traits considered. For MACE, this is readily implemented in a random regression model. Methods This article compares two principal component approaches to estimate variance components for MACE using real datasets. The methods tested were a REML approach that directly estimates the genetic principal components (direct PC and the so-called bottom-up REML approach (bottom-up PC, in which traits are sequentially added to the analysis and the statistically significant genetic principal components are retained. Furthermore, this article evaluates the utility of the bottom-up PC approach to determine the appropriate rank of the (covariance matrix. Results Our study demonstrates the usefulness of both approaches and shows that they can be applied to large multi-country models considering all concerned countries simultaneously. These strategies can thus replace the current practice of estimating the covariance components required through a series of analyses involving selected subsets of traits. Our results support the importance of using the appropriate rank in the genetic (covariance matrix. Using too low a rank resulted in biased parameter estimates, whereas too high a rank did not result in
Comparison between a sire model and an animal model for genetic evaluation of fertility traits in Danish Holstein population

DEFF Research Database (Denmark)

Sun, C; Madsen, P; Nielsen, U S

2009-01-01

Comparisons between a sire model, a sire-dam model, and an animal model were carried out to evaluate the ability of the models to predict breeding values of fertility traits, based on data including 471,742 records from the first lactation of Danish Holstein cows, covering insemination years from...... the results suggest that the animal model, rather than the sire model, should be used for genetic evaluation of fertility traits......Comparisons between a sire model, a sire-dam model, and an animal model were carried out to evaluate the ability of the models to predict breeding values of fertility traits, based on data including 471,742 records from the first lactation of Danish Holstein cows, covering insemination years from...... 1995 to 2004. The traits in the analysis were days from calving to first insemination, calving interval, days open, days from first to last insemination, number of inseminations per conception, and nonreturn rate within 56 d after first service. The correlations between sire estimated breeding value...
Approximating prediction uncertainty for random forest regression models

Science.gov (United States)

John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

2016-01-01

Machine learning approaches such as random forest haveÂ increased for the spatial modeling and mapping of continuousÂ variables. Random forest is a non-parametric ensembleÂ approach, and unlike traditional regression approaches thereÂ is no direct quantification of prediction error. UnderstandingÂ prediction uncertainty is important when using model-basedÂ continuous maps as...

Growth at fattening and carcass characteristics of D'man, Sardi and meat-sire crossbred lambs slaughtered at two stages of maturity.

Science.gov (United States)

Boujenane, Ismaïl

2015-10-01

The objective of this study was to evaluate growth at fattening, carcass characteristics and carcass measurements of 19 Sardi, 19 D'man and 52 meat-sire crossbred lambs of both sexes slaughtered at 50 and 70 % of mature weight. Crossbred lambs were born from Sardi, D'man and F1 Sardi × D'man ewes mated to meat-breed rams (Ile de France and Mérinos Précoce). Lambs of each group (breed type by sex) were chosen at random and slaughtered either at 50 or 70 % of mature weight. Sardi and D'man purebred lambs had significantly lower growth at fattening, pre-slaughter live weight, empty body weight, hot carcass weight, carcass conformation, carcass fatness, red offal, white offal, sub-products and non-carcass fat than meat-sire crossbred lambs. Differences were 42.2 g/day, 5.03 kg, 4.46 kg, 2.57 kg, 0.96, 0.59, 0.18 kg, 0.39 kg, 0.63 kg and 0.12 kg, respectively. B and Wr measurements of meat-sire crossbred lambs were higher than those of Sardi and D'man purebred lambs, whereas the F measurement was in favour of purebreds. Likewise, CC, G/F and Wr/Th indices of meat-sire crossbreds were higher than those of purebreds; however, the opposite was observed for L/G and Th/G indices. Maternal heterosis was positive and not significant for most traits, negative and not significant for few traits (P > 0.05), but negative and significant for the proportion of sub-products (P carcass characteristics of local sheep can be significantly improved by terminal crossbreeding.
Effects of herd origin, AI stud and sire identification on genetic evaluation of Holstein Friesian bulls

Directory of Open Access Journals (Sweden)

Giovanni Bittante

2010-01-01

Full Text Available The purpose of this study was to estimate the effects of herd origin of bull, AI stud and sire identification number (ID on official estimated breeding values (EBV for production traits of Holstein Friesian proven bulls. The data included 1,005 Italian Holstein-Friesian bulls, sons of 76 sires, born in 100 herds and progeny tested by 10 AI studs. Bulls were required to have date of first proof between September 1992 and September 1997, to be born in a herd with at least one other bull and to have sire and dam with official EBV when bull was selected for progeny testing. Records of sires with only one son were also discarded. The dependent variable analyzed was the official genetic evaluation for a “quantity and quality of milk” index (ILQ. The linear model to predict breeding values of bulls included the fixed class effects of herd origin of bull, AI testing organization, birth year of bull, and estimated breeding values of sire and dam, both as linear covariates. The R2of the model was 45% and a significant effect was found for genetic merit of sire (P for herd origin of bull (P nificant. The range of herd origin effect was 872 kg of ILQ. However, in this study, the causes of this result were not clear; it may be due to numerous factors, one of which may be preferential treatment on dams of bulls. Analyses of resid- uals on breeding value of proven bulls for ILQ showed a non significant effect of sire ID, after adjusting for parent aver- age, herd origin effect and birth year effect. Although the presence of bias in genetic evaluation of dairy bulls is not evi- dent, further research is recommended firstly to understand the reasons of the significant herd origin effect, secondly to monitor and guarantee the greatest accuracy and reliability of genetic evaluation procedures.
Breeding objectives for Angus and Charolais specialized sire lines ...

African Journals Online (AJOL)

Breeding indigenous cows to terminal sires may facilitate production of calves in the emerging sector that better meet commercial feedlot requirements. ... On average, relative emphasis given to breeding values for survival, direct weaning weight, postweaning daily gain, postweaning daily feed intake, dressing percent, and ...
Comparison of Holstein service-sire fertility for heifer and cow breedings with conventional and sexed semen.

Science.gov (United States)

Sire conception rate (SCR), a service-sire fertility evaluation implemented in August 2008, is based on up to 7 conventional-semen breedings for parities 1 through 5 (Ccow). The same procedure was used to derive SCR for other types of breedings: sexed semen for cows (Scow) and conventional semen and...
The effect of sire selection on the response of lambs to vaccination with irradiated Trichostrongylus colubriformis larvae

International Nuclear Information System (INIS)

Dineen, J.K.; Windon, R.G.

1980-01-01

Rams selected for responsiveness and unresponsiveness to vaccination with irradiated T. colubriformis larvae at an early age were mated to unselected random bred ewes. Progeny were vaccinated with 20,000 irradiated larvae at 8 and 12 weeks of age, given anthelmintic treatment at 16 weeks and challenged with 20,000 normal larvae at 17 weeks. The results, based on wether worm counts and ewe faecal egg counts, showed significant differences between responder and non-responder progeny. There was a significant correlation between worm counts and faecal egg counts of half-sibs from the same sire group. The occurrence of globule leucocytes was inversely related to worm burdens of wether progeny, however, no clear relationship was found with eosinophils. In vitro lymphocyte stimulation using T. colubriformis L 3 antigen, concanavalin A and lipopolysaccharide showed that statistically defined responder progeny, pooled from both responder and non-responder sire groups, gave higher responses than non-responder lambs after vaccination. The results confirm that genetically-determined factors are involved in the response of lambs to vaccination at an early age, and indicate that rapid genetic progress may be achieved in the type of mating usually carried out under field conditions. (author)
Doe productivity indices and sire effects of a heterogeneous rabbit ...

African Journals Online (AJOL)

IJAAAR

reproductive data obtained include annual productivity indices for each doe and sire family at birth, ... contributed to their productivity success in ..... susceptible to heat stress at temperatures above. 300c. ... Effects of weaning litter size and sex.
Effects of sires with different weight gain potentials and varying planes of nutrition on growth of growing-finishing pigs.

Science.gov (United States)

Ha, Duck-Min; Jung, Dae-Yun; Park, Man Jong; Park, Byung-Chul; Lee, C Young

2014-01-01

The present study was performed to investigate the effects of two groups of sires with 'medium' and 'high' weight gain potentials (M-sires and H-sires, respectively) on growth of their progenies on varying planes of nutrition during the growing-finishing period. The ADG of the M-sires' progeny was greater (P plane of nutrition (H plane) followed by the medium (M) and low (L) planes (0.65, 0.61, and 0.51 kg, respectively; P planes vs. L plane (0.63, 0.62, and 0.54 kg, respectively). The ADG of pigs on the M or H plane during the grower phase and switched to the H plane thereafter (M-to-H or H-to-H planes) was greater than that of pigs on the L-to-L planes (0.99 vs. 0.78 kg) during the early finisher phase in the M-sires' progeny (P planes did not differ from that of pigs on the M-to-M or H-to-M planes (0.94 vs. 0.96 kg). Results suggest that the H-to-H or H-to-M planes and M-to-M or M-to-L planes are optimal for maximal growth of the M- and H-sires' progenies, respectively.
Conditional Monte Carlo randomization tests for regression models.

Science.gov (United States)

Parhat, Parwen; Rosenberger, William F; Diao, Guoqing

2014-08-15

We discuss the computation of randomization tests for clinical trials of two treatments when the primary outcome is based on a regression model. We begin by revisiting the seminal paper of Gail, Tan, and Piantadosi (1988), and then describe a method based on Monte Carlo generation of randomization sequences. The tests based on this Monte Carlo procedure are design based, in that they incorporate the particular randomization procedure used. We discuss permuted block designs, complete randomization, and biased coin designs. We also use a new technique by Plamadeala and Rosenberger (2012) for simple computation of conditional randomization tests. Like Gail, Tan, and Piantadosi, we focus on residuals from generalized linear models and martingale residuals from survival models. Such techniques do not apply to longitudinal data analysis, and we introduce a method for computation of randomization tests based on the predicted rate of change from a generalized linear mixed model when outcomes are longitudinal. We show, by simulation, that these randomization tests preserve the size and power well under model misspecification. Copyright © 2014 John Wiley & Sons, Ltd.
INFLUENCE OF AGE, GENDER AND SIRE LINE ON YOUNG CATTLE BEHAVIOUR TRAITS

Directory of Open Access Journals (Sweden)

Jan Broucek

2013-03-01

Full Text Available The aim of this study was to test effects of age, gender, and sire line on dairy cattle behaviour. We have analyzed results of ethological tests for 40 Holstein breed animals (23 males and 17 females, offsprings of three sires. Maintenance behaviour were observed at the age of 90, 130 and 170 days. Behaviour in the maze was conducted at the age of 119 days, an open-field test was applied at the age of 124, 168, and 355 days. The social behaviour was determined by feeding on 155th day of the age. The times and the number of periods in all activities of maintenance behaviour were changing significantly (P<0.001 according to the age. The total time of lying, lying with ruminating, ruminating, feeding was increasing from the age of 90 days to the age of 170 days, on the other hand the time of standing was decreasing. The times of total lying, lying with ruminating, total ruminating, feeding were increased, and time of standing was decreased from the age of 90 days to the age of 170 days. Calves spent more time lying on the left side than on the right side. The number of ruminating periods was increasing according to the age. Eating periods were decreasing from the age of 90 to 170 days. The most of lying periods were recorded at the age of 130 days. The differences between sex were found in total time of lying, lying on the right side (P<0.05, and the males rest longer and had more periods of lying than females. We have found differences in times of feeding (P<0.001, total lying, standing (P<0.01, and lying on the left side (P<0.05 according to sire by comparing behaviour of the calves. Sire genotypes were significantly manifested in period number of total lying (P<0.001, lying on the right side, feeding (P<0.01, and standing (P<0.05. Males stood in the first part of maze longer than females (P<0.001, also length of total standing was longer by bulls (P<0.01. Heifers took shorter time to leave the maze than bulls (P<0.05. Sire lineages significantly
Assessment of glucose homeostasis in crossbred steer progeny sired by Brahman bulls that experienced prenatal transportation stress

Science.gov (United States)

The objective of this experiment was to assess glucose homeostasis of crossbred male progeny whose Brahman sires experienced prenatal transportation stress (PS) in utero. Sixteen steers (PNS group) sired by 3 PS bulls gestating dams were transported for 2 h at 60, 80, 100, 120, and 140 ± 5 d of gest...
The Initial Regression Statistical Characteristics of Intervals Between Zeros of Random Processes

Directory of Open Access Journals (Sweden)

V. K. Hohlov

2014-01-01

Full Text Available The article substantiates the initial regression statistical characteristics of intervals between zeros of realizing random processes, studies their properties allowing the use these features in the autonomous information systems (AIS of near location (NL. Coefficients of the initial regression (CIR to minimize the residual sum of squares of multiple initial regression views are justified on the basis of vector representations associated with a random vector notion of analyzed signal parameters. It is shown that even with no covariance-based private CIR it is possible to predict one random variable through another with respect to the deterministic components. The paper studies dependences of CIR interval sizes between zeros of the narrowband stationary in wide-sense random process with its energy spectrum. Particular CIR for random processes with Gaussian and rectangular energy spectra are obtained. It is shown that the considered CIRs do not depend on the average frequency of spectra, are determined by the relative bandwidth of the energy spectra, and weakly depend on the type of spectrum. CIR properties enable its use as an informative parameter when implementing temporary regression methods of signal processing, invariant to the average rate and variance of the input implementations. We consider estimates of the average energy spectrum frequency of the random stationary process by calculating the length of the time interval corresponding to the specified number of intervals between zeros. It is shown that the relative variance in estimation of the average energy spectrum frequency of stationary random process with increasing relative bandwidth ceases to depend on the last process implementation in processing above ten intervals between zeros. The obtained results can be used in the AIS NL to solve the tasks of detection and signal recognition, when a decision is made in conditions of unknown mathematical expectations on a limited observation
Paternal Retrieval Behavior Regulated by Brain Estrogen Synthetase (Aromatase) in Mouse Sires that Engage in Communicative Interactions with Pairmates.

Science.gov (United States)

Akther, Shirin; Huang, Zhiqi; Liang, Mingkun; Zhong, Jing; Fakhrul, Azam A K M; Yuhi, Teruko; Lopatina, Olga; Salmina, Alla B; Yokoyama, Shigeru; Higashida, Chiharu; Tsuji, Takahiro; Matsuo, Mie; Higashida, Haruhiro

2015-01-01

Parental behaviors involve complex social recognition and memory processes and interactive behavior with children that can greatly facilitate healthy human family life. Fathers play a substantial role in child care in a small but significant number of mammals, including humans. However, the brain mechanism that controls male parental behavior is much less understood than that controlling female parental behavior. Fathers of non-monogamous laboratory ICR mice are an interesting model for examining the factors that influence paternal responsiveness because sires can exhibit maternal-like parental care (retrieval of pups) when separated from their pups along with their pairmates because of olfactory and auditory signals from the dams. Here we tested whether paternal behavior is related to femininity by the aromatization of testosterone. For this purpose, we measured the immunoreactivity of aromatase [cytochrome P450 family 19 (CYP19)], which synthesizes estrogen from androgen, in nine brain regions of the sire. We observed higher levels of aromatase expression in these areas of the sire brain when they engaged in communicative interactions with dams in separate cages. Interestingly, the number of nuclei with aromatase immunoreactivity in sires left together with maternal mates in the home cage after pup-removing was significantly larger than that in sires housed with a whole family. The capacity of sires to retrieve pups was increased following a period of 5 days spent with the pups as a whole family after parturition, whereas the acquisition of this ability was suppressed in sires treated daily with an aromatase inhibitor. The results demonstrate that the dam significantly stimulates aromatase in the male brain and that the presence of the pups has an inhibitory effect on this increase. These results also suggest that brain aromatization regulates the initiation, development, and maintenance of paternal behavior in the ICR male mice.
Comparing spatial regression to random forests for large ...

Science.gov (United States)

Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records. In this study, we compare these two techniques using a data set containing the macroinvertebrate multimetric index (MMI) at 1859 stream sites with over 200 landscape covariates. Our primary goal is predicting MMI at over 1.1 million perennial stream reaches across the USA. For spatial regression modeling, we develop two new methods to accommodate large data: (1) a procedure that estimates optimal Box-Cox transformations to linearize covariate relationships; and (2) a computationally efficient covariate selection routine that takes into account spatial autocorrelation. We show that our new methods lead to cross-validated performance similar to random forests, but that there is an advantage for spatial regression when quantifying the uncertainty of the predictions. Simulations are used to clarify advantages for each method. This research investigates different approaches for modeling and mapping national stream condition. We use MMI data from the EPA's National Rivers and Streams Assessment and predictors from StreamCat (Hill et al., 2015). Previous studies have focused on modeling the MMI condition classes (i.e., good, fair, and po
Edificio de oficinas. Gualca – Sire, Turín

Directory of Open Access Journals (Sweden)

Casalegno, G.

1958-09-01

Full Text Available Aunque puede considerarse como una unidad arquitectónica, puesto que constituye una sola manzana, en realidad está dividido en dos partes casi iguales: una, orientada al mediodía, propiedad de la sociedad "Gualca", proyectada por Casalegno y dedicada a departamentos de alquiler, y otra, propiedad de la sociedad "Sire", con proyecto de Ceresa y Levi-Montalcini.
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

Science.gov (United States)

Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

2016-01-01

Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
The inclusion of herd-year-season by sire interaction in the ...

African Journals Online (AJOL)

The inclusion of herd-year-season by sire interaction in the estimation of genetic parameters in Bonsmara cattle. FWC Neser, KV Konstantinov, GJ Erasmus. Abstract. No abstract. Full Text: EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT.
Application of random regression models to the genetic evaluation ...

African Journals Online (AJOL)

The model included fixed regression on AM (range from 30 to 138 mo) and the effect of herd-measurement date concatenation. Random parts of the model were RRM coefficients for additive and permanent environmental effects, while residual effects were modelled to account for heterogeneity of variance by AY. Estimates ...
Genetic Parameter Estimates for Metabolizing Two Common Pharmaceuticals in Swine

Directory of Open Access Journals (Sweden)

Jeremy T. Howard

2018-02-01

Full Text Available In livestock, the regulation of drugs used to treat livestock has received increased attention and it is currently unknown how much of the phenotypic variation in drug metabolism is due to the genetics of an animal. Therefore, the objective of the study was to determine the amount of phenotypic variation in fenbendazole and flunixin meglumine drug metabolism due to genetics. The population consisted of crossbred female and castrated male nursery pigs (n = 198 that were sired by boars represented by four breeds. The animals were spread across nine batches. Drugs were administered intravenously and blood collected a minimum of 10 times over a 48 h period. Genetic parameters for the parent drug and metabolite concentration within each drug were estimated based on pharmacokinetics (PK parameters or concentrations across time utilizing a random regression model. The PK parameters were estimated using a non-compartmental analysis. The PK model included fixed effects of sex and breed of sire along with random sire and batch effects. The random regression model utilized Legendre polynomials and included a fixed population concentration curve, sex, and breed of sire effects along with a random sire deviation from the population curve and batch effect. The sire effect included the intercept for all models except for the fenbendazole metabolite (i.e., intercept and slope. The mean heritability across PK parameters for the fenbendazole and flunixin meglumine parent drug (metabolite was 0.15 (0.18 and 0.31 (0.40, respectively. For the parent drug (metabolite, the mean heritability across time was 0.27 (0.60 and 0.14 (0.44 for fenbendazole and flunixin meglumine, respectively. The errors surrounding the heritability estimates for the random regression model were smaller compared to estimates obtained from PK parameters. Across both the PK and plasma drug concentration across model, a moderate heritability was estimated. The model that utilized the plasma drug
Genetic Parameter Estimates for Metabolizing Two Common Pharmaceuticals in Swine

Science.gov (United States)

Howard, Jeremy T.; Ashwell, Melissa S.; Baynes, Ronald E.; Brooks, James D.; Yeatts, James L.; Maltecca, Christian

2018-01-01

In livestock, the regulation of drugs used to treat livestock has received increased attention and it is currently unknown how much of the phenotypic variation in drug metabolism is due to the genetics of an animal. Therefore, the objective of the study was to determine the amount of phenotypic variation in fenbendazole and flunixin meglumine drug metabolism due to genetics. The population consisted of crossbred female and castrated male nursery pigs (n = 198) that were sired by boars represented by four breeds. The animals were spread across nine batches. Drugs were administered intravenously and blood collected a minimum of 10 times over a 48 h period. Genetic parameters for the parent drug and metabolite concentration within each drug were estimated based on pharmacokinetics (PK) parameters or concentrations across time utilizing a random regression model. The PK parameters were estimated using a non-compartmental analysis. The PK model included fixed effects of sex and breed of sire along with random sire and batch effects. The random regression model utilized Legendre polynomials and included a fixed population concentration curve, sex, and breed of sire effects along with a random sire deviation from the population curve and batch effect. The sire effect included the intercept for all models except for the fenbendazole metabolite (i.e., intercept and slope). The mean heritability across PK parameters for the fenbendazole and flunixin meglumine parent drug (metabolite) was 0.15 (0.18) and 0.31 (0.40), respectively. For the parent drug (metabolite), the mean heritability across time was 0.27 (0.60) and 0.14 (0.44) for fenbendazole and flunixin meglumine, respectively. The errors surrounding the heritability estimates for the random regression model were smaller compared to estimates obtained from PK parameters. Across both the PK and plasma drug concentration across model, a moderate heritability was estimated. The model that utilized the plasma drug
Comparison of several methods of sires evaluation for total milk ...

African Journals Online (AJOL)

A total of 956 lactation records of Holstein cows kept at Kaa Albon station, Imuran Governorate, Yemen during the period from 1991 to 2003 were used to investigate the effect of some genetic and non-genetic factors (Sire, parity, season of calving, year of calving and age at first calving as covariate) on the Total Milk Yield ...

Deriving Genomic Breeding Values for Residual Feed Intake from Covariance Functions of Random Regression Models

DEFF Research Database (Denmark)

Strathe, Anders B; Mark, Thomas; Nielsen, Bjarne

2014-01-01

Random regression models were used to estimate covariance functions between cumulated feed intake (CFI) and body weight (BW) in 8424 Danish Duroc pigs. Random regressions on second order Legendre polynomials of age were used to describe genetic and permanent environmental curves in BW and CFI...
A comparison of random forest regression and multiple linear regression for prediction in neuroscience.

Science.gov (United States)

Smith, Paul F; Ganesh, Siva; Liu, Ping

2013-10-30

Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.
Paternal retrieval behavior regulated by brain estrogen synthetase (aromatase in mouse sires that engage in communicative interactions with pairmates

Directory of Open Access Journals (Sweden)

Shirin eAkther

2015-12-01

Full Text Available Parental behaviors involve complex social recognition and memory processes and interactive behavior with children that can greatly facilitate healthy human family life. Fathers play a substantial role in child care in a small but significant number of mammals, including humans. However, the brain mechanism that controls male parental behavior is much less understood than that controlling female parental behavior. Fathers of non-monogamous laboratory ICR mice are an interesting model for examining the factors that influence paternal responsiveness because sires can exhibit maternal-like parental care (retrieval of pups when separated from their pups along with their pairmates because of olfactory and auditory signals from the dams. Here we tested whether paternal behavior is related to femininity by the aromatization of testosterone. For this purpose, we measured the immunoreactivity of aromatase (cytochrome P450 family 19 (CYP19, which synthesizes estrogen from androgen, in nine brain regions of the sire. We observed higher levels of aromatase expression in these areas of the sire brain when they engaged in communicative interactions with dams in separate cages. The capacity of sires to retrieve pups was increased following a period of five days spent with the pups as a whole family after parturition, whereas the acquisition of this ability was suppressed in sires treated daily with an aromatase inhibitor. These results suggest that brain aromatization regulates the initiation, development, and maintenance of paternal behavior in the ICR mice.
Comparison of several methods of sires evaluation for total milk ...

African Journals Online (AJOL)

2015-01-24

Jan 24, 2015 ... Comparison of several methods of sires evaluation for total milk yield in a herd of Holstein cows in Yemen. F.R. Al-Samarai1,*, Y.K. Abdulrahman1, F.A. Mohammed2, F.H. Al-Zaidi2 and N.N. Al-Anbari3. 1Department of Veterinary Public Health/College of Veterinary Medicine, University of Baghdad, Iraq.
Effects of herd origin, AI stud and sire identification on genetic evaluation of Holstein Friesian bulls

OpenAIRE

Giovanni Bittante; Paolo Carnier; Luigi Gallo Gallo; Riccardo Dal Zotto; Martino Cassandro

2010-01-01

The purpose of this study was to estimate the effects of herd origin of bull, AI stud and sire identification number (ID) on official estimated breeding values (EBV) for production traits of Holstein Friesian proven bulls. The data included 1,005 Italian Holstein-Friesian bulls, sons of 76 sires, born in 100 herds and progeny tested by 10 AI studs. Bulls were required to have date of first proof between September 1992 and September 1997, to be born in a herd with at least on...
Growth curves of crossbred cows sired by Hereford, Angus, Belgian Blue, Brahman, Boran, and Tuli bulls, and the fraction of mature body weight and height at puberty.

Science.gov (United States)

Freetly, H C; Kuehn, L A; Cundiff, L V

2011-08-01

The objective of this study was to evaluate the growth curves of females to determine if mature size and relative rates of maturation among breeds differed. Body weight and hip height data were fitted to the nonlinear function BW = f(age) = A - Be(k×age), where A is an estimate of mature BW and k determines the rate that BW or height moves from B to A. Cows represented progeny from 28 Hereford, 38 Angus, 25 Belgian Blue, 34 Brahman, 8 Boran, and 9 Tuli sires. Bulls from these breeds were mated by AI to Angus, Hereford, and MARC III composite (1/4 Angus, 1/4 Hereford, 1/4 Red Poll, and 1/4 Pinzgauer) cows to produce calves in 1992, 1993, and 1994. These matings resulted in 516 mature cows whose growth curves were subsequently evaluated. Hereford-sired cows tended to have heavier mature BW, as estimated by parameter A, than Angus- (P=0.09) and Brahman-sired cows (P=0.06), and were heavier than the other breeds (P Angus-sired cows were heavier than Boran- (P Angus-sired cows did not differ from Brahman-sired cows (P=0.94). Brahman-sired cows had a heavier mature BW than Boran- (P Angus-sired cows matured faster (k) than cows sired by Hereford (P=0.03), Brahman (P Angus-sired cows (P=0.09), and had reached a greater proportion of their mature BW at puberty than had Hereford- (P < 0.001), Tuli- (P < 0.001), and Belgian Blue-sired cows (P < 0.001). Within species of cattle, the relative range in proportion of mature BW at puberty (Bos taurus 0.56 through 0.58, and Bos indicus 0.60) was highly conserved, suggesting that proportion of mature BW is a more robust predictor of age at puberty across breeds than is absolute weight or age. © 2011 American Society of Animal Science. All rights reserved.
Comparison of muscle fibre characteristics and production traits among offspring from Meishan dams mated to different sires

Directory of Open Access Journals (Sweden)

Ki-Chang Hong

2010-01-01

Full Text Available This study evaluated how various porcine sires affected muscle fibre characteristics, with respect to production traits. Sires from Berkshire, Duroc, Meishan, and Yorkshire pigs were mated to Meishan dams (BM, DM, MM, and YM offspring, respectively. A total of 96 pigs were evaluated for muscle fibre characteristics and production traits. The progeny from Duroc and Yorkshire sires had the greatest number of total fibres (P<0.05 and exhibited less backfat thickness (P<0.001 and larger loin muscle areas (P<0.05 than BM pigs. The DM and BM crossbreds showed higher marbling (P<0.01, and colour scores (P<0.05, as well as lower shear force scores (P<0.001. The MM pigs had greater proportional area of type IIb muscle fibres (P<0.05, and also displayed higher drip loss (P<0.01, higher lightness (P<0.001, and a greater incidence of PSE pork (pale, soft, and exudative; 25% than DM, BM, and YM. These results showed that a greater number of total muscle fibres without increasing the cross sectional area of fibres improved lean meat production, and that a lower proportion of type IIb fibres was associated with better meat quality. For these reasons, the Duroc sire × Meishan dam crossbreed emerged as the most appropriate mating type examined herein to simultaneously enhance both lean meat production and meat quality.
Analysis of lines and breeds of sires in the breeding of the Czech warmblood horses based on grading their offspring in rearing facilities for testing young horses (RFT

Directory of Open Access Journals (Sweden)

Hana Černohorská

2013-01-01

Full Text Available The objective of the present study was to evaluate the effect of the breed of sire and line of sire on grading of the body conformation and performance of colts of warmblood horses in rearing facilities for testing young horses (RFT. The groundwork database contained data from 2001 to 2011 from nine RFT’s. The database was processed statistically using the GLM method to assess the statistical significance of the effect of the breed of the sire and line of the sire on body conformation and performance of the colts. By multiple comparisons of the individual effects using the Tukey-B method we discovered statistically significant differences in the body conformation and performance of colts of sires among the respective breeds and lines. The performance of the offspring of Dutch warmblood, Hanoverian horse and Holsteiner horse sires is better than of the offspring of sires of the Thoroughbred, Czech warmblood and Selle Francais. The conformation of the offspring by sires of the Holsteiner horse and Hanoverian horse breeds is superior to that of offspring by sires of the Selle Francais and Czech warmblood. The mechanics of movement of the offspring of the 2300 Shagya XVIII-Báb. line is inferior to the offspring of the following lines: 3100 Adeptus xx, 67 Dark Ronald, 1000 Der Lowe xx, 3250 Dwinger 3257, 4800 Ladykiller xx, Orange Peel xx – Alme Z, 1100 Przedswit VI-Rad., 4900 Rantzau xx – Cor De La Bryere, 4600 Rittersporn xx – Ramzes 4028, 60 St. Simon and 88 Teddy. The effect of the line of the sires on the body conformation of colts has not been proved.
Sire breed and breed genotype of dam effects in crossbreeding beef ...

African Journals Online (AJOL)

Cows bred to Afrikaner bulls were less (P < 0.05) productive than cows bred to other Bos taurus sires. An increase in proportion Afrikaner breeding in dam resulted in longer calving intervals and a decline in cow productivity, but these differences were not always significant. A breeding strategy for the retainment of superior ...
Influence of sire breed on the interplay among rumen microbial populations inhabiting the rumen liquid of the progeny in beef cattle.

Directory of Open Access Journals (Sweden)

Emma Hernandez-Sanabria

Full Text Available This study aimed to evaluate whether the host genetic background impact the ruminal microbial communities of the progeny of sires from three different breeds under different diets. Eighty five bacterial and twenty eight methanogen phylotypes from 49 individuals of diverging sire breed (Angus, ANG; Charolais, CHA; and Hybrid, HYB, fed high energy density (HE and low energy density (LE diets were determined and correlated with breed, rumen fermentation and phenotypic variables, using multivariate statistical approaches. When bacterial phylotypes were compared between diets, ANG offspring showed the lowest number of diet-associated phylotypes, whereas CHA and HYB progenies had seventeen and twenty-three diet-associated phylotypes, respectively. For the methanogen phylotypes, there were no sire breed-associated phylotypes; however, seven phylotypes were significantly different among breeds on either diet (P<0.05. Sire breed did not influence the metabolic variables measured when high energy diet was fed. A correlation matrix of all pairwise comparisons among frequencies of bacterial and methanogen phylotypes uncovered their relationships with sire breed. A cluster containing methanogen phylotypes M16 (Methanobrevibacter gottschalkii and M20 (Methanobrevibacter smithii, and bacterial phylotype B62 (Robinsoniella sp. in Angus offspring fed low energy diet reflected the metabolic interactions among microbial consortia. The clustering of the phylotype frequencies from the three breeds indicated that phylotypes detected in CHA and HYB progenies are more similar among them, compared to ANG animals. Our results revealed that the frequency of particular microbial phylotypes in the progeny of cattle may be influenced by the sire breed when different diets are fed and ultimately further impact host metabolic functions, such as feed efficiency.
Influence of dietary amino acid level on chemical body composition and performance of growing-finishing boars of two sire lines.

Science.gov (United States)

Otten, Caroline; Berk, Andreas; Müller, Simone; Weber, Manfred; Dänicke, Sven

2013-12-01

There is only little information available concerning the chemical body composition of growing-finishing boars. For that reason, a total of 26 entire male pigs (boars) of two different Piétrain sire lines were fed with different levels of dietary essential amino acids (EAA) and the influence of this treatment on performance and chemical body composition was evaluated. In addition, an initial group of eight boars (n = 4 per sire line) was slaughtered at approximately 21 kg live weight (LW). The other 26 boars were fed three different diets containing 11.5, 13.2 and 14.9 g lysine/kg during the grower period and 9.0, 10.4, 11.7 g lysine/kg during the finisher period, respectively. Other EAA were added in relation to lysine (Lys: Met + Cys: Thr: Trp: Val = 1: 0.60: 0.65: 0.18: 0.75). At a LW of approximately 122 kg these 26 boars (six groups with three to seven animals each) were also slaughtered. The effects of EAA level and sire line on fattening and slaughter performance was recorded, and body and weight gain composition were analysed. There were no significant effects of EAA level on performance or on chemical body composition. Boars sired with Piétrain line 1 demonstrated increased lean meat content and protein body content (p < 0.05) as compared to Piétrain line 2-sired boars.
Multilevel covariance regression with correlated random effects in the mean and variance structure.

Science.gov (United States)

Quintero, Adrian; Lesaffre, Emmanuel

2017-09-01

Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bumblebee workers from different sire groups vary in susceptibility to parasite infection

DEFF Research Database (Denmark)

Baer, Boris; Schmid-Hempel, Paul

2003-01-01

is so far only supported indirectly. Here we tested this crucial assumption using data from a study on the bumblebee Bombus terrestris L. with queens inseminated with sperm of either one or several males that originated from different sire groups (i.e. groups of brothers). We found that, under field...
Genetic parameters of calving ease using sire-maternal grandsire model in Korean Holsteins

Directory of Open Access Journals (Sweden)

Mahboob Alam

2017-09-01

Full Text Available Objective Calving ease (CE is a complex reproductive trait of economic importance in dairy cattle. This study was aimed to investigate the genetic merits of CE for Holsteins in Korea. Methods A total of 297,614 field records of CE, from 2000 to 2015, from first parity Holstein heifers were recorded initially. After necessary data pruning such as age at first calving (18 to 42 mo, gestation length, and presence of sire information, final datasets for CE consisted of 147,526 and 132,080 records for service sire calving ease (SCE and daughter calving ease (DCE evaluations, respectively. The CE categories were ordered and scores ranged from CE1 to CE5 (CE1, easy; CE2, slight assistance; CE3, moderate assistance; CE4, difficult calving; CE5, extreme difficulty calving. A linear transformation of CE score was obtained on each category using Snell procedure, and a scaling factor was applied to attain the spread between 0 (CE5 and 100% (CE1. A sire-maternal grandsire model analysis was performed using ASREML 3.0 software package. Results The estimated direct heritability (h2 from SCE and DCE evaluations were 0.11±0.01 and 0.08±0.01, respectively. Maternal h2 estimates were 0.05±0.02 and 0.04±0.01 from SCE and DCE approaches, respectively. Estimates of genetic correlations between direct and maternal genetic components were −0.68±0.09 (SCE and −0.71±0.09 (DCE. The average direct genetic effect increased over time, whereas average maternal effect was low and consistent. The estimated direct predicted transmitting ability (PTA was desirable and increasing over time, but the maternal PTA was undesirable and decreasing. Conclusion The evidence on sufficient genetic variances in this study could reflect a possible selection improvement over time regarding ease of calving. It is expected that the estimated genetic parameters could be a valuable resource to formulate sire selection and breeding plans which would be directed towards the reduction of
Comparing spatial regression to random forests for large environmental data sets

Science.gov (United States)

Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputatio...
Random regression models for daily feed intake in Danish Duroc pigs

DEFF Research Database (Denmark)

Strathe, Anders Bjerring; Mark, Thomas; Jensen, Just

The objective of this study was to develop random regression models and estimate covariance functions for daily feed intake (DFI) in Danish Duroc pigs. A total of 476201 DFI records were available on 6542 Duroc boars between 70 to 160 days of age. The data originated from the National test station......-year-season, permanent, and animal genetic effects. The functional form was based on Legendre polynomials. A total of 64 models for random regressions were initially ranked by BIC to identify the approximate order for the Legendre polynomials using AI-REML. The parsimonious model included Legendre polynomials of 2nd...... order for genetic and permanent environmental curves and a heterogeneous residual variance, allowing the daily residual variance to change along the age trajectory due to scale effects. The parameters of the model were estimated in a Bayesian framework, using the RJMC module of the DMU package, where...
Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

Science.gov (United States)

Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula

2011-01-01

Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
A sex-chromosome inversion causes strong overdominance for sperm traits that affect siring success.

Science.gov (United States)

Knief, Ulrich; Forstmeier, Wolfgang; Pei, Yifan; Ihle, Malika; Wang, Daiping; Martin, Katrin; Opatová, Pavlína; Albrechtová, Jana; Wittig, Michael; Franke, Andre; Albrecht, Tomáš; Kempenaers, Bart

2017-08-01

Male reproductive success depends on the competitive ability of sperm to fertilize the ova, which should lead to strong selection on sperm characteristics. This raises the question of how heritable variation in sperm traits is maintained. Here we show that in zebra finches (Taeniopygia guttata) nearly half of the variance in sperm morphology is explained by an inversion on the Z chromosome with a 40% allele frequency in the wild. The sperm of males that are heterozygous for the inversion had the longest midpieces and the highest velocity. Furthermore, such males achieved the highest fertility and the highest siring success, both within-pair and extra-pair. Males homozygous for the derived allele show detrimental sperm characteristics and the lowest siring success. Our results suggest heterozygote advantage as the mechanism that maintains the inversion polymorphism and hence variance in sperm design and in fitness.
Power of QTL detection by either fixed or random models in half-sib designs

Directory of Open Access Journals (Sweden)

Schaeffer Lawrence R

2005-11-01

Full Text Available Abstract The aim of this study was to compare the variance component approach for QTL linkage mapping in half-sib designs to the simple regression method. Empirical power was determined by Monte Carlo simulation in granddaughter designs. The factors studied (base values in parentheses included the number of sires (5 and sons per sire (80, ratio of QTL variance to total genetic variance (λ = 0.1, marker spacing (10 cM, and QTL allele frequency (0.5. A single bi-allelic QTL and six equally spaced markers with six alleles each were simulated. Empirical power using the regression method was 0.80, 0.92 and 0.98 for 5, 10, and 20 sires, respectively, versus 0.88, 0.98 and 0.99 using the variance component method. Power was 0.74, 0.80, 0.93, and 0.95 using regression versus 0.77, 0.88, 0.94, and 0.97 using the variance component method for QTL variance ratios (λ of 0.05, 0.1, 0.2, and 0.3, respectively. Power was 0.79, 0.85, 0.80 and 0.87 using regression versus 0.80, 0.86, 0.88, and 0.85 using the variance component method for QTL allele frequencies of 0.1, 0.3, 0.5, and 0.8, respectively. The log10 of type I error profiles were quite flat at close marker spacing (1 cM, confirming the inability to fine-map QTL by linkage analysis in half-sib designs. The variance component method showed slightly more potential than the regression method in QTL mapping.
Weighted SGD for ℓp Regression with Randomized Preconditioning*

Science.gov (United States)

Yang, Jiyan; Chow, Yin-Lam; Ré, Christopher; Mahoney, Michael W.

2018-01-01

In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. SGD methods are easy to implement and applicable to a wide range of convex optimization problems. In contrast, RLA algorithms provide much stronger performance guarantees but are applicable to a narrower class of problems. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems—e.g., ℓ2 and ℓ1 regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system.By rewriting a deterministic ℓp regression problem as a stochastic optimization problem, we connect pwSGD to several existing ℓp solvers including RLA methods with algorithmic leveraging (RLA for short).We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Such SGD convergence rates are superior to other related SGD algorithm such as the weighted randomized Kaczmarz algorithm.Particularly, when solving ℓ1 regression with size n by d, pwSGD returns an approximate solution with ε relative error in the objective value in 𝒪(log n·nnz(A)+poly(d)/ε2) time. This complexity is uniformly better than that of RLA methods in terms of both ε and d when the problem is unconstrained. In the presence of constraints, pwSGD only has to solve a sequence of much simpler and smaller optimization problem over the same constraints. In general this is more efficient than solving the constrained subproblem required in RLA.For ℓ2 regression, pwSGD returns an approximate solution with ε relative error in the objective value and the solution vector measured in

Genetic correlations between body condition score, yield and fertility in Holstein heifers estimated by random regression models

NARCIS (Netherlands)

Veerkamp, R.F.; Koenen, E.P.C.; Jong, de G.

2001-01-01

Twenty type classifiers scored body condition (BCS) of 91,738 first-parity cows from 601 sires and 5518 maternal grandsires. Fertility data during first lactation were extracted for 177,220 cows, of which 67,278 also had a BCS observation, and first-lactation 305-d milk, fat, and protein yields were
Linear Regression with a Randomly Censored Covariate: Application to an Alzheimer's Study.

Science.gov (United States)

Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

2017-01-01

The association between maternal age of onset of dementia and amyloid deposition (measured by in vivo positron emission tomography (PET) imaging) in cognitively normal older offspring is of interest. In a regression model for amyloid, special methods are required due to the random right censoring of the covariate of maternal age of onset of dementia. Prior literature has proposed methods to address the problem of censoring due to assay limit of detection, but not random censoring. We propose imputation methods and a survival regression method that do not require parametric assumptions about the distribution of the censored covariate. Existing imputation methods address missing covariates, but not right censored covariates. In simulation studies, we compare these methods to the simple, but inefficient complete case analysis, and to thresholding approaches. We apply the methods to the Alzheimer's study.
Genetic evaluation of European quails by random regression models

Directory of Open Access Journals (Sweden)

Flaviana Miranda Gonçalves

2012-09-01

Full Text Available The objective of this study was to compare different random regression models, defined from different classes of heterogeneity of variance combined with different Legendre polynomial orders for the estimate of (covariance of quails. The data came from 28,076 observations of 4,507 female meat quails of the LF1 lineage. Quail body weights were determined at birth and 1, 14, 21, 28, 35 and 42 days of age. Six different classes of residual variance were fitted to Legendre polynomial functions (orders ranging from 2 to 6 to determine which model had the best fit to describe the (covariance structures as a function of time. According to the evaluated criteria (AIC, BIC and LRT, the model with six classes of residual variances and of sixth-order Legendre polynomial was the best fit. The estimated additive genetic variance increased from birth to 28 days of age, and dropped slightly from 35 to 42 days. The heritability estimates decreased along the growth curve and changed from 0.51 (1 day to 0.16 (42 days. Animal genetic and permanent environmental correlation estimates between weights and age classes were always high and positive, except for birth weight. The sixth order Legendre polynomial, along with the residual variance divided into six classes was the best fit for the growth rate curve of meat quails; therefore, they should be considered for breeding evaluation processes by random regression models.
Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements

KAUST Repository

Ryu, Duchwan

2010-09-28

We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
Simultaneous confidence bands for Cox regression from semiparametric random censorship.

Science.gov (United States)

Mondal, Shoubhik; Subramanian, Sundarraman

2016-01-01

Cox regression is combined with semiparametric random censorship models to construct simultaneous confidence bands (SCBs) for subject-specific survival curves. Simulation results are presented to compare the performance of the proposed SCBs with the SCBs that are based only on standard Cox. The new SCBs provide correct empirical coverage and are more informative. The proposed SCBs are illustrated with two real examples. An extension to handle missing censoring indicators is also outlined.
Genomic stability and physiological assessments of live offspring sired by a bull clone, Starbuck II.

Science.gov (United States)

Ortegon, H; Betts, D H; Lin, L; Coppola, G; Perrault, S D; Blondin, P; King, W A

2007-01-01

It appears that overt phenotypic abnormalities observed in some domestic animal clones are not transmitted to their progeny. The current study monitored Holstein heifers sired by a bull clone, Starbuck II, from weaning to puberty. Genomic stability was assessed by telomere length status and chromosomal analysis. Growth parameters, blood profiles, physical exams and reproductive parameters were assessed for 12 months (and compared to age-matched control heifers). Progeny sired by the clone bull did not differ (P>0.05) in weight, length and height compared to controls. However, progeny had lower heart rates (HR) (P=0.009), respiratory rates (RR) (P=0.007) and body temperature (P=0.03). Hematological profiles were within normal ranges and did not differ (P>0.05) between both groups. External and internal genitalia were normal and both groups reached puberty at expected ages. Progeny had two or three ovarian follicular waves per estrous cycle and serum progesterone concentrations were similar (P=0.99) to controls. Telomere lengths of sperm and blood cells from Starbuck II were not different (P>0.05) than those of non-cloned cattle; telomere lengths of progeny were not different (P>0.05) from age-matched controls. In addition, progeny had normal karyotypes in peripheral blood leukocytes compared to controls (89.1% versus 86.3% diploid, respectively). In summary, heifers sired by a bull clone had normal chromosomal stability, growth, physical, hematological and reproductive parameters, compared to normal heifers. Furthermore, they had moderate stress responses to routine handling and restraint.
Evaluating an Organizational-Level Occupational Health Intervention in a Combined Regression Discontinuity and Randomized Control Design.

Science.gov (United States)

Sørensen, By Ole H

2016-10-01

Organizational-level occupational health interventions have great potential to improve employees' health and well-being. However, they often compare unfavourably to individual-level interventions. This calls for improving methods for designing, implementing and evaluating organizational interventions. This paper presents and discusses the regression discontinuity design because, like the randomized control trial, it is a strong summative experimental design, but it typically fits organizational-level interventions better. The paper explores advantages and disadvantages of a regression discontinuity design with an embedded randomized control trial. It provides an example from an intervention study focusing on reducing sickness absence in 196 preschools. The paper demonstrates that such a design fits the organizational context, because it allows management to focus on organizations or workgroups with the most salient problems. In addition, organizations may accept an embedded randomized design because the organizations or groups with most salient needs receive obligatory treatment as part of the regression discontinuity design. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Comparison of Growth Performance of Antibiotic-free Yorkshire Crossbreds Sired by Berkshire, Large Black, and Tamworth Breeds Raised in Hoop Structures

Directory of Open Access Journals (Sweden)

N. Whitley

2012-10-01

Full Text Available The objective of this study was to compare body weight, ADG, and feed:gain ratio of antibiotic-free pigs from Yorkshire dams and sired by Yorkshire (YY, Berkshire (BY, Large Black (LBY or Tamworth (TY boars. All the crossbred pigs in each of three trials were raised as one group from weaning to finishing in the same deep-bedded hoop, providing a comfortable environment for the animals which allowed rooting and other natural behaviors. Birth, weaning and litter weights were measured and recorded. From approximately 50 kg to market weight (125 kg, feed intake and body weights were recorded manually (body weight or using a FIRE (Feed Intake Recording Equipment, Osborne Industries Inc. Osborne, Kansas system with eight individual feeding stations. Feed intake data for 106 finishing pigs between 140 and 210 d of age and the resulting weights and feed conversion ratios were analyzed by breed type. Least square means for body weights (birth, weaning and to 240 d were estimated with Proc Mixed in SAS 9.2 for fixed effects such as crossbreed and days of age within the sire breed. The differences within fixed effects were compared using least significant differences with DIFF option. Individual birth weights and weaning weights were influenced by sire breed (p<0.05. For birth weight, BY pigs were the lightest, TY and YY pigs were the heaviest but similar to each other and LBY pigs were intermediate. For weaning weights, BY and LBY pigs were heavier than TY and YY pigs. However, litter birth and weaning weights were not influenced by sire breed, and average daily gain was also not significantly different among breed types. Tamworth sired pigs had lower overall body weight gain, and feed conversion was lower in TY and YY groups than BY and LBY groups (p<0.05, however, number of observations was somewhat limited for feed conversion and for Tamworth pigs. Overall, no convincing differences among breed types were noted for this study, but growth performance in
Evaluations for service-sire conception rate for heifer and cow inseminations with conventional and sexed semen

Science.gov (United States)

Service-sire conception rate (SCR), a phenotypic fertility evaluation based on conventional (nonsexed) inseminations from parities 1 through 5, was implemented by USDA in August 2008. Using insemination data from 2005 through 2009, the SCR procedure was applied separately for nulliparous heifer inse...
A random regression model in analysis of litter size in pigs | Lukovi& ...

African Journals Online (AJOL)

Dispersion parameters for number of piglets born alive (NBA) were estimated using a random regression model (RRM). Two data sets of litter records from the Nemščak farm in Slovenia were used for analyses. The first dataset (DS1) included records from the first to the sixth parity. The second dataset (DS2) was extended ...
The performance of random coefficient regression in accounting for residual confounding.

Science.gov (United States)

Gustafson, Paul; Greenland, Sander

2006-09-01

Greenland (2000, Biometrics 56, 915-921) describes the use of random coefficient regression to adjust for residual confounding in a particular setting. We examine this setting further, giving theoretical and empirical results concerning the frequentist and Bayesian performance of random coefficient regression. Particularly, we compare estimators based on this adjustment for residual confounding to estimators based on the assumption of no residual confounding. This devolves to comparing an estimator from a nonidentified but more realistic model to an estimator from a less realistic but identified model. The approach described by Gustafson (2005, Statistical Science 20, 111-140) is used to quantify the performance of a Bayesian estimator arising from a nonidentified model. From both theoretical calculations and simulations we find support for the idea that superior performance can be obtained by replacing unrealistic identifying constraints with priors that allow modest departures from those constraints. In terms of point-estimator bias this superiority arises when the extent of residual confounding is substantial, but the advantage is much broader in terms of interval estimation. The benefit from modeling residual confounding is maintained when the prior distributions employed only roughly correspond to reality, for the standard identifying constraints are equivalent to priors that typically correspond much worse.
Evaluation of Columbia, USMARC-Composite, Suffolk, and Texel rams as terminal sires in an extensive rangeland production system: I. Ewe productivity and crossbred lamb survival and preweaning growth

Science.gov (United States)

A 3-yr study was conducted to comprehensively evaluate Columbia, Suffolk, USMARC-Composite (Composite), and Texel breeds as terminal sires in an extensive rangeland production system. The objective was to estimate breed-of-ram effects on ewe fertility, prolificacy, and dystocia, and sire breed effe...
Estimating overall exposure effects for the clustered and censored outcome using random effect Tobit regression models.

Science.gov (United States)

Wang, Wei; Griswold, Michael E

2016-11-30

The random effect Tobit model is a regression model that accommodates both left- and/or right-censoring and within-cluster dependence of the outcome variable. Regression coefficients of random effect Tobit models have conditional interpretations on a constructed latent dependent variable and do not provide inference of overall exposure effects on the original outcome scale. Marginalized random effects model (MREM) permits likelihood-based estimation of marginal mean parameters for the clustered data. For random effect Tobit models, we extend the MREM to marginalize over both the random effects and the normal space and boundary components of the censored response to estimate overall exposure effects at population level. We also extend the 'Average Predicted Value' method to estimate the model-predicted marginal means for each person under different exposure status in a designated reference group by integrating over the random effects and then use the calculated difference to assess the overall exposure effect. The maximum likelihood estimation is proposed utilizing a quasi-Newton optimization algorithm with Gauss-Hermite quadrature to approximate the integration of the random effects. We use these methods to carefully analyze two real datasets. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A comparison of pollen-siring ability and life history between males and hermaphrodites of subdioecious Silene acaulis

DEFF Research Database (Denmark)

Philipp, Marianne; Jakobsen, Ruth Bruus; Nachman, Gøsta Støger

2009-01-01

was performed in which females were hand A pollen-competition experiment was performed in which females were hand pollinated with a mixture of pollen from males and hermaphrodites, all with known isozyme alleles, which allowed determination of who sired each seed. We recorded plant size, flower morphology......, hermaphrodite, and male individuals. The sex expression of males and hermaphrodites can vary over years for the same individual, while females are always females. Previous studies have shown that outcrossed seeds from females become seedlings with higher survival and growth rates than those from outcrossed...... seeds from hermaphrodites.Questions: (1) Do pollen grains from males exhibit some advantage over pollen from (1) Do pollen grains from males exhibit some advantage over pollen from hermaphrodites? In particular, do they sire more seeds than hermaphrodites? (2) Is the reproductive system of S. acaulis...
Left ventricular mass regression after porcine versus bovine aortic valve replacement: a randomized comparison.

Science.gov (United States)

Suri, Rakesh M; Zehr, Kenton J; Sundt, Thoralf M; Dearani, Joseph A; Daly, Richard C; Oh, Jae K; Schaff, Hartzell V

2009-10-01

It is unclear whether small differences in transprosthetic gradient between porcine and bovine biologic aortic valves translate into improved regression of left ventricular (LV) hypertrophy after aortic valve replacement. We investigated transprosthetic gradient, aortic valve orifice area, and LV mass in patients randomized to aortic valve replacement with either the Medtronic Mosaic (MM) porcine or an Edwards Perimount (EP) bovine pericardial bioprosthesis. One hundred fifty-two patients with aortic valve disease were randomly assigned to receive either the MM (n = 76) or an EP prosthesis. There were 89 men (59%), and the mean age was 76 years. Echocardiograms from preoperative, postoperative, predismissal, and 1-year time points were analyzed. Baseline characteristics and preoperative echocardiograms were similar between the two groups. The median implant size was 23 mm for both. There were no early deaths, and 10 patients (7%) died after dismissal. One hundred seven of 137 patients (78%) had a 1-year echocardiogram, and none required aortic valve reoperation. The mean aortic valve gradient at dismissal was 19.4 mm Hg (MM) versus13.5 mm Hg (EP; p regression of LV mass index (MM, -32.4 g/m(2) versus EP, -27.0 g/m(2); p = 0.40). Greater preoperative LV mass index was the sole independent predictor of greater LV mass regression after surgery (p regression of LV mass during the first year after aortic valve replacement.
Estimation of genetic parameters related to eggshell strength using random regression models.

Science.gov (United States)

Guo, J; Ma, M; Qu, L; Shen, M; Dou, T; Wang, K

2015-01-01

This study examined the changes in eggshell strength and the genetic parameters related to this trait throughout a hen's laying life using random regression. The data were collected from a crossbred population between 2011 and 2014, where the eggshell strength was determined repeatedly for 2260 hens. Using random regression models (RRMs), several Legendre polynomials were employed to estimate the fixed, direct genetic and permanent environment effects. The residual effects were treated as independently distributed with heterogeneous variance for each test week. The direct genetic variance was included with second-order Legendre polynomials and the permanent environment with third-order Legendre polynomials. The heritability of eggshell strength ranged from 0.26 to 0.43, the repeatability ranged between 0.47 and 0.69, and the estimated genetic correlations between test weeks was high at > 0.67. The first eigenvalue of the genetic covariance matrix accounted for about 97% of the sum of all the eigenvalues. The flexibility and statistical power of RRM suggest that this model could be an effective method to improve eggshell quality and to reduce losses due to cracked eggs in a breeding plan.
Subordinate Males Sire Offspring in Madagascar Fish-eagle (Haliaeetus Vociferoides) Polyandrous Breeding Groups

OpenAIRE

Tingay, Ruth E.; Culver, Melanie; Hallerman, Eric M.; Fraser, James D.; Watson, Richard T.

2002-01-01

The island endemic Madagascar Fish-Eagle (Haliaeetus vociferoides) is one of the most endangered birds of prey. Certain populations in west-central Madagascar sometimes exhibit a third, and sometimes a fourth, adult involved in breeding activities at a nest. We applied DNA fingerprinting to assess relatedness among 17 individuals at four nests. In all nests with young, a subordinate rather than the dominant male sired the offspring. Within-nest relatedness comparisons showed that some dominan...
Genetic analysis of body weights of individually fed beef bulls in South Africa using random regression models.

Science.gov (United States)

Selapa, N W; Nephawe, K A; Maiwashe, A; Norris, D

2012-02-08

The aim of this study was to estimate genetic parameters for body weights of individually fed beef bulls measured at centralized testing stations in South Africa using random regression models. Weekly body weights of Bonsmara bulls (N = 2919) tested between 1999 and 2003 were available for the analyses. The model included a fixed regression of the body weights on fourth-order orthogonal Legendre polynomials of the actual days on test (7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, and 84) for starting age and contemporary group effects. Random regressions on fourth-order orthogonal Legendre polynomials of the actual days on test were included for additive genetic effects and additional uncorrelated random effects of the weaning-herd-year and the permanent environment of the animal. Residual effects were assumed to be independently distributed with heterogeneous variance for each test day. Variance ratios for additive genetic, permanent environment and weaning-herd-year for weekly body weights at different test days ranged from 0.26 to 0.29, 0.37 to 0.44 and 0.26 to 0.34, respectively. The weaning-herd-year was found to have a significant effect on the variation of body weights of bulls despite a 28-day adjustment period. Genetic correlations amongst body weights at different test days were high, ranging from 0.89 to 1.00. Heritability estimates were comparable to literature using multivariate models. Therefore, random regression model could be applied in the genetic evaluation of body weight of individually fed beef bulls in South Africa.
The limiting behavior of the estimated parameters in a misspecified random field regression model

DEFF Research Database (Denmark)

Dahl, Christian Møller; Qin, Yu

This paper examines the limiting properties of the estimated parameters in the random field regression model recently proposed by Hamilton (Econometrica, 2001). Though the model is parametric, it enjoys the flexibility of the nonparametric approach since it can approximate a large collection of n...
Evaluation of Rambouillet, Polypay, and Romanov-White Dorper × Rambouillet ewes mated to terminal sires in an extensive rangeland production system: Lamb production.

Science.gov (United States)

Notter, D R; Mousel, M R; Lewis, G S; Leymaster, K A; Taylor, J B

2017-09-01

Ewe productivity (i.e., total number or weight of lambs weaned per breeding ewe) is a key indicator of lamb production efficiency. This study compared various measures of ewe productivity and ewe and lamb performance among ewes of 3 breed types mated to rams of 4 terminal-sire breed types in an extensive rangeland production system. Purebred Rambouillet ( = 212), purebred Polypay ( = 236), and crossbred Romanov-White Dorper × Rambouillet (RW-RA; = 231) ewes were produced from locally adapted Polypay and Rambouillet ewes and then annually mated to Columbia, Suffolk, Columbia × Suffolk, or Suffolk × Columbia sires for up to 4 yr, beginning at 1 yr of age. The cumulative number and weight of lambs weaned through 4 yr were greater for RW-RA (5.9 lambs and 153 kg, respectively) and Polypay ewes (4.9 lambs and 123 kg, respectively) than for Rambouillet ewes (2.9 lambs and 99 kg, respectively) and also were greater for RW-RA ewes than for Polypay ewes (all ewes, compared with Rambouillet ewes, was driven by greater lambing rates (ewes lambing per ewe exposed) as ewe lambs (87 and 77 vs. 31%, respectively; ewe lambs (1.3, 1.3, and 1.0, respectively) and adult ewes (2.1, 2.0, and 1.6, respectively). The RW-RA ewes also had greater longevity ( ewes. Lamb BW at birth and weaning in adult ewes favored less-prolific Rambouillet ewes ( ewe breed types were small and not significant ( = 0.08). Effects of sire breed type on measures of cumulative ewe productivity were not significant ( > 0.74), but Suffolk-sired lambs had the heaviest adjusted birth weights ( = 0.01) and Columbia-sired lambs tended to have the lightest adjusted weaning weights ( = 0.12). Combined effects of heterosis and additive breed effects were associated with greater lambing rates in ewe lambs, larger litters at all ages, and substantially greater number and weight of lambs weaned for Polypay and RW-RA ewes than for Rambouillet ewes.

Semi-parametric estimation of random effects in a logistic regression model using conditional inference

DEFF Research Database (Denmark)

Petersen, Jørgen Holm

2016-01-01

This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied...
Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages

Science.gov (United States)

Kim, Yoonsang; Emery, Sherry

2013-01-01

Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages.

Science.gov (United States)

Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry

2013-08-01

Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations.

Science.gov (United States)

NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel

2017-08-01

Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Association of bovine leptin polymorphisms with energy output and energy storage traits in progeny tested Holstein-Friesian dairy cattle sires

Science.gov (United States)

2010-01-01

Background Leptin modulates appetite, energy expenditure and the reproductive axis by signalling via its receptor the status of body energy stores to the brain. The present study aimed to quantify the associations between 10 novel and known single nucleotide polymorphisms in genes coding for leptin and leptin receptor with performance traits in 848 Holstein-Friesian sires, estimated from performance of up to 43,117 daughter-parity records per sire. Results All single nucleotide polymorphisms were segregating in this sample population and none deviated (P > 0.05) from Hardy-Weinberg equilibrium. Complete linkage disequilibrium existed between the novel polymorphism LEP-1609, and the previously identified polymorphisms LEP-1457 and LEP-580. LEP-2470 associated (P body condition score, reduced milk yield and shorter gestation (P fertility in the Holstein-Friesian dairy cow. PMID:20670403
Predicting longitudinal trajectories of health probabilities with random-effects multinomial logit regression.

Science.gov (United States)

Liu, Xian; Engel, Charles C

2012-12-20

Researchers often encounter longitudinal health data characterized with three or more ordinal or nominal categories. Random-effects multinomial logit models are generally applied to account for potential lack of independence inherent in such clustered data. When parameter estimates are used to describe longitudinal processes, however, random effects, both between and within individuals, need to be retransformed for correctly predicting outcome probabilities. This study attempts to go beyond existing work by developing a retransformation method that derives longitudinal growth trajectories of unbiased health probabilities. We estimated variances of the predicted probabilities by using the delta method. Additionally, we transformed the covariates' regression coefficients on the multinomial logit function, not substantively meaningful, to the conditional effects on the predicted probabilities. The empirical illustration uses the longitudinal data from the Asset and Health Dynamics among the Oldest Old. Our analysis compared three sets of the predicted probabilities of three health states at six time points, obtained from, respectively, the retransformation method, the best linear unbiased prediction, and the fixed-effects approach. The results demonstrate that neglect of retransforming random errors in the random-effects multinomial logit model results in severely biased longitudinal trajectories of health probabilities as well as overestimated effects of covariates on the probabilities. Copyright © 2012 John Wiley & Sons, Ltd.
Influence of Maximum Inbreeding Avoidance under BLUP EBV Selection on Pinzgau Population Diversity

Directory of Open Access Journals (Sweden)

Radovan Kasarda

2011-05-01

Full Text Available Evaluated was effect of mating (random vs. maximum avoidance of inbreeding under BLUP EBV selection strategy. Existing population structure was under Monte Carlo stochastic simulation analyzed from the point to minimize increase of inbreeding. Maximum avoidance of inbreeding under BLUP selection resulted into comparable increase of inbreeding then random mating in average of 10 generation development. After 10 generations of simulation of mating strategy was observed ΔF= 6,51 % (2 sires, 5,20 % (3 sires, 3,22 % (4 sires resp. 2,94 % (5 sires. With increased number of sires selected, decrease of inbreeding was observed. With use of 4, resp. 5 sires increase of inbreeding was comparable to random mating with phenotypic selection. For saving of genetic diversity and prevention of population loss is important to minimize increase of inbreeding in small populations. Classical approach was based on balancing ratio of sires and dams in mating program. Contrariwise in the most of commercial populations small number of sires was used with high mating ratio.
Reduction of the number of parameters needed for a polynomial random regression test-day model

NARCIS (Netherlands)

Pool, M.H.; Meuwissen, T.H.E.

2000-01-01

Legendre polynomials were used to describe the (co)variance matrix within a random regression test day model. The goodness of fit depended on the polynomial order of fit, i.e., number of parameters to be estimated per animal but is limited by computing capacity. Two aspects: incomplete lactation
Estimação de parâmetros genéticos para produção de leite de vacas da raça Holandesa via regressão aleatória Estimation of genetic parameters for Holstein cows milk production by random regression

Directory of Open Access Journals (Sweden)

C.K.P. Dorneles

2009-04-01

Full Text Available Foram utilizados 21.702 registros de produção de leite no dia do controle de 2.429 vacas primíparas da raça Holandesa, filhas de 233 touros, coletados em 33 rebanhos do Estado do Rio Grande do Sul, para estimar parâmetros genéticos para produção de leite no dia do controle. O modelo de regressão aleatória ajustado aos controles leiteiros entre o sexto e o 305º dia de lactação incluiu o efeito de rebanho-ano-mês do controle, idade da vaca no parto e os parâmetros do polinômio de Legendre de ordem quatro, para modelar a curva média da produção de leite da população e parâmetros do mesmo polinômio, para modelar os efeitos aleatórios genético-aditivo e de ambiente permanente. As variâncias genéticas e de ambiente permanente para produção de leite no dia do controle variaram, respectivamente, de 2,38 a 3,14 e de 7,55 a 10,35. As estimativas de herdabilidade aumentaram gradativamente do início (0,14 para o final do período de lactação (0,20, indicando ser uma característica de moderada herdabilidade. As correlações genéticas entre as produções de leite de diferentes estágios leiteiros variaram de 0,33 a 0,99 e foram maiores entre os controles adjacentes. As correlações de ambiente permanente seguiram a mesma tendência das correlações genéticas. O modelo de regressão aleatória com polinômio de Legendre de ordem quatro pode ser considerado como uma boa ferramenta para estimação de parâmetros genéticos para a produção de leite ao longo da lactação.A total of 21,702 records of milk production from 2,429 first-lactation Holstein cows, sired by 233 bulls, collected in 33 herds in the State of Rio Grande do Sul from 1991 to 2003, were used to estimate genetic parameters for that characteristic. The random regression model adjusted to test day from the 6th and the 305th lactation day included the effect of herd-year-month of the test day, the age of the cow at parturition, and the order fourth Legendre
Modeling urban coastal flood severity from crowd-sourced flood reports using Poisson regression and Random Forest

Science.gov (United States)

Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.

2018-04-01

Sea level rise has already caused more frequent and severe coastal flooding and this trend will likely continue. Flood prediction is an essential part of a coastal city's capacity to adapt to and mitigate this growing problem. Complex coastal urban hydrological systems however, do not always lend themselves easily to physically-based flood prediction approaches. This paper presents a method for using a data-driven approach to estimate flood severity in an urban coastal setting using crowd-sourced data, a non-traditional but growing data source, along with environmental observation data. Two data-driven models, Poisson regression and Random Forest regression, are trained to predict the number of flood reports per storm event as a proxy for flood severity, given extensive environmental data (i.e., rainfall, tide, groundwater table level, and wind conditions) as input. The method is demonstrated using data from Norfolk, Virginia USA from September 2010 to October 2016. Quality-controlled, crowd-sourced street flooding reports ranging from 1 to 159 per storm event for 45 storm events are used to train and evaluate the models. Random Forest performed better than Poisson regression at predicting the number of flood reports and had a lower false negative rate. From the Random Forest model, total cumulative rainfall was by far the most dominant input variable in predicting flood severity, followed by low tide and lower low tide. These methods serve as a first step toward using data-driven methods for spatially and temporally detailed coastal urban flood prediction.
Genetic analyses of partial egg production in Japanese quail using multi-trait random regression models.

Science.gov (United States)

Karami, K; Zerehdaran, S; Barzanooni, B; Lotfi, E

2017-12-01

1. The aim of the present study was to estimate genetic parameters for average egg weight (EW) and egg number (EN) at different ages in Japanese quail using multi-trait random regression (MTRR) models. 2. A total of 8534 records from 900 quail, hatched between 2014 and 2015, were used in the study. Average weekly egg weights and egg numbers were measured from second until sixth week of egg production. 3. Nine random regression models were compared to identify the best order of the Legendre polynomials (LP). The most optimal model was identified by the Bayesian Information Criterion. A model with second order of LP for fixed effects, second order of LP for additive genetic effects and third order of LP for permanent environmental effects (MTRR23) was found to be the best. 4. According to the MTRR23 model, direct heritability for EW increased from 0.26 in the second week to 0.53 in the sixth week of egg production, whereas the ratio of permanent environment to phenotypic variance decreased from 0.48 to 0.1. Direct heritability for EN was low, whereas the ratio of permanent environment to phenotypic variance decreased from 0.57 to 0.15 during the production period. 5. For each trait, estimated genetic correlations among weeks of egg production were high (from 0.85 to 0.98). Genetic correlations between EW and EN were low and negative for the first two weeks, but they were low and positive for the rest of the egg production period. 6. In conclusion, random regression models can be used effectively for analysing egg production traits in Japanese quail. Response to selection for increased egg weight would be higher at older ages because of its higher heritability and such a breeding program would have no negative genetic impact on egg production.
3D statistical shape models incorporating 3D random forest regression voting for robust CT liver segmentation

Science.gov (United States)

Norajitra, Tobias; Meinzer, Hans-Peter; Maier-Hein, Klaus H.

2015-03-01

During image segmentation, 3D Statistical Shape Models (SSM) usually conduct a limited search for target landmarks within one-dimensional search profiles perpendicular to the model surface. In addition, landmark appearance is modeled only locally based on linear profiles and weak learners, altogether leading to segmentation errors from landmark ambiguities and limited search coverage. We present a new method for 3D SSM segmentation based on 3D Random Forest Regression Voting. For each surface landmark, a Random Regression Forest is trained that learns a 3D spatial displacement function between the according reference landmark and a set of surrounding sample points, based on an infinite set of non-local randomized 3D Haar-like features. Landmark search is then conducted omni-directionally within 3D search spaces, where voxelwise forest predictions on landmark position contribute to a common voting map which reflects the overall position estimate. Segmentation experiments were conducted on a set of 45 CT volumes of the human liver, of which 40 images were randomly chosen for training and 5 for testing. Without parameter optimization, using a simple candidate selection and a single resolution approach, excellent results were achieved, while faster convergence and better concavity segmentation were observed, altogether underlining the potential of our approach in terms of increased robustness from distinct landmark detection and from better search coverage.
Birth and weaning traits in crossbred cattle from Hereford, Angus, Norwegian Red, Swedish Red and White, Wagyu, and Friesian sires.

Science.gov (United States)

Casas, E; Thallman, R M; Cundiff, L V

2012-09-01

The objective of this study was to characterize breeds representing diverse biological types for birth and weaning traits in crossbred cattle (Bos taurus). Gestation length, calving difficulty, percentage of unassisted calving, percentage of perinatal survival, percentage of survival from birth to weaning, birth weight, weaning weight, BW at 205 d, and ADG was measured in 1,370 calves born and 1,285 calves weaned. Calves were obtained by mating Hereford, Angus, and MARC III (1/4 Hereford, 1/4 Angus, 1/4 Pinzgauer, and 1/4 Red Poll) mature cows to Hereford or Angus (British breeds), Norwegian Red, Swedish Red and White, Wagyu, and Friesian sires. Calves were born during the spring of 1997 and 1998. Sire breed was significant for gestation length, birth weight, BW at 205 d, and ADG (P Angus cows had the shortest (282 d). Offspring from MARC III cows were the heaviest at birth (39.4 kg) when compared with offspring from Hereford (38.2 kg) and Angus (38.6 kg) cows. Progeny from Angus cows were the heaviest at 205 d (235 kg) and grew faster (0.96 kg/d), whereas offspring from Hereford cows were the lightest at 205 d (219 kg) and were the slowest in growth (0.88 kg/d). Sex was significant for gestation length (P = 0.026), birth weight, BW at 205 d, and ADG (P < 0.001). Male calves had a longer gestation length (284 d) when compared with female calves (283 d). Males were heavier than females at birth and at 205 d, and grew faster. Sire breed effects can be optimized by selection and use of appropriate crossbreeding systems.
Genetic Parameters for Body condition score, Body weigth, Milk yield and Fertility estimated using random regression models

NARCIS (Netherlands)

Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.

2003-01-01

Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields
A Logistic Regression Model with a Hierarchical Random Error Term for Analyzing the Utilization of Public Transport

Directory of Open Access Journals (Sweden)

Chong Wei

2015-01-01

Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.
An investigation into beef calf mortality on five high-altitude ranches that selected sires with low pulmonary arterial pressures for over 20 years.

Science.gov (United States)

Neary, Joseph M; Gould, Daniel H; Garry, Franklyn B; Knight, Anthony P; Dargatz, David A; Holt, Timothy N

2013-03-01

Producer reports from ranches over 2,438 meters in southwest Colorado suggest that the mortality of preweaned beef calves may be substantially higher than the national average despite the selection of low pulmonary pressure herd sires for over 20 years. Diagnostic investigations of this death loss problem have been limited due to the extensive mountainous terrain over which these calves are grazed with their dams. The objective of the current study was to determine the causes of calf mortality on 5 high-altitude ranches in Colorado that have been selectively breeding sires with low pulmonary pressure (branding (6 weeks of age) in the spring to weaning in the fall (7 months of age). Clinical signs were recorded, and blood samples were taken from sick calves. Postmortem examinations were performed, and select tissue samples were submitted for aerobic culture and/or histopathology. On the principal study ranch, 9.6% (59/612) of the calves that were branded in the spring either died or were presumed dead by weaning in the fall. In total, 28 necropsies were performed: 14 calves (50%) had lesions consistent with pulmonary hypertension and right-sided heart failure, and 14 calves (50%) died from bronchopneumonia. Remodeling of the pulmonary arterial system, indicative of pulmonary hypertension, was evident in the former and to varying degrees in the latter. There is a need to better characterize the additional risk factors that complicate pulmonary arterial pressure testing of herd sires as a strategy to control pulmonary hypertension.
Random regression models to estimate genetic parameters for milk production of Guzerat cows using orthogonal Legendre polynomials

Directory of Open Access Journals (Sweden)

Maria Gabriela Campolina Diniz Peixoto

2014-05-01

Full Text Available The objective of this work was to compare random regression models for the estimation of genetic parameters for Guzerat milk production, using orthogonal Legendre polynomials. Records (20,524 of test-day milk yield (TDMY from 2,816 first-lactation Guzerat cows were used. TDMY grouped into 10-monthly classes were analyzed for additive genetic effect and for environmental and residual permanent effects (random effects, whereas the contemporary group, calving age (linear and quadratic effects and mean lactation curve were analized as fixed effects. Trajectories for the additive genetic and permanent environmental effects were modeled by means of a covariance function employing orthogonal Legendre polynomials ranging from the second to the fifth order. Residual variances were considered in one, four, six, or ten variance classes. The best model had six residual variance classes. The heritability estimates for the TDMY records varied from 0.19 to 0.32. The random regression model that used a second-order Legendre polynomial for the additive genetic effect, and a fifth-order polynomial for the permanent environmental effect is adequate for comparison by the main employed criteria. The model with a second-order Legendre polynomial for the additive genetic effect, and that with a fourth-order for the permanent environmental effect could also be employed in these analyses.
Multi-fidelity Gaussian process regression for prediction of random fields

International Nuclear Information System (INIS)

Parussini, L.; Venturi, D.; Perdikaris, P.; Karniadakis, G.E.

2017-01-01

We propose a new multi-fidelity Gaussian process regression (GPR) approach for prediction of random fields based on observations of surrogate models or hierarchies of surrogate models. Our method builds upon recent work on recursive Bayesian techniques, in particular recursive co-kriging, and extends it to vector-valued fields and various types of covariances, including separable and non-separable ones. The framework we propose is general and can be used to perform uncertainty propagation and quantification in model-based simulations, multi-fidelity data fusion, and surrogate-based optimization. We demonstrate the effectiveness of the proposed recursive GPR techniques through various examples. Specifically, we study the stochastic Burgers equation and the stochastic Oberbeck–Boussinesq equations describing natural convection within a square enclosure. In both cases we find that the standard deviation of the Gaussian predictors as well as the absolute errors relative to benchmark stochastic solutions are very small, suggesting that the proposed multi-fidelity GPR approaches can yield highly accurate results.
Multi-fidelity Gaussian process regression for prediction of random fields

Energy Technology Data Exchange (ETDEWEB)

Parussini, L. [Department of Engineering and Architecture, University of Trieste (Italy); Venturi, D., E-mail: venturi@ucsc.edu [Department of Applied Mathematics and Statistics, University of California Santa Cruz (United States); Perdikaris, P. [Department of Mechanical Engineering, Massachusetts Institute of Technology (United States); Karniadakis, G.E. [Division of Applied Mathematics, Brown University (United States)

2017-05-01

We propose a new multi-fidelity Gaussian process regression (GPR) approach for prediction of random fields based on observations of surrogate models or hierarchies of surrogate models. Our method builds upon recent work on recursive Bayesian techniques, in particular recursive co-kriging, and extends it to vector-valued fields and various types of covariances, including separable and non-separable ones. The framework we propose is general and can be used to perform uncertainty propagation and quantification in model-based simulations, multi-fidelity data fusion, and surrogate-based optimization. We demonstrate the effectiveness of the proposed recursive GPR techniques through various examples. Specifically, we study the stochastic Burgers equation and the stochastic Oberbeck–Boussinesq equations describing natural convection within a square enclosure. In both cases we find that the standard deviation of the Gaussian predictors as well as the absolute errors relative to benchmark stochastic solutions are very small, suggesting that the proposed multi-fidelity GPR approaches can yield highly accurate results.
Predicting attention-deficit/hyperactivity disorder severity from psychosocial stress and stress-response genes : A random forest regression approach

NARCIS (Netherlands)

Van Der Meer, D.; Hoekstra, P. J.; Van Donkelaar, M.; Bralten, J.; Oosterlaan, J.; Heslenfeld, D.; Faraone, S. V.; Franke, B.; Buitelaar, J. K.; Hartman, C. A.

2017-01-01

Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression

Predicting attention-deficit/hyperactivity disorder severity from psychosocial stress and stress-response genes : a random forest regression approach

NARCIS (Netherlands)

van der Meer, D.; Hoekstra, P. J.; van Donkelaar, Marjolein M. J.; Bralten, Janita; Oosterlaan, J; Heslenfeld, Dirk J.; Faraone, S. V.; Franke, B.; Buitelaar, J. K.; Hartman, C. A.

2017-01-01

Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression
Appropriate assessment of neighborhood effects on individual health: integrating random and fixed effects in multilevel logistic regression

DEFF Research Database (Denmark)

Larsen, Klaus; Merlo, Juan

2005-01-01

The logistic regression model is frequently used in epidemiologic studies, yielding odds ratio or relative risk interpretations. Inspired by the theory of linear normal models, the logistic regression model has been extended to allow for correlated responses by introducing random effects. However......, the model does not inherit the interpretational features of the normal model. In this paper, the authors argue that the existing measures are unsatisfactory (and some of them are even improper) when quantifying results from multilevel logistic regression analyses. The authors suggest a measure...... of heterogeneity, the median odds ratio, that quantifies cluster heterogeneity and facilitates a direct comparison between covariate effects and the magnitude of heterogeneity in terms of well-known odds ratios. Quantifying cluster-level covariates in a meaningful way is a challenge in multilevel logistic...
An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

Science.gov (United States)

Strobl, Carolin; Malley, James; Tutz, Gerhard

2009-01-01

Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…
ESTIMATION OF GENETIC PARAMETERS IN TROPICARNE CATTLE WITH RANDOM REGRESSION MODELS USING B-SPLINES

Directory of Open Access Journals (Sweden)

Joel DomÃnguez Viveros

2015-04-01

Full Text Available The objectives were to estimate variance components, and direct (h2 and maternal (m2 heritability in the growth of Tropicarne cattle based on a random regression model using B-Splines for random effects modeling. Information from 12 890 monthly weightings of 1787 calves, from birth to 24 months old, was analyzed. The pedigree included 2504 animals. The random effects model included genetic and permanent environmental (direct and maternal of cubic order, and residuals. The fixed effects included contemporaneous groups (year â€“ season of weighed, sex and the covariate age of the cow (linear and quadratic. The B-Splines were defined in four knots through the growth period analyzed. Analyses were performed with the software Wombat. The variances (phenotypic and residual presented a similar behavior; of 7 to 12 months of age had a negative trend; from birth to 6 months and 13 to 18 months had positive trend; after 19 months were maintained constant. The m2 were low and near to zero, with an average of 0.06 in an interval of 0.04 to 0.11; the h2 also were close to zero, with an average of 0.10 in an interval of 0.03 to 0.23.
Genetic correlations among body condition score, yield and fertility in multiparous cows using random regression models

OpenAIRE

Bastin, Catherine; Gillon, Alain; Massart, Xavier; Bertozzi, Carlo; Vanderick, Sylvie; Gengler, Nicolas

2010-01-01

Genetic correlations between body condition score (BCS) in lactation 1 to 3 and four economically important traits (days open, 305-days milk, fat, and protein yields recorded in the first 3 lactations) were estimated on about 12,500 Walloon Holstein cows using 4-trait random regression models. Results indicated moderate favorable genetic correlations between BCS and days open (from -0.46 to -0.62) and suggested the use of BCS for indirect selection on fertility. However, unfavorable genetic c...
Lasteaeda kujundatakse lapsekesksemaks / Punamäe, Anita; Sarap, Anu; Peterson, Ester; Kala, Sire; Laanemäe-Räim, Consuelo; küsitlenud Kaile Kabun

Index Scriptorium Estoniae

2007-01-01

Vestlusringis on maavalitsuse haridus- ja kultuuriosakonna peaspetsialist Anita Punamäe, Parksepa lasteaia juhataja Anu Sarap, Päkapiku lasteaia juhataja asetäitja õppekasvatustöö alal Ester Peterson, Sõlekese lasteaia õpetaja Sire Kala, lapsevanem ja lasteaia Punamütsike hoolekogu liige Consuelo Laanemäe-Räim
Association of bovine leptin polymorphisms with energy output and energy storage traits in progeny tested Holstein-Friesian dairy cattle sires

Directory of Open Access Journals (Sweden)

Waters Sinead M

2010-07-01

Full Text Available Abstract Background Leptin modulates appetite, energy expenditure and the reproductive axis by signalling via its receptor the status of body energy stores to the brain. The present study aimed to quantify the associations between 10 novel and known single nucleotide polymorphisms in genes coding for leptin and leptin receptor with performance traits in 848 Holstein-Friesian sires, estimated from performance of up to 43,117 daughter-parity records per sire. Results All single nucleotide polymorphisms were segregating in this sample population and none deviated (P > 0.05 from Hardy-Weinberg equilibrium. Complete linkage disequilibrium existed between the novel polymorphism LEP-1609, and the previously identified polymorphisms LEP-1457 and LEP-580. LEP-2470 associated (P Conclusions Several leptin polymorphisms (LEP-2470, LEP-1238, LEP-963, Y7F and R25C associated with the energetically expensive process of lactogenesis. Only SNP Y7F associated with energy storage. Associations were also observed between leptin polymorphisms and calving difficulty, gestation length and calf perinatal mortality. The lack of an association between the leptin variants investigated with calving interval in this large data set would question the potential importance of these leptin variants, or indeed leptin, in selection for improved fertility in the Holstein-Friesian dairy cow.
Effects of time of weaning, supplement, and sire breed of calf during the fall grazing period on cow and calf performance.

Science.gov (United States)

Short, R E; Grings, E E; MacNeil, M D; Heitschmidt, R K; Haferkamp, M R; Adams, D C

1996-07-01

A 4-yr experiment was conducted to determine effects of protein supplementation, age at weaning, and calf sire breed on cow and calf performance during fall grazing. Each year 48 pregnant, crossbred cows nursing steer calves (mean calving date = April 8) were assigned to a 2 x 2 x 2 factorial experiment replicated in three native range pastures. Treatment factors were: 1) no supplement (NS) or an individually fed supplement (S, 3 kg of a 34% protein supplement fed to cows every 3rd d); 2) calves weaned at the beginning (W, mid to late September) or at the end (NW, mid to late December) of the trial each year; or 3) calves sired by Hereford or Charolais bulls. Data were adjusted for cow size (initial hip height and initial and final weights and condition scores) by analyses of covariance using principal component coefficients as covariates. Change in cow weight and condition score were increased by S and W (P Forage intake was decreased (P intake (forage+supplement) was not affected by S but was decreased by W (P effects of treatments were observed the next spring in cow weight, condition score, and birth weight (NW decreased birth weight by 2 kg, P effects by the next fall on weaning weights or pregnancy rates. Milk yield decreased during the experimental period, and S maintained higher milk production in late lactation (P Calf ADG was increased by S and Charolais sires (P effects of feeding a 34% protein supplement to cows were to increase calf gains and improve persistency of lactation and efficiency; 2) delaying weaning decreased cow weight and condition score; 3) effects of weaning age and protein supplementation were highly dependent on forage and environmental conditions in any given year; and 4) whatever effects existed in a given year did not carry over to effects on next year's production as measured by pregnancy rates and weaning weights.
Degummed crude canola oil, sire breed and gender effects on intramuscular long-chain omega-3 fatty acid properties of raw and cooked lamb meat

Directory of Open Access Journals (Sweden)

Aaron Ross Flakemore

2017-08-01

Full Text Available Abstract Background Omega-3 long-chain (≥C20 polyunsaturated fatty acids (ω3 LC-PUFA confer important attributes to health-conscious meat consumers due to the significant role they play in brain development, prevention of coronary heart disease, obesity and hypertension. In this study, the ω3 LC-PUFA content of raw and cooked Longissimus thoracis et lumborum (LTL muscle from genetically divergent Australian prime lambs supplemented with dietary degummed crude canola oil (DCCO was evaluated. Methods Samples of LTL muscle were sourced from 24 first cross ewe and wether lambs sired by Dorset, White Suffolk and Merino rams joined to Merino dams that were assigned to supplemental regimes of degummed crude canola oil (DCCO: a control diet at 0 mL/kg DM of DCCO (DCCOC; 25 mL/kg DM of DCCO (DCCOM and 50 mL/kg DCCO (DCCOH. Lambs were individually housed and offered 1 kg/day/head for 42 days before being slaughtered. Samples for cooked analysis were prepared to a core temperature of 70 °C using conductive dry-heat. Results Within raw meats: DCCOH supplemented lambs had significantly (P < 0.05 higher concentrations of eicosapentaenoic (EPA, 20:5ω3 and EPA + docosahexaenoic (DHA, 22:6ω3 acids than those supplemented with DCCOM or DCCOC; Dorset sired lambs contained significantly (P < 0.05 more EPA and EPA + DHA than other sire breeds; diet and sire breed interactions were significant (P < 0.05 in affecting EPA and EPA + DHA concentrations. In cooked meat, ω3 LC-PUFA concentrations in DCCOM (32 mg/100 g, DCCOH (38 mg/100 g, Dorset (36 mg/100 g, White Suffolk (32 mg/100 g, ewes (32 mg/100 g and wethers (33 mg/100 g, all exceeded the minimum content of 30 mg/100 g of edible cooked portion of EPA + DHA for Australian defined ‘source’ level ω3 LC-PUFA classification. Conclusion These results present that combinations of dietary degummed crude canola oil, sheep genetics and culinary preparation method can be used as
Persistência na lactação para vacas da raça Holandesa criadas no Estado do Rio Grande do Sul via modelos de regressão aleatória Lactation persistency for Holstein cows raised in the State of Rio Grande do Sul using a random regression model

Directory of Open Access Journals (Sweden)

Cristian Kelen Pinto Dorneles

2009-08-01

Full Text Available Foram utilizados 21.702 registros de produção de leite no dia do controle de 2.429 vacas primíparas da raça Holandesa, filhas de 233 touros, coletados em 33 rebanhos do Estado do Rio Grande do Sul, entre 1992 e 2003, para estimar parâmetros genéticos, para três medidas de persistência (PS1, PS2 e PS3 e para a produção de leite até 305 dias (P305 de lactação. Os modelos de regressão aleatória ajustados aos controles leiteiros entre o sexto e o 300o dia de lactação incluíram o efeito de rebanho-ano-mês do controle, a idade da vaca ao parto e os parâmetros do polinômio de Legendre de ordem quatro, para modelar a curva média da produção de leite da população e os parâmetros do mesmo polinômio, para modelar os efeitos aleatórios genético-aditivo direto e de ambiente permanente. As estimativas de herdabilidade obtidas foram 0,05, 0,08 e 0,19, respectivamente, para PS1, PS2 e PS3 e 0,25, para P305 sugerindo a possibilidade de ganho genético por meio da seleção para PS3 e para P305. As correlações genéticas entre as três medidas de persistência e P305, variaram de -0,05 a 0,07, indicando serem persistência e produção, características determinadas por grupos de genes diferentes. Assim, consequentemente, a seleção para P305, geralmente praticada, não promove progresso genético para a persistência.There were used 21,702 test day milk yields from 2,429 first parity Holstein breed cows, daughters of 2,031 dams and 233 sires, distributed over 33 herds in the state of Rio Grande do Sul, from 1992 to 2003. Genetic parameters for three measures of lactation persistency (PS1, PS2 e PS3 and for milk production to 305 days (P305 were evaluated. A random regression model adjusted by fourth order Legendre polynomial was used. The random regression model adjusted to test day between the sixth and the 305th lactation day included the herd-year-season of the test day, the age of the cow at the parturition effects and the
Regression modeling methods, theory, and computation with SAS

CERN Document Server

Panik, Michael

2009-01-01

Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Genetic analysis of partial egg production records in Japanese quail using random regression models.

Science.gov (United States)

Abou Khadiga, G; Mahmoud, B Y F; Farahat, G S; Emam, A M; El-Full, E A

2017-08-01

The main objectives of this study were to detect the most appropriate random regression model (RRM) to fit the data of monthly egg production in 2 lines (selected and control) of Japanese quail and to test the consistency of different criteria of model choice. Data from 1,200 female Japanese quails for the first 5 months of egg production from 4 consecutive generations of an egg line selected for egg production in the first month (EP1) was analyzed. Eight RRMs with different orders of Legendre polynomials were compared to determine the proper model for analysis. All criteria of model choice suggested that the adequate model included the second-order Legendre polynomials for fixed effects, and the third-order for additive genetic effects and permanent environmental effects. Predictive ability of the best model was the highest among all models (ρ = 0.987). According to the best model fitted to the data, estimates of heritability were relatively low to moderate (0.10 to 0.17) showed a descending pattern from the first to the fifth month of production. A similar pattern was observed for permanent environmental effects with greater estimates in the first (0.36) and second (0.23) months of production than heritability estimates. Genetic correlations between separate production periods were higher (0.18 to 0.93) than their phenotypic counterparts (0.15 to 0.87). The superiority of the selected line over the control was observed through significant (P egg production in earlier ages (first and second months) than later ones. A methodology based on random regression animal models can be recommended for genetic evaluation of egg production in Japanese quail. © 2017 Poultry Science Association Inc.
On marker-based parentage verification via non-linear optimization.

Science.gov (United States)

Boerner, Vinzent

2017-06-15

Parentage verification by molecular markers is mainly based on short tandem repeat markers. Single nucleotide polymorphisms (SNPs) as bi-allelic markers have become the markers of choice for genotyping projects. Thus, the subsequent step is to use SNP genotypes for parentage verification as well. Recent developments of algorithms such as evaluating opposing homozygous SNP genotypes have drawbacks, for example the inability of rejecting all animals of a sample of potential parents. This paper describes an algorithm for parentage verification by constrained regression which overcomes the latter limitation and proves to be very fast and accurate even when the number of SNPs is as low as 50. The algorithm was tested on a sample of 14,816 animals with 50, 100 and 500 SNP genotypes randomly selected from 40k genotypes. The samples of putative parents of these animals contained either five random animals, or four random animals and the true sire. Parentage assignment was performed by ranking of regression coefficients, or by setting a minimum threshold for regression coefficients. The assignment quality was evaluated by the power of assignment (P[Formula: see text]) and the power of exclusion (P[Formula: see text]). If the sample of putative parents contained the true sire and parentage was assigned by coefficient ranking, P[Formula: see text] and P[Formula: see text] were both higher than 0.99 for the 500 and 100 SNP genotypes, and higher than 0.98 for the 50 SNP genotypes. When parentage was assigned by a coefficient threshold, P[Formula: see text] was higher than 0.99 regardless of the number of SNPs, but P[Formula: see text] decreased from 0.99 (500 SNPs) to 0.97 (100 SNPs) and 0.92 (50 SNPs). If the sample of putative parents did not contain the true sire and parentage was rejected using a coefficient threshold, the algorithm achieved a P[Formula: see text] of 1 (500 SNPs), 0.99 (100 SNPs) and 0.97 (50 SNPs). The algorithm described here is easy to implement
Testing homogeneity in Weibull-regression models.

Science.gov (United States)

Bolfarine, Heleno; Valença, Dione M

2005-10-01

In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Modelos de regressão aleatória com diferentes estruturas de variância residual para descrever o tamanho da leitegada Random regression models with different residual variance structures for describing litter size in swine

Directory of Open Access Journals (Sweden)

Aderbal Cavalcante-Neto

2011-12-01

Full Text Available Objetivou-se comparar modelos de regressão aleatória com diferentes estruturas de variância residual, a fim de se buscar a melhor modelagem para a característica tamanho da leitegada ao nascer (TLN. Utilizaram-se 1.701 registros de TLN, que foram analisados por meio de modelo animal, unicaracterística, de regressão aleatória. As regressões fixa e aleatórias foram representadas por funções contínuas sobre a ordem de parto, ajustadas por polinômios ortogonais de Legendre de ordem 3. Para averiguar a melhor modelagem para a variância residual, considerou-se a heterogeneidade de variância por meio de 1 a 7 classes de variância residual. O modelo geral de análise incluiu grupo de contemporâneo como efeito fixo; os coeficientes de regressão fixa para modelar a trajetória média da população; os coeficientes de regressão aleatória do efeito genético aditivo-direto, do comum-de-leitegada e do de ambiente permanente de animal; e o efeito aleatório residual. O teste da razão de verossimilhança, o critério de informação de Akaike e o critério de informação bayesiano de Schwarz apontaram o modelo que considerou homogeneidade de variância como o que proporcionou melhor ajuste aos dados utilizados. As herdabilidades obtidas foram próximas a zero (0,002 a 0,006. O efeito de ambiente permanente foi crescente da 1ª (0,06 à 5ª (0,28 ordem, mas decrescente desse ponto até a 7ª ordem (0,18. O comum-de-leitegada apresentou valores baixos (0,01 a 0,02. A utilização de homogeneidade de variância residual foi mais adequada para modelar as variâncias associadas à característica tamanho da leitegada ao nascer nesse conjunto de dado.The objective of this work was to compare random regression models with different residual variance structures, so as to obtain the best modeling for the trait litter size at birth (LSB in swine. One thousand, seven hundred and one records of LSB were analyzed. LSB was analyzed by means of a
Genetic parameters for body condition score, body weight, milk yield, and fertility estimated using random regression models.

Science.gov (United States)

Berry, D P; Buckley, F; Dillon, P; Evans, R D; Rath, M; Veerkamp, R F

2003-11-01

Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields from 8725 multiparous Holstein-Friesian cows. A cubic random regression was sufficient to model the changing genetic variances for BCS, BW, and milk across different days in milk. The genetic correlations between BCS and fertility changed little over the lactation; genetic correlations between BCS and interval to first service and between BCS and pregnancy rate to first service varied from -0.47 to -0.31, and from 0.15 to 0.38, respectively. This suggests that maximum genetic gain in fertility from indirect selection on BCS should be based on measurements taken in midlactation when the genetic variance for BCS is largest. Selection for increased BW resulted in shorter intervals to first service, but more services and poorer pregnancy rates; genetic correlations between BW and pregnancy rate to first service varied from -0.52 to -0.45. Genetic selection for higher lactation milk yield alone through selection on increased milk yield in early lactation is likely to have a more deleterious effect on genetic merit for fertility than selection on higher milk yield in late lactation.
Late foetal life nutrient restriction and sire genotype affect postnatal performance of lambs

DEFF Research Database (Denmark)

Tygesen, Malin Plumhoff; Tauson, Anne-Helen; Blache, D.

2008-01-01

This experiment investigates the effects of maternal nutrient restriction in late gestation on the offsprings' postnatal metabolism and performance. Forty purebred Shropshire twin lambs born to ewes fed either a high-nutrition diet (H) (according to standard) or a low-nutrition (L) diet (50% during...... the last 6 weeks of gestation) were studied from birth until 145 days of age. In each feeding group, two different sires were represented, ‘growth' (G) and ‘meat' (M), having different breeding indices for the lean : fat ratio. Post partum all ewes were fed the same diet. Lambs born to L-ewes had...... significantly lower birth weights and pre-weaning growth rates. This was especially pronounced in L-lambs born to the M-ram, which also had markedly lower pre-weaning glucose concentrations than the other three groups of lambs. L-lambs converted milk to live weight with an increased efficiency in week 3 of life...
A logistic regression estimating function for spatial Gibbs point processes

DEFF Research Database (Denmark)

Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege

We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related to the p......We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...
Genetic determination of mortality rate in Danish dairy cows

DEFF Research Database (Denmark)

Maia, Rafael Pimentel; Ask, Birgitte; Madsen, Per

2014-01-01

: a sire random component with pedigree representing the sire genetic effects and a herd-year-season component. Moreover, the level of heterozygosity and the sire breed proportions were included in the models as covariates in order to account for potential non-additive genetic effects due to the massive...... introduction of genetic material from other populations. The correlations between the sire components for death rate and slaughter rate were negative and small for the 3 populations, suggesting the existence of specific genetic mechanisms for each culling reason and common concurrent genetic mechanisms...
A role of the sires and dams in the hermaphrodite phenomenon linked with polled Damascus goat breed

Directory of Open Access Journals (Sweden)

M. Roukbi

2013-12-01

Full Text Available The selection for polled character as preferential in Damascus breed leads to spread homozygous individuals for the polled gene and polled intersexes and consequently further economic losses in this breed. It’s very important to study the genetic origin, the role of sirs and dams in the development of intersexuality linked with hornlessness, and evaluate some other effects in the excess of the intersexes in caprine herd. To perform this work data of 52 intersexes issues from mating 19 polled bucks with 12 horned and 37 polled goats in Humeimeh research station, belonging to General commission for agricultural scientific research, were collected and analyzed by mean of Chi-Square (SAS, 1998. The results showed the statistical effect of sires (P≤0.007 and the unstististical effect (P≥0.05 of dames on the development of polled intersexes in Damascus goat breed. The number of kids intersexes were repeated 10, 5, 4, 3, 2 and 1 for 1, 2, 2 and 1, five and eight sire number respectively. Whereas the number of kids intersexes were repeated only 2 and 1 for 3 and 46 goat number respectively. The sex of the kids, kidding type and horned goat character have all highly significant effect (P≤0.001 and this because intersex cases issues of single births and twin birth: twin to male, twin to female, and triple births: twin to male and female, and twin to tow males respectively were repeated 17, 18, 14, 2 and 1 respectively. Also, single births, twin births and triple births were repeated 17, 32 and 3 respectively. Cases of intersexuality issues from horned and polled goats were repeated 14 and 38 respectively. It was concluded the important role of hornlessness genetic and multiple births in the development of polled intersexes in Damascus goat breed.

Application of single-step genomic best linear unbiased prediction with a multiple-lactation random regression test-day model for Japanese Holsteins.

Science.gov (United States)

Baba, Toshimi; Gotoh, Yusaku; Yamaguchi, Satoshi; Nakagawa, Satoshi; Abe, Hayato; Masuda, Yutaka; Kawahara, Takayoshi

2017-08-01

This study aimed to evaluate a validation reliability of single-step genomic best linear unbiased prediction (ssGBLUP) with a multiple-lactation random regression test-day model and investigate an effect of adding genotyped cows on the reliability. Two data sets for test-day records from the first three lactations were used: full data from February 1975 to December 2015 (60 850 534 records from 2 853 810 cows) and reduced data cut off in 2011 (53 091 066 records from 2 502 307 cows). We used marker genotypes of 4480 bulls and 608 cows. Genomic enhanced breeding values (GEBV) of 305-day milk yield in all the lactations were estimated for at least 535 young bulls using two marker data sets: bull genotypes only and both bulls and cows genotypes. The realized reliability (R 2 ) from linear regression analysis was used as an indicator of validation reliability. Using only genotyped bulls, R 2 was ranged from 0.41 to 0.46 and it was always higher than parent averages. The very similar R 2 were observed when genotyped cows were added. An application of ssGBLUP to a multiple-lactation random regression model is feasible and adding a limited number of genotyped cows has no significant effect on reliability of GEBV for genotyped bulls. © 2016 Japanese Society of Animal Science.
New machine learning tools for predictive vegetation mapping after climate change: Bagging and Random Forest perform better than Regression Tree Analysis

Science.gov (United States)

L.R. Iverson; A.M. Prasad; A. Liaw

2004-01-01

More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...
Influence of pollen transport dynamics on sire profiles and multiple paternity in flowering plants.

Directory of Open Access Journals (Sweden)

Randall J Mitchell

Full Text Available In many flowering plants individual fruits contain a mixture of half- and full- siblings, reflecting pollination by several fathers. To better understand the mechanisms generating multiple paternity within fruits we present a theoretical framework linking pollen carryover with patterns of pollinator movement. This 'sire profile' model predicts that species with more extensive pollen carryover will have a greater number of mates. It also predicts that flowers on large displays, which are often probed consecutively during a single pollinator visitation sequence, will have a lower effective number of mates. We compared these predictions with observed values for bumble bee-pollinated Mimulus ringens, which has restricted carryover, and hummingbird-pollinated Ipomopsis aggregata, which has extensive carryover. The model correctly predicted that the effective number of mates is much higher in the species with more extensive carryover. This work extends our knowledge of plant mating systems by highlighting mechanisms influencing the genetic composition of sibships.
Mixed-effects regression models in linguistics

CERN Document Server

Heylen, Kris; Geeraerts, Dirk

2018-01-01

When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
Increased progesterone production in cumulus-oocyte complexes of female mice sired by males with the Y-chromosome long arm deletion and its potential influence on fertilization efficiency.

Science.gov (United States)

Kotarska, Katarzyna; Galas, Jerzy; Przybyło, Małgorzata; Bilińska, Barbara; Styrna, Józefa

2015-02-01

It was revealed previously that B10.BR(Y(del)) females sired by males with the Y-chromosome long arm deletion differ from genetically identical B10.BR females sired by males with the intact Y chromosome. This is interpreted as a result of different epigenetic information which females of both groups inherit from their fathers. In the following study, we show that cumulus-oocyte complexes ovulated by B10.BR(Y(del)) females synthesize increased amounts of progesterone, which is important sperm stimulator. Because their extracellular matrix is excessively firm, the increased progesterone secretion belongs presumably to factors that compensate this feature enabling unchanged fertilization ratios. Described compensatory mechanism can act only on sperm of high quality, presenting proper receptors. Indeed, low proportion of sperm of Y(del) males that poorly fertilize B10.BR(Y(del)) oocytes demonstrates positive staining of membrane progesterone receptors. This proportion is significantly higher for sperm of control males that fertilize B10.BR(Y(del)) and B10.BR oocytes with the same efficiency. © The Author(s) 2014.
Microbiome Data Accurately Predicts the Postmortem Interval Using Random Forest Regression Models

Directory of Open Access Journals (Sweden)

Aeriel Belk

2018-02-01

Full Text Available Death investigations often include an effort to establish the postmortem interval (PMI in cases in which the time of death is uncertain. The postmortem interval can lead to the identification of the deceased and the validation of witness statements and suspect alibis. Recent research has demonstrated that microbes provide an accurate clock that starts at death and relies on ecological change in the microbial communities that normally inhabit a body and its surrounding environment. Here, we explore how to build the most robust Random Forest regression models for prediction of PMI by testing models built on different sample types (gravesoil, skin of the torso, skin of the head, gene markers (16S ribosomal RNA (rRNA, 18S rRNA, internal transcribed spacer regions (ITS, and taxonomic levels (sequence variants, species, genus, etc.. We also tested whether particular suites of indicator microbes were informative across different datasets. Generally, results indicate that the most accurate models for predicting PMI were built using gravesoil and skin data using the 16S rRNA genetic marker at the taxonomic level of phyla. Additionally, several phyla consistently contributed highly to model accuracy and may be candidate indicators of PMI.
Estimation of genotype X environment interactions, in a grassbased system, for milk yield, body condition score,and body weight using random regression models

NARCIS (Netherlands)

Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.

2003-01-01

(Co)variance components for milk yield, body condition score (BCS), body weight (BW), BCS change and BW change over different herd-year mean milk yields (HMY) and nutritional environments (concentrate feeding level, grazing severity and silage quality) were estimated using a random regression model.
Robust linear registration of CT images using random regression forests

Science.gov (United States)

Konukoglu, Ender; Criminisi, Antonio; Pathak, Sayan; Robertson, Duncan; White, Steve; Haynor, David; Siddiqui, Khan

2011-03-01

Global linear registration is a necessary first step for many different tasks in medical image analysis. Comparing longitudinal studies1, cross-modality fusion2, and many other applications depend heavily on the success of the automatic registration. The robustness and efficiency of this step is crucial as it affects all subsequent operations. Most common techniques cast the linear registration problem as the minimization of a global energy function based on the image intensities. Although these algorithms have proved useful, their robustness in fully automated scenarios is still an open question. In fact, the optimization step often gets caught in local minima yielding unsatisfactory results. Recent algorithms constrain the space of registration parameters by exploiting implicit or explicit organ segmentations, thus increasing robustness4,5. In this work we propose a novel robust algorithm for automatic global linear image registration. Our method uses random regression forests to estimate posterior probability distributions for the locations of anatomical structures - represented as axis aligned bounding boxes6. These posterior distributions are later integrated in a global linear registration algorithm. The biggest advantage of our algorithm is that it does not require pre-defined segmentations or regions. Yet it yields robust registration results. We compare the robustness of our algorithm with that of the state of the art Elastix toolbox7. Validation is performed via 1464 pair-wise registrations in a database of very diverse 3D CT images. We show that our method decreases the "failure" rate of the global linear registration from 12.5% (Elastix) to only 1.9%.
Estimating the Performance of Random Forest versus Multiple Regression for Predicting Prices of the Apartments

Directory of Open Access Journals (Sweden)

Marjan Čeh

2018-05-01

Full Text Available The goal of this study is to analyse the predictive performance of the random forest machine learning technique in comparison to commonly used hedonic models based on multiple regression for the prediction of apartment prices. A data set that includes 7407 records of apartment transactions referring to real estate sales from 2008–2013 in the city of Ljubljana, the capital of Slovenia, was used in order to test and compare the predictive performances of both models. Apparent challenges faced during modelling included (1 the non-linear nature of the prediction assignment task; (2 input data being based on transactions occurring over a period of great price changes in Ljubljana whereby a 28% decline was noted in six consecutive testing years; and (3 the complex urban form of the case study area. Available explanatory variables, organised as a Geographic Information Systems (GIS ready dataset, including the structural and age characteristics of the apartments as well as environmental and neighbourhood information were considered in the modelling procedure. All performance measures (R2 values, sales ratios, mean average percentage error (MAPE, coefficient of dispersion (COD revealed significantly better results for predictions obtained by the random forest method, which confirms the prospective of this machine learning technique on apartment price prediction.
A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design.

Science.gov (United States)

Meaney, Christopher; Moineddin, Rahim

2014-01-24

In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the
Genetic Analysis of Milk Yield Using Random Regression Test Day Model in Tehran Province Holstein Dairy Cow

Directory of Open Access Journals (Sweden)

A. Seyeddokht

2012-09-01

Full Text Available In this research a random regression test day model was used to estimate heritability values and calculation genetic correlations between test day milk records. a total of 140357 monthly test day milk records belonging to 28292 first lactation Holstein cattle(trice time a day milking distributed in 165 herd and calved from 2001 to 2010 belonging to the herds of Tehran province were used. The fixed effects of herd-year-month of calving as contemporary group and age at calving and Holstein gene percentage as covariate were fitted. Orthogonal legendre polynomial with a 4th-order was implemented to take account of genetic and environmental aspects of milk production over the course of lactation. RRM using Legendre polynomials as base functions appears to be the most adequate to describe the covariance structure of the data. The results showed that the average of heritability for the second half of lactation period was higher than that of the first half. The heritability value for the first month was lowest (0.117 and for the eighth month of the lactation was highest (0.230 compared to the other months of lactation. Because of genetic variation was increased gradually, and residual variance was high in the first months of lactation, heritabilities were different over the course of lactation. The RRMs with a higher number of parameters were more useful to describe the genetic variation of test-day milk yield throughout the lactation. In this research estimation of genetic parameters, and calculation genetic correlations were implemented by random regression test day model, therefore using this method is the exact way to take account of parameters rather than the other ways.
Systematic review of treatment modalities for gingival depigmentation: a random-effects poisson regression analysis.

Science.gov (United States)

Lin, Yi Hung; Tu, Yu Kang; Lu, Chun Tai; Chung, Wen Chen; Huang, Chiung Fang; Huang, Mao Suan; Lu, Hsein Kun

2014-01-01

Repigmentation variably occurs with different treatment methods in patients with gingival pigmentation. A systemic review was conducted of various treatment modalities for eliminating melanin pigmentation of the gingiva, comprising bur abrasion, scalpel surgery, cryosurgery, electrosurgery, gingival grafts, and laser techniques, to compare the recurrence rates (Rrs) of these treatment procedures. Electronic databases, including PubMed, Web of Science, Google, and Medline were comprehensively searched, and manual searches were conducted for studies published from January 1951 to June 2013. After applying inclusion and exclusion criteria, the final list of articles was reviewed in depth to achieve the objectives of this review. A Poisson regression was used to analyze the outcome of depigmentation using the various treatment methods. The systematic review was based on case reports mainly. In total, 61 eligible publications met the defined criteria. The various therapeutic procedures showed variable clinical results with a wide range of Rrs. A random-effects Poisson regression showed that cryosurgery (Rr = 0.32%), electrosurgery (Rr = 0.74%), and laser depigmentation (Rr = 1.16%) yielded superior result, whereas bur abrasion yielded the highest Rr (8.89%). Within the limit of the sampling level, the present evidence-based results show that cryosurgery exhibits the optimal predictability for depigmentation of the gingiva among all procedures examined, followed by electrosurgery and laser techniques. It is possible to treat melanin pigmentation of the gingiva with various methods and prevent repigmentation. Among those treatment modalities, cryosurgery, electrosurgery, and laser surgery appear to be the best choices for treating gingival pigmentation. © 2014 Wiley Periodicals, Inc.
Uso de modelos de regressão aleatória para descrever a variação genética da produção de leite na raça Holandesa Random regressions models to describe the genetic variation of milk yield in Holstein breed

Directory of Open Access Journals (Sweden)

Cláudio Vieira de Araújo

2006-06-01

Full Text Available Registros de produção de leite de 68.523 controles leiteiros de 8.536 vacas da raça Holandesa, com parições nos anos de 1996 a 2001, foram utilizados na comparação entre modelos de regressão aleatória para estimação de componentes de variância. Os registros de controle leiteiro foram analisados como características múltiplas, considerando cada controle uma característica distinta. Os mesmos registros de controle leiteiro foram analisados como dados longitudinais, por meio de modelos de regressão aleatória, que diferiram entre si pela função utilizada para descrever a trajetória da curva de lactação dos animais. As funções utilizadas foram a exponencial de Wilmink, a função de Ali e Schaeffer e os polinômios de Legendre de segundo e quarto graus. A comparação entre modelos foi realizada com base nos seguintes critérios: estimativas de componentes de variância, obtidas no modelo multicaractístico e por regressão aleatória; valores da variância residual; e valores do logaritmo da função de verossimilhança. As estimativas de herdabilidade obtidas por meio dos modelos de características múltiplas variaram de 0,110 a 0,244. Para os modelos de regressão aleatória, esses valores oscilaram de 0,127 a 0,301, observando-se as maiores estimativas nos modelos com maior número de parâmetros. Verificou-se que os modelos de regressão aleatória que utilizaram os polinômios de Legendre descreveram melhor a variação genética da produção de leite.Data comprising 68,523 test day milk yield of 8,536 cows of the Holstein breed, calving from 1996 to 2001, were used to compare random regression models, for estimating variance components. Test day records (TD were analyzed as multiple traits, considering each TD as a different trait. The test day records were analyzed as longitudinal traits by different random regression models regarding the function used to describe the trajectory of the lactation curve of the animals
Dimension Reduction and Discretization in Stochastic Problems by Regression Method

DEFF Research Database (Denmark)

Ditlevsen, Ove Dalager

1996-01-01

The chapter mainly deals with dimension reduction and field discretizations based directly on the concept of linear regression. Several examples of interesting applications in stochastic mechanics are also given.Keywords: Random fields discretization, Linear regression, Stochastic interpolation, ...
Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

Directory of Open Access Journals (Sweden)

Drzewiecki Wojciech

2016-12-01

Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.
Interfamiliar specific fertility in Italian Brown Swiss cattle

Directory of Open Access Journals (Sweden)

Alessandro Bagnato

2010-01-01

Full Text Available The aim of this study is to evaluate the effects of interaction between sire of cow and service sire on the success/unsuccess of inseminations. Data from insemination events of Italian Brown Swiss cows collected from January 1993 through August 2007 were restricted to repeat breeder cows. A cluster analysis was carried out to group herds with very few observations in clusters with at least 15 observations. The edited data set included 102,710 services of 10,708 cows, daughters of 1,716 sires and mated to 3,108 service sires. The success or unsuccess at each insemination was evaluated by a linear mixed model including the fixed effects of herd-year interaction, month of insemination, age, and the random effects of sire service-sire of cow interaction and residual. The distribution of bull combination estimates was bimodal. When the tails of distribution (best and worst 5% of estimates were considered, 271 service sires were included in both tails. Results suggest that major gene can affect the survival of embryos and that positive or negative interactions between paternal and maternal genotype can affect this reproductive trait.
BOX-COX transformation and random regression models for fecal egg count data

Directory of Open Access Journals (Sweden)

Marcos Vinicius Silva

2012-01-01

Full Text Available Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants fecal egg count (FEC is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used to achieve normality before analysis. However, the transformed data are often not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6,375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (covariance components. We also proposed using random regression models (RRM for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4 adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
Box-Cox Transformation and Random Regression Models for Fecal egg Count Data.

Science.gov (United States)

da Silva, Marcos Vinícius Gualberto Barbosa; Van Tassell, Curtis P; Sonstegard, Tad S; Cobuci, Jaime Araujo; Gasbarre, Louis C

2011-01-01

Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants, fecal egg count (FEC) is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used in an effort to achieve normality before analysis. However, the transformed data are often still not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box-Cox transformation to approach normality and to estimate (co)variance components. We also proposed using random regression models (RRM) for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4) adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box-Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated.
Logistic regression for dichotomized counts.

Science.gov (United States)

Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W

2016-12-01

Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Multiple Mating, Paternity and Complex Fertilisation Patterns in the Chokka Squid Loligo reynaudii.

Directory of Open Access Journals (Sweden)

Marie-Jose Naud

Full Text Available Polyandry is widespread and influences patterns of sexual selection, with implications for sexual conflict over mating. Assessing sperm precedence patterns is a first step towards understanding sperm competition within a female and elucidating the roles of male- and female-controlled factors. In this study behavioural field data and genetic data were combined to investigate polyandry in the chokka squid Loligo reynaudii. Microsatellite DNA-based paternity analysis revealed multiple paternity to be the norm, with 79% of broods sired by at least two males. Genetic data also determined that the male who was guarding the female at the moment of sampling was a sire in 81% of the families tested, highlighting mate guarding as a successful male tactic with postcopulatory benefits linked to sperm deposition site giving privileged access to extruded egg strings. As females lay multiple eggs in capsules (egg strings wherein their position is not altered during maturation it is possible to describe the spatial / temporal sequence of fertilisation / sperm precedence There were four different patterns of fertilisation found among the tested egg strings: 1 unique sire; 2 dominant sire, with one or more rare sires; 3 randomly mixed paternity (two or more sires; and 4 a distinct switch in paternity occurring along the egg string. The latter pattern cannot be explained by a random use of stored sperm, and suggests postcopulatory female sperm choice. Collectively the data indicate multiple levels of male- and female-controlled influences on sperm precedence, and highlights squid as interesting models to study the interplay between sexual and natural selection.

Genomic selection for tolerance to heat stress in Australian dairy cattle.

Science.gov (United States)

Nguyen, Thuy T T; Bowman, Phil J; Haile-Mariam, Mekonnen; Pryce, Jennie E; Hayes, Benjamin J

2016-04-01

Temperature and humidity levels above a certain threshold decrease milk production in dairy cattle, and genetic variation is associated with the amount of lost production. To enable selection for improved heat tolerance, the aim of this study was to develop genomic estimated breeding values (GEBV) for heat tolerance in dairy cattle. Heat tolerance was defined as the rate of decline in production under heat stress. We combined herd test-day recording data from 366,835 Holstein and 76,852 Jersey cows with daily temperature and humidity measurements from weather stations closest to the tested herds for test days between 2003 and 2013. We used daily mean values of temperature-humidity index averaged for the day of test and the 4 previous days as the measure of heat stress. Tolerance to heat stress was estimated for each cow using a random regression model with a common threshold of temperature-humidity index=60 for all cows. The slope solutions for cows from this model were used to define the daughter trait deviations of their sires. Genomic best linear unbiased prediction was used to calculate GEBV for heat tolerance for milk, fat, and protein yield. Two reference populations were used, the first consisted of genotyped sires only (2,300 Holstein and 575 Jersey sires), and the other included genotyped sires and cows (2,189 Holstein and 1,188 Jersey cows). The remainder of the genotyped sires were used as a validation set. All animals had genotypes for 632,003 single nucleotide polymorphisms. When using only genotyped sires in the reference set and only the first parity data, the accuracy of GEBV for heat tolerance in relation to changes in milk, fat, and protein yield were 0.48, 0.50, and 0.49 in the Holstein validation sires and 0.44, 0.61, and 0.53 in the Jersey validation sires, respectively. Some slight improvement in the accuracy of prediction was achieved when cows were included in the reference population for Holsteins. No clear improvements in the accuracy of
[Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

Science.gov (United States)

Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

2011-11-01

To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Genetic evaluation of calf and heifer survival in Iranian Holstein cattle using linear and threshold models.

Science.gov (United States)

Forutan, M; Ansari Mahyari, S; Sargolzaei, M

2015-02-01

Calf and heifer survival are important traits in dairy cattle affecting profitability. This study was carried out to estimate genetic parameters of survival traits in female calves at different age periods, until nearly the first calving. Records of 49,583 female calves born during 1998 and 2009 were considered in five age periods as days 1-30, 31-180, 181-365, 366-760 and full period (day 1-760). Genetic components were estimated based on linear and threshold sire models and linear animal models. The models included both fixed effects (month of birth, dam's parity number, calving ease and twin/single) and random effects (herd-year, genetic effect of sire or animal and residual). Rates of death were 2.21, 3.37, 1.97, 4.14 and 12.4% for the above periods, respectively. Heritability estimates were very low ranging from 0.48 to 3.04, 0.62 to 3.51 and 0.50 to 4.24% for linear sire model, animal model and threshold sire model, respectively. Rank correlations between random effects of sires obtained with linear and threshold sire models and with linear animal and sire models were 0.82-0.95 and 0.61-0.83, respectively. The estimated genetic correlations between the five different periods were moderate and only significant for 31-180 and 181-365 (r(g) = 0.59), 31-180 and 366-760 (r(g) = 0.52), and 181-365 and 366-760 (r(g) = 0.42). The low genetic correlations in current study would suggest that survival at different periods may be affected by the same genes with different expression or by different genes. Even though the additive genetic variations of survival traits were small, it might be possible to improve these traits by traditional or genomic selection. © 2014 Blackwell Verlag GmbH.
Influência da interação touro x rebanho na estimação da correlação entre efeitos genéticos direto e materno em bovinos da raça Nelore Influence of sire x herd interaction on the estimation of correlation between direct and maternal genetic effects in Nellore cattle

Directory of Open Access Journals (Sweden)

Joanir Pereira Eler

2000-12-01

Full Text Available A interação touro x rebanho foi avaliada em uma população com 30.789 registros de animais da raça Nelore nascidos entre 1984 e 1994 em doze fazendas localizadas em três Estados do Sudeste e Centro-Oeste brasileiro, com um total de 48.495 animais no pedigree. As características consideradas foram os pesos ao nascer (PESNAS e à desmama (PESDES e o ganho de peso da desmama ao sobreano (GP345. O efeito da interação touro x rebanho foi considerado aleatório em modelos animais uni e bicaraterística, usando MTDFREML. Esse efeito foi importante para PESNAS (6% da variância fenotípica e influenciou os componentes de variância e covariância e, conseqüentemente, os parâmetros genéticos. O efeito foi menor (cerca de 1% da variância fenotípica para PESDES, mas alterou as estimativas dos componentes de variância e covariância. Para GP345, o efeito foi pequeno, embora significativo pelos verossimilhança. As correlações genéticas entre efeitos direto e materno são próximas de zero, ou até mesmo positivas, se a interação touro x rebanho for incluída no modelo, e sempre negativas se ela for omitida.Sire x herd interactions were studied in 30,789 records of birth (BW and weaning weight (WW and weight gain from weaning to 18 months of age (G345 of Nellore cattle born from 1984 to 1994 in twelve farms located in three states of central and southeastern Brazil, with a total of 48.495 animals in pedigree. Sire x herd interaction was considered as a random effect in single trait and two traits animal models using MTDFREML. This effect was important for BW (6% of the phenotypic variance and it both affected variance and covariance components and, consequently, genetic parameters. The effect was smaller for WW (around 1% of the phenotypic variance, but influenced the estimates of (co variance components. For G345, Sire x Herd effect was small. Likelihood tests showed that this effect was significant for all traits. This study showed that
Herd-specific random regression carcass profiles for beef cattle after adjustment for animal genetic merit.

Science.gov (United States)

Englishby, Tanya M; Moore, Kirsty L; Berry, Donagh P; Coffey, Mike P; Banos, Georgios

2017-07-01

Abattoir data are an important source of information for the genetic evaluation of carcass traits, but also for on-farm management purposes. The present study aimed to quantify the contribution of herd environment to beef carcass characteristics (weight, conformation score and fat score) with particular emphasis on generating finishing herd-specific profiles for these traits across different ages at slaughter. Abattoir records from 46,115 heifers and 78,790 steers aged between 360 and 900days, and from 22,971 young bulls aged between 360 and 720days, were analysed. Finishing herd-year and animal genetic (co)variance components for each trait were estimated using random regression models. Across slaughter age and gender, the ratio of finishing herd-year to total phenotypic variance ranged from 0.31 to 0.72 for carcass weight, 0.21 to 0.57 for carcass conformation and 0.11 to 0.44 for carcass fat score. These parameters indicate that the finishing herd environment is an important contributor to carcass trait variability and amenable to improvement with management practices. Copyright © 2017 Elsevier Ltd. All rights reserved.
Estimation of Genetic Parameters for First Lactation Monthly Test-day Milk Yields using Random Regression Test Day Model in Karan Fries Cattle

Directory of Open Access Journals (Sweden)

Ajay Singh

2016-06-01

Full Text Available A single trait linear mixed random regression test-day model was applied for the first time for analyzing the first lactation monthly test-day milk yield records in Karan Fries cattle. The test-day milk yield data was modeled using a random regression model (RRM considering different order of Legendre polynomial for the additive genetic effect (4th order and the permanent environmental effect (5th order. Data pertaining to 1,583 lactation records spread over a period of 30 years were recorded and analyzed in the study. The variance component, heritability and genetic correlations among test-day milk yields were estimated using RRM. RRM heritability estimates of test-day milk yield varied from 0.11 to 0.22 in different test-day records. The estimates of genetic correlations between different test-day milk yields ranged 0.01 (test-day 1 [TD-1] and TD-11 to 0.99 (TD-4 and TD-5. The magnitudes of genetic correlations between test-day milk yields decreased as the interval between test-days increased and adjacent test-day had higher correlations. Additive genetic and permanent environment variances were higher for test-day milk yields at both ends of lactation. The residual variance was observed to be lower than the permanent environment variance for all the test-day milk yields.
A bullet-sired bone cyst

Energy Technology Data Exchange (ETDEWEB)

Brogdon, B.G. [University of South Alabama Medical Center, Department of Radiology, Mobile, AL (United States); Cottrell, W.C. [Orthopaedic Associates of West Florida, Clearwater, FL (United States); Nimityongskul, P. [University of South Alabama Medical Center, Department of Orthopaedic Surgery, Mobile, AL (United States); Takhtani, D. [Johns Hopkins School of Medicine, Department of Radiology, Baltimore, MD (United States)

2006-12-15

Random gunfire deposited a bullet in the proximal tibial metaphysis of a 9-year- old girl. The wound was not incapacitating and was treated conservatively. Within 17 months, soreness developed in the proximal leg, and radiography revealed a large unicameral cyst within which the bullet freely tumbled. Eventually, fear of impending fracture prompted further radiography, computed tomography, surgical intervention and pathological examination of the cyst wall. We believe this is only the second description in the English-language literature of this rare sequence of events. (orig.)
A bullet-sired bone cyst

International Nuclear Information System (INIS)

Brogdon, B.G.; Cottrell, W.C.; Nimityongskul, P.; Takhtani, D.

2006-01-01

Random gunfire deposited a bullet in the proximal tibial metaphysis of a 9-year- old girl. The wound was not incapacitating and was treated conservatively. Within 17 months, soreness developed in the proximal leg, and radiography revealed a large unicameral cyst within which the bullet freely tumbled. Eventually, fear of impending fracture prompted further radiography, computed tomography, surgical intervention and pathological examination of the cyst wall. We believe this is only the second description in the English-language literature of this rare sequence of events. (orig.)
Better Autologistic Regression

Directory of Open Access Journals (Sweden)

Mark A. Wolters

2017-11-01

Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Inferring genetic parameters of lactation in Tropical Milking Criollo cattle with random regression test-day models.

Science.gov (United States)

Santellano-Estrada, E; Becerril-Pérez, C M; de Alba, J; Chang, Y M; Gianola, D; Torres-Hernández, G; Ramírez-Valverde, R

2008-11-01

This study inferred genetic and permanent environmental variation of milk yield in Tropical Milking Criollo cattle and compared 5 random regression test-day models using Wilmink's function and Legendre polynomials. Data consisted of 15,377 test-day records from 467 Tropical Milking Criollo cows that calved between 1974 and 2006 in the tropical lowlands of the Gulf Coast of Mexico and in southern Nicaragua. Estimated heritabilities of test-day milk yields ranged from 0.18 to 0.45, and repeatabilities ranged from 0.35 to 0.68 for the period spanning from 6 to 400 d in milk. Genetic correlation between days in milk 10 and 400 was around 0.50 but greater than 0.90 for most pairs of test days. The model that used first-order Legendre polynomials for additive genetic effects and second-order Legendre polynomials for permanent environmental effects gave the smallest residual variance and was also favored by the Akaike information criterion and likelihood ratio tests.
Potential misinterpretation of treatment effects due to use of odds ratios and logistic regression in randomized controlled trials.

Directory of Open Access Journals (Sweden)

Mirjam J Knol

Full Text Available BACKGROUND: In randomized controlled trials (RCTs, the odds ratio (OR can substantially overestimate the risk ratio (RR if the incidence of the outcome is over 10%. This study determined the frequency of use of ORs, the frequency of overestimation of the OR as compared with its accompanying RR in published RCTs, and we assessed how often regression models that calculate RRs were used. METHODS: We included 288 RCTs published in 2008 in five major general medical journals (Annals of Internal Medicine, British Medical Journal, Journal of the American Medical Association, Lancet, New England Journal of Medicine. If an OR was reported, we calculated the corresponding RR, and we calculated the percentage of overestimation by using the formula . RESULTS: Of 193 RCTs with a dichotomous primary outcome, 24 (12.4% presented a crude and/or adjusted OR for the primary outcome. In five RCTs (2.6%, the OR differed more than 100% from its accompanying RR on the log scale. Forty-one of all included RCTs (n = 288; 14.2% presented ORs for other outcomes, or for subgroup analyses. Nineteen of these RCTs (6.6% had at least one OR that deviated more than 100% from its accompanying RR on the log scale. Of 53 RCTs that adjusted for baseline variables, 15 used logistic regression. Alternative methods to estimate RRs were only used in four RCTs. CONCLUSION: ORs and logistic regression are often used in RCTs and in many articles the OR did not approximate the RR. Although the authors did not explicitly misinterpret these ORs as RRs, misinterpretation by readers can seriously affect treatment decisions and policy making.
Heterosis and direct effects for Charolais-sired calf weight and growth, cow weight and weight change, and ratios of cow and calf weights and weight changes across warm season lactation in Romosinuano, Angus, and F cows in Arkansas.

Science.gov (United States)

Riley, D G; Burke, J M; Chase, C C; Coleman, S W

2016-01-01

The use of Brahman in cow-calf production offers some adaptation to the harsh characteristics of endophyte-infected tall fescue. Criollo breeds, such as the Romosinuano, may have similar adaptation. The objectives were to estimate genetic effects in Romosinuano, Angus, and crossbred cows for their weights, weights of their calves, and ratios (calf weight:cow weight and cow weight change:calf weight gain) across lactation and to assess the influence of forage on traits and estimates. Cows ( = 91) were bred to Charolais bulls after their second parity. Calves ( = 214) were born from 2006 to 2009. Cows and calves were weighed in early (April and June), mid- (July), and late lactation (August and October). Animal was a random effect in analyses of calf data; sire was random in analyses of cow records and ratios. Fixed effects investigated included calf age, calf sex, cow age-year combinations, sire breed of cow, dam breed of cow, and interactions. Subsequent analyses evaluated the effect of forage grazed: endophyte-free or endophyte-infected tall fescue. Estimates of maternal heterosis for calf weight ranged from 9.3 ± 4.3 to 15.4 ± 5.7 kg from mid-lactation through weaning ( cow) were -6.8 ± 3.0 and -8.9 ± 4.2 kg for weights recorded in April and June. Calf weights and weight gains from birth were greater ( cows grazing endophyte-free tall fescue except in mid-summer. Cow weight change from April to each time was negative for Angus cows and lower ( Cows grazing endophyte-free tall fescue were heavier ( cows had the lowest ( cow weight change:calf weight gain, indicating an energy-deficit condition. Cows grazing endophyte-free tall fescue had more negative ( cow weight, 7.9 ± 3.0 to 15.8 ± 5.0 kg for cow weight change, and 0.07 ± 0.03 to 0.27 ± 0.1 for cow weight change:calf weight gain. Direct Romosinuano effects ranged from 14.8 ± 4.2 to 49.8 ± 7.7 kg for cow weight change and 0.2 ± 0.04 to 0.51 ± 0.14 for cow weight change:calf weight gain. The adaptive
Little genetic variability in resilience among cattle exists for a range of performance traits across herds in Ireland differing in Fasciola hepatica prevalence.

Science.gov (United States)

Twomey, Alan J; Graham, David A; Doherty, Michael L; Blom, Astrid; Berry, Donagh P

2018-06-04

It is anticipated that in the future, livestock will be exposed to a greater risk of infection from parasitic diseases. Therefore, future breeding strategies for livestock, which are generally long-term strategies for change, should target animals adaptable to environments with a high parasitic load. Covariance components were estimated in the present study for a selection of dairy and beef performance traits over herd-years differing in Fasciola hepatica load using random regression sire models. Herd-year prevalence of F. hepatica was determined by using F. hepatica-damaged liver phenotypes which were recorded in abattoirs nationally. The data analyzed consisted up to 83,821 lactation records from dairy cows for a range of milk production and fertility traits, as well as 105,054 young animals with carcass-related information obtained at slaughter. Reaction norms for individual sires were derived from the random regression coefficients. The heritability and additive genetic standard deviations for all traits analyzed remained relatively constant as herd-year F. hepatica prevalence gradient increased up to a prevalence level of 0.7; although there was a large increase in heritability and additive genetic standard deviation for milk and fertility traits in the observed F. hepatica prevalence levels >0.7, only 5% of the data existed in herd-year prevalence levels >0.7. Very little rescaling, therefore, exists across differing herd-year F. hepatica prevalence levels. Within-trait genetic correlations among the performance traits across different herd-year F. hepatica prevalence levels were less than unity for all traits. Nevertheless, within-trait genetic correlations for milk production and carcass traits were all >0.8 for F. hepatica prevalence levels between 0.2 and 0.8. The lowest estimate of within-trait genetic correlations for the different fertility traits ranged from -0.03 (SE = 1.09) in age of first calving to 0.54 (SE = 0.22) for calving to first service
Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression

NARCIS (Netherlands)

Yoo, W.W.; Ghosal, S

2016-01-01

In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a
Testing overall and moderator effects meta-regression

NARCIS (Netherlands)

Huizenga, H.M.; Visser, I.; Dolan, C.V.

2011-01-01

Random effects meta-regression is a technique to synthesize results of multiple studies. It allows for a test of an overall effect, as well as for tests of effects of study characteristics, that is, (discrete or continuous) moderator effects. We describe various procedures to test moderator effects:
Adiposity, lipogenesis, and fatty acid composition of subcutaneous and intramuscular adipose tissues of Brahman and Angus crossbred cattle.

Science.gov (United States)

Campbell, E M G; Sanders, J O; Lunt, D K; Gill, C A; Taylor, J F; Davis, S K; Riley, D G; Smith, S B

2016-04-01

The objective of this study was to demonstrate differences in aspects of adipose tissue cellularity, lipid metabolism, and fatty and cholesterol composition in Angus and Brahman crossbred cattle. We hypothesized that in vitro measures of lipogenesis would be greater in three-fourths Angus progeny than in three-fourths Brahman progeny, especially in intramuscular (i.m.) adipose tissue. Progeny ( = 227) were fed a standard, corn-based diet for approximately 150 d before slaughter. Breed was considered to be the effect of interest and was forced into the model. There were 9 breed groups including all 4 kinds of three-fourths Angus calves: Angus bulls Angus-sired F cows ( = 32), Angus bulls Brahman-sired F cows ( = 20), Brahman-sired F bulls Angus cows ( = 24), and Angus-sired F bulls Angus cows ( = 20). There were all 4 kinds of three-fourths Brahman calves: Brahman bulls Brahman-sired F cows ( = 21), Brahman bulls Angus-sired F cows ( = 43), Brahman-sired F bulls Brahman cows ( = 26), and Angus-sired F bulls Brahman cows ( = 13). Additionally, F calves (one-half Brahman and one-half Angus) were produced only from Brahman-sired F bulls Angus-sired F cows ( = 28). Contrasts were calculated when breed was an important fixed effect, using the random effect family(breed) as the error term. Most contrasts were nonsignificant ( > 0.10). Those that were significant ( Angus > F, three-fourths Brahman > F, and three-fourths crossbred progeny combined > F), s.c. adipocyte volume (three-fourths Angus > F and three-fourths bloods combined > F), lipogenesis from acetate in s.c. adipose tissue (three-fourths Brahman calves from Brahman dams > three-fourths Brahman calves from F dams), and percentage 18:3-3 in s.c. adipose tissue (three-fourths Brahman calves from Brahman-sired F dams Angus-sired F dams). Intramuscular adipocyte volume ( Angus cattle. Additionally, several differences were observed in i.m. adipose tissue that were consistent with this being a less-developed adipose
Design and analysis of experiments classical and regression approaches with SAS

CERN Document Server

Onyiah, Leonard C

2008-01-01

Introductory Statistical Inference and Regression Analysis Elementary Statistical Inference Regression Analysis Experiments, the Completely Randomized Design (CRD)-Classical and Regression Approaches Experiments Experiments to Compare Treatments Some Basic Ideas Requirements of a Good Experiment One-Way Experimental Layout or the CRD: Design and Analysis Analysis of Experimental Data (Fixed Effects Model) Expected Values for the Sums of Squares The Analysis of Variance (ANOVA) Table Follow-Up Analysis to Check fo
The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

Science.gov (United States)

González-Recio, O; Jiménez-Montero, J A; Alenda, R

2013-01-01

In the next few years, with the advent of high-density single nucleotide polymorphism (SNP) arrays and genome sequencing, genomic evaluation methods will need to deal with a large number of genetic variants and an increasing sample size. The boosting algorithm is a machine-learning technique that may alleviate the drawbacks of dealing with such large data sets. This algorithm combines different predictors in a sequential manner with some shrinkage on them; each predictor is applied consecutively to the residuals from the committee formed by the previous ones to form a final prediction based on a subset of covariates. Here, a detailed description is provided and examples using a toy data set are included. A modification of the algorithm called "random boosting" was proposed to increase predictive ability and decrease computation time of genome-assisted evaluation in large data sets. Random boosting uses a random selection of markers to add a subsequent weak learner to the predictive model. These modifications were applied to a real data set composed of 1,797 bulls genotyped for 39,714 SNP. Deregressed proofs of 4 yield traits and 1 type trait from January 2009 routine evaluations were used as dependent variables. A 2-fold cross-validation scenario was implemented. Sires born before 2005 were used as a training sample (1,576 and 1,562 for production and type traits, respectively), whereas younger sires were used as a testing sample to evaluate predictive ability of the algorithm on yet-to-be-observed phenotypes. Comparison with the original algorithm was provided. The predictive ability of the algorithm was measured as Pearson correlations between observed and predicted responses. Further, estimated bias was computed as the average difference between observed and predicted phenotypes. The results showed that the modification of the original boosting algorithm could be run in 1% of the time used with the original algorithm and with negligible differences in accuracy
Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

Science.gov (United States)

Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

2018-02-01

A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.
Sire influence on reproductive, performance characteristics and ...

African Journals Online (AJOL)

, Panda White x Cinnamon Brown (PWxCB) and Silver Brown x Cinnamon Brown (SBxCB). The experiment was a randomized complete block design. Parameters measured include: fertility and hatchability traits, growth performance traits and ...

Random Intercept and Random Slope 2-Level Multilevel Models

Directory of Open Access Journals (Sweden)

Rehan Ahmad Khan

2012-11-01

Full Text Available Random intercept model and random intercept & random slope model carrying two-levels of hierarchy in the population are presented and compared with the traditional regression approach. The impact of students’ satisfaction on their grade point average (GPA was explored with and without controlling teachers influence. The variation at level-1 can be controlled by introducing the higher levels of hierarchy in the model. The fanny movement of the fitted lines proves variation of student grades around teachers.
Random Decrement and Regression Analysis of Traffic Responses of Bridges

DEFF Research Database (Denmark)

Asmussen, J. C.; Ibrahim, S. R.; Brincker, Rune

1996-01-01

The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data fro the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e. g. wind, traffic...
Random Decrement and Regression Analysis of Traffic Responses of Bridges

DEFF Research Database (Denmark)

Asmussen, J. C.; Ibrahim, S. R.; Brincker, Rune

The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data from the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e.g. wind, traffic...
Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

Science.gov (United States)

Drzewiecki, Wojciech

2016-12-01

In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.
Regression Discontinuity and Randomized Controlled Trial Estimates: An Application to The Mycotic Ulcer Treatment Trials.

Science.gov (United States)

Oldenburg, Catherine E; Venkatesh Prajna, N; Krishnan, Tiruvengada; Rajaraman, Revathi; Srinivasan, Muthiah; Ray, Kathryn J; O'Brien, Kieran S; Glymour, M Maria; Porco, Travis C; Acharya, Nisha R; Rose-Nussbaumer, Jennifer; Lietman, Thomas M

2018-08-01

We compare results from regression discontinuity (RD) analysis to primary results of a randomized controlled trial (RCT) utilizing data from two contemporaneous RCTs for treatment of fungal corneal ulcers. Patients were enrolled in the Mycotic Ulcer Treatment Trials I and II (MUTT I & MUTT II) based on baseline visual acuity: patients with acuity ≤ 20/400 (logMAR 1.3) enrolled in MUTT I, and >20/400 in MUTT II. MUTT I investigated the effect of topical natamycin versus voriconazole on best spectacle-corrected visual acuity. MUTT II investigated the effect of topical voriconazole plus placebo versus topical voriconazole plus oral voriconazole. We compared the RD estimate (natamycin arm of MUTT I [N = 162] versus placebo arm of MUTT II [N = 54]) to the RCT estimate from MUTT I (topical natamycin [N = 162] versus topical voriconazole [N = 161]). In the RD, patients receiving natamycin had mean improvement of 4-lines of visual acuity at 3 months (logMAR -0.39, 95% CI: -0.61, -0.17) compared to topical voriconazole plus placebo, and 2-lines in the RCT (logMAR -0.18, 95% CI: -0.30, -0.05) compared to topical voriconazole. The RD and RCT estimates were similar, although the RD design overestimated effects compared to the RCT.
Carcass characteristics and meat quality of Hereford sired steers born to beef-cross-dairy and Angus breeding cows.

Science.gov (United States)

Coleman, Lucy W; Hickson, Rebecca E; Schreurs, Nicola M; Martin, Natalia P; Kenyon, Paul R; Lopez-Villalobos, Nicolas; Morris, Stephen T

2016-11-01

Steers from Angus, Angus×Holstein Friesian, Angus×Holstein Friesian-Jersey and Angus×Jersey cows and a Hereford sire were measured for their carcass and meat quality characteristics. Steers from the Angus×Holstein Friesian cows had a greater final body weight and carcass weight (P<0.05). Steers from Angus×Jersey cows had the lowest carcass weight and dressing-out percentage (P<0.05). There was a greater fat depth over the rump at 12 and 18months of age for the steers from Angus cows (P<0.05) but, not at 24months of age. The steers had similar meat quality characteristics across the breed groups. Steers from Angus×Holstein Friesian and Angus×Jersey cows had a higher ratio of n6 to n3 fatty acids. Using beef-cross-dairy cows to produce steers for meat production does not impact on meat quality. Using Jersey in the breed cross reduced the carcass tissues in the live weight and the potential meat yield. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Differentiating regressed melanoma from regressed lichenoid keratosis.

Science.gov (United States)

Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

2017-04-01

Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Dropout from exercise randomized controlled trials among people with depression: A meta-analysis and meta regression.

Science.gov (United States)

Stubbs, Brendon; Vancampfort, Davy; Rosenbaum, Simon; Ward, Philip B; Richards, Justin; Soundy, Andrew; Veronese, Nicola; Solmi, Marco; Schuch, Felipe B

2016-01-15

Exercise has established efficacy in improving depressive symptoms. Dropouts from randomized controlled trials (RCT's) pose a threat to the validity of this evidence base, with dropout rates varying across studies. We conducted a systematic review and meta-analysis to investigate the prevalence and predictors of dropout rates among adults with depression participating in exercise RCT's. Three authors identified RCT's from a recent Cochrane review and conducted updated searches of major electronic databases from 01/2013 to 08/2015. We included RCT's of exercise interventions in people with depression (including major depressive disorder (MDD) and depressive symptoms) that reported dropout rates. A random effects meta-analysis and meta regression were conducted. Overall, 40 RCT's were included reporting dropout rates across 52 exercise interventions including 1720 people with depression (49.1 years (range=19-76 years), 72% female (range=0-100)). The trim and fill adjusted prevalence of dropout across all studies was 18.1% (95%CI=15.0-21.8%) and 17.2% (95%CI=13.5-21.7, N=31) in MDD only. In MDD participants, higher baseline depressive symptoms (β=0.0409, 95%CI=0.0809-0.0009, P=0.04) predicted greater dropout, whilst supervised interventions delivered by physiotherapists (β=-1.2029, 95%CI=-2.0967 to -0.3091, p=0.008) and exercise physiologists (β=-1.3396, 95%CI=-2.4478 to -0.2313, p=0.01) predicted lower dropout. A comparative meta-analysis (N=29) established dropout was lower in exercise than control conditions (OR=0.642, 95%CI=0.43-0.95, p=0.02). Exercise is well tolerated by people with depression and drop out in RCT's is lower than control conditions. Thus, exercise is a feasible treatment, in particular when delivered by healthcare professionals with specific training in exercise prescription. Copyright © 2015 Elsevier B.V. All rights reserved.
Stochastic development regression using method of moments

DEFF Research Database (Denmark)

Kühnel, Line; Sommer, Stefan Horst

2017-01-01

This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Detection of bias in animal model pedigree indices of heifers

Directory of Open Access Journals (Sweden)

M. LIDAUER

2008-12-01

Full Text Available The objective of the study was to test whether the pedigree indices (PI of heifers are biased, and if so, whether the magnitude of the bias varies in different groups of heifers. Therefore, two animal model evaluations with two different data sets were computed. Data with all the records from the national evaluation in December 1994 was used to obtain estimated breeding values (EBV for 305-days' milk yield and protein yield. In the second evaluation, the PIs were estimated for cows calving the first time in 1993 by excluding all their production records from the data. Three different statistics, a simple t-test, the linear regression of EBV on PI, and the polynomial regression of the difference in the predictions (EBV-PI on PI, were computed for three groups of first parity Ayrshire cows: daughters of proven sires, daughters of young sires, and daughters of bull dam candidates. A practically relevant bias was found only in the PIs for the daughters of young sires. On average their PIs were biased upwards by 0.20 standard deviations (78.8 kg for the milk yield and by 0.21 standard deviations (2.2 kg for the protein yield. The polynomial regression analysis showed that the magnitude of the bias in the PIs changed somewhat with the size of the PIs.;
Downscaling of surface moisture flux and precipitation in the Ebro Valley (Spain using analogues and analogues followed by random forests and multiple linear regression

Directory of Open Access Journals (Sweden)

G. Ibarra-Berastegi

2011-06-01

Full Text Available In this paper, reanalysis fields from the ECMWF have been statistically downscaled to predict from large-scale atmospheric fields, surface moisture flux and daily precipitation at two observatories (Zaragoza and Tortosa, Ebro Valley, Spain during the 1961–2001 period. Three types of downscaling models have been built: (i analogues, (ii analogues followed by random forests and (iii analogues followed by multiple linear regression. The inputs consist of data (predictor fields taken from the ERA-40 reanalysis. The predicted fields are precipitation and surface moisture flux as measured at the two observatories. With the aim to reduce the dimensionality of the problem, the ERA-40 fields have been decomposed using empirical orthogonal functions. Available daily data has been divided into two parts: a training period used to find a group of about 300 analogues to build the downscaling model (1961–1996 and a test period (1997–2001, where models' performance has been assessed using independent data. In the case of surface moisture flux, the models based on analogues followed by random forests do not clearly outperform those built on analogues plus multiple linear regression, while simple averages calculated from the nearest analogues found in the training period, yielded only slightly worse results. In the case of precipitation, the three types of model performed equally. These results suggest that most of the models' downscaling capabilities can be attributed to the analogues-calculation stage.
Genetic parameters for quail body weights using a random ...

African Journals Online (AJOL)

A model including fixed and random linear regressions is described for analyzing body weights at different ages. In this study, (co)variance components, heritabilities for quail weekly weights and genetic correlations among these weights were estimated using a random regression model by DFREML under DXMRR option.
Multiple regression models for energy use in air-conditioned office buildings in different climates

International Nuclear Information System (INIS)

Lam, Joseph C.; Wan, Kevin K.W.; Liu Dalong; Tsang, C.L.

2010-01-01

An attempt was made to develop multiple regression models for office buildings in the five major climates in China - severe cold, cold, hot summer and cold winter, mild, and hot summer and warm winter. A total of 12 key building design variables were identified through parametric and sensitivity analysis, and considered as inputs in the regression models. The coefficient of determination R 2 varies from 0.89 in Harbin to 0.97 in Kunming, indicating that 89-97% of the variations in annual building energy use can be explained by the changes in the 12 parameters. A pseudo-random number generator based on three simple multiplicative congruential generators was employed to generate random designs for evaluation of the regression models. The difference between regression-predicted and DOE-simulated annual building energy use are largely within 10%. It is envisaged that the regression models developed can be used to estimate the likely energy savings/penalty during the initial design stage when different building schemes and design concepts are being considered.
Genomic prediction of reproduction traits for Merino sheep.

Science.gov (United States)

Bolormaa, S; Brown, D J; Swan, A A; van der Werf, J H J; Hayes, B J; Daetwyler, H D

2017-06-01

Economically important reproduction traits in sheep, such as number of lambs weaned and litter size, are expressed only in females and later in life after most selection decisions are made, which makes them ideal candidates for genomic selection. Accurate genomic predictions would lead to greater genetic gain for these traits by enabling accurate selection of young rams with high genetic merit. The aim of this study was to design and evaluate the accuracy of a genomic prediction method for female reproduction in sheep using daughter trait deviations (DTD) for sires and ewe phenotypes (when individual ewes were genotyped) for three reproduction traits: number of lambs born (NLB), litter size (LSIZE) and number of lambs weaned. Genomic best linear unbiased prediction (GBLUP), BayesR and pedigree BLUP analyses of the three reproduction traits measured on 5340 sheep (4503 ewes and 837 sires) with real and imputed genotypes for 510 174 SNPs were performed. The prediction of breeding values using both sire and ewe trait records was validated in Merino sheep. Prediction accuracy was evaluated by across sire family and random cross-validations. Accuracies of genomic estimated breeding values (GEBVs) were assessed as the mean Pearson correlation adjusted by the accuracy of the input phenotypes. The addition of sire DTD into the prediction analysis resulted in higher accuracies compared with using only ewe records in genomic predictions or pedigree BLUP. Using GBLUP, the average accuracy based on the combined records (ewes and sire DTD) was 0.43 across traits, but the accuracies varied by trait and type of cross-validations. The accuracies of GEBVs from random cross-validations (range 0.17-0.61) were higher than were those from sire family cross-validations (range 0.00-0.51). The GEBV accuracies of 0.41-0.54 for NLB and LSIZE based on the combined records were amongst the highest in the study. Although BayesR was not significantly different from GBLUP in prediction accuracy
Comparison of innate immune responses and somatotropic axis components of Holstein and Montbéliarde-sired crossbred dairy cows during the transition period.

Science.gov (United States)

Mendonça, L G D; Litherland, N B; Lucy, M C; Keisler, D H; Ballou, M A; Hansen, L B; Chebel, R C

2013-06-01

Objectives were to compare parameters related to innate immune responses and somatotropic axis of Holstein (HO) and Montbéliarde (MO)-sired crossbred cows during the transition from late gestation to early lactation. Cows (40 HO and 47 MO-sired crossbred) were enrolled in the study 45d before expected calving date (study d 0=calving). Polymorphonuclear leukocytes (PMNL) isolated from blood samples collected weekly from study d -7 to 21 and on study d 42 were used for determination of percentage of PMNL positive for phagocytosis (PA+) and oxidative burst (OB+), intensity of PA and OB, percentage of PMNL expressing CD18 (CD18+) and L-selectin (LS+), and intensity of CD18 and LS expression. Blood was sampled weekly from study d -7 to 14 and on study d 28, 42, and 56 for determination of insulin, growth hormone (GH), leptin, and insulin-like growth factor (IGF)-1 concentrations. Blood sampled weekly from study d -14 to 21 and on study d 42 was used to determine cortisol concentration. Liver biopsies were performed on study d -14, 7, 14, and 28 for determination of mRNA expression for insulin receptor B (IRB), total GH receptor (GHRtot), GHR variant 1A (GHR1A), and IGF-1. Data were analyzed by ANOVA for repeated measures or by ANOVA using the GLM procedure of SAS (SAS Institute Inc., Cary, NC). Intensity of CD18 expression was greater in PMNL from crossbred cows compared with PMNL from HO cows [1,482.1 ± 82.3 vs. 1,286.6 ± 69.8 geometric mean fluorescence intensity (GMFI)]. Furthermore, among HO cows, the percentage of PA+ PMNL on study d -7 (64.4 ± 5.2%) tended to be greater than on study d 0 (57.1 ± 5.1%), but no differences in percentage of PA+ PMNL between study d -7 and 0 were observed in crossbred cows. Similarly, intensity of PA in PA+ PMNL from HO cows decreased from study d -7 to 0 (4,750.6 ± 1,217.0 vs. 1,964.7 ± 1,227.9 GMFI), but no changes in intensity of PA in PA+ PMNL from crossbred cows were observed. On study d 0, intensity of PA tended to be
Direct modeling of regression effects for transition probabilities in the progressive illness-death model

DEFF Research Database (Denmark)

Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo

2017-01-01

In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...
Regression dilution bias: tools for correction methods and sample size calculation.

Science.gov (United States)

Berglund, Lars

2012-08-01

Random errors in measurement of a risk factor will introduce downward bias of an estimated association to a disease or a disease marker. This phenomenon is called regression dilution bias. A bias correction may be made with data from a validity study or a reliability study. In this article we give a non-technical description of designs of reliability studies with emphasis on selection of individuals for a repeated measurement, assumptions of measurement error models, and correction methods for the slope in a simple linear regression model where the dependent variable is a continuous variable. Also, we describe situations where correction for regression dilution bias is not appropriate. The methods are illustrated with the association between insulin sensitivity measured with the euglycaemic insulin clamp technique and fasting insulin, where measurement of the latter variable carries noticeable random error. We provide software tools for estimation of a corrected slope in a simple linear regression model assuming data for a continuous dependent variable and a continuous risk factor from a main study and an additional measurement of the risk factor in a reliability study. Also, we supply programs for estimation of the number of individuals needed in the reliability study and for choice of its design. Our conclusion is that correction for regression dilution bias is seldom applied in epidemiological studies. This may cause important effects of risk factors with large measurement errors to be neglected.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

Science.gov (United States)

Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

2009-11-01

G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Longitudinal changes in telomere length and associated genetic parameters in dairy cattle analysed using random regression models.

Directory of Open Access Journals (Sweden)

Luise A Seeker

Full Text Available Telomeres cap the ends of linear chromosomes and shorten with age in many organisms. In humans short telomeres have been linked to morbidity and mortality. With the accumulation of longitudinal datasets the focus shifts from investigating telomere length (TL to exploring TL change within individuals over time. Some studies indicate that the speed of telomere attrition is predictive of future disease. The objectives of the present study were to 1 characterize the change in bovine relative leukocyte TL (RLTL across the lifetime in Holstein Friesian dairy cattle, 2 estimate genetic parameters of RLTL over time and 3 investigate the association of differences in individual RLTL profiles with productive lifespan. RLTL measurements were analysed using Legendre polynomials in a random regression model to describe TL profiles and genetic variance over age. The analyses were based on 1,328 repeated RLTL measurements of 308 female Holstein Friesian dairy cattle. A quadratic Legendre polynomial was fitted to the fixed effect of age in months and to the random effect of the animal identity. Changes in RLTL, heritability and within-trait genetic correlation along the age trajectory were calculated and illustrated. At a population level, the relationship between RLTL and age was described by a positive quadratic function. Individuals varied significantly regarding the direction and amount of RLTL change over life. The heritability of RLTL ranged from 0.36 to 0.47 (SE = 0.05-0.08 and remained statistically unchanged over time. The genetic correlation of RLTL at birth with measurements later in life decreased with the time interval between samplings from near unity to 0.69, indicating that TL later in life might be regulated by different genes than TL early in life. Even though animals differed in their RLTL profiles significantly, those differences were not correlated with productive lifespan (p = 0.954.
Longitudinal analysis of the strengths and difficulties questionnaire scores of the Millennium Cohort Study children in England using M-quantile random-effects regression.

Science.gov (United States)

Tzavidis, Nikos; Salvati, Nicola; Schmid, Timo; Flouri, Eirini; Midouhas, Emily

2016-02-01

Multilevel modelling is a popular approach for longitudinal data analysis. Statistical models conventionally target a parameter at the centre of a distribution. However, when the distribution of the data is asymmetric, modelling other location parameters, e.g. percentiles, may be more informative. We present a new approach, M -quantile random-effects regression, for modelling multilevel data. The proposed method is used for modelling location parameters of the distribution of the strengths and difficulties questionnaire scores of children in England who participate in the Millennium Cohort Study. Quantile mixed models are also considered. The analyses offer insights to child psychologists about the differential effects of risk factors on children's outcomes.

Retro-regression--another important multivariate regression improvement.

Science.gov (United States)

Randić, M

2001-01-01

We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Modified Regression Correlation Coefficient for Poisson Regression Model

Science.gov (United States)

Kaengthong, Nattacha; Domthong, Uthumporn

2017-09-01

This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs

Science.gov (United States)

Karabatsos, George; Walker, Stephen G.

2013-01-01

The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…
Random Forest as a Predictive Analytics Alternative to Regression in Institutional Research

Science.gov (United States)

He, Lingjun; Levine, Richard A.; Fan, Juanjuan; Beemer, Joshua; Stronach, Jeanne

2018-01-01

In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of…
Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements

KAUST Repository

Ryu, Duchwan; Li, Erning; Mallick, Bani K.

2010-01-01

" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
Genome-wide prediction of discrete traits using bayesian regressions and machine learning

Directory of Open Access Journals (Sweden)

Forni Selma

2011-02-01

Full Text Available Abstract Background Genomic selection has gained much attention and the main goal is to increase the predictive accuracy and the genetic gain in livestock using dense marker information. Most methods dealing with the large p (number of covariates small n (number of observations problem have dealt only with continuous traits, but there are many important traits in livestock that are recorded in a discrete fashion (e.g. pregnancy outcome, disease resistance. It is necessary to evaluate alternatives to analyze discrete traits in a genome-wide prediction context. Methods This study shows two threshold versions of Bayesian regressions (Bayes A and Bayesian LASSO and two machine learning algorithms (boosting and random forest to analyze discrete traits in a genome-wide prediction context. These methods were evaluated using simulated and field data to predict yet-to-be observed records. Performances were compared based on the models' predictive ability. Results The simulation showed that machine learning had some advantages over Bayesian regressions when a small number of QTL regulated the trait under pure additivity. However, differences were small and disappeared with a large number of QTL. Bayesian threshold LASSO and boosting achieved the highest accuracies, whereas Random Forest presented the highest classification performance. Random Forest was the most consistent method in detecting resistant and susceptible animals, phi correlation was up to 81% greater than Bayesian regressions. Random Forest outperformed other methods in correctly classifying resistant and susceptible animals in the two pure swine lines evaluated. Boosting and Bayes A were more accurate with crossbred data. Conclusions The results of this study suggest that the best method for genome-wide prediction may depend on the genetic basis of the population analyzed. All methods were less accurate at correctly classifying intermediate animals than extreme animals. Among the different
Models for Estimating Genetic Parameters of Milk Production Traits Using Random Regression Models in Korean Holstein Cattle

Directory of Open Access Journals (Sweden)

C. I. Cho

2016-05-01

Full Text Available The objectives of the study were to estimate genetic parameters for milk production traits of Holstein cattle using random regression models (RRMs, and to compare the goodness of fit of various RRMs with homogeneous and heterogeneous residual variances. A total of 126,980 test-day milk production records of the first parity Holstein cows between 2007 and 2014 from the Dairy Cattle Improvement Center of National Agricultural Cooperative Federation in South Korea were used. These records included milk yield (MILK, fat yield (FAT, protein yield (PROT, and solids-not-fat yield (SNF. The statistical models included random effects of genetic and permanent environments using Legendre polynomials (LP of the third to fifth order (L3–L5, fixed effects of herd-test day, year-season at calving, and a fixed regression for the test-day record (third to fifth order. The residual variances in the models were either homogeneous (HOM or heterogeneous (15 classes, HET15; 60 classes, HET60. A total of nine models (3 orders of polynomials×3 types of residual variance including L3-HOM, L3-HET15, L3-HET60, L4-HOM, L4-HET15, L4-HET60, L5-HOM, L5-HET15, and L5-HET60 were compared using Akaike information criteria (AIC and/or Schwarz Bayesian information criteria (BIC statistics to identify the model(s of best fit for their respective traits. The lowest BIC value was observed for the models L5-HET15 (MILK; PROT; SNF and L4-HET15 (FAT, which fit the best. In general, the BIC values of HET15 models for a particular polynomial order was lower than that of the HET60 model in most cases. This implies that the orders of LP and types of residual variances affect the goodness of models. Also, the heterogeneity of residual variances should be considered for the test-day analysis. The heritability estimates of from the best fitted models ranged from 0.08 to 0.15 for MILK, 0.06 to 0.14 for FAT, 0.08 to 0.12 for PROT, and 0.07 to 0.13 for SNF according to days in milk of first
Model comparison on genomic predictions using high-density markers for different groups of bulls in the Nordic Holstein population

DEFF Research Database (Denmark)

Gao, Hongding; Su, Guosheng; Janss, Luc

2013-01-01

This study compared genomic predictions based on imputed high-density markers (~777,000) in the Nordic Holstein population using a genomic BLUP (GBLUP) model, 4 Bayesian exponential power models with different shape parameters (0.3, 0.5, 0.8, and 1.0) for the exponential power distribution...... relationship with the training population. Groupsmgs had both the sire and the maternal grandsire (MGS), Groupsire only had the sire, Groupmgs only had the MGS, and Groupnon had neither the sire nor the MGS in the training population. Reliability of DGV was measured as the squared correlation between DGV...... and DRP divided by the reliability of DRP for the bulls in validation data set. Unbiasedness of DGV was measured as the regression of DRP on DGV. The results indicated that DGV were more accurate and less biased for animals that were more related to the training population. In general, the Bayesian...
Association between cow reproduction and calf growth traits and ELISA scores for paratuberculosis in a multibreed herd of beef cattle.

Science.gov (United States)

Elzo, M A; Rae, D O; Lanhart, S E; Hembry, F G; Wasdin, J G; Driver, J D

2009-08-01

The objective of this research was to assess the association between 4 cow reproductive and weight traits, and 2 preweaning calf traits and ELISA scores for paratuberculosis (0 = negative, 1 = suspect, 2 = weak-positive, and 3 = positive) in a multibreed herd of cows ranging from 100% Angus (A) to 100% Brahman (B). Cow data were 624 gestation lengths (GL), 358 records of time open (TO), 605 calving intervals (CI), and 1240 weight changes from November to weaning in September (WC) from 502 purebred and crossbred cows. Calf data consisted of 956 birth weights (BWT), and 923 weaning weights adjusted to 205 d of age (WW205) from 956 purebred and crossbred calves. Traits were analyzed individually using multibreed mixed models that assumed homogeneity of variances across breed groups. Covariances among random effects were assumed to be zero. Fixed effects were year, age of cow, sex of calf, year x age of cow interaction (except WC), age of cow x sex of calf interaction (only for WC), and covariates for B fraction of sire and cow, heterosis of cow and calf, and ELISA score. Random effects were sire (except for TO and CI), dam, and residual. Regression estimates of cow and calf traits on ELISA scores indicated that lower cow fertility (longer TO), lower ability of cows to maintain weight (negative WC), lower calf BWT, and lower calf WW205 were associated with higher cow ELISA scores. Further research on the effects of subclinical paratuberculosis in beef cattle at regional and national levels seems advisable considering the large potential economic cost of this disease.
On the null distribution of Bayes factors in linear regression

Science.gov (United States)

We show that under the null, the 2 log (Bayes factor) is asymptotically distributed as a weighted sum of chi-squared random variables with a shifted mean. This claim holds for Bayesian multi-linear regression with a family of conjugate priors, namely, the normal-inverse-gamma prior, the g-prior, and...
Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

Science.gov (United States)

Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

2015-08-01

Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.
Dual Regression

OpenAIRE

Spady, Richard; Stouli, Sami

2012-01-01

We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study.

Science.gov (United States)

Li, Hongjian; Leung, Kwong-Sak; Wong, Man-Hon; Ballester, Pedro J

2014-08-27

State-of-the-art protein-ligand docking methods are generally limited by the traditionally low accuracy of their scoring functions, which are used to predict binding affinity and thus vital for discriminating between active and inactive compounds. Despite intensive research over the years, classical scoring functions have reached a plateau in their predictive performance. These assume a predetermined additive functional form for some sophisticated numerical features, and use standard multivariate linear regression (MLR) on experimental data to derive the coefficients. In this study we show that such a simple functional form is detrimental for the prediction performance of a scoring function, and replacing linear regression by machine learning techniques like random forest (RF) can improve prediction performance. We investigate the conditions of applying RF under various contexts and find that given sufficient training samples RF manages to comprehensively capture the non-linearity between structural features and measured binding affinities. Incorporating more structural features and training with more samples can both boost RF performance. In addition, we analyze the importance of structural features to binding affinity prediction using the RF variable importance tool. Lastly, we use Cyscore, a top performing empirical scoring function, as a baseline for comparison study. Machine-learning scoring functions are fundamentally different from classical scoring functions because the former circumvents the fixed functional form relating structural features with binding affinities. RF, but not MLR, can effectively exploit more structural features and more training samples, leading to higher prediction performance. The future availability of more X-ray crystal structures will further widen the performance gap between RF-based and MLR-based scoring functions. This further stresses the importance of substituting RF for MLR in scoring function development.
Genome-wide associations for milk production and somatic cell score in Holstein-Friesian cattle in Ireland

Directory of Open Access Journals (Sweden)

Meredith Brian K

2012-03-01

Full Text Available Abstract Background Contemporary dairy breeding goals have broadened to include, along with milk production traits, a number of non-production-related traits in an effort to improve the overall functionality of the dairy cow. Increased indirect selection for resistance to mastitis, one of the most important production-related diseases in the dairy sector, via selection for reduced somatic cell count has been part of these broadened goals. A number of genome-wide association studies have identified genetic variants associated with milk production traits and mastitis resistance, however the majority of these studies have been based on animals which were predominantly kept in confinement and fed a concentrate-based diet (i.e. high-input production systems. This genome-wide association study aims to detect associations using genotypic and phenotypic data from Irish Holstein-Friesian cattle fed predominantly grazed grass in a pasture-based production system (low-input. Results Significant associations were detected for milk yield, fat yield, protein yield, fat percentage, protein percentage and somatic cell score using separate single-locus, frequentist and multi-locus, Bayesian approaches. These associations were detected using two separate populations of Holstein-Friesian sires and cows. In total, 1,529 and 37 associations were detected in the sires using a single SNP regression and a Bayesian method, respectively. There were 103 associations in common between the sires and cows across all the traits. As well as detecting associations within known QTL regions, a number of novel associations were detected; the most notable of these was a region of chromosome 13 associated with milk yield in the population of Holstein-Friesian sires. Conclusions A total of 276 of novel SNPs were detected in the sires using a single SNP regression approach. Although obvious candidate genes may not be initially forthcoming, this study provides a preliminary framework
Genome-wide associations for milk production and somatic cell score in Holstein-Friesian cattle in Ireland

Science.gov (United States)

2012-01-01

Background Contemporary dairy breeding goals have broadened to include, along with milk production traits, a number of non-production-related traits in an effort to improve the overall functionality of the dairy cow. Increased indirect selection for resistance to mastitis, one of the most important production-related diseases in the dairy sector, via selection for reduced somatic cell count has been part of these broadened goals. A number of genome-wide association studies have identified genetic variants associated with milk production traits and mastitis resistance, however the majority of these studies have been based on animals which were predominantly kept in confinement and fed a concentrate-based diet (i.e. high-input production systems). This genome-wide association study aims to detect associations using genotypic and phenotypic data from Irish Holstein-Friesian cattle fed predominantly grazed grass in a pasture-based production system (low-input). Results Significant associations were detected for milk yield, fat yield, protein yield, fat percentage, protein percentage and somatic cell score using separate single-locus, frequentist and multi-locus, Bayesian approaches. These associations were detected using two separate populations of Holstein-Friesian sires and cows. In total, 1,529 and 37 associations were detected in the sires using a single SNP regression and a Bayesian method, respectively. There were 103 associations in common between the sires and cows across all the traits. As well as detecting associations within known QTL regions, a number of novel associations were detected; the most notable of these was a region of chromosome 13 associated with milk yield in the population of Holstein-Friesian sires. Conclusions A total of 276 of novel SNPs were detected in the sires using a single SNP regression approach. Although obvious candidate genes may not be initially forthcoming, this study provides a preliminary framework upon which to identify the
GENETIC CONTRIBUTION OF RAM ON LITTER SIZE IN ŠUMAVA SHEEP

Directory of Open Access Journals (Sweden)

Jitka Schmidová

2015-09-01

Full Text Available The objective of the present study was to quantify the service sire effect in terms of (co variance components of born and weaned lambs number and to propose models for the potential inclusion of this effect in the linear equations for breeding value estimation. The database with 21,324 lambings in Šumava sheep from 1992- 2013 was used. The basic model equation for the analysis of variance of litter size contained effects of ewe´s age at lambing, contemporary group, permanent environmental effect of ewe and direct additive genetic effect of ewe. Two modifications of the basic model were used for estimation of service sire effect. The proportions of variance for the service sire effect for number of born and weaned lambs were 2.1% and 2.0%, when service sire was not included into relationship matrix; while included into the relationship matrix and dividing effect into genetic contribution and permanent environment effect refer that nongenetic effect seems to be bigger than genetic (0.013 vs. 0.009 for number of born and 0.017 vs. 0.004 for number of weaned. Changes in other variance components were relatively low, except of contemporary group. Model including service sire effect as a simple random effect without genetic relationship matrix inclusion is recommended for genetic evaluation of litter size traits.
Genetic Analysis of Daily Maximum Milking Speed by a Random Walk Model in Dairy Cows

DEFF Research Database (Denmark)

Karacaören, Burak; Janss, Luc; Kadarmideen, Haja

Data were obtained from dairy cows stationed at research farm ETH Zurich for maximum milking speed. The main aims of this paper are a) to evaluate if the Wood curve is suitable to model mean lactation curve b) to predict longitudinal breeding values by random regression and random walk models of ...... filter applications: random walk model could give online prediction of breeding values. Hence without waiting for whole lactation records, genetic evaluation could be made when the daily or monthly data is available......Data were obtained from dairy cows stationed at research farm ETH Zurich for maximum milking speed. The main aims of this paper are a) to evaluate if the Wood curve is suitable to model mean lactation curve b) to predict longitudinal breeding values by random regression and random walk models...... of maximum milking speed. Wood curve did not provide a good fit to the data set. Quadratic random regressions gave better predictions compared with the random walk model. However random walk model does not need to be evaluated for different orders of regression coefficients. In addition with the Kalman...
Estimating HIES Data through Ratio and Regression Methods for Different Sampling Designs

Directory of Open Access Journals (Sweden)

Faqir Muhammad

2007-01-01

Full Text Available In this study, comparison has been made for different sampling designs, using the HIES data of North West Frontier Province (NWFP for 2001-02 and 1998-99 collected from the Federal Bureau of Statistics, Statistical Division, Government of Pakistan, Islamabad. The performance of the estimators has also been considered using bootstrap and Jacknife. A two-stage stratified random sample design is adopted by HIES. In the first stage, enumeration blocks and villages are treated as the first stage Primary Sampling Units (PSU. The sample PSU’s are selected with probability proportional to size. Secondary Sampling Units (SSU i.e., households are selected by systematic sampling with a random start. They have used a single study variable. We have compared the HIES technique with some other designs, which are: Stratified Simple Random Sampling. Stratified Systematic Sampling. Stratified Ranked Set Sampling. Stratified Two Phase Sampling. Ratio and Regression methods were applied with two study variables, which are: Income (y and Household sizes (x. Jacknife and Bootstrap are used for variance replication. Simple Random Sampling with sample size (462 to 561 gave moderate variances both by Jacknife and Bootstrap. By applying Systematic Sampling, we received moderate variance with sample size (467. In Jacknife with Systematic Sampling, we obtained variance of regression estimator greater than that of ratio estimator for a sample size (467 to 631. At a sample size (952 variance of ratio estimator gets greater than that of regression estimator. The most efficient design comes out to be Ranked set sampling compared with other designs. The Ranked set sampling with jackknife and bootstrap, gives minimum variance even with the smallest sample size (467. Two Phase sampling gave poor performance. Multi-stage sampling applied by HIES gave large variances especially if used with a single study variable.
A systematic review and meta-regression analysis of mivacurium for tracheal intubation

NARCIS (Netherlands)

Vanlinthout, L.E.H.; Mesfin, S.H.; Hens, N.; Vanacker, B.F.; Robertson, E.N.; Booij, L.H.D.J.

2014-01-01

We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent
Modelos de regressão aleatória para avaliação da curva de crescimento em matrizes de codorna de corte Random regression models for growth evaluation of meat-type quail hens

Directory of Open Access Journals (Sweden)

Bruno Bastos Teixeira

2012-09-01

Full Text Available Objetivou-se comparar diferentes modelos de regressão aleatória por meio de funções polinomiais de Legendre de diferentes ordens, para avaliar o que melhor se ajusta ao estudo genético da curva de crescimento de codornas de corte. Foram avaliados dados de 2136 matrizes de codorna de corte, dos quais 1026 pertenciam ao grupo genético UFV1 e 1110 ao grupo UFV2. As codornas foram pesadas nos 1°, 7°, 14°, 21°, 28°, 35°, 42°, 77°, 112° e 147° dias de idade e seus pesos utilizados para a análise. Foram testadas duas possíveis modelagens de variância residual heterogênea, sendo agrupadas em 3 e 5 classes de idade. Após, foi realizado o estudo do modelo de regressão aleatória que melhor aplica-se à curva de crescimento das codornas. A comparação entre os modelos foi feita pelo Critério de Informação de Akaike (AIC, Critério de Informação Bayesiano de Schwarz (BIC, Logaritmo da função de verossimilhança (Log e L e teste da razão de verossimilhança (LRT, ao nível de 1%. O modelo que considerou a heterogeneidade de variância residual CL3 mostrou-se adequado à linhagem UFV1, e o modelo CL5 à linhagem UFV2. Uma função polinomial de Legendre com ordem 5, para efeito genético aditivo direto e 5 para efeito permanente de animal, para a linhagem UFV1 e, com ordem 3, para efeito genético aditivo direto e 5 para efeito permanente de animal para a linhagem UFV2, deve ser utilizada na avaliação genética da curva de crescimento das codornas de corte.The objective was to compare different random regression models using Legendre polynomial functions of different orders, to evaluate what best fits the genetic study of the growth curve of meat quails. It was evaluated data from 2136 cut dies quail, of which 1026 belonged to genetic group UFV1 and 1110 the group UFV2. Quail were weighed at 10, 70, 140, 210, 280, 350, 420, 770, 1120 and 1470 days of age, and weights used for the analysis. It was tested two possible modeling

Regression: A Bibliography.

Science.gov (United States)

Pedrini, D. T.; Pedrini, Bonnie C.

Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Random regression analysis for body weights and main morphological traits in genetically improved farmed tilapia (Oreochromis niloticus).

Science.gov (United States)

He, Jie; Zhao, Yunfeng; Zhao, Jingli; Gao, Jin; Xu, Pao; Yang, Runqing

2018-02-01

To genetically analyse growth traits in genetically improved farmed tilapia (GIFT), the body weight (BWE) and main morphological traits, including body length (BL), body depth (BD), body width (BWI), head length (HL) and length of the caudal peduncle (CPL), were measured six times in growth duration on 1451 fish from 45 mixed families of full and half sibs. A random regression model (RRM) was used to model genetic changes of the growth traits with days of age and estimate the heritability for any growth point and genetic correlations between pairwise growth points. Using the covariance function based on optimal RRMs, the heritabilities were estimated to be from 0.102 to 0.662 for BWE, 0.157 to 0.591 for BL, 0.047 to 0.621 for BD, 0.018 to 0.577 for BWI, 0.075 to 0.597 for HL and 0.032 to 0.610 for CPL between 60 and 140 days of age. All genetic correlations exceeded 0.5 between pairwise growth points. Moreover, the traits at initial days of age showed less correlation with those at later days of age. With phenotypes observed repeatedly, the model choice showed that the optimal RRMs could more precisely predict breeding values at a specific growth time than repeatability models or multiple trait animal models, which enhanced the efficiency of selection for the BWE and main morphological traits.
A brief introduction to regression designs and mixed-effects modelling by a recent convert

OpenAIRE

Balling, Laura Winther

2008-01-01

This article discusses the advantages of multiple regression designs over the factorial designs traditionally used in many psycholinguistic experiments. It is shown that regression designs are typically more informative, statistically more powerful and better suited to the analysis of naturalistic tasks. The advantages of including both fixed and random effects are demonstrated with reference to linear mixed-effects models, and problems of collinearity, variable distribution and variable sele...
Genotype by environment interactions for growth in Red Angus.

Science.gov (United States)

Fennewald, D J; Weaber, R L; Lamberson, W R

2017-02-01

Accuracy of sire selection is limited by how well animals are characterized for their environment. The objective of this study was to evaluate the presence of genotype × environment interactions (G×E) for birth weight (BiW) and weaning weight (WW) for Red Angus in the United States. Adjusted weights were provided by the Red Angus Association of America. Environments were defined as 9 regions within the continental United States with similar temperature-humidity indices. Mean weights of calves were determined for each region and for each sire's progeny within each region. A reaction norm (RN) for each bull was estimated by regressing the sire means on the region means weighted for the number of progeny of each sire. The range for BiW and WW RN was -1.3 to 4.0 and -1.7 to 2.8, respectively. The heritabilities of BiW and WW RN were 0.40 and 0.39, respectively. Phenotypic and genetic correlations between BiW and WW RN were 0.19 and 0.54, respectively. The phenotypic correlation of the progeny mean to the RN was -0.20 ( <0.05) and suggests that sires with higher means are more stable in progeny performance across environments. Weights in different regions were considered separate traits and genetic correlations were estimated between all pairs of regions as another method to determine G×E. Genetic correlations < 0.80 indicate G×E at a level for concern, but existed for only 2 of 36 estimates for BiW and 12 of 36 estimates for WW. Genetic correlations between different regions ranged from 0.74 to 0.96 for BiW and 0.62 to 0.99 for WW and indicate that sires tend to rank similarly across environments for these traits.
Advanced statistics: linear regression, part I: simple linear regression.

Science.gov (United States)

Marill, Keith A

2004-01-01

Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Early detection of structual changes in random signal

International Nuclear Information System (INIS)

Kuroda, Yoshiteru; Yokota, Katsuhiro

1981-01-01

Early detection of structual changes in observed random signal is very important from the point of system diagnosis. In this paper, the following procedures are applied to this problem and the results are compared. (1) auto-regressive model to random signal to calculate the prediction error, i.e., the defference between observed and predicted values. (2) auto-regressive method to caluculate the sum of the prediction error. (3) a method is based on AIC (Akaike Information Criterion). Simulation is made of these procedures, indicating their merits and demerits as a diagostic tools. (author)
A Comparison of Advanced Regression Algorithms for Quantifying Urban Land Cover

Directory of Open Access Journals (Sweden)

Akpona Okujeni

2014-07-01

Full Text Available Quantitative methods for mapping sub-pixel land cover fractions are gaining increasing attention, particularly with regard to upcoming hyperspectral satellite missions. We evaluated five advanced regression algorithms combined with synthetically mixed training data for quantifying urban land cover from HyMap data at 3.6 and 9 m spatial resolution. Methods included support vector regression (SVR, kernel ridge regression (KRR, artificial neural networks (NN, random forest regression (RFR and partial least squares regression (PLSR. Our experiments demonstrate that both kernel methods SVR and KRR yield high accuracies for mapping complex urban surface types, i.e., rooftops, pavements, grass- and tree-covered areas. SVR and KRR models proved to be stable with regard to the spatial and spectral differences between both images and effectively utilized the higher complexity of the synthetic training mixtures for improving estimates for coarser resolution data. Observed deficiencies mainly relate to known problems arising from spectral similarities or shadowing. The remaining regressors either revealed erratic (NN or limited (RFR and PLSR performances when comprehensively mapping urban land cover. Our findings suggest that the combination of kernel-based regression methods, such as SVR and KRR, with synthetically mixed training data is well suited for quantifying urban land cover from imaging spectrometer data at multiple scales.
Comparative evaluation of left ventricular mass regression after aortic valve replacement: a prospective randomized analysis

Directory of Open Access Journals (Sweden)

Kiessling Arndt H

2011-10-01

Full Text Available Abstract Background We assessed the hemodynamic performance of various prostheses and the clinical outcomes after aortic valve replacement, in different age groups. Methods One-hundred-and-twenty patients with isolated aortic valve stenosis were included in this prospective randomized randomised trial and allocated in three age-groups to receive either pulmonary autograft (PA, n = 20 or mechanical prosthesis (MP, Edwards Mira n = 20 in group 1 (age 75. Clinical outcomes and hemodynamic performance were evaluated at discharge, six months and one year. Results In group 1, patients with PA had significantly lower mean gradients than the MP (2.6 vs. 10.9 mmHg, p = 0.0005 with comparable left ventricular mass regression (LVMR. Morbidity included 1 stroke in the PA population and 1 gastrointestinal bleeding in the MP subgroup. In group 2, mean gradients did not differ significantly between both populations (7.0 vs. 8.9 mmHg, p = 0.81. The rate of LVMR and EF were comparable at 12 months; each group with one mortality. Morbidity included 1 stroke and 1 gastrointestinal bleeding in the stentless and 3 bleeding complications in the MP group. In group 3, mean gradients did not differ significantly (7.8 vs 6.5 mmHg, p = 0.06. Postoperative EF and LVMR were comparable. There were 3 deaths in the stented group and no mortality in the stentless group. Morbidity included 1 endocarditis and 1 stroke in the stentless compared to 1 endocarditis, 1 stroke and one pulmonary embolism in the stented group. Conclusions Clinical outcomes justify valve replacement with either valve substitute in the respective age groups. The PA hemodynamically outperformed the MPs. Stentless valves however, did not demonstrate significantly superior hemodynamics or outcomes in comparison to stented bioprosthesis or MPs.
Replicating Experimental Impact Estimates Using a Regression Discontinuity Approach. NCEE 2012-4025

Science.gov (United States)

Gleason, Philip M.; Resch, Alexandra M.; Berk, Jillian A.

2012-01-01

This NCEE Technical Methods Paper compares the estimated impacts of an educational intervention using experimental and regression discontinuity (RD) study designs. The analysis used data from two large-scale randomized controlled trials--the Education Technology Evaluation and the Teach for America Study--to provide evidence on the performance of…
Probabilistic Signal Recovery and Random Matrices

Science.gov (United States)

2016-12-08

that classical methods for linear regression (such as Lasso) are applicable for non- linear data. This surprising finding has already found several...we studied the complexity of convex sets. In numerical linear algebra , we analyzed the fastest known randomized approximation algorithm for...and perfect matchings In numerical linear algebra , we studied the fastest known randomized approximation algorithm for computing the permanents of
Ordinary least square regression, orthogonal regression, geometric mean regression and their applications in aerosol science

International Nuclear Information System (INIS)

Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei

2007-01-01

Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age
A comparison between Poisson and zero-inflated Poisson regression models with an application to number of black spots in Corriedale sheep

Directory of Open Access Journals (Sweden)

Rodrigues-Motta Mariana

2008-07-01

Full Text Available Abstract Dark spots in the fleece area are often associated with dark fibres in wool, which limits its competitiveness with other textile fibres. Field data from a sheep experiment in Uruguay revealed an excess number of zeros for dark spots. We compared the performance of four Poisson and zero-inflated Poisson (ZIP models under four simulation scenarios. All models performed reasonably well under the same scenario for which the data were simulated. The deviance information criterion favoured a Poisson model with residual, while the ZIP model with a residual gave estimates closer to their true values under all simulation scenarios. Both Poisson and ZIP models with an error term at the regression level performed better than their counterparts without such an error. Field data from Corriedale sheep were analysed with Poisson and ZIP models with residuals. Parameter estimates were similar for both models. Although the posterior distribution of the sire variance was skewed due to a small number of rams in the dataset, the median of this variance suggested a scope for genetic selection. The main environmental factor was the age of the sheep at shearing. In summary, age related processes seem to drive the number of dark spots in this breed of sheep.
Polynomial regression analysis and significance test of the regression function

International Nuclear Information System (INIS)

Gao Zhengming; Zhao Juan; He Shengping

2012-01-01

In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Reduced Rank Regression

DEFF Research Database (Denmark)

Johansen, Søren

2008-01-01

The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Quantile Regression Methods

DEFF Research Database (Denmark)

Fitzenberger, Bernd; Wilke, Ralf Andreas

2015-01-01

if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
The Regression Analysis of Individual Financial Performance: Evidence from Croatia

OpenAIRE

Bahovec, Vlasta; Barbić, Dajana; Palić, Irena

2017-01-01

Background: A large body of empirical literature indicates that gender and financial literacy are significant determinants of individual financial performance. Objectives: The purpose of this paper is to recognize the impact of the variable financial literacy and the variable gender on the variation of the financial performance using the regression analysis. Methods/Approach: The survey was conducted using the systematically chosen random sample of Croatian financial consumers. The cross sect...
A brief introduction to regression designs and mixed-effects modelling by a recent convert

DEFF Research Database (Denmark)

Balling, Laura Winther

2008-01-01

This article discusses the advantages of multiple regression designs over the factorial designs traditionally used in many psycholinguistic experiments. It is shown that regression designs are typically more informative, statistically more powerful and better suited to the analysis of naturalistic...... tasks. The advantages of including both fixed and random effects are demonstrated with reference to linear mixed-effects models, and problems of collinearity, variable distribution and variable selection are discussed. The advantages of these techniques are exemplified in an analysis of a word...
Regression Phalanxes

OpenAIRE

Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.

2017-01-01

Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Ethanolic extract of Artemisia aucheri induces regression of aorta wall fatty streaks in hypercholesterolemic rabbits.

Science.gov (United States)

Asgary, S; Dinani, N Jafari; Madani, H; Mahzouni, P

2008-05-01

Artemisia aucheri is a native-growing plant which is widely used in Iranian traditional medicine. This study was designed to evaluate the effects of A. aucheri on regression of atherosclerosis in hypercholesterolemic rabbits. Twenty five rabbits were randomly divided into five groups of five each and treated 3-months as follows: 1: normal diet, 2: hypercholesterolemic diet (HCD), 3 and 4: HCD for 60 days and then normal diet and normal diet + A. aucheri (100 mg x kg(-1) x day(-1)) respectively for an additional 30 days (regression period). In the regression period dietary use of A. aucheri in group 4 significantly decreased total cholesterol, triglyceride and LDL-cholesterol, while HDL-cholesterol was significantly increased. The atherosclerotic area was significantly decreased in this group. Animals, which received only normal diet in the regression period showed no regression but rather progression of atherosclerosis. These findings suggest that A. aucheri may cause regression of atherosclerotic lesions.
Variances in the projections, resulting from CLIMEX, Boosted Regression Trees and Random Forests techniques

Science.gov (United States)

Shabani, Farzin; Kumar, Lalit; Solhjouy-fard, Samaneh

2017-08-01

The aim of this study was to have a comparative investigation and evaluation of the capabilities of correlative and mechanistic modeling processes, applied to the projection of future distributions of date palm in novel environments and to establish a method of minimizing uncertainty in the projections of differing techniques. The location of this study on a global scale is in Middle Eastern Countries. We compared the mechanistic model CLIMEX (CL) with the correlative models MaxEnt (MX), Boosted Regression Trees (BRT), and Random Forests (RF) to project current and future distributions of date palm ( Phoenix dactylifera L.). The Global Climate Model (GCM), the CSIRO-Mk3.0 (CS) using the A2 emissions scenario, was selected for making projections. Both indigenous and alien distribution data of the species were utilized in the modeling process. The common areas predicted by MX, BRT, RF, and CL from the CS GCM were extracted and compared to ascertain projection uncertainty levels of each individual technique. The common areas identified by all four modeling techniques were used to produce a map indicating suitable and unsuitable areas for date palm cultivation for Middle Eastern countries, for the present and the year 2100. The four different modeling approaches predict fairly different distributions. Projections from CL were more conservative than from MX. The BRT and RF were the most conservative methods in terms of projections for the current time. The combination of the final CL and MX projections for the present and 2100 provide higher certainty concerning those areas that will become highly suitable for future date palm cultivation. According to the four models, cold, hot, and wet stress, with differences on a regional basis, appears to be the major restrictions on future date palm distribution. The results demonstrate variances in the projections, resulting from different techniques. The assessment and interpretation of model projections requires reservations

Strengthening the Regression Discontinuity Design Using Additional Design Elements: A Within-Study Comparison

Science.gov (United States)

Wing, Coady; Cook, Thomas D.

2013-01-01

The sharp regression discontinuity design (RDD) has three key weaknesses compared to the randomized clinical trial (RCT). It has lower statistical power, it is more dependent on statistical modeling assumptions, and its treatment effect estimates are limited to the narrow subpopulation of cases immediately around the cutoff, which is rarely of…
Advanced statistics: linear regression, part II: multiple linear regression.

Science.gov (United States)

Marill, Keith A

2004-01-01

The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Boosted beta regression.

Directory of Open Access Journals (Sweden)

Matthias Schmid

Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Regression to Causality : Regression-style presentation influences causal attribution

DEFF Research Database (Denmark)

Bordacconi, Mats Joe; Larsen, Martin Vinæs

2014-01-01

of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
THE EFFECT OF PHANTOM GROUPS ON GENETIC TREND

African Journals Online (AJOL)

Helena Theron

In an animal model evaluation of breeding values it is assumed that the base animals are all at the .... these values over the daughters of a sire reduces the stochastic component and represents a genetic ... To test whether the genetic trend is over- or underestimated, a regression model is fitted to the DYD ... parameter.
Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression.

Science.gov (United States)

Candel, Math J J M; Van Breukelen, Gerard J P

2010-06-30

Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.
Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

Science.gov (United States)

Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

2016-01-01

Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19
Exploring reasons for the observed inconsistent trial reports on intra-articular injections with hyaluronic acid in the treatment of osteoarthritis: Meta-regression analyses of randomized trials.

Science.gov (United States)

Johansen, Mette; Bahrt, Henriette; Altman, Roy D; Bartels, Else M; Juhl, Carsten B; Bliddal, Henning; Lund, Hans; Christensen, Robin

2016-08-01

The aim was to identify factors explaining inconsistent observations concerning the efficacy of intra-articular hyaluronic acid compared to intra-articular sham/control, or non-intervention control, in patients with symptomatic osteoarthritis, based on randomized clinical trials (RCTs). A systematic review and meta-regression analyses of available randomized trials were conducted. The outcome, pain, was assessed according to a pre-specified hierarchy of potentially available outcomes. Hedges׳s standardized mean difference [SMD (95% CI)] served as effect size. REstricted Maximum Likelihood (REML) mixed-effects models were used to combine study results, and heterogeneity was calculated and interpreted as Tau-squared and I-squared, respectively. Overall, 99 studies (14,804 patients) met the inclusion criteria: Of these, only 71 studies (72%), including 85 comparisons (11,216 patients), had adequate data available for inclusion in the primary meta-analysis. Overall, compared with placebo, intra-articular hyaluronic acid reduced pain with an effect size of -0.39 [-0.47 to -0.31; P hyaluronic acid. Based on available trial data, intra-articular hyaluronic acid showed a better effect than intra-articular saline on pain reduction in osteoarthritis. Publication bias and the risk of selective outcome reporting suggest only small clinical effect compared to saline. Copyright © 2016 Elsevier Inc. All rights reserved.
Prediction of N2O emission from local information with Random Forest

International Nuclear Information System (INIS)

Philibert, Aurore; Loyce, Chantal; Makowski, David

2013-01-01

Nitrous oxide is a potent greenhouse gas, with a global warming potential 298 times greater than that of CO 2 . In agricultural soils, N 2 O emissions are influenced by a large number of environmental characteristics and crop management techniques that are not systematically reported in experiments. Random Forest (RF) is a machine learning method that can handle missing data and ranks input variables on the basis of their importance. We aimed to predict N 2 O emission on the basis of local information, to rank environmental and crop management variables according to their influence on N 2 O emission, and to compare the performances of RF with several regression models. RF outperformed the regression models for predictive purposes, and this approach led to the identification of three important input variables: N fertilization, type of crop, and experiment duration. This method could be used in the future for prediction of N 2 O emissions from local information. -- Highlights: ► Random Forest gave more accurate N 2 O predictions than regression. ► Missing data were well handled by Random Forest. ► The most important factors were nitrogen rate, type of crop and experiment duration. -- Random Forest, a machine learning method, outperformed the regression models for predicting N 2 O emissions and led to the identification of three important input variables
How a dependent's variable non-randomness affects taper equation ...

African Journals Online (AJOL)

In order to apply the least squares method in regression analysis, the values of the dependent variable Y should be random. In an example of regression analysis linear and nonlinear taper equations, which estimate the diameter of the tree dhi at any height of the tree hi, were compared. For each tree the diameter at the ...
Regression analysis with categorized regression calibrated exposure: some interesting findings

Directory of Open Access Journals (Sweden)

Hjartåker Anette

2006-07-01

Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Evaluation of inbreeding in laying hens by applying optimum genetic contribution and gene flow theory.

Science.gov (United States)

König, S; Tsehay, F; Sitzenstock, F; von Borstel, U U; Schmutz, M; Preisinger, R; Simianer, H

2010-04-01

Due to consistent increases of inbreeding of on average 0.95% per generation in layer populations, selection tools should consider both genetic gain and genetic relationships in the long term. The optimum genetic contribution theory using official estimated breeding values for egg production was applied for 3 different lines of a layer breeding program to find the optimal allocations of hens and sires. Constraints in different scenarios encompassed restrictions related to additive genetic relationships, the increase of inbreeding, the number of selected sires and hens, and the number of selected offspring per mating. All these constraints enabled higher genetic gain up to 10.9% at the same level of additive genetic relationships or in lower relationships at the same gain when compared with conventional selection schemes ignoring relationships. Increases of inbreeding and genetic gain were associated with the number of selected sires. For the lowest level of the allowed average relationship at 10%, the optimal number of sires was 70 and the estimated breeding value for egg production of the selected group was 127.9. At the highest relationship constraint (16%), the optimal number of sires decreased to 15, and the average genetic value increased to 139.7. Contributions from selected sires and hens were used to develop specific mating plans to minimize inbreeding in the following generation by applying a simulated annealing algorithm. The additional reduction of average additive genetic relationships for matings was up to 44.9%. An innovative deterministic approach to estimate kinship coefficients between and within defined selection groups based on gene flow theory was applied to compare increases of inbreeding from random matings with layer populations undergoing selection. Large differences in rates of inbreeding were found, and they underline the necessity to establish selection tools controlling long-term relationships. Furthermore, it was suggested to use
Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic.

Science.gov (United States)

Bowden, Jack; Del Greco M, Fabiola; Minelli, Cosetta; Davey Smith, George; Sheehan, Nuala A; Thompson, John R

2016-12-01

: MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We
Virtual machine consolidation enhancement using hybrid regression algorithms

Directory of Open Access Journals (Sweden)

Amany Abdelsamea

2017-11-01

Full Text Available Cloud computing data centers are growing rapidly in both number and capacity to meet the increasing demands for highly-responsive computing and massive storage. Such data centers consume enormous amounts of electrical energy resulting in high operating costs and carbon dioxide emissions. The reason for this extremely high energy consumption is not just the quantity of computing resources and the power inefficiency of hardware, but rather lies in the inefficient usage of these resources. VM consolidation involves live migration of VMs hence the capability of transferring a VM between physical servers with a close to zero down time. It is an effective way to improve the utilization of resources and increase energy efficiency in cloud data centers. VM consolidation consists of host overload/underload detection, VM selection and VM placement. Most of the current VM consolidation approaches apply either heuristic-based techniques, such as static utilization thresholds, decision-making based on statistical analysis of historical data; or simply periodic adaptation of the VM allocation. Most of those algorithms rely on CPU utilization only for host overload detection. In this paper we propose using hybrid factors to enhance VM consolidation. Specifically we developed a multiple regression algorithm that uses CPU utilization, memory utilization and bandwidth utilization for host overload detection. The proposed algorithm, Multiple Regression Host Overload Detection (MRHOD, significantly reduces energy consumption while ensuring a high level of adherence to Service Level Agreements (SLA since it gives a real indication of host utilization based on three parameters (CPU, Memory, Bandwidth utilizations instead of one parameter only (CPU utilization. Through simulations we show that our approach reduces power consumption by 6 times compared to single factor algorithms using random workload. Also using PlanetLab workload traces we show that MRHOD improves
Variable Selection for Regression Models of Percentile Flows

Science.gov (United States)

Fouad, G.

2017-12-01

Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high
Accelerated convergence and robust asymptotic regression of the Gumbel scale parameter for gapped sequence alignment

International Nuclear Information System (INIS)

Park, Yonil; Sheetlin, Sergey; Spouge, John L

2005-01-01

Searches through biological databases provide the primary motivation for studying sequence alignment statistics. Other motivations include physical models of annealing processes or mathematical similarities to, e.g., first-passage percolation and interacting particle systems. Here, we investigate sequence alignment statistics, partly to explore two general mathematical methods. First, we model the global alignment of random sequences heuristically with Markov additive processes. In sequence alignment, the heuristic suggests a numerical acceleration scheme for simulating an important asymptotic parameter (the Gumbel scale parameter λ). The heuristic might apply to similar mathematical theories. Second, we extract the asymptotic parameter λ from simulation data with the statistical technique of robust regression. Robust regression is admirably suited to 'asymptotic regression' and deserves to be better known for it
Effect of folic acid on appetite in children: ordinal logistic and fuzzy logistic regressions.

Science.gov (United States)

Namdari, Mahshid; Abadi, Alireza; Taheri, S Mahmoud; Rezaei, Mansour; Kalantari, Naser; Omidvar, Nasrin

2014-03-01

Reduced appetite and low food intake are often a concern in preschool children, since it can lead to malnutrition, a leading cause of impaired growth and mortality in childhood. It is occasionally considered that folic acid has a positive effect on appetite enhancement and consequently growth in children. The aim of this study was to assess the effect of folic acid on the appetite of preschool children 3 to 6 y old. The study sample included 127 children ages 3 to 6 who were randomly selected from 20 preschools in the city of Tehran in 2011. Since appetite was measured by linguistic terms, a fuzzy logistic regression was applied for modeling. The obtained results were compared with a statistical ordinal logistic model. After controlling for the potential confounders, in a statistical ordinal logistic model, serum folate showed a significantly positive effect on appetite. A small but positive effect of folate was detected by fuzzy logistic regression. Based on fuzzy regression, the risk for poor appetite in preschool children was related to the employment status of their mothers. In this study, a positive association was detected between the levels of serum folate and improved appetite. For further investigation, a randomized controlled, double-blind clinical trial could be helpful to address causality. Copyright © 2014 Elsevier Inc. All rights reserved.
Genetic parameters for residual feed intake in a random population of Pekin duck

Directory of Open Access Journals (Sweden)

Yunsheng Zhang

2017-02-01

Full Text Available Objective The feed intake (FI and feed efficiency are economically important traits in ducks. To obtain insight into this economically important trait, we designed an experiment based on the residual feed intake (RFI and feed conversion ratio (FCR of a random population Pekin duck. Methods Two thousand and twenty pedigreed random population Pekin ducks were established from 90 males mated to 450 females in two hatches. Traits analyzed in the study were body weight at the 42th day (BW42, 15 to 42 days average daily gain (ADG, 15 to 42 days FI, 15 to 42 days FCR, and 15 to 42 days RFI to assess their genetic inter-relationships. The genetic parameters for feed efficiency traits were estimated using restricted maximum likelihood (REML methodology applied to a sire-dam model for all traits using the ASREML software. Results Estimates heritability of BW42, ADG, FI, FCR, and RFI were 0.39, 0.38, 0.33, 0.38, and 0.41, respectively. The genetic correlation was high between RFI and FI (0.77 and moderate between RFI and FCR (0.54. The genetic correlation was high and moderate between FCR and ADG (−0.80, and between FCR and BW42 (−0.64, and between FCR and FI (0.49, respectively. Conclusion Thus, selection on RFI was expected to improve feed efficiency, and reduce FI. Selection on RFI thus improves the feed efficiency of animals without impairing their FI and increase growth rate.
Time-adaptive quantile regression

DEFF Research Database (Denmark)

Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik

2008-01-01

and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Parameter estimation and statistical test of geographically weighted bivariate Poisson inverse Gaussian regression models

Science.gov (United States)

Amalia, Junita; Purhadi, Otok, Bambang Widjanarko

2017-11-01

Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.

East African Medical Journal

African Journals Online (AJOL)

2002-07-01

Jul 1, 2002 ... PREVALENCE OF VITAMIN A DEFICIENCY AMONG PRE-SCHOOL AND SCHOOL-AGED CHILDREN IN ARSSI ZONE. ETHIOPIA. YT Asrat, BSc. MSc ... in the “low” range (<20ttl/dl) in 51% of the children. Conclusion: The results ... of Arssi zone Dodotana Sire district was selected at random for this study.
Evaluation of Columbia, U.S. Meat Animal Research Center Composite, Suffolk, and Texel rams as terminal sires in an extensive rangeland production system: VI. Measurements of live-lamb and carcass shape and their relationship to carcass yield and value.

Science.gov (United States)

Notter, D R; Mousel, M R; Leeds, T D; Zerby, H N; Moeller, S J; Lewis, G S; Taylor, J B

2014-05-01

Linear measurements on live lambs and carcasses can be used to characterize sheep breeds and may have value for prediction of carcass yield and value. This study used 512 crossbred lambs produced over 3 yr by mating Columbia, U.S. Meat Animal Research Center (USMARC) Composite, Suffolk, and Texel rams to adult Rambouillet ewes to assess sire-breed differences in live-animal and carcass shape and to evaluate the value of shape measurements as predictors of chilled carcass weight (CCW), weight of high-value cuts (rack, loin, leg, and sirloin; HVW), weight of trimmed high-value cuts (trimmed rack and loin and trimmed, boneless leg and sirloin; TrHVW), and estimated carcass value before (CVal) and after trimming of high-value cuts (TrCVal). Lambs were produced under extensive rangeland conditions, weaned at an average age of 132 d, fed a concentrate diet in a drylot, and harvested in each year in 3 groups at target mean BW of 54, 61, and 68 kg. Canonical discriminant analysis indicated that over 93% of variation among sire breeds was accounted for by the contrast between tall, long, less-thickly muscled breeds with greater BW and CCW (i.e., the Columbia and Suffolk) compared with shorter, more thickly muscled breeds with smaller BW and CCW. After correcting for effects of year, harvest group, sire breed, and shipping BW, linear measurements on live lambs contributed little to prediction of CCW. Similarly, after accounting for effects of CCW, linear measurements on live animals further reduced residual SD (RSD) of dependent variables by 0.2 to 5.7%, with generally positive effects of increasing live leg width and generally negative effects of increasing heart girth. Carcass measurements were somewhat more valuable as predictors of carcass merit. After fitting effects of CCW, additional consideration of carcass shape reduced RSD by 2.1, 3.6, 9.5, and 2.2% for HVW, TrHVW, CVal, and TrCVal, respectively. Effects of increasing carcass leg width were positive for HVW, Tr
Dose-Dependent Effects of Statins for Patients with Aneurysmal Subarachnoid Hemorrhage: Meta-Regression Analysis.

Science.gov (United States)

To, Minh-Son; Prakash, Shivesh; Poonnoose, Santosh I; Bihari, Shailesh

2018-05-01

The study uses meta-regression analysis to quantify the dose-dependent effects of statin pharmacotherapy on vasospasm, delayed ischemic neurologic deficits (DIND), and mortality in aneurysmal subarachnoid hemorrhage. Prospective, retrospective observational studies, and randomized controlled trials (RCTs) were retrieved by a systematic database search. Summary estimates were expressed as absolute risk (AR) for a given statin dose or control (placebo). Meta-regression using inverse variance weighting and robust variance estimation was performed to assess the effect of statin dose on transformed AR in a random effects model. Dose-dependence of predicted AR with 95% confidence interval (CI) was recovered by using Miller's Freeman-Tukey inverse. The database search and study selection criteria yielded 18 studies (2594 patients) for analysis. These included 12 RCTs, 4 retrospective observational studies, and 2 prospective observational studies. Twelve studies investigated simvastatin, whereas the remaining studies investigated atorvastatin, pravastatin, or pitavastatin, with simvastatin-equivalent doses ranging from 20 to 80 mg. Meta-regression revealed dose-dependent reductions in Freeman-Tukey-transformed AR of vasospasm (slope coefficient -0.00404, 95% CI -0.00720 to -0.00087; P = 0.0321), DIND (slope coefficient -0.00316, 95% CI -0.00586 to -0.00047; P = 0.0392), and mortality (slope coefficient -0.00345, 95% CI -0.00623 to -0.00067; P = 0.0352). The present meta-regression provides weak evidence for dose-dependent reductions in vasospasm, DIND and mortality associated with acute statin use after aneurysmal subarachnoid hemorrhage. However, the analysis was limited by substantial heterogeneity among individual studies. Greater dosing strategies are a potential consideration for future RCTs. Copyright © 2018 Elsevier Inc. All rights reserved.
Machine-learning techniques for family demography: an application of random forests to the analysis of divorce determinants in Germany

OpenAIRE

Arpino, Bruno; Le Moglie, Marco; Mencarini, Letizia

2018-01-01

Demographers often analyze the determinants of life-course events with parametric regression-type approaches. Here, we present a class of nonparametric approaches, broadly defined as machine learning (ML) techniques, and discuss advantages and disadvantages of a popular type known as random forest. We argue that random forests can be useful either as a substitute, or a complement, to more standard parametric regression modeling. Our discussion of random forests is intuitive and...
Regression analysis by example

CERN Document Server

Chatterjee, Samprit

2012-01-01

Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Applied logistic regression

CERN Document Server

Hosmer, David W; Sturdivant, Rodney X

2013-01-01

A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Statistical learning from a regression perspective

CERN Document Server

Berk, Richard A

2016-01-01

This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this can be seen as an extension of nonparametric regression. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. A continued emphasis on the implications for practice runs through the text. Among the statistical learning procedures examined are bagging, random forests, boosting, support vector machines and neural networks. Response variables may be quantitative or categorical. As in the first edition, a unifying theme is supervised learning that can be trea...
Normalization Ridge Regression in Practice I: Comparisons Between Ordinary Least Squares, Ridge Regression and Normalization Ridge Regression.

Science.gov (United States)

Bulcock, J. W.

The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Random effect selection in generalised linear models

DEFF Research Database (Denmark)

Denwood, Matt; Houe, Hans; Forkman, Björn

We analysed abattoir recordings of meat inspection codes with possible relevance to onfarm animal welfare in cattle. Random effects logistic regression models were used to describe individual-level data obtained from 461,406 cattle slaughtered in Denmark. Our results demonstrate that the largest...
Exploring the Influence of Neighborhood Characteristics on Burglary Risks: A Bayesian Random Effects Modeling Approach

Directory of Open Access Journals (Sweden)

Hongqiang Liu

2016-06-01

Full Text Available A Bayesian random effects modeling approach was used to examine the influence of neighborhood characteristics on burglary risks in Jianghan District, Wuhan, China. This random effects model is essentially spatial; a spatially structured random effects term and an unstructured random effects term are added to the traditional non-spatial Poisson regression model. Based on social disorganization and routine activity theories, five covariates extracted from the available data at the neighborhood level were used in the modeling. Three regression models were fitted and compared by the deviance information criterion to identify which model best fit our data. A comparison of the results from the three models indicates that the Bayesian random effects model is superior to the non-spatial models in fitting the data and estimating regression coefficients. Our results also show that neighborhoods with above average bar density and department store density have higher burglary risks. Neighborhood-specific burglary risks and posterior probabilities of neighborhoods having a burglary risk greater than 1.0 were mapped, indicating the neighborhoods that should warrant more attention and be prioritized for crime intervention and reduction. Implications and limitations of the study are discussed in our concluding section.
Vector regression introduced

Directory of Open Access Journals (Sweden)

Mok Tik

2014-06-01

Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Learning Random Numbers: A Matlab Anomaly

Czech Academy of Sciences Publication Activity Database

Savický, Petr; Robnik-Šikonja, M.

2008-01-01

Roč. 22, č. 3 (2008), s. 254-265 ISSN 0883-9514 R&D Projects: GA AV ČR 1ET100300517 Institutional research plan: CEZ:AV0Z10300504 Keywords : random number s * machine learning * classification * attribute evaluation * regression Subject RIV: BA - General Mathematics Impact factor: 0.795, year: 2008
Applied linear regression

CERN Document Server

Weisberg, Sanford

2013-01-01

Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Logistic quantile regression provides improved estimates for bounded avian counts: A case study of California Spotted Owl fledgling production

Science.gov (United States)

Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.

2017-01-01

Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of
Multitrait, Random Regression, or Simple Repeatability Model in High-Throughput Phenotyping Data Improve Genomic Prediction for Wheat Grain Yield.

Science.gov (United States)

Sun, Jin; Rutkoski, Jessica E; Poland, Jesse A; Crossa, José; Jannink, Jean-Luc; Sorrells, Mark E

2017-07-01

High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat ( L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect selection for grain yield. In this study, we evaluated three statistical models, simple repeatability (SR), multitrait (MT), and random regression (RR), for the longitudinal data of secondary traits and compared the impact of the proposed models for secondary traits on their predictive abilities for grain yield. Grain yield and secondary traits, canopy temperature (CT) and normalized difference vegetation index (NDVI), were collected in five diverse environments for 557 wheat lines with available pedigree and genomic information. A two-stage analysis was applied for pedigree and genomic selection (GS). First, secondary traits were fitted by SR, MT, or RR models, separately, within each environment. Then, best linear unbiased predictions (BLUPs) of secondary traits from the above models were used in the multivariate prediction models to compare predictive abilities for grain yield. Predictive ability was substantially improved by 70%, on average, from multivariate pedigree and genomic models when including secondary traits in both training and test populations. Additionally, (i) predictive abilities slightly varied for MT, RR, or SR models in this data set, (ii) results indicated that including BLUPs of secondary traits from the MT model was the best in severe drought, and (iii) the RR model was slightly better than SR and MT models under drought environment. Copyright © 2017 Crop Science Society of America.
Understanding poisson regression.

Science.gov (United States)

Hayat, Matthew J; Higgins, Melinda

2014-04-01

Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Alternative Methods of Regression

CERN Document Server

Birkes, David

2011-01-01

Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Virologic response to tipranavir-ritonavir or darunavir-ritonavir based regimens in antiretroviral therapy experienced HIV-1 patients: a meta-analysis and meta-regression of randomized controlled clinical trials.

Directory of Open Access Journals (Sweden)

Asres Berhan

Full Text Available The development of tipranavir and darunavir, second generation non-peptidic HIV protease inhibitors, with marked improved resistance profiles, has opened a new perspective on the treatment of antiretroviral therapy (ART experienced HIV patients with poor viral load control. The aim of this study was to determine the virologic response in ART experienced patients to tipranavir-ritonavir and darunavir-ritonavir based regimens.A computer based literature search was conducted in the databases of HINARI (Health InterNetwork Access to Research Initiative, Medline and Cochrane library. Meta-analysis was performed by including randomized controlled studies that were conducted in ART experienced patients with plasma viral load above 1,000 copies HIV RNA/ml. The odds ratios and 95% confidence intervals (CI for viral loads of <50 copies and <400 copies HIV RNA/ml at the end of the intervention were determined by the random effects model. Meta-regression, sensitivity analysis and funnel plots were done. The number of HIV-1 patients who were on either a tipranavir-ritonavir or darunavir-ritonavir based regimen and achieved viral load less than 50 copies HIV RNA/ml was significantly higher (overall OR = 3.4; 95% CI, 2.61-4.52 than the number of HIV-1 patients who were on investigator selected boosted comparator HIV-1 protease inhibitors (CPIs-ritonavir. Similarly, the number of patients with viral load less than 400 copies HIV RNA/ml was significantly higher in either the tipranavir-ritonavir or darunavir-ritonavir based regimen treated group (overall OR = 3.0; 95% CI, 2.15-4.11. Meta-regression showed that the viral load reduction was independent of baseline viral load, baseline CD4 count and duration of tipranavir-ritonavir or darunavir-ritonavir based regimen.Tipranavir and darunavir based regimens were more effective in patients who were ART experienced and had poor viral load control. Further studies are required to determine their consistent
Introduction to regression graphics

CERN Document Server

Cook, R Dennis

2009-01-01

Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Auto Regressive Moving Average (ARMA) Modeling Method for Gyro Random Noise Using a Robust Kalman Filter

Science.gov (United States)

Huang, Lei

2015-01-01

To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required. PMID:26437409

Randomized Block Cubic Newton Method

KAUST Repository

Doikov, Nikita; Richtarik, Peter

2018-01-01

We study the problem of minimizing the sum of three convex functions: a differentiable, twice-differentiable and a non-smooth term in a high dimensional setting. To this effect we propose and analyze a randomized block cubic Newton (RBCN) method, which in each iteration builds a model of the objective function formed as the sum of the natural models of its three components: a linear model with a quadratic regularizer for the differentiable term, a quadratic model with a cubic regularizer for the twice differentiable term, and perfect (proximal) model for the nonsmooth term. Our method in each iteration minimizes the model over a random subset of blocks of the search variable. RBCN is the first algorithm with these properties, generalizing several existing methods, matching the best known bounds in all special cases. We establish ${\\cal O}(1/\\epsilon)$, ${\\cal O}(1/\\sqrt{\\epsilon})$ and ${\\cal O}(\\log (1/\\epsilon))$ rates under different assumptions on the component functions. Lastly, we show numerically that our method outperforms the state-of-the-art on a variety of machine learning problems, including cubically regularized least-squares, logistic regression with constraints, and Poisson regression.
Randomized Block Cubic Newton Method

KAUST Repository

Doikov, Nikita

2018-02-12

We study the problem of minimizing the sum of three convex functions: a differentiable, twice-differentiable and a non-smooth term in a high dimensional setting. To this effect we propose and analyze a randomized block cubic Newton (RBCN) method, which in each iteration builds a model of the objective function formed as the sum of the natural models of its three components: a linear model with a quadratic regularizer for the differentiable term, a quadratic model with a cubic regularizer for the twice differentiable term, and perfect (proximal) model for the nonsmooth term. Our method in each iteration minimizes the model over a random subset of blocks of the search variable. RBCN is the first algorithm with these properties, generalizing several existing methods, matching the best known bounds in all special cases. We establish ${\\\\cal O}(1/\\\\epsilon)$, ${\\\\cal O}(1/\\\\sqrt{\\\\epsilon})$ and ${\\\\cal O}(\\\\log (1/\\\\epsilon))$ rates under different assumptions on the component functions. Lastly, we show numerically that our method outperforms the state-of-the-art on a variety of machine learning problems, including cubically regularized least-squares, logistic regression with constraints, and Poisson regression.
Random forests of interaction trees for estimating individualized treatment effects in randomized trials.

Science.gov (United States)

Su, Xiaogang; Peña, Annette T; Liu, Lei; Levine, Richard A

2018-04-29

Assessing heterogeneous treatment effects is a growing interest in advancing precision medicine. Individualized treatment effects (ITEs) play a critical role in such an endeavor. Concerning experimental data collected from randomized trials, we put forward a method, termed random forests of interaction trees (RFIT), for estimating ITE on the basis of interaction trees. To this end, we propose a smooth sigmoid surrogate method, as an alternative to greedy search, to speed up tree construction. The RFIT outperforms the "separate regression" approach in estimating ITE. Furthermore, standard errors for the estimated ITE via RFIT are obtained with the infinitesimal jackknife method. We assess and illustrate the use of RFIT via both simulation and the analysis of data from an acupuncture headache trial. Copyright © 2018 John Wiley & Sons, Ltd.
Identification of System Parameters by the Random Decrement Technique

DEFF Research Database (Denmark)

Brincker, Rune; Kirkegaard, Poul Henning; Rytter, Anders

1991-01-01

-Walker equations and finally, least-square fitting of the theoretical correlation function. The results are compared to the results of fitting an Auto Regressive Moving Average (ARMA) model directly to the system output from a single-degree-of-freedom system loaded by white noise.......The aim of this paper is to investigate and illustrate the possibilities of using correlation functions estimated by the Random Decrement Technique as a basis for parameter identification. A two-stage system identification system is used: first, the correlation functions are estimated by the Random...... Decrement Technique, and then the system parameters are identified from the correlation function estimates. Three different techniques are used in the parameter identification process: a simple non-parametric method, estimation of an Auto Regressive (AR) model by solving an overdetermined set of Yule...
Degummed crude canola oil, sire breed and gender effects on intramuscular long-chain omega-3 fatty acid properties of raw and cooked lamb meat.

Science.gov (United States)

Flakemore, Aaron Ross; Malau-Aduli, Bunmi Sherifat; Nichols, Peter David; Malau-Aduli, Aduli Enoch Othniel

2017-01-01

Omega-3 long-chain (≥C 20 ) polyunsaturated fatty acids (ω3 LC-PUFA) confer important attributes to health-conscious meat consumers due to the significant role they play in brain development, prevention of coronary heart disease, obesity and hypertension. In this study, the ω3 LC-PUFA content of raw and cooked Longissimus thoracis et lumborum (LTL) muscle from genetically divergent Australian prime lambs supplemented with dietary degummed crude canola oil (DCCO) was evaluated. Samples of LTL muscle were sourced from 24 first cross ewe and wether lambs sired by Dorset, White Suffolk and Merino rams joined to Merino dams that were assigned to supplemental regimes of degummed crude canola oil (DCCO): a control diet at 0 mL/kg DM of DCCO (DCCOC); 25 mL/kg DM of DCCO (DCCOM) and 50 mL/kg DCCO (DCCOH). Lambs were individually housed and offered 1 kg/day/head for 42 days before being slaughtered. Samples for cooked analysis were prepared to a core temperature of 70 °C using conductive dry-heat. Within raw meats: DCCOH supplemented lambs had significantly ( P culinary preparation method can be used as effective management tools to deliver nutritionally improved ω3 LC-PUFA lamb to meat consumers.
Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units

International Nuclear Information System (INIS)

Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios, Martha; Baechler, Sébastien

2015-01-01

Purpose: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. Method: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). Results: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Conclusion: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables
Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

Science.gov (United States)

Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

2016-03-01

In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.
Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

Science.gov (United States)

Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

2015-01-01

Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Association between response rates and survival outcomes in patients with newly diagnosed multiple myeloma. A systematic review and meta-regression analysis.

Science.gov (United States)

Mainou, Maria; Madenidou, Anastasia-Vasiliki; Liakos, Aris; Paschos, Paschalis; Karagiannis, Thomas; Bekiari, Eleni; Vlachaki, Efthymia; Wang, Zhen; Murad, Mohammad Hassan; Kumar, Shaji; Tsapas, Apostolos

2017-06-01

We performed a systematic review and meta-regression analysis of randomized control trials to investigate the association between response to initial treatment and survival outcomes in patients with newly diagnosed multiple myeloma (MM). Response outcomes included complete response (CR) and the combined outcome of CR or very good partial response (VGPR), while survival outcomes were overall survival (OS) and progression-free survival (PFS). We used random-effect meta-regression models and conducted sensitivity analyses based on definition of CR and study quality. Seventy-two trials were included in the systematic review, 63 of which contributed data in meta-regression analyses. There was no association between OS and CR in patients without autologous stem cell transplant (ASCT) (regression coefficient: .02, 95% confidence interval [CI] -0.06, 0.10), in patients undergoing ASCT (-.11, 95% CI -0.44, 0.22) and in trials comparing ASCT with non-ASCT patients (.04, 95% CI -0.29, 0.38). Similarly, OS did not correlate with the combined metric of CR or VGPR, and no association was evident between response outcomes and PFS. Sensitivity analyses yielded similar results. This meta-regression analysis suggests that there is no association between conventional response outcomes and survival in patients with newly diagnosed MM. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Does higher education protect against obesity? Evidence using Mendelian randomization.

Science.gov (United States)

Böckerman, Petri; Viinikainen, Jutta; Pulkki-Råback, Laura; Hakulinen, Christian; Pitkänen, Niina; Lehtimäki, Terho; Pehkonen, Jaakko; Raitakari, Olli T

2017-08-01

The aim of this explorative study was to examine the effect of education on obesity using Mendelian randomization. Participants (N=2011) were from the on-going nationally representative Young Finns Study (YFS) that began in 1980 when six cohorts (aged 30, 33, 36, 39, 42 and 45 in 2007) were recruited. The average value of BMI (kg/m 2 ) measurements in 2007 and 2011 and genetic information were linked to comprehensive register-based information on the years of education in 2007. We first used a linear regression (Ordinary Least Squares, OLS) to estimate the relationship between education and BMI. To identify a causal relationship, we exploited Mendelian randomization and used a genetic score as an instrument for education. The genetic score was based on 74 genetic variants that genome-wide association studies (GWASs) have found to be associated with the years of education. Because the genotypes are randomly assigned at conception, the instrument causes exogenous variation in the years of education and thus enables identification of causal effects. The years of education in 2007 were associated with lower BMI in 2007/2011 (regression coefficient (b)=-0.22; 95% Confidence Intervals [CI]=-0.29, -0.14) according to the linear regression results. The results based on Mendelian randomization suggests that there may be a negative causal effect of education on BMI (b=-0.84; 95% CI=-1.77, 0.09). The findings indicate that education could be a protective factor against obesity in advanced countries. Copyright © 2017 Elsevier Inc. All rights reserved.
Interação genótipo-ambiente para a produção de leite em rebanhos da raça holandesa no Brasil: (I modelo de touro Genotype-environment interaction on milk production in Holstein in Brazil: (I sire model

Directory of Open Access Journals (Sweden)

Paulo Roberto Nogara Rorato

1999-12-01

Full Text Available Com o objetivo de avaliar o efeito da interação genótipo-ambiente sobre o desempenho produtivo de vacas da raça Holandesa no Brasil, foram estudados os registros de produção total de leite à primeira lactação de 14.418 vacas filhas de 324 touros e distribuídas em 181 rebanhos em diferentes estados, no período de 1981 a 1991. Os dados foram estratificados de acordo com a produção média de leite do rebanho, em nível baixo (B, médio (M e alto (A. Os componentes de (covariância foram estimados utilizando-se o método da máxima verossimilhança restrita e dois modelos de touro. Os componentes de variância de touro variaram de 116.879 a 274.871 e foram maiores nos níveis mais altos; os residuais variaram de 1.691.879 a 1.956.025, crescendo com o nível de produção dos rebanhos e os da interação variaram de 66.854 a 149.972, tendo o maior valor ocorrido nos níveis extremos de produção. Os coeficientes de herdabilidade variaram de 0,22 a 0,49 e os de correlação genética foram 0,22, 0,46 e 0,69, entre os níveis B e A, B e M e M e A, respectivamente.Records on 14.418 first lactations of Holstein cows sired by 324 bulls distributed in 181 herds in different States from 1981 to 1991, were used to study the effect of genotype-environment interaction on milk production. The data were distributed in three levels (low-B, medium-M, and high-A according to the average of the herd milk production. (Covariances components were estimated by REML using two sire models. The variance components of sire ranged from 116,879 to 274,871 were larger at the higher levels, the residuals ranged from 1,691,879 to 1,956,025, increasing with the production level of the herds and the interaction ranged from 66,854 to 149,972 with the highest value when the daughters performed at the extreme levels. The heritabilities ranged from 0.22 to 0.49 and the genetic correlations were 0.22, 0.46, and 0.69, respectivelly, among the levels low and high, low and
MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data.

Science.gov (United States)

Yavorska, Olena O; Burgess, Stephen

2017-12-01

MendelianRandomization is a software package for the R open-source software environment that performs Mendelian randomization analyses using summarized data. The core functionality is to implement the inverse-variance weighted, MR-Egger and weighted median methods for multiple genetic variants. Several options are available to the user, such as the use of robust regression, fixed- or random-effects models and the penalization of weights for genetic variants with heterogeneous causal estimates. Extensions to these methods, such as allowing for variants to be correlated, can be chosen if appropriate. Graphical commands allow summarized data to be displayed in an interactive graph, or the plotting of causal estimates from multiple methods, for comparison. Although the main method of data entry is directly by the user, there is also an option for allowing summarized data to be incorporated from the PhenoScanner database of genotype-phenotype associations. We hope to develop this feature in future versions of the package. The R software environment is available for download from [https://www.r-project.org/]. The MendelianRandomization package can be downloaded from the Comprehensive R Archive Network (CRAN) within R, or directly from [https://cran.r-project.org/web/packages/MendelianRandomization/]. Both R and the MendelianRandomization package are released under GNU General Public Licenses (GPL-2|GPL-3). © The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association.
Regression and regression analysis time series prediction modeling on climate data of quetta, pakistan

International Nuclear Information System (INIS)

Jafri, Y.Z.; Kamal, L.

2007-01-01

Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

Science.gov (United States)

Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

2012-01-01

In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
Linear regression in astronomy. I

Science.gov (United States)

Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh

1990-01-01

Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Logic regression and its extensions.

Science.gov (United States)

Schwender, Holger; Ruczinski, Ingo

2010-01-01

Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Retinal microaneurysm count predicts progression and regression of diabetic retinopathy. Post-hoc results from the DIRECT Programme.

Science.gov (United States)

Sjølie, A K; Klein, R; Porta, M; Orchard, T; Fuller, J; Parving, H H; Bilous, R; Aldington, S; Chaturvedi, N

2011-03-01

To study the association between baseline retinal microaneurysm score and progression and regression of diabetic retinopathy, and response to treatment with candesartan in people with diabetes. This was a multicenter randomized clinical trial. The progression analysis included 893 patients with Type 1 diabetes and 526 patients with Type 2 diabetes with retinal microaneurysms only at baseline. For regression, 438 with Type 1 and 216 with Type 2 diabetes qualified. Microaneurysms were scored from yearly retinal photographs according to the Early Treatment Diabetic Retinopathy Study (ETDRS) protocol. Retinopathy progression and regression was defined as two or more step change on the ETDRS scale from baseline. Patients were normoalbuminuric, and normotensive with Type 1 and Type 2 diabetes or treated hypertensive with Type 2 diabetes. They were randomized to treatment with candesartan 32 mg daily or placebo and followed for 4.6 years. A higher microaneurysm score at baseline predicted an increased risk of retinopathy progression (HR per microaneurysm score 1.08, P diabetes; HR 1.07, P = 0.0174 in Type 2 diabetes) and reduced the likelihood of regression (HR 0.79, P diabetes; HR 0.85, P = 0.0009 in Type 2 diabetes), all adjusted for baseline variables and treatment. Candesartan reduced the risk of microaneurysm score progression. Microaneurysm counts are important prognostic indicators for worsening of retinopathy, thus microaneurysms are not benign. Treatment with renin-angiotensin system inhibitors is effective in the early stages and may improve mild diabetic retinopathy. Microaneurysm scores may be useful surrogate endpoints in clinical trials. © 2011 The Authors. Diabetic Medicine © 2011 Diabetes UK.
Tumor regression patterns in retinoblastoma

International Nuclear Information System (INIS)

Zafar, S.N.; Siddique, S.N.; Zaheer, N.

2016-01-01

To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Identification of System Parameters by the Random Decrement Technique

DEFF Research Database (Denmark)

Brincker, Rune; Kirkegaard, Poul Henning; Rytter, Anders

-Walker equations and finally least square fitting of the theoretical correlation function. The results are compared to the results of fitting an Auto Regressive Moving Average(ARMA) model directly to the system output. All investigations are performed on the simulated output from a single degree-off-freedom system......The aim of this paper is to investigate and illustrate the possibilities of using correlation functions estimated by the Random Decrement Technique as a basis for parameter identification. A two-stage system identification method is used: first the correlation functions are estimated by the Random...... Decrement technique and then the system parameters are identified from the correlation function estimates. Three different techniques are used in the parameters identification process: a simple non-paramatic method, estimation of an Auto Regressive(AR) model by solving an overdetermined set of Yule...
Combining Alphas via Bounded Regression

Directory of Open Access Journals (Sweden)

Zura Kakushadze

2015-11-01

Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.

riskRegression

DEFF Research Database (Denmark)

Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas

2017-01-01

In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

Science.gov (United States)

Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

2018-05-01

Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
A computational approach to compare regression modelling strategies in prediction research.

Science.gov (United States)

Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

2016-08-25

It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
An Optimal Sample Data Usage Strategy to Minimize Overfitting and Underfitting Effects in Regression Tree Models Based on Remotely-Sensed Data

Directory of Open Access Journals (Sweden)

Yingxin Gu

2016-11-01

Full Text Available Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD between the predicted and actual NDVI (scaled NDVI, value from 0–200 and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4, which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.
Relationships between the structure of wheat gluten and ACE inhibitory activity of hydrolysate: stepwise multiple linear regression analysis.

Science.gov (United States)

Zhang, Yanyan; Ma, Haile; Wang, Bei; Qu, Wenjuan; Wali, Asif; Zhou, Cunshan

2016-08-01

Ultrasound pretreatment of wheat gluten (WG) before enzymolysis can improve the angiotensin converting enzyme (ACE) inhibitory activity of the hydrolysates by alerting the structure of substrate proteins. Establishment of a relationship between the structure of WG and ACE inhibitory activity of the hydrolysates to judge the end point of the ultrasonic pretreatment is vital. The results of stepwise multiple linear regression (MLR) showed that the contents of free sulfhydryl, α-helix, disulfide bond, surface hydrophobicity and random coil were significantly correlated to ACE Inhibitory activity of the hydrolysate, with the standard partial regression coefficients were 3.729, -0.676, -0.252, 0.022 and 0.156, respectively. The R(2) of this model was 0.970. External validation showed that the stepwise MLR model could well predict the ACE inhibitory activity of hydrolysate based on the content of free sulfhydryl, α-helix, disulfide bond, surface hydrophobicity and random coil of WG before hydrolysis. A stepwise multiple linear regression model describing the quantitative relationships between the structure of WG and the ACE Inhibitory activity of the hydrolysates was established. This model can be used to predict the endpoint of the ultrasonic pretreatment. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.
Regression in autistic spectrum disorders.

Science.gov (United States)

Stefanatos, Gerry A

2008-12-01

A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Understanding logistic regression analysis

OpenAIRE

Sperandei, Sandro

2014-01-01

Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...
Determining clinical benefits of drug-eluting coronary stents according to the population risk profile: a meta-regression from 31 randomized trials.

Science.gov (United States)

Moreno, Raul; Martin-Reyes, Roberto; Jimenez-Valero, Santiago; Sanchez-Recalde, Angel; Galeote, Guillermo; Calvo, Luis; Plaza, Ignacio; Lopez-Sendon, Jose-Luis

2011-04-01

The use of drug-eluting stents (DES) in unfavourable patients has been associated with higher rates of clinical complications and stent thrombosis, and because of that concerns about the use of DES in high-risk settings have been raised. This study sought to demonstrate that the clinical benefit of DES increases as the risk profile of the patients increases. A meta-regression analysis from 31 randomized trials that compared DES and bare-metal stents, including overall 12,035 patients, was performed. The relationship between the clinical benefit of using DES (number of patients to treat [NNT] to prevent one episode of target lesion revascularization [TLR]), and the risk profile of the population (rate of TLR in patients allocated to bare-metal stents) in each trial was evaluated. The clinical benefit of DES increased as the risk profile of each study population increased: NNT for TLR=31.1-1.2 (TLR for bare-metal stents); prisk profile of each study population, since the effect of DES in mortality, myocardial infarction, and stent thrombosis, was not adversely affected by the risk profile of each study population (95% confidence interval for β value 0.09 to 0.11, -0.12 to 0.19, and -0.03 to-0.15 for mortality, myocardial infarction, and stent thrombosis, respectively). The clinical benefit of DES increases as the risk profile of the patients increases, without affecting safety. Copyright © 2009 Elsevier Ireland Ltd. All rights reserved.
Genetic variance in micro-environmental sensitivity for milk and milk quality in Walloon Holstein cattle.

Science.gov (United States)

Vandenplas, J; Bastin, C; Gengler, N; Mulder, H A

2013-09-01

Animals that are robust to environmental changes are desirable in the current dairy industry. Genetic differences in micro-environmental sensitivity can be studied through heterogeneity of residual variance between animals. However, residual variance between animals is usually assumed to be homogeneous in traditional genetic evaluations. The aim of this study was to investigate genetic heterogeneity of residual variance by estimating variance components in residual variance for milk yield, somatic cell score, contents in milk (g/dL) of 2 groups of milk fatty acids (i.e., saturated and unsaturated fatty acids), and the content in milk of one individual fatty acid (i.e., oleic acid, C18:1 cis-9), for first-parity Holstein cows in the Walloon Region of Belgium. A total of 146,027 test-day records from 26,887 cows in 747 herds were available. All cows had at least 3 records and a known sire. These sires had at least 10 cows with records and each herd × test-day had at least 5 cows. The 5 traits were analyzed separately based on fixed lactation curve and random regression test-day models for the mean. Estimation of variance components was performed by running iteratively expectation maximization-REML algorithm by the implementation of double hierarchical generalized linear models. Based on fixed lactation curve test-day mean models, heritability for residual variances ranged between 1.01×10(-3) and 4.17×10(-3) for all traits. The genetic standard deviation in residual variance (i.e., approximately the genetic coefficient of variation of residual variance) ranged between 0.12 and 0.17. Therefore, some genetic variance in micro-environmental sensitivity existed in the Walloon Holstein dairy cattle for the 5 studied traits. The standard deviations due to herd × test-day and permanent environment in residual variance ranged between 0.36 and 0.45 for herd × test-day effect and between 0.55 and 0.97 for permanent environmental effect. Therefore, nongenetic effects also
Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

Science.gov (United States)

Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

2015-01-01

This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of
Linear regression in astronomy. II

Science.gov (United States)

Feigelson, Eric D.; Babu, Gutti J.

1992-01-01

A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Feed efficiency of tropically adapted cattle when fed in winter or spring in a temperate location.

Science.gov (United States)

Coleman, S W; Chase, C C; Phillips, W A; Riley, D G

2018-04-16

Earlier work has shown that young, tropically adapted cattle do not gain as rapidly as temperately adapted cattle during the winter in Oklahoma. The objective for this study was to determine if efficiency of gains was also impacted in tropically adapted cattle and if efficiency was consistent over different seasons. Over 3 yrs, 240 straightbred and crossbred steers (F1 and three-way crosses) of Angus, Brahman or Romosinuano breeding, born in Brooksville, FL were transported to El Reno, OK in October and fed in two phases to determine performance, individual intake and efficiency. Phase 1 (WIN) began in November after a 28 d recovery from shipping stress and Phase 2 (SS) began in March, 28 d following completion of WIN each year. The diet for WIN was a grower diet (14% CP, 1.10 Mcal NEg/kg) and that for the SS was a feedlot diet (12.8% CP; 1.33 Mcal NEg/kg). After a 14 d adjustment to diet and facilities, intake trials were conducted over a period of 56 to 162 d for determination of intake and gain for efficiency. Body weights were recorded at approximately 14 d intervals, and initial BW, median BW, and ADG were determined from individual animal regressions of BW on days on feed (DOF). Individual daily DMI was then regressed by phase on median BW and ADG, and residuals of regression were recorded as residual feed intake (RFI). Similarly, daily gain was regressed by phase on median BW and DMI, and errors of regression were recorded as residual gain (RADG). Gain to feed (G:F) was also calculated. The statistical model to evaluate ADG, DMI, and efficiency included fixed effects of dam age (3 to 4, 5, 6 to10, and > 10yr), harvest group (3 per year), age on test, and a nested term DT(ST x XB) where DT = proportion tropical breeding of dam (0, 0.5, or 1), ST= proportion tropical breeding of sire (1, or 0), and XB whether the calf was straightbred or crossbred. Year of record, sire(ST x XB) and pen were random effects. Pre-weaning ADG and BW increased (P efficiency
OPTIMIZATION OF AGE AT FIRST CALVING IN KARAN FRIES CATTLE

Directory of Open Access Journals (Sweden)

P.K.Panja

2012-01-01

Full Text Available Data on 571 Karan Fries (crossing Tharparkar and Sahiwal cows with American Holstein Friesian sires at NDRI, Karnal was studied for determination of optimum age at first calving (AFC. Least squares analysis (Harvey, 1975 was used to see the effect of sire, period and season of calving and was corrected for significant effect of non-genetic factors. The genetic and phenotypic parameters was estimated for the sires which had five or more progenies. The relationship between age at first calving with other traits were studied by using regression analysis and class interval method. The least squaqres means of age at first calving (AFC, first lactation 305 days or less milk yield (FL305Y, first lactation total milk yield (FLTMY, milk yield per day of first lactation length (MY/FLL and milk yield per day of first calving interval (MY/FCI was estimated as 940.98 ± 44.24 days, 3199+ 44.24 kgs, 3599.06 ± 54.96 kgs, 10.50 ± 0.14 kgs and 7.52 ± 0.26 kgs , respectively. The heritability estimates of these traits were moderate. The AFC had significant and positive phenotypic correlation with FL305Y, FLTMY, MY/FLL and MY/FCI. The genetic correlation of AFC with FLTMY was positive. Relationship between AFC and first lactation production traits could not be explained through regression analysis therefore class interval method was used to find at the relationship. Eight classes of AFC was used to find out the relationship. Optimum AFC was identified based on higher milk production and numbers of animals in various classes as 26-36 months. To determine the optimum range of AFC, much emphasis should be given as maximum profit rather than maximizing milk production.
A Matlab program for stepwise regression

Directory of Open Access Journals (Sweden)

Yanhong Qi

2016-03-01

Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Kepler AutoRegressive Planet Search (KARPS)

Science.gov (United States)

Caceres, Gabriel

2018-01-01

One of the main obstacles in detecting faint planetary transits is the intrinsic stellar variability of the host star. The Kepler AutoRegressive Planet Search (KARPS) project implements statistical methodology associated with autoregressive processes (in particular, ARIMA and ARFIMA) to model stellar lightcurves in order to improve exoplanet transit detection. We also develop a novel Transit Comb Filter (TCF) applied to the AR residuals which provides a periodogram analogous to the standard Box-fitting Least Squares (BLS) periodogram. We train a random forest classifier on known Kepler Objects of Interest (KOIs) using select features from different stages of this analysis, and then use ROC curves to define and calibrate the criteria to recover the KOI planet candidates with high fidelity. These statistical methods are detailed in a contributed poster (Feigelson et al., this meeting).These procedures are applied to the full DR25 dataset of NASA’s Kepler mission. Using the classification criteria, a vast majority of known KOIs are recovered and dozens of new KARPS Candidate Planets (KCPs) discovered, including ultra-short period exoplanets. The KCPs will be briefly presented and discussed.
Alpins and thibos vectorial astigmatism analyses: proposal of a linear regression model between methods

Directory of Open Access Journals (Sweden)

Giuliano de Oliveira Freitas

2013-10-01

Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.
A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

Science.gov (United States)

Delwiche, Stephen R; Reeves, James B

2010-01-01

In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various
Multiple Time Series Forecasting Using Quasi-Randomized Functional Link Neural Networks

Directory of Open Access Journals (Sweden)

Thierry Moudiki

2018-03-01

Full Text Available We are interested in obtaining forecasts for multiple time series, by taking into account the potential nonlinear relationships between their observations. For this purpose, we use a specific type of regression model on an augmented dataset of lagged time series. Our model is inspired by dynamic regression models (Pankratz 2012, with the response variable’s lags included as predictors, and is known as Random Vector Functional Link (RVFL neural networks. The RVFL neural networks have been successfully applied in the past, to solving regression and classification problems. The novelty of our approach is to apply an RVFL model to multivariate time series, under two separate regularization constraints on the regression parameters.
Quantile regression theory and applications

CERN Document Server

Davino, Cristina; Vistocco, Domenico

2013-01-01

A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Fungible weights in logistic regression.

Science.gov (United States)

Jones, Jeff A; Waller, Niels G

2016-06-01

In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

Improvement of Storm Forecasts Using Gridded Bayesian Linear Regression for Northeast United States

Science.gov (United States)

Yang, J.; Astitha, M.; Schwartz, C. S.

2017-12-01

Bayesian linear regression (BLR) is a post-processing technique in which regression coefficients are derived and used to correct raw forecasts based on pairs of observation-model values. This study presents the development and application of a gridded Bayesian linear regression (GBLR) as a new post-processing technique to improve numerical weather prediction (NWP) of rain and wind storm forecasts over northeast United States. Ten controlled variables produced from ten ensemble members of the National Center for Atmospheric Research (NCAR) real-time prediction system are used for a GBLR model. In the GBLR framework, leave-one-storm-out cross-validation is utilized to study the performances of the post-processing technique in a database composed of 92 storms. To estimate the regression coefficients of the GBLR, optimization procedures that minimize the systematic and random error of predicted atmospheric variables (wind speed, precipitation, etc.) are implemented for the modeled-observed pairs of training storms. The regression coefficients calculated for meteorological stations of the National Weather Service are interpolated back to the model domain. An analysis of forecast improvements based on error reductions during the storms will demonstrate the value of GBLR approach. This presentation will also illustrate how the variances are optimized for the training partition in GBLR and discuss the verification strategy for grid points where no observations are available. The new post-processing technique is successful in improving wind speed and precipitation storm forecasts using past event-based data and has the potential to be implemented in real-time.
Landslide susceptibility mapping on a global scale using the method of logistic regression

Directory of Open Access Journals (Sweden)

L. Lin

2017-08-01

Full Text Available This paper proposes a statistical model for mapping global landslide susceptibility based on logistic regression. After investigating explanatory factors for landslides in the existing literature, five factors were selected for model landslide susceptibility: relative relief, extreme precipitation, lithology, ground motion and soil moisture. When building the model, 70 % of landslide and nonlandslide points were randomly selected for logistic regression, and the others were used for model validation. To evaluate the accuracy of predictive models, this paper adopts several criteria including a receiver operating characteristic (ROC curve method. Logistic regression experiments found all five factors to be significant in explaining landslide occurrence on a global scale. During the modeling process, percentage correct in confusion matrix of landslide classification was approximately 80 % and the area under the curve (AUC was nearly 0.87. During the validation process, the above statistics were about 81 % and 0.88, respectively. Such a result indicates that the model has strong robustness and stable performance. This model found that at a global scale, soil moisture can be dominant in the occurrence of landslides and topographic factor may be secondary.
Factors associated with 56-day non-return rate in dairy cattle

Directory of Open Access Journals (Sweden)

Ramiro Fouz

2011-06-01

Full Text Available The objective of this work was to identify factors associated with the 56-day non-return rate (56-NRR in dairy herds in the Galician region, Spain, and to estimate it for individual Holstein bulls. The experiment was carried out in herds originated from North-West Spain, from September 2008 to August 2009. Data of the 76,440 first inseminations performed during this period were gathered. Candidate factors were tested for their association with the 56-NRR by using a logistic model (binomial. Afterwards, 37 sires with a minimum of 150 first performed inseminations were individually evaluated. Logistic models were also estimated for each bull, and predicted individual 56-NRR rate values were calculated as a solution for the model parameters. Logistic regression found four major factors associated with 56-NRR in lactating cows: age at insemination, days from calving to insemination, milk production level at the time of insemination, and herd size. First-service conception rate, when a particular sire was used, was higher for heifers (0.71 than for lactating cows (0.52. Non-return rates were highly variable among bulls. Asignificant part of the herd-level variation of 56-NRR of Holstein cattle seems attributable to the service sire. High correlation level between observed and predicted 56-NRR was found.
Principal component regression analysis with SPSS.

Science.gov (United States)

Liu, R X; Kuang, J; Gong, Q; Hou, X L

2003-06-01

The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Logistic regression models

CERN Document Server

Hilbe, Joseph M

2009-01-01

This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Logistic regression applied to natural hazards: rare event logistic regression with replications

Science.gov (United States)

Guns, M.; Vanacker, V.

2012-06-01

Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Bridging Weighted Rules and Graph Random Walks for Statistical Relational Models

Directory of Open Access Journals (Sweden)

Seyed Mehran Kazemi

2018-02-01

Full Text Available The aim of statistical relational learning is to learn statistical models from relational or graph-structured data. Three main statistical relational learning paradigms include weighted rule learning, random walks on graphs, and tensor factorization. These paradigms have been mostly developed and studied in isolation for many years, with few works attempting at understanding the relationship among them or combining them. In this article, we study the relationship between the path ranking algorithm (PRA, one of the most well-known relational learning methods in the graph random walk paradigm, and relational logistic regression (RLR, one of the recent developments in weighted rule learning. We provide a simple way to normalize relations and prove that relational logistic regression using normalized relations generalizes the path ranking algorithm. This result provides a better understanding of relational learning, especially for the weighted rule learning and graph random walk paradigms. It opens up the possibility of using the more flexible RLR rules within PRA models and even generalizing both by including normalized and unnormalized relations in the same model.
REGRES: A FORTRAN-77 program to calculate nonparametric and ``structural'' parametric solutions to bivariate regression equations

Science.gov (United States)

Rock, N. M. S.; Duffy, T. R.

REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
Understanding logistic regression analysis.

Science.gov (United States)

Sperandei, Sandro

2014-01-01

Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
[Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

Science.gov (United States)

Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

2016-03-01

Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.
Household water treatment in developing countries: comparing different intervention types using meta-regression.

Science.gov (United States)

Hunter, Paul R

2009-12-01

Household water treatment (HWT) is being widely promoted as an appropriate intervention for reducing the burden of waterborne disease in poor communities in developing countries. A recent study has raised concerns about the effectiveness of HWT, in part because of concerns over the lack of blinding and in part because of considerable heterogeneity in the reported effectiveness of randomized controlled trials. This study set out to attempt to investigate the causes of this heterogeneity and so identify factors associated with good health gains. Studies identified in an earlier systematic review and meta-analysis were supplemented with more recently published randomized controlled trials. A total of 28 separate studies of randomized controlled trials of HWT with 39 intervention arms were included in the analysis. Heterogeneity was studied using the "metareg" command in Stata. Initial analyses with single candidate predictors were undertaken and all variables significant at the P Risk and the parameter estimates from the final regression model. The overall effect size of all unblinded studies was relative risk = 0.56 (95% confidence intervals 0.51-0.63), but after adjusting for bias due to lack of blinding the effect size was much lower (RR = 0.85, 95% CI = 0.76-0.97). Four main variables were significant predictors of effectiveness of intervention in a multipredictor meta regression model: Log duration of study follow-up (regression coefficient of log effect size = 0.186, standard error (SE) = 0.072), whether or not the study was blinded (coefficient 0.251, SE 0.066) and being conducted in an emergency setting (coefficient -0.351, SE 0.076) were all significant predictors of effect size in the final model. Compared to the ceramic filter all other interventions were much less effective (Biosand 0.247, 0.073; chlorine and safe waste storage 0.295, 0.061; combined coagulant-chlorine 0.2349, 0.067; SODIS 0.302, 0.068). A Monte Carlo model predicted that over 12 months
Noise reduction by support vector regression with a Ricker wavelet kernel

International Nuclear Information System (INIS)

Deng, Xiaoying; Yang, Dinghui; Xie, Jing

2009-01-01

We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR
Noise reduction by support vector regression with a Ricker wavelet kernel

Science.gov (United States)

Deng, Xiaoying; Yang, Dinghui; Xie, Jing

2009-06-01

We propose a noise filtering technology based on the least-squares support vector regression (LS-SVR), to improve the signal-to-noise ratio (SNR) of seismic data. We modified it by using an admissible support vector (SV) kernel, namely the Ricker wavelet kernel, to replace the conventional radial basis function (RBF) kernel in seismic data processing. We investigated the selection of the regularization parameter for the LS-SVR and derived a concise selecting formula directly from the noisy data. We used the proposed method for choosing the regularization parameter which not only had the advantage of high speed but could also obtain almost the same effectiveness as an optimal parameter method. We conducted experiments using synthetic data corrupted by the random noise of different types and levels, and found that our method was superior to the wavelet transform-based approach and the Wiener filtering. We also applied the method to two field seismic data sets and concluded that it was able to effectively suppress the random noise and improve the data quality in terms of SNR.
Minimax Regression Quantiles

DEFF Research Database (Denmark)

Bache, Stefan Holst

A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Regression with Sparse Approximations of Data

DEFF Research Database (Denmark)

Noorzad, Pardis; Sturm, Bob L.

2012-01-01

We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \$k\$-nearest neighbors regression (\$k\$-NNR), and more generally, local polynomial kernel regression. Unlike \$k\$-NNR, however, SPARROW can adapt the number of regressors to use based...
Logistic regression applied to natural hazards: rare event logistic regression with replications

Directory of Open Access Journals (Sweden)

M. Guns

2012-06-01

Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
A simple approach to power and sample size calculations in logistic regression and Cox regression models.

Science.gov (United States)

Vaeth, Michael; Skovlund, Eva

2004-06-15

For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
pKa prediction for acidic phosphorus-containing compounds using multiple linear regression with computational descriptors.

Science.gov (United States)

Yu, Donghai; Du, Ruobing; Xiao, Ji-Chang

2016-07-05

Ninety-six acidic phosphorus-containing molecules with pKa 1.88 to 6.26 were collected and divided into training and test sets by random sampling. Structural parameters were obtained by density functional theory calculation of the molecules. The relationship between the experimental pKa values and structural parameters was obtained by multiple linear regression fitting for the training set, and tested with the test set; the R(2) values were 0.974 and 0.966 for the training and test sets, respectively. This regression equation, which quantitatively describes the influence of structural parameters on pKa , and can be used to predict pKa values of similar structures, is significant for the design of new acidic phosphorus-containing extractants. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
A method for fitting regression splines with varying polynomial order in the linear mixed model.

Science.gov (United States)

Edwards, Lloyd J; Stewart, Paul W; MacDougall, James E; Helms, Ronald W

2006-02-15

The linear mixed model has become a widely used tool for longitudinal analysis of continuous variables. The use of regression splines in these models offers the analyst additional flexibility in the formulation of descriptive analyses, exploratory analyses and hypothesis-driven confirmatory analyses. We propose a method for fitting piecewise polynomial regression splines with varying polynomial order in the fixed effects and/or random effects of the linear mixed model. The polynomial segments are explicitly constrained by side conditions for continuity and some smoothness at the points where they join. By using a reparameterization of this explicitly constrained linear mixed model, an implicitly constrained linear mixed model is constructed that simplifies implementation of fixed-knot regression splines. The proposed approach is relatively simple, handles splines in one variable or multiple variables, and can be easily programmed using existing commercial software such as SAS or S-plus. The method is illustrated using two examples: an analysis of longitudinal viral load data from a study of subjects with acute HIV-1 infection and an analysis of 24-hour ambulatory blood pressure profiles.
Detecting spatio-temporal changes in agricultural land use in Heilongjiang province, China using MODIS time-series data and a random forest regression model

Science.gov (United States)

Hu, Q.; Friedl, M. A.; Wu, W.

2017-12-01

Accurate and timely information regarding the spatial distribution of crop types and their changes is essential for acreage surveys, yield estimation, water management, and agricultural production decision-making. In recent years, increasing population, dietary shifts and climate change have driven drastic changes in China's agricultural land use. However, no maps are currently available that document the spatial and temporal patterns of these agricultural land use changes. Because of its short revisit period, rich spectral bands and global coverage, MODIS time series data has been shown to have great potential for detecting the seasonal dynamics of different crop types. However, its inherently coarse spatial resolution limits the accuracy with which crops can be identified from MODIS in regions with small fields or complex agricultural landscapes. To evaluate this more carefully and specifically understand the strengths and weaknesses of MODIS data for crop-type mapping, we used MODIS time-series imagery to map the sub-pixel fractional crop area for four major crop types (rice, corn, soybean and wheat) at 500-m spatial resolution for Heilongjiang province, one of the most important grain-production regions in China where recent agricultural land use change has been rapid and pronounced. To do this, a random forest regression (RF-g) model was constructed to estimate the percentage of each sub-pixel crop type in 2006, 2011 and 2016. Crop type maps generated through expert visual interpretation of high spatial resolution images (i.e., Landsat and SPOT data) were used to calibrate the regression model. Five different time series of vegetation indices (155 features) derived from different spectral channels of MODIS land surface reflectance (MOD09A1) data were used as candidate features for the RF-g model. An out-of-bag strategy and backward elimination approach was applied to select the optimal spectra-temporal feature subset for each crop type. The resulting crop maps

Post-processing through linear regression

Science.gov (United States)

van Schaeybroeck, B.; Vannitsem, S.

2011-03-01

Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Semiparametric regression during 2003–2007

KAUST Repository

Ruppert, David; Wand, M.P.; Carroll, Raymond J.

2009-01-01

Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Unbalanced Regressions and the Predictive Equation

DEFF Research Database (Denmark)

Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo

Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

Science.gov (United States)

Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

2014-12-01

Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Interpretation of commonly used statistical regression models.

Science.gov (United States)

Kasza, Jessica; Wolfe, Rory

2014-01-01

A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Linear regression

CERN Document Server

Olive, David J

2017-01-01

This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Regression modeling of ground-water flow

Science.gov (United States)

Cooley, R.L.; Naff, R.L.

1985-01-01

Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Regression relation for pure quantum states and its implications for efficient computing.

Science.gov (United States)

Elsayed, Tarek A; Fine, Boris V

2013-02-15

We obtain a modified version of the Onsager regression relation for the expectation values of quantum-mechanical operators in pure quantum states of isolated many-body quantum systems. We use the insights gained from this relation to show that high-temperature time correlation functions in many-body quantum systems can be controllably computed without complete diagonalization of the Hamiltonians, using instead the direct integration of the Schrödinger equation for randomly sampled pure states. This method is also applicable to quantum quenches and other situations describable by time-dependent many-body Hamiltonians. The method implies exponential reduction of the computer memory requirement in comparison with the complete diagonalization. We illustrate the method by numerically computing infinite-temperature correlation functions for translationally invariant Heisenberg chains of up to 29 spins 1/2. Thereby, we also test the spin diffusion hypothesis and find it in a satisfactory agreement with the numerical results. Both the derivation of the modified regression relation and the justification of the computational method are based on the notion of quantum typicality.
Post-processing through linear regression

Directory of Open Access Journals (Sweden)

B. Van Schaeybroeck

2011-03-01

Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.

These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Logistic regression applied to natural hazards: rare event logistic regression with replications

OpenAIRE

Guns, M.; Vanacker, Veerle

2012-01-01

Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logisti...
Selection intensity for milk yield in 1970—1977 in the Finnish Ayrshire

Directory of Open Access Journals (Sweden)

U. B. Lindström

1978-12-01

Full Text Available Selection differentials for sires and dams of bulls taken into AI use in 1970—1977, as well as for sires used in AI, were combined with an estimate of the quality of dams of female replacements to calculate the (predicted genetic change in milk yield in the Ayrshire breed. In the period the average annual genetic gain was 0.97 % of the mean yield, in the last three years it was c. 1.1 %. The average generation interval was 6.8 years; 8.7 years for the bull sires, 7.4 years for the bull dams and 6.4 years for the cow sires. The bull sires accounted for 42 %, the bull dams for 37 % and the cow sires for only 12% of the total genetic gain. A more rational use of progeny tested and young bulls, combined with a reduction of the generation interval of 15 %, could easily have increased the genetic progress by 20 %.
A Seemingly Unrelated Poisson Regression Model

OpenAIRE

King, Gary

1989-01-01

This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.

Science.gov (United States)

Ferrari, Alberto

2017-01-01

Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.
Population parameters to compare dog breeds : differences between five Dutch purebred populations

NARCIS (Netherlands)

Nielen, A.L.J.; Beek, van der S.; Ubbink, G.J.; Knol, B.W.

2001-01-01

Differences in five purebred dog populations born in 1994 in the Netherlands were evaluated using different parameters. Numerically, the Golden Retriever was the largest breed (840 litters of 234 sires) and the Kooiker Dog (101 litters of 41 sires) the smallest. The litter per sire ratio was largest
Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

Directory of Open Access Journals (Sweden)

Santana Isabel

2011-08-01

Full Text Available Abstract Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI, but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.
Recursive Algorithm For Linear Regression

Science.gov (United States)

Varanasi, S. V.

1988-01-01

Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Use of a Regression Model to Study Host-Genomic Determinants of Phage Susceptibility in MRSA

DEFF Research Database (Denmark)

Zschach, Henrike; Larsen, Mette V; Hasman, Henrik

2018-01-01

strains to 12 (nine monovalent) different therapeutic phage preparations and subsequently employed linear regression models to estimate the influence of individual host gene families on resistance to phages. Specifically, we used a two-step regression model setup with a preselection step based on gene...... family enrichment. We show that our models are robust and capture the data's underlying signal by comparing their performance to that of models build on randomized data. In doing so, we have identified 167 gene families that govern phage resistance in our strain set and performed functional analysis...... on them. This revealed genes of possible prophage or mobile genetic element origin, along with genes involved in restriction-modification and transcription regulators, though the majority were genes of unknown function. This study is a step in the direction of understanding the intricate host...
Applied regression analysis a research tool

CERN Document Server

Pantula, Sastry; Dickey, David

1998-01-01

Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Standards for Standardized Logistic Regression Coefficients

Science.gov (United States)

Menard, Scott

2011-01-01

Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Effect of Spirulina supplementation on plasma metabolites in crossbred and purebred Australian Merino lambs

Directory of Open Access Journals (Sweden)

A.E.O. Malau-Aduli

2015-06-01

Full Text Available The effect of supplementing purebred and crossbred Merino lambs with Arthrospira platensis (Spirulina on plasma metabolite concentrations under pasture-based management system and the influences of sire breed and sex were investigated. A completely randomized experimental design balanced by 4 sire breeds (Merino, White Suffolk, Dorset and Black Suffolk, 3 Spirulina supplementation levels (0, 100 and 200 ml representing the control, low and high, respectively and 2 sexes (ewe and wether lambs was utilised. All lambs had ad libitum access to the basal diet of ryegrass pastures and barley. Lambs in the treatment groups were individually drenched daily with Spirulina prior to being released with the control group of lambs for grazing over a 6-week period following a 3-week adjustment phase. At the start and completion of the feeding trial, blood samples were centrifuged and plasma metabolites measured. Data were analysed with Spirulina supplementation level, sire breed, sex and their second-order interactions fitted as fixed effects and metabolite concentrations as dependent variables. Gamma-glutamyl transferase (GGT concentrations decreased (from 79.40 to 69.25 UI and glucose increased (from 3.81 to 4.19 mmol/L as the level of Spirulina supplementation increased from 0 ml in the control to 200 ml in the high treatment groups (P < 0.05. Lambs supplemented at low Spirulina levels had the highest creatinine concentrations (61.75 μmol/L. Interactions between sex and supplementation level significantly affected glucose, aspartate aminotransferase (AST and Mg concentrations (P < 0.05, while sire breed and supplementation level interactions influenced albumin to globulin (A/G ratio, creatinine and GGT concentrations. It was demonstrated that Spirulina supplementation does not negatively impact lamb health and productivity.

Regression and local control rates after radiotherapy for jugulotympanic paragangliomas: Systematic review and meta-analysis

International Nuclear Information System (INIS)

Hulsteijn, Leonie T. van; Corssmit, Eleonora P.M.; Coremans, Ida E.M.; Smit, Johannes W.A.; Jansen, Jeroen C.; Dekkers, Olaf M.

2013-01-01

The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ⩾12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can
[Prediction model of health workforce and beds in county hospitals of Hunan by multiple linear regression].

Science.gov (United States)

Ling, Ru; Liu, Jiawang

2011-12-01

To construct prediction model for health workforce and hospital beds in county hospitals of Hunan by multiple linear regression. We surveyed 16 counties in Hunan with stratified random sampling according to uniform questionnaires,and multiple linear regression analysis with 20 quotas selected by literature view was done. Independent variables in the multiple linear regression model on medical personnels in county hospitals included the counties' urban residents' income, crude death rate, medical beds, business occupancy, professional equipment value, the number of devices valued above 10 000 yuan, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, and utilization rate of hospital beds. Independent variables in the multiple linear regression model on county hospital beds included the the population of aged 65 and above in the counties, disposable income of urban residents, medical personnel of medical institutions in county area, business occupancy, the total value of professional equipment, fixed assets, long-term debt, medical income, medical expenses, outpatient and emergency visits, hospital visits, actual available bed days, utilization rate of hospital beds, and length of hospitalization. The prediction model shows good explanatory and fitting, and may be used for short- and mid-term forecasting.
Bayesian ARTMAP for regression.

Science.gov (United States)

Sasu, L M; Andonie, R

2013-10-01

Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Mechanisms of neuroblastoma regression

Science.gov (United States)

Brodeur, Garrett M.; Bagatell, Rochelle

2014-01-01

Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients

Science.gov (United States)

Gorgees, HazimMansoor; Mahdi, FatimahAssim

2018-05-01

This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Genetic evaluation of growth traits in beef cattle using random ...

African Journals Online (AJOL)

Traits included in the analysis were birth- (BW), weaning- (WW), yearling- (YW), eighteen month- (FW) and three measurements of mature weight (MW). Linear polynomials with intercepts were fitted using random regression models. The direct heritability estimates were moderate and ranged from 0.13 to 0.25 while maternal ...
Multicollinearity and Regression Analysis

Science.gov (United States)

Daoud, Jamal I.

2017-12-01

In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
Panel Smooth Transition Regression Models

DEFF Research Database (Denmark)

González, Andrés; Terasvirta, Timo; Dijk, Dick van

We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Credit Scoring Problem Based on Regression Analysis

OpenAIRE

Khassawneh, Bashar Suhil Jad Allah

2014-01-01

ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Genetic analysis of calf and heifer losses in Danish Holstein

DEFF Research Database (Denmark)

Fuerst-Walti, B; Sørensen, Morten Kargo

2010-01-01

Mortality in dairy cattle is not only relevant with regard to economic losses but also to animal health and welfare. Thus, the aim of this investigation was to explore the genetic background of postnatal mortality in calves and replacement heifers in different age groups until first calving...... periods, whereas their records were kept for preceding periods. After further data editing, more than 840,000 calves and heifers born in the years 1998 to 2007 were investigated. Mortality rates were 3.23, 2.66, 0.97, 1.92, and 9.36% for the defined periods P1 to P5, respectively. For the estimation...... of genetic parameters, linear and threshold sire models were applied. Effects accounted for were the random effects herd × year × season and sire as well as the fixed effects year × month, number of dam's parity (parities >5 were set to 5), calf size, and calving ease. In total, the pedigree consisted of 4...
Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction.

Science.gov (United States)

Rahman, Raziur; Haider, Saad; Ghosh, Souparno; Pal, Ranadip

2015-01-01

Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees' prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.
Effect of climatic variables on production and reproduction traits of colored broiler breeder poultry

Directory of Open Access Journals (Sweden)

G. D. Nayak

2015-04-01

Full Text Available Aim: The present study was conducted to investigate the important climatic variables affecting production and reproduction in a broiler breeder flock. Materials and Methods: The experiment was conducted for a period of 1 year on colored synthetic female line male and female poultry birds. 630 female progeny and 194 male progenies from 69 sires and 552 dams produced in four consecutive hatches at an interval of 10 days were used for the present study. Each of the seven, body weight and reproduction traits were regressed with nine environmental variables. Initially, the data were subjected to hatch effect and sire effect corrections through best linear unbiased estimator (BLUE method and, then, multiple linear regressions of environmental variables on each trait were applied. Result: The overall regression was significant (p<0.01 in all traits except 20 week age body weight of females. The R2 value ranged from 0.12 to 0.90 for the traits. Regression coefficient values (b values for maximum temperature and minimum temperature were significant (p<0.05 on 5th week age body weight of males. Similarly, evaporation and morning relative humidity (RH was significant (p<0.05 for 5th week age body weight of females. Almost all b values were significant (p<0.05 for egg production up to 40 week age. The b values representing rainfall, morning RH, afternoon RH, sunshine hours, and rainy days were significant (p<0.05 on bodyweight at 20 week age. All environmental variables except maximum temperature and minimum temperature were significant (p<0.05 on body weight of females at 20 weeks of age. Age at sexual maturity was regressed significantly (p<0.05 with evaporation, afternoon RH whereas, egg shape index was regressed significantly (p<0.05 with a maximum temperature, evaporation and afternoon RH. Conclusion: The result indicated that various environmental variables play a significant role in production and reproduction of breeder broiler poultry. Controlling
Unbalanced Regressions and the Predictive Equation

DEFF Research Database (Denmark)

Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo

Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...... in the theoretical predictive equation by suggesting a data generating process, where returns are generated as linear functions of a lagged latent I(0) risk process. The observed predictor is a function of this latent I(0) process, but it is corrupted by a fractionally integrated noise. Such a process may arise due...... to aggregation or unexpected level shifts. In this setup, the practitioner estimates a misspecified, unbalanced, and endogenous predictive regression. We show that the OLS estimate of this regression is inconsistent, but standard inference is possible. To obtain a consistent slope estimate, we then suggest...
[From clinical judgment to linear regression model.

Science.gov (United States)

Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

2013-01-01

When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Autistic Regression

Science.gov (United States)

Matson, Johnny L.; Kozlowski, Alison M.

2010-01-01

Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation

Directory of Open Access Journals (Sweden)

Sharad Damodar Gore

2009-10-01

Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Dynamic Optimization for IPS2 Resource Allocation Based on Improved Fuzzy Multiple Linear Regression

Directory of Open Access Journals (Sweden)

Maokuan Zheng

2017-01-01

Full Text Available The study mainly focuses on resource allocation optimization for industrial product-service systems (IPS2. The development of IPS2 leads to sustainable economy by introducing cooperative mechanisms apart from commodity transaction. The randomness and fluctuation of service requests from customers lead to the volatility of IPS2 resource utilization ratio. Three basic rules for resource allocation optimization are put forward to improve system operation efficiency and cut unnecessary costs. An approach based on fuzzy multiple linear regression (FMLR is developed, which integrates the strength and concision of multiple linear regression in data fitting and factor analysis and the merit of fuzzy theory in dealing with uncertain or vague problems, which helps reduce those costs caused by unnecessary resource transfer. The iteration mechanism is introduced in the FMLR algorithm to improve forecasting accuracy. A case study of human resource allocation optimization in construction machinery industry is implemented to test and verify the proposed model.
Customized sequential designs for random simulation experiments: Kriging metamodeling and bootstrapping

NARCIS (Netherlands)

Beers, van W.C.M.; Kleijnen, J.P.C.

2005-01-01

This paper proposes a novel method to select an experimental design for interpolation in random simulation, especially discrete event simulation. (Though the paper focuses on Kriging, this design approach may also apply to other types of metamodels such as linear regression models.) Assuming that
Discriminative Elastic-Net Regularized Linear Regression.

Science.gov (United States)

Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen

2017-03-01

In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification

Science.gov (United States)

Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen

2017-12-01

Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.

Assessment of Poisson, logit, and linear models for genetic analysis of clinical mastitis in Norwegian Red cows.

Science.gov (United States)

Vazquez, A I; Gianola, D; Bates, D; Weigel, K A; Heringstad, B

2009-02-01

Clinical mastitis is typically coded as presence/absence during some period of exposure, and records are analyzed with linear or binary data models. Because presence includes cows with multiple episodes, there is loss of information when a count is treated as a binary response. The Poisson model is designed for counting random variables, and although it is used extensively in epidemiology of mastitis, it has rarely been used for studying the genetics of mastitis. Many models have been proposed for genetic analysis of mastitis, but they have not been formally compared. The main goal of this study was to compare linear (Gaussian), Bernoulli (with logit link), and Poisson models for the purpose of genetic evaluation of sires for mastitis in dairy cattle. The response variables were clinical mastitis (CM; 0, 1) and number of CM cases (NCM; 0, 1, 2, ..). Data consisted of records on 36,178 first-lactation daughters of 245 Norwegian Red sires distributed over 5,286 herds. Predictive ability of models was assessed via a 3-fold cross-validation using mean squared error of prediction (MSEP) as the end-point. Between-sire variance estimates for NCM were 0.065 in Poisson and 0.007 in the linear model. For CM the between-sire variance was 0.093 in logit and 0.003 in the linear model. The ratio between herd and sire variances for the models with NCM response was 4.6 and 3.5 for Poisson and linear, respectively, and for model for CM was 3.7 in both logit and linear models. The MSEP for all cows was similar. However, within healthy animals, MSEP was 0.085 (Poisson), 0.090 (linear for NCM), 0.053 (logit), and 0.056 (linear for CM). For mastitic animals the MSEP values were 1.206 (Poisson), 1.185 (linear for NCM response), 1.333 (logit), and 1.319 (linear for CM response). The models for count variables had a better performance when predicting diseased animals and also had a similar performance between them. Logit and linear models for CM had better predictive ability for healthy
Visible and near infrared spectroscopy coupled to random forest to quantify some soil quality parameters

Science.gov (United States)

de Santana, Felipe Bachion; de Souza, André Marcelo; Poppi, Ronei Jesus

2018-02-01

This study evaluates the use of visible and near infrared spectroscopy (Vis-NIRS) combined with multivariate regression based on random forest to quantify some quality soil parameters. The parameters analyzed were soil cation exchange capacity (CEC), sum of exchange bases (SB), organic matter (OM), clay and sand present in the soils of several regions of Brazil. Current methods for evaluating these parameters are laborious, timely and require various wet analytical methods that are not adequate for use in precision agriculture, where faster and automatic responses are required. The random forest regression models were statistically better than PLS regression models for CEC, OM, clay and sand, demonstrating resistance to overfitting, attenuating the effect of outlier samples and indicating the most important variables for the model. The methodology demonstrates the potential of the Vis-NIR as an alternative for determination of CEC, SB, OM, sand and clay, making possible to develop a fast and automatic analytical procedure.
Categorical regression dose-response modeling

Science.gov (United States)

The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Abstract Expression Grammar Symbolic Regression

Science.gov (United States)

Korns, Michael F.

This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
REGRESSIVE ANALYSIS OF BRAKING EFFICIENCY OF M1 CATEGORY VEHICLES WITH ANTI-BLOCKING BRAKE SYSTEM

Directory of Open Access Journals (Sweden)

О. Sarayev

2015-07-01

Full Text Available The problematics of assessing the effectiveness of vehicle braking after road accidentoccurrence is considered. For the first time in relation to the modern models of vehicles equipped with anti-lock brakes there were obtained regression models describing the relationship between the coefficient of traction and a random variable of steady deceleration. This does not contradict the essence of the stochastic physical object, which is the process of vehicle braking, unlike the previously adopted method of formalizing this process, using a deterministic function.
Estimating Unbiased Treatment Effects in Education Using a Regression Discontinuity Design

Directory of Open Access Journals (Sweden)

William C. Smith

2014-08-01

Full Text Available The ability of regression discontinuity (RD designs to provide an unbiased treatment effect while overcoming the ethical concerns plagued by Random Control Trials (RCTs make it a valuable and useful approach in education evaluation. RD is the only explicitly recognized quasi-experimental approach identified by the Institute of Education Statistics to meet the prerequisites of a causal relationship. Unfortunately, the statistical complexity of the RD design has limited its application in education research. This article provides a less technical introduction to RD for education researchers and practitioners. Using visual analysis to aide conceptual understanding, the article walks readers through the essential steps of a Sharp RD design using hypothetical, but realistic, district intervention data and provides additional resources for further exploration.
Comparison of Classical Linear Regression and Orthogonal Regression According to the Sum of Squares Perpendicular Distances

OpenAIRE

KELEŞ, Taliha; ALTUN, Murat

2016-01-01

Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...
Regression tree analysis for predicting body weight of Nigerian Muscovy duck (Cairina moschata

Directory of Open Access Journals (Sweden)

Oguntunji Abel Olusegun

2017-01-01

Full Text Available Morphometric parameters and their indices are central to the understanding of the type and function of livestock. The present study was conducted to predict body weight (BWT of adult Nigerian Muscovy ducks from nine (9 morphometric parameters and seven (7 body indices and also to identify the most important predictor of BWT among them using regression tree analysis (RTA. The experimental birds comprised of 1,020 adult male and female Nigerian Muscovy ducks randomly sampled in Rain Forest (203, Guinea Savanna (298 and Derived Savanna (519 agro-ecological zones. Result of RTA revealed that compactness; body girth and massiveness were the most important independent variables in predicting BWT and were used in constructing RT. The combined effect of the three predictors was very high and explained 91.00% of the observed variation of the target variable (BWT. The optimal regression tree suggested that Muscovy ducks with compactness >5.765 would be fleshy and have highest BWT. The result of the present study could be exploited by animal breeders and breeding companies in selection and improvement of BWT of Muscovy ducks.
Pathological assessment of liver fibrosis regression

Directory of Open Access Journals (Sweden)

WANG Bingqiong

2017-03-01

Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Collocation methods for uncertainty quanti cation in PDE models with random data

KAUST Repository

Nobile, Fabio

2014-01-06

In this talk we consider Partial Differential Equations (PDEs) whose input data are modeled as random fields to account for their intrinsic variability or our lack of knowledge. After parametrizing the input random fields by finitely many independent random variables, we exploit the high regularity of the solution of the PDE as a function of the input random variables and consider sparse polynomial approximations in probability (Polynomial Chaos expansion) by collocation methods. We first address interpolatory approximations where the PDE is solved on a sparse grid of Gauss points in the probability space and the solutions thus obtained interpolated by multivariate polynomials. We present recent results on optimized sparse grids in which the selection of points is based on a knapsack approach and relies on sharp estimates of the decay of the coefficients of the polynomial chaos expansion of the solution. Secondly, we consider regression approaches where the PDE is evaluated on randomly chosen points in the probability space and a polynomial approximation constructed by the least square method. We present recent theoretical results on the stability and optimality of the approximation under suitable conditions between the number of sampling points and the dimension of the polynomial space. In particular, we show that for uniform random variables, the number of sampling point has to scale quadratically with the dimension of the polynomial space to maintain the stability and optimality of the approximation. Numerical results show that such condition is sharp in the monovariate case but seems to be over-constraining in higher dimensions. The regression technique seems therefore to be attractive in higher dimensions.
Logistic Regression: Concept and Application

Science.gov (United States)

Cokluk, Omay

2010-01-01

The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Logistic quantile regression provides improved estimates for bounded avian counts: a case study of California Spotted Owl fledgling production

Science.gov (United States)

Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane

2017-01-01

Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Predictors of course in obsessive-compulsive disorder: logistic regression versus Cox regression for recurrent events.

Science.gov (United States)

Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M

2007-09-01

Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
Sparse reduced-rank regression with covariance estimation

KAUST Repository

Chen, Lisha

2014-12-08

Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Sparse reduced-rank regression with covariance estimation

KAUST Repository

Chen, Lisha; Huang, Jianhua Z.

2014-01-01

Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Determinants of the probability of adopting quality protein maize (QPM) technology in Tanzania: A logistic regression analysis

OpenAIRE

Gregory, T.; Sewando, P.

2013-01-01

Adoption of technology is an important factor in economic development. The thrust of this study was to establish factors affecting adoption of QPM technology in Northern zone of Tanzania. Primary data was collected from a random sample of 120 smallholder maize farmers in four villages. Data collected were analysed using descriptive and quantitative methods. Logit model was used to determine factors that influence adoption of QPM technology. The regression results indicated that education of t...
The quantile regression approach to efficiency measurement: insights from Monte Carlo simulations.

Science.gov (United States)

Liu, Chunping; Laporte, Audrey; Ferguson, Brian S

2008-09-01

In the health economics literature there is an ongoing debate over approaches used to estimate the efficiency of health systems at various levels, from the level of the individual hospital - or nursing home - up to that of the health system as a whole. The two most widely used approaches to evaluating the efficiency with which various units deliver care are non-parametric data envelopment analysis (DEA) and parametric stochastic frontier analysis (SFA). Productivity researchers tend to have very strong preferences over which methodology to use for efficiency estimation. In this paper, we use Monte Carlo simulation to compare the performance of DEA and SFA in terms of their ability to accurately estimate efficiency. We also evaluate quantile regression as a potential alternative approach. A Cobb-Douglas production function, random error terms and a technical inefficiency term with different distributions are used to calculate the observed output. The results, based on these experiments, suggest that neither DEA nor SFA can be regarded as clearly dominant, and that, depending on the quantile estimated, the quantile regression approach may be a useful addition to the armamentarium of methods for estimating technical efficiency.
Online Censoring for Large-Scale Regressions with Application to Streaming Big Data.

Science.gov (United States)

Berberidis, Dimitris; Kekatos, Vassilis; Giannakis, Georgios B

2016-08-01

On par with data-intensive applications, the sheer size of modern linear regression problems creates an ever-growing demand for efficient solvers. Fortunately, a significant percentage of the data accrued can be omitted while maintaining a certain quality of statistical inference with an affordable computational budget. This work introduces means of identifying and omitting less informative observations in an online and data-adaptive fashion. Given streaming data, the related maximum-likelihood estimator is sequentially found using first- and second-order stochastic approximation algorithms. These schemes are well suited when data are inherently censored or when the aim is to save communication overhead in decentralized learning setups. In a different operational scenario, the task of joint censoring and estimation is put forth to solve large-scale linear regressions in a centralized setup. Novel online algorithms are developed enjoying simple closed-form updates and provable (non)asymptotic convergence guarantees. To attain desired censoring patterns and levels of dimensionality reduction, thresholding rules are investigated too. Numerical tests on real and synthetic datasets corroborate the efficacy of the proposed data-adaptive methods compared to data-agnostic random projection-based alternatives.
Harmonic regression of Landsat time series for modeling attributes from national forest inventory data

Science.gov (United States)

Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.

2018-03-01

Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.
Longitudinal beta regression models for analyzing health-related quality of life scores over time

Directory of Open Access Journals (Sweden)

Hunger Matthias

2012-09-01

Full Text Available Abstract Background Health-related quality of life (HRQL has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice. Methods We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy. Results At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors. Conclusions Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.

Regression models of reactor diagnostic signals

International Nuclear Information System (INIS)

Vavrin, J.

1989-01-01

The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Price Sensitivity of Demand for Prescription Drugs: Exploiting a Regression Kink Design

DEFF Research Database (Denmark)

Simonsen, Marianne; Skipper, Lars; Skipper, Niels

This paper investigates price sensitivity of demand for prescription drugs using drug purchase records for at 20% random sample of the Danish population. We identify price responsiveness by exploiting exogenous variation in prices caused by kinked reimbursement schemes and implement a regression ...... education and income are, however, more responsive to the price. Also, essential drugs that prevent deterioration in health and prolong life have lower associated average price sensitivity....... kink design. Thus, within a unifying framework we uncover price sensitivity for different subpopulations and types of drugs. The results suggest low average price responsiveness with corresponding price elasticities ranging from -0.08 to -0.25, implying that demand is inelastic. Individuals with lower...
A flexible mixed-effect negative binomial regression model for detecting unusual increases in MRI lesion counts in individual multiple sclerosis patients.

Science.gov (United States)

Kondo, Yumi; Zhao, Yinshan; Petkau, John

2015-06-15

We develop a new modeling approach to enhance a recently proposed method to detect increases of contrast-enhancing lesions (CELs) on repeated magnetic resonance imaging, which have been used as an indicator for potential adverse events in multiple sclerosis clinical trials. The method signals patients with unusual increases in CEL activity by estimating the probability of observing CEL counts as large as those observed on a patient's recent scans conditional on the patient's CEL counts on previous scans. This conditional probability index (CPI), computed based on a mixed-effect negative binomial regression model, can vary substantially depending on the choice of distribution for the patient-specific random effects. Therefore, we relax this parametric assumption to model the random effects with an infinite mixture of beta distributions, using the Dirichlet process, which effectively allows any form of distribution. To our knowledge, no previous literature considers a mixed-effect regression for longitudinal count variables where the random effect is modeled with a Dirichlet process mixture. As our inference is in the Bayesian framework, we adopt a meta-analytic approach to develop an informative prior based on previous clinical trials. This is particularly helpful at the early stages of trials when less data are available. Our enhanced method is illustrated with CEL data from 10 previous multiple sclerosis clinical trials. Our simulation study shows that our procedure estimates the CPI more accurately than parametric alternatives when the patient-specific random effect distribution is misspecified and that an informative prior improves the accuracy of the CPI estimates. Copyright © 2015 John Wiley & Sons, Ltd.
Regression and Sparse Regression Methods for Viscosity Estimation of Acid Milk From it’s Sls Features

DEFF Research Database (Denmark)

Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann

2012-01-01

Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... with sparse LAR, lasso and Elastic Net (EN) sparse regression methods. Due to the inconsistent measurement condition, Locally Weighted Scatter plot Smoothing (Loess) has been employed to alleviate the undesired variation in the estimated viscosity. The experimental results of applying different methods show...
Testing discontinuities in nonparametric regression

KAUST Repository

Dai, Wenlin

2017-01-19

In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression

KAUST Repository

Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

2017-01-01

In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Parallel distribution of sexes within left and right uterine horns in Holstein dairy cows: evidence that the effect of side of pregnancy on sex ratio could be breed-specific in cattle.

Science.gov (United States)

Gharagozlou, F; Vojgani, M; Akbarinejad, V; Niasari-Naslaji, A; Hemmati, M; Youssefi, R

2013-11-30

Dissimilar distribution of male and female calves within left and right uterine horns has been observed in beef cows. A retrospective study was conducted to investigate the effect of side of pregnancy on secondary sex ratio in Holstein dairy cows. Data associated with sex of calves, side of pregnancy, sire, dam, parity number of dam, AI technician, season and year were retrieved from the database of a Holstein dairy farm. In total, data consisted of 6515 birth records from 3155 dams and 244 sires across years 2001-2010. Data were analyzed using logistic regression. There was no difference in proportion of male and female calves between left (52.9% and 47.1%, respectively) and right (53.2% and 46.8%, respectively) uterine horns (P>0.05). AI technician, year, season and parity of dam did not affect secondary sex ratio (P>0.05). Secondary sex ratio of left and right uterine horns, and consequently, overall secondary sex ratio (53.1%) were skewed toward males as compared with hypothetical secondary sex ratio of 50% (Pcows. Copyright © 2013 Elsevier B.V. All rights reserved.
Land surface temperature downscaling using random forest regression: primary result and sensitivity analysis

Science.gov (United States)

Pan, Xin; Cao, Chen; Yang, Yingbao; Li, Xiaolong; Shan, Liangliang; Zhu, Xi

2018-04-01

The land surface temperature (LST) derived from thermal infrared satellite images is a meaningful variable in many remote sensing applications. However, at present, the spatial resolution of the satellite thermal infrared remote sensing sensor is coarser, which cannot meet the needs. In this study, LST image was downscaled by a random forest model between LST and multiple predictors in an arid region with an oasis-desert ecotone. The proposed downscaling approach was evaluated using LST derived from the MODIS LST product of Zhangye City in Heihe Basin. The primary result of LST downscaling has been shown that the distribution of downscaled LST matched with that of the ecosystem of oasis and desert. By the way of sensitivity analysis, the most sensitive factors to LST downscaling were modified normalized difference water index (MNDWI)/normalized multi-band drought index (NMDI), soil adjusted vegetation index (SAVI)/ shortwave infrared reflectance (SWIR)/normalized difference vegetation index (NDVI), normalized difference building index (NDBI)/SAVI and SWIR/NDBI/MNDWI/NDWI for the region of water, vegetation, building and desert, with LST variation (at most) of 0.20/-0.22 K, 0.92/0.62/0.46 K, 0.28/-0.29 K and 3.87/-1.53/-0.64/-0.25 K in the situation of +/-0.02 predictor perturbances, respectively.
On Solving Lq-Penalized Regressions

Directory of Open Access Journals (Sweden)

Tracy Zhou Wu

2007-01-01

Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
An Assessment of Polynomial Regression Techniques for the Relative Radiometric Normalization (RRN of High-Resolution Multi-Temporal Airborne Thermal Infrared (TIR Imagery

Directory of Open Access Journals (Sweden)

Mir Mustafizur Rahman

2014-11-01

Full Text Available Thermal Infrared (TIR remote sensing images of urban environments are increasingly available from airborne and satellite platforms. However, limited access to high-spatial resolution (H-res: ~1 m TIR satellite images requires the use of TIR airborne sensors for mapping large complex urban surfaces, especially at micro-scales. A critical limitation of such H-res mapping is the need to acquire a large scene composed of multiple flight lines and mosaic them together. This results in the same scene components (e.g., roads, buildings, green space and water exhibiting different temperatures in different flight lines. To mitigate these effects, linear relative radiometric normalization (RRN techniques are often applied. However, the Earth’s surface is composed of features whose thermal behaviour is characterized by complexity and non-linearity. Therefore, we hypothesize that non-linear RRN techniques should demonstrate increased radiometric agreement over similar linear techniques. To test this hypothesis, this paper evaluates four (linear and non-linear RRN techniques, including: (i histogram matching (HM; (ii pseudo-invariant feature-based polynomial regression (PIF_Poly; (iii no-change stratified random sample-based linear regression (NCSRS_Lin; and (iv no-change stratified random sample-based polynomial regression (NCSRS_Poly; two of which (ii and iv are newly proposed non-linear techniques. When applied over two adjacent flight lines (~70 km2 of TABI-1800 airborne data, visual and statistical results show that both new non-linear techniques improved radiometric agreement over the previously evaluated linear techniques, with the new fully-automated method, NCSRS-based polynomial regression, providing the highest improvement in radiometric agreement between the master and the slave images, at ~56%. This is ~5% higher than the best previously evaluated linear technique (NCSRS-based linear regression.
Boosted regression trees, multivariate adaptive regression splines and their two-step combinations with multiple linear regression or partial least squares to predict blood-brain barrier passage: a case study.

Science.gov (United States)

Deconinck, E; Zhang, M H; Petitet, F; Dubus, E; Ijjaali, I; Coomans, D; Vander Heyden, Y

2008-02-18

The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.
Testing Heteroscedasticity in Robust Regression

Czech Academy of Sciences Publication Activity Database

Kalina, Jan

2011-01-01

Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Variâncias do ponto crítico de equações de regressão quadrática Variances of the critical point of a quadratic regression equation

Directory of Open Access Journals (Sweden)

Ceile Cristina Ferreira Nunes

2004-04-01

ítico calculada usando-se a expressão que leva em consideração a covariância entre e apresenta resultados mais satisfatórios e que não segue uma distribuição normal, pois apresenta uma distribuição de freqüência com assimetria positiva e formato leptocúrtico.The aim of this paper is determine variances for the analysis of the critical point of a second-degree regression equation in experimental situations with different variances through Monte Carlo simulation. In many theoretical or applied studies, one finds situations involving ratios of random variables and more frequently normal variables. Examples are provided by variables, which appear in economic dose research of nutrients in fertilization experiments, as well as in other problems in which there are interests in the random variable, estimator of the critic point in the regression . Data of five hundred thirty six trials in cotton yield were utilized to study the distribution of the critical point of a quadratic regression equation by adjusting a quadratic model. The parameters were evaluated using a least square method. From the estimations a MATLAB routine was implemented to simulate two sets with five thousands random errors with normal distribution and zero mean, relative to each of the theoretical variances: or = 0.1; 0.5; 1; 5; 10; 15; 20 and 50. The estimation of the variance of the critical point was obtained by three methods: (a usual formula for the variance; (b formula obtained by differentiation of the critical point estimator and (c formula for the computation of the variance of a quotient by taking into consideration the covariance between and . The results obtained for the statistic average for the regression between e , as well as its respective variances in terms of the several theoretical residual variances ( adopted show that those theoretical values are close to real ones. Moreover, there is a trend of increasing and with increase of the theoretical variance. It may
Recognition of NEMP and LEMP signals based on auto-regression model and artificial neutral network

International Nuclear Information System (INIS)

Li Peng; Song Lijun; Han Chao; Zheng Yi; Cao Baofeng; Li Xiaoqiang; Zhang Xueqin; Liang Rui

2010-01-01

Auto-regression (AR) model, one power spectrum estimation method of stationary random signals, and artificial neutral network were adopted to recognize nuclear and lightning electromagnetic pulses. Self-correlation function and Burg algorithms were used to acquire the AR model coefficients as eigenvalues, and BP artificial neural network was introduced as the classifier with different numbers of hidden layers and hidden layer nodes. The results show that AR model is effective in those signals, feature extraction, and the Burg algorithm is more effective than the self-correlation function algorithm. (authors)
Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

Science.gov (United States)

Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

2018-03-29

The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.
Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss Classification Using Image-Based Features

Directory of Open Access Journals (Sweden)

Mohammadmehdi Saberioon

2018-03-01

Full Text Available The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss were fed either a fish-meal based diet (80 fish or a 100% plant-based diet (80 fish and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF, Support vector machine (SVM, Logistic regression (LR and k-Nearest neighbours (k-NN. The SVM with radial based kernel provided the best classifier with correct classification rate (CCR of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40% classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin.
Spontaneous regression of a congenital melanocytic nevus

Directory of Open Access Journals (Sweden)

Amiya Kumar Nath

2011-01-01

Full Text Available Congenital melanocytic nevus (CMN may rarely regress which may also be associated with a halo or vitiligo. We describe a 10-year-old girl who presented with CMN on the left leg since birth, which recently started to regress spontaneously with associated depigmentation in the lesion and at a distant site. Dermoscopy performed at different sites of the regressing lesion demonstrated loss of epidermal pigments first followed by loss of dermal pigments. Histopathology and Masson-Fontana stain demonstrated lymphocytic infiltration and loss of pigment production in the regressing area. Immunohistochemistry staining (S100 and HMB-45, however, showed that nevus cells were present in the regressing areas.
Regression Analysis by Example. 5th Edition

Science.gov (United States)

Chatterjee, Samprit; Hadi, Ali S.

2012-01-01

Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Gaussian process regression analysis for functional data

CERN Document Server

Shi, Jian Qing

2011-01-01

Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Is past life regression therapy ethical?

Science.gov (United States)

Andrade, Gabriel

2017-01-01

Past life regression therapy is used by some physicians in cases with some mental diseases. Anxiety disorders, mood disorders, and gender dysphoria have all been treated using life regression therapy by some doctors on the assumption that they reflect problems in past lives. Although it is not supported by psychiatric associations, few medical associations have actually condemned it as unethical. In this article, I argue that past life regression therapy is unethical for two basic reasons. First, it is not evidence-based. Past life regression is based on the reincarnation hypothesis, but this hypothesis is not supported by evidence, and in fact, it faces some insurmountable conceptual problems. If patients are not fully informed about these problems, they cannot provide an informed consent, and hence, the principle of autonomy is violated. Second, past life regression therapy has the great risk of implanting false memories in patients, and thus, causing significant harm. This is a violation of the principle of non-malfeasance, which is surely the most important principle in medical ethics.

Regression Models for Market-Shares

DEFF Research Database (Denmark)

Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue

2005-01-01

On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
The Crash Intensity Evaluation Using General Centrality Criterions and a Geographically Weighted Regression

Science.gov (United States)

Ghadiriyan Arani, M.; Pahlavani, P.; Effati, M.; Noori Alamooti, F.

2017-09-01

Today, one of the social problems influencing on the lives of many people is the road traffic crashes especially the highway ones. In this regard, this paper focuses on highway of capital and the most populous city in the U.S. state of Georgia and the ninth largest metropolitan area in the United States namely Atlanta. Geographically weighted regression and general centrality criteria are the aspects of traffic used for this article. In the first step, in order to estimate of crash intensity, it is needed to extract the dual graph from the status of streets and highways to use general centrality criteria. With the help of the graph produced, the criteria are: Degree, Pageranks, Random walk, Eccentricity, Closeness, Betweenness, Clustering coefficient, Eigenvector, and Straightness. The intensity of crash point is counted for every highway by dividing the number of crashes in that highway to the total number of crashes. Intensity of crash point is calculated for each highway. Then, criteria and crash point were normalized and the correlation between them was calculated to determine the criteria that are not dependent on each other. The proposed hybrid approach is a good way to regression issues because these effective measures result to a more desirable output. R2 values for geographically weighted regression using the Gaussian kernel was 0.539 and also 0.684 was obtained using a triple-core cube. The results showed that the triple-core cube kernel is better for modeling the crash intensity.
Fast image interpolation via random forests.

Science.gov (United States)

Huang, Jun-Jie; Siu, Wan-Chi; Liu, Tian-Rui

2015-10-01

This paper proposes a two-stage framework for fast image interpolation via random forests (FIRF). The proposed FIRF method gives high accuracy, as well as requires low computation. The underlying idea of this proposed work is to apply random forests to classify the natural image patch space into numerous subspaces and learn a linear regression model for each subspace to map the low-resolution image patch to high-resolution image patch. The FIRF framework consists of two stages. Stage 1 of the framework removes most of the ringing and aliasing artifacts in the initial bicubic interpolated image, while Stage 2 further refines the Stage 1 interpolated image. By varying the number of decision trees in the random forests and the number of stages applied, the proposed FIRF method can realize computationally scalable image interpolation. Extensive experimental results show that the proposed FIRF(3, 2) method achieves more than 0.3 dB improvement in peak signal-to-noise ratio over the state-of-the-art nonlocal autoregressive modeling (NARM) method. Moreover, the proposed FIRF(1, 1) obtains similar or better results as NARM while only takes its 0.3% computational time.
Detection of epistatic effects with logic regression and a classical linear regression model.

Science.gov (United States)

Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata

2014-02-01

To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
A Comparison of Regression Techniques for Estimation of Above-Ground Winter Wheat Biomass Using Near-Surface Spectroscopy

Directory of Open Access Journals (Sweden)

Jibo Yue

2018-01-01

Full Text Available Above-ground biomass (AGB provides a vital link between solar energy consumption and yield, so its correct estimation is crucial to accurately monitor crop growth and predict yield. In this work, we estimate AGB by using 54 vegetation indexes (e.g., Normalized Difference Vegetation Index, Soil-Adjusted Vegetation Index and eight statistical regression techniques: artificial neural network (ANN, multivariable linear regression (MLR, decision-tree regression (DT, boosted binary regression tree (BBRT, partial least squares regression (PLSR, random forest regression (RF, support vector machine regression (SVM, and principal component regression (PCR, which are used to analyze hyperspectral data acquired by using a field spectrophotometer. The vegetation indexes (VIs determined from the spectra were first used to train regression techniques for modeling and validation to select the best VI input, and then summed with white Gaussian noise to study how remote sensing errors affect the regression techniques. Next, the VIs were divided into groups of different sizes by using various sampling methods for modeling and validation to test the stability of the techniques. Finally, the AGB was estimated by using a leave-one-out cross validation with these powerful techniques. The results of the study demonstrate that, of the eight techniques investigated, PLSR and MLR perform best in terms of stability and are most suitable when high-accuracy and stable estimates are required from relatively few samples. In addition, RF is extremely robust against noise and is best suited to deal with repeated observations involving remote-sensing data (i.e., data affected by atmosphere, clouds, observation times, and/or sensor noise. Finally, the leave-one-out cross-validation method indicates that PLSR provides the highest accuracy (R2 = 0.89, RMSE = 1.20 t/ha, MAE = 0.90 t/ha, NRMSE = 0.07, CV (RMSE = 0.18; thus, PLSR is best suited for works requiring high
Prediction of hourly PM2.5 using a space-time support vector regression model

Science.gov (United States)

Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

2018-05-01

Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.
Poisson Mixture Regression Models for Heart Disease Prediction.

Science.gov (United States)

Mufudza, Chipo; Erol, Hamza

2016-01-01

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Poisson Mixture Regression Models for Heart Disease Prediction

Science.gov (United States)

Erol, Hamza

2016-01-01

Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Effect of sequence of insemination after simultaneous thawing of multiple semen straws on conception rate to timed AI in suckled multiparous Nelore cows.

Science.gov (United States)

Oliveira, L Z; Arruda, R P; de Andrade, A F C; Santos, R M; Beletti, M E; Peres, R F G; Martins, J P N; de Lima, V F M Hossepian

2012-11-01

The objective was to determine the effect of sequence of insemination after simultaneous thawing of multiple 0.5 mL semen straws on conception rate in suckled multiparous Nelore cows. The effect of this thawing procedure on in vitro sperm characteristics was also evaluated. All cows (N = 944) received the same timed AI protocol. Ten straws (0.5 mL) of frozen semen from the same batch were simultaneously thawed at 36 °C, for a minimum of 30 sec. One straw per cow was used for timed AI. Frozen semen from three Angus bulls was used. Timed AI records included sequence of insemination (first to tenth) and time of semen removal from thawing bath. For laboratory analyses, the same semen batches used in the field experiment were evaluated. Ten frozen straws from the same batch were thawed simultaneously in a thawing unit identical to that used in the field experiment. The following sperm characteristics were analyzed: sperm motility parameters, sperm thermal resistance, plasma and acrosomal membrane integrity, lipid peroxidation, chromatin structure, and sperm morphometry. Based on logistic regression, there were no significant effects of breeding group, body condition score, AI technician, and sire on conception rate, but there was an interaction between sire and straw group (P = 0.002). Semen from only one bull had decreased (P conception rates at timed AI, depending on the sire used. Nevertheless, the effects of this thawing environment on in vitro sperm characteristics, remain to be further investigated. Copyright © 2012 Elsevier Inc. All rights reserved.
Regression analysis using dependent Polya trees.

Science.gov (United States)

Schörgendorfer, Angela; Branscum, Adam J

2013-11-30

Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Early regression of severe left ventricular hypertrophy after transcatheter aortic valve replacement is associated with decreased hospitalizations.

Science.gov (United States)

Lindman, Brian R; Stewart, William J; Pibarot, Philippe; Hahn, Rebecca T; Otto, Catherine M; Xu, Ke; Devereux, Richard B; Weissman, Neil J; Enriquez-Sarano, Maurice; Szeto, Wilson Y; Makkar, Raj; Miller, D Craig; Lerakis, Stamatios; Kapadia, Samir; Bowers, Bruce; Greason, Kevin L; McAndrew, Thomas C; Lei, Yang; Leon, Martin B; Douglas, Pamela S

2014-06-01

This study sought to examine the relationship between left ventricular mass (LVM) regression and clinical outcomes after transcatheter aortic valve replacement (TAVR). LVM regression after valve replacement for aortic stenosis is assumed to be a favorable effect of LV unloading, but its relationship to improved clinical outcomes is unclear. Of 2,115 patients with symptomatic aortic stenosis at high surgical risk receiving TAVR in the PARTNER (Placement of Aortic Transcatheter Valves) randomized trial or continued access registry, 690 had both severe LV hypertrophy (left ventricular mass index [LVMi] ≥ 149 g/m(2) men, ≥ 122 g/m(2) women) at baseline and an LVMi measurement at 30-day post-TAVR follow-up. Clinical outcomes were compared for patients with greater than versus lesser than median percentage change in LVMi between baseline and 30 days using Cox proportional hazard models to evaluate event rates from 30 to 365 days. Compared with patients with lesser regression, patients with greater LVMi regression had a similar rate of all-cause mortality (14.1% vs. 14.3%, p = 0.99), but a lower rate of rehospitalization (9.5% vs. 18.5%, hazard ratio [HR]: 0.50, 95% confidence interval [CI]: 0.32 to 0.78; p = 0.002) and a lower rate of rehospitalization specifically for heart failure (7.3% vs. 13.6%, p = 0.01). The association with a lower rate of rehospitalization was consistent across subgroups and remained significant after multivariable adjustment (HR: 0.53, 95% CI: 0.34 to 0.84; p = 0.007). Patients with greater LVMi regression had lower B-type natriuretic peptide (p = 0.002) and a trend toward better quality of life (p = 0.06) at 1-year follow-up than did those with lesser regression. In high-risk patients with severe aortic stenosis and severe LV hypertrophy undergoing TAVR, those with greater early LVM regression had one-half the rate of rehospitalization over the subsequent year compared to those with lesser regression. Copyright © 2014 American College of
Applied Regression Modeling A Business Approach

CERN Document Server

Pardoe, Iain

2012-01-01

An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Verification of helical tomotherapy delivery using autoassociative kernel regression

International Nuclear Information System (INIS)

Seibert, Rebecca M.; Ramsey, Chester R.; Garvey, Dustin R.; Wesley Hines, J.; Robison, Ben H.; Outten, Samuel S.

2007-01-01

Quality assurance (QA) is a topic of major concern in the field of intensity modulated radiation therapy (IMRT). The standard of practice for IMRT is to perform QA testing for individual patients to verify that the dose distribution will be delivered to the patient. The purpose of this study was to develop a new technique that could eventually be used to automatically evaluate helical tomotherapy treatments during delivery using exit detector data. This technique uses an autoassociative kernel regression (AAKR) model to detect errors in tomotherapy delivery. AAKR is a novel nonparametric model that is known to predict a group of correct sensor values when supplied a group of sensor values that is usually corrupted or contains faults such as machine failure. This modeling scheme is especially suited for the problem of monitoring the fluence values found in the exit detector data because it is able to learn the complex detector data relationships. This scheme still applies when detector data are summed over many frames with a low temporal resolution and a variable beam attenuation resulting from patient movement. Delivery sequences from three archived patients (prostate, lung, and head and neck) were used in this study. Each delivery sequence was modified by reducing the opening time for random individual multileaf collimator (MLC) leaves by random amounts. The error and error-free treatments were delivered with different phantoms in the path of the beam. Multiple autoassociative kernel regression (AAKR) models were developed and tested by the investigators using combinations of the stored exit detector data sets from each delivery. The models proved robust and were able to predict the correct or error-free values for a projection, which had a single MLC leaf decrease its opening time by less than 10 msec. The model also was able to determine machine output errors. The average uncertainty value for the unfaulted projections ranged from 0.4% to 1.8% of the detector
Regression of environmental noise in LIGO data

International Nuclear Information System (INIS)

Tiwari, V; Klimenko, S; Mitselmakher, G; Necula, V; Drago, M; Prodi, G; Frolov, V; Yakushin, I; Re, V; Salemi, F; Vedovato, G

2015-01-01

We address the problem of noise regression in the output of gravitational-wave (GW) interferometers, using data from the physical environmental monitors (PEM). The objective of the regression analysis is to predict environmental noise in the GW channel from the PEM measurements. One of the most promising regression methods is based on the construction of Wiener–Kolmogorov (WK) filters. Using this method, the seismic noise cancellation from the LIGO GW channel has already been performed. In the presented approach the WK method has been extended, incorporating banks of Wiener filters in the time–frequency domain, multi-channel analysis and regulation schemes, which greatly enhance the versatility of the regression analysis. Also we present the first results on regression of the bi-coherent noise in the LIGO data. (paper)
Simulation-based production planning for engineer-to-order systems with random yield

NARCIS (Netherlands)

Akcay, Alp; Martagan, Tugce

2018-01-01

We consider an engineer-to-order production system with unknown yield. We model the yield as a random variable which represents the percentage output obtained from one unit of production quantity. We develop a beta-regression model in which the mean value of the yield depends on the unique
Elevated serum urate is a potential factor in reduction of total bilirubin: a Mendelian randomization study

Science.gov (United States)

Zhang, Hui; Liu, Jing; Dong, Zheng; Ding, Yue; Qian, Qiaoxia; Zhou, Jingru; Ma, Yanyun; Mei, Zhendong; Chen, Xiangxiang; Li, Yuan; Yuan, Ziyu; Zhang, Juan; Yang, Yajun; Chen, Xingdong; Jin, Li; Zou, Hejian; Wang, Xiaofeng; Wang, Jiucun

2017-01-01

Aim A Mendelian randomization study (MRS) can be linked to a “natural” randomized controlled trial in order to avoid potential bias of observational epidemiology. We aimed to study the possible association between serum urate (SU) and total bilirubin (TBIL) using MRS. Materials and Methods An observational epidemiological study using ordinary least squares (OLS) regression and MRS using two-stage least square (TLS) regression was conducted to assess the effect of SU on TBIL. The comparison between the OLS regression and the TLS regression was analyzed by the Durbin-Hausman test. If the p value is significant, it suggests that the OLS regression cannot evaluate the relationship between exposure and outcome, and the TLS regression is precise; while if the p value is not significant, there would be no significant difference between the two regressions. Results A total of 3,753 subjects were analyzed. In OLS regression, there was no significant association between SU and TBIL in all subjects and subgroup analysis (all p > 0.05). However, MRS revealed a negative correlation between SU and TBIL after adjustment for confounders (beta = –0.021, p = 0.010). Further analysis was conducted in different SU subgroups, and results show that elevated SU was associated with a significant reduction in TBIL after adjustment for hyperuricemic subjects (beta = –0.053, p = 0.027). In addition, the results using the Durbin-Hausman test further confirmed a negative effect of SU on TBIL (p = 0.002 and 0.010, respectively). Conclusions This research shows for the first time that elevated SU was a potential causal factor in the reduction of TBIL and it provides strong evidence to resolve the controversial association between SU and TBIL. PMID:29262606
Fasting Glucose and the Risk of Depressive Symptoms: Instrumental-Variable Regression in the Cardiovascular Risk in Young Finns Study.

Science.gov (United States)

Wesołowska, Karolina; Elovainio, Marko; Hintsa, Taina; Jokela, Markus; Pulkki-Råback, Laura; Pitkänen, Niina; Lipsanen, Jari; Tukiainen, Janne; Lyytikäinen, Leo-Pekka; Lehtimäki, Terho; Juonala, Markus; Raitakari, Olli; Keltikangas-Järvinen, Liisa

2017-12-01

Type 2 diabetes (T2D) has been associated with depressive symptoms, but the causal direction of this association and the underlying mechanisms, such as increased glucose levels, remain unclear. We used instrumental-variable regression with a genetic instrument (Mendelian randomization) to examine a causal role of increased glucose concentrations in the development of depressive symptoms. Data were from the population-based Cardiovascular Risk in Young Finns Study (n = 1217). Depressive symptoms were assessed in 2012 using a modified Beck Depression Inventory (BDI-I). Fasting glucose was measured concurrently with depressive symptoms. A genetic risk score for fasting glucose (with 35 single nucleotide polymorphisms) was used as an instrumental variable for glucose. Glucose was not associated with depressive symptoms in the standard linear regression (B = -0.04, 95% CI [-0.12, 0.04], p = .34), but the instrumental-variable regression showed an inverse association between glucose and depressive symptoms (B = -0.43, 95% CI [-0.79, -0.07], p = .020). The difference between the estimates of standard linear regression and instrumental-variable regression was significant (p = .026) CONCLUSION: Our results suggest that the association between T2D and depressive symptoms is unlikely to be caused by increased glucose concentrations. It seems possible that T2D might be linked to depressive symptoms due to low glucose levels.
Assessment of wastewater treatment facility compliance with decreasing ammonia discharge limits using a regression tree model.

Science.gov (United States)

Suchetana, Bihu; Rajagopalan, Balaji; Silverstein, JoAnn

2017-11-15

A regression tree-based diagnostic approach is developed to evaluate factors affecting US wastewater treatment plant compliance with ammonia discharge permit limits using Discharge Monthly Report (DMR) data from a sample of 106 municipal treatment plants for the period of 2004-2008. Predictor variables used to fit the regression tree are selected using random forests, and consist of the previous month's effluent ammonia, influent flow rates and plant capacity utilization. The tree models are first used to evaluate compliance with existing ammonia discharge standards at each facility and then applied assuming more stringent discharge limits, under consideration in many states. The model predicts that the ability to meet both current and future limits depends primarily on the previous month's treatment performance. With more stringent discharge limits predicted ammonia concentration relative to the discharge limit, increases. In-sample validation shows that the regression trees can provide a median classification accuracy of >70%. The regression tree model is validated using ammonia discharge data from an operating wastewater treatment plant and is able to accurately predict the observed ammonia discharge category approximately 80% of the time, indicating that the regression tree model can be applied to predict compliance for individual treatment plants providing practical guidance for utilities and regulators with an interest in controlling ammonia discharges. The proposed methodology is also used to demonstrate how to delineate reliable sources of demand and supply in a point source-to-point source nutrient credit trading scheme, as well as how planners and decision makers can set reasonable discharge limits in future. Copyright © 2017 Elsevier B.V. All rights reserved.
Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis

Directory of Open Access Journals (Sweden)

BUDIMAN

2012-01-01

Full Text Available Budiman, Arisoesilaningsih E. 2012. Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis. Biodiversitas 13: 18-22. The aims of this research was to determine the multiple regression models of vegetative and corm growth of Amorphophallus muelleri Blume in some age variations and habitat conditions of agroforestry in East Java. Descriptive exploratory research method was conducted by systematic random sampling at five agroforestries on four plantations in East Java: Saradan, Bojonegoro, Nganjuk and Blitar. In each agroforestry, we observed A. muelleri vegetative and corm growth on four growing age (1, 2, 3 and 4 years old respectively as well as environmental variables such as altitude, vegetation, climate and soil conditions. Data were analyzed using descriptive statistics to compare A. muelleri habitat in five agroforestries. Meanwhile, the influence and contribution of each environmental variable to the growth of A. muelleri vegetative and corm were determined using multiple regression analysis of SPSS 17.0. The multiple regression models of A. muelleri vegetative and corm growth were generated based on some characteristics of agroforestries and age showed high validity with R2 = 88-99%. Regression model showed that age, monthly temperatures, percentage of radiation and soil calcium (Ca content either simultaneously or partially determined the growth of A. muelleri vegetative and corm. Based on these models, the A. muelleri corm reached the optimal growth after four years of cultivation and they will be ready to be harvested. Additionally, the soil Ca content should reach 25.3 me.hg-1 as Sugihwaras agroforestry, with the maximal radiation of 60%.
Logistic Regression and Path Analysis Method to Analyze Factors influencing Students’ Achievement

Science.gov (United States)

Noeryanti, N.; Suryowati, K.; Setyawan, Y.; Aulia, R. R.

2018-04-01

Students' academic achievement cannot be separated from the influence of two factors namely internal and external factors. The first factors of the student (internal factors) consist of intelligence (X1), health (X2), interest (X3), and motivation of students (X4). The external factors consist of family environment (X5), school environment (X6), and society environment (X7). The objects of this research are eighth grade students of the school year 2016/2017 at SMPN 1 Jiwan Madiun sampled by using simple random sampling. Primary data are obtained by distributing questionnaires. The method used in this study is binary logistic regression analysis that aims to identify internal and external factors that affect student’s achievement and how the trends of them. Path Analysis was used to determine the factors that influence directly, indirectly or totally on student’s achievement. Based on the results of binary logistic regression, variables that affect student’s achievement are interest and motivation. And based on the results obtained by path analysis, factors that have a direct impact on student’s achievement are students’ interest (59%) and students’ motivation (27%). While the factors that have indirect influences on students’ achievement, are family environment (97%) and school environment (37).

Genetic analysis of somatic cell score in Danish dairy cattle using ramdom regression test-day model

DEFF Research Database (Denmark)

Elsaid, Reda; Sabry, Ayman; Lund, Mogens Sandø

2011-01-01

,233 Danish Holstein cows, were extracted from the national milk recording database. Each data set was analyzed with random regression models using AI-REML. Fixed effects in all models were age at first calving, herd test day, days carrying calf, effects of germ plasm importation (e.g. additive breed effects......) and low between the beginning and the end of lactation. The estimated environmental correlations were lower than the genetic correlations, but the trends were similar. Based on test-day records, the accuracy of genetic evaluations for SCC should be improved when the variation in heritabilities...
Forecasting with Dynamic Regression Models

CERN Document Server

Pankratz, Alan

2012-01-01

One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Data processing for potentiometric precipitation titration of mixtures of isovalent ions by linear regression analysis

International Nuclear Information System (INIS)

Mar'yanov, B.M.; Shumar, S.V.; Gavrilenko, M.A.

1994-01-01

A method for the computer processing of the curves of potentiometric differential titration using the precipitation reactions is developed. This method is based on transformation of the titration curve into a line of multiphase regression, whose parameters determine the equivalence points and the solubility products of the formed precipitates. The computational algorithm is tested using experimental curves for the titration of solutions containing Hg(2) and Cd(2) by the solution of sodium diethyldithiocarbamate. The random errors (RSD) for the titration of 1x10 -4 M solutions are in the range of 3-6%. 7 refs.; 2 figs.; 1 tab
Estimating Loess Plateau Average Annual Precipitation with Multiple Linear Regression Kriging and Geographically Weighted Regression Kriging

Directory of Open Access Journals (Sweden)

Qiutong Jin

2016-06-01

Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Encrypted data stream identification using randomness sparse representation and fuzzy Gaussian mixture model

Science.gov (United States)

Zhang, Hong; Hou, Rui; Yi, Lei; Meng, Juan; Pan, Zhisong; Zhou, Yuhuan

2016-07-01

The accurate identification of encrypted data stream helps to regulate illegal data, detect network attacks and protect users' information. In this paper, a novel encrypted data stream identification algorithm is introduced. The proposed method is based on randomness characteristics of encrypted data stream. We use a l1-norm regularized logistic regression to improve sparse representation of randomness features and Fuzzy Gaussian Mixture Model (FGMM) to improve identification accuracy. Experimental results demonstrate that the method can be adopted as an effective technique for encrypted data stream identification.
Gibrat’s law and quantile regressions

DEFF Research Database (Denmark)

Distante, Roberta; Petrella, Ivan; Santoro, Emiliano

2017-01-01

The nexus between firm growth, size and age in U.S. manufacturing is examined through the lens of quantile regression models. This methodology allows us to overcome serious shortcomings entailed by linear regression models employed by much of the existing literature, unveiling a number of important...
ON REGRESSION REPRESENTATIONS OF STOCHASTIC-PROCESSES

NARCIS (Netherlands)

RUSCHENDORF, L; DEVALK, [No Value

We construct a.s. nonlinear regression representations of general stochastic processes (X(n))n is-an-element-of N. As a consequence we obtain in particular special regression representations of Markov chains and of certain m-dependent sequences. For m-dependent sequences we obtain a constructive
Introduction to the use of regression models in epidemiology.

Science.gov (United States)

Bender, Ralf

2009-01-01

Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
MANCOVA for one way classification with homogeneity of regression coefficient vectors

Science.gov (United States)

Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

2017-11-01

The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.
From Rasch scores to regression

DEFF Research Database (Denmark)

Christensen, Karl Bang

2006-01-01

Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Producing The New Regressive Left

DEFF Research Database (Denmark)

Crone, Christine

members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
Long-term results of a randomized trial in locally advanced rectal cancer: no benefit from adding a brachytherapy boost

DEFF Research Database (Denmark)

Appelt, Ane L; Vogelius, Ivan R; Pløen, John

2014-01-01

PURPOSE/OBJECTIVE(S): Mature data on tumor control and survival are presented from a randomized trial of the addition of a brachytherapy boost to long-course neoadjuvant chemoradiation therapy (CRT) for locally advanced rectal cancer. METHODS AND MATERIALS: Between March 2005 and November 2008, 248...... patients with T3-4N0-2M0 rectal cancer were prospectively randomized to either long-course preoperative CRT (50.4 Gy in 28 fractions, per oral tegafur-uracil and L-leucovorin) alone or the same CRT schedule plus a brachytherapy boost (10 Gy in 2 fractions). The primary trial endpoint was pathologic...... on stratification for tumor regression grade and resection margin status indicated the presence of response migration. CONCLUSIONS: Despite increased pathologic tumor regression at the time of surgery, we observed no benefit on late outcome. Improved tumor regression does not necessarily lead to a relevant clinical...
Mixture of Regression Models with Single-Index

OpenAIRE

Xiang, Sijia; Yao, Weixin

2016-01-01

In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...
Local bilinear multiple-output quantile/depth regression

Czech Academy of Sciences Publication Activity Database

Hallin, M.; Lu, Z.; Paindaveine, D.; Šiman, Miroslav

2015-01-01

Roč. 21, č. 3 (2015), s. 1435-1466 ISSN 1350-7265 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : conditional depth * growth chart * halfspace depth * local bilinear regression * multivariate quantile * quantile regression * regression depth Subject RIV: BA - General Mathematics Impact factor: 1.372, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/siman-0446857.pdf
Do clinical and translational science graduate students understand linear regression? Development and early validation of the REGRESS quiz.

Science.gov (United States)

Enders, Felicity

2013-12-01

Although regression is widely used for reading and publishing in the medical literature, no instruments were previously available to assess students' understanding. The goal of this study was to design and assess such an instrument for graduate students in Clinical and Translational Science and Public Health. A 27-item REsearch on Global Regression Expectations in StatisticS (REGRESS) quiz was developed through an iterative process. Consenting students taking a course on linear regression in a Clinical and Translational Science program completed the quiz pre- and postcourse. Student results were compared to practicing statisticians with a master's or doctoral degree in statistics or a closely related field. Fifty-two students responded precourse, 59 postcourse , and 22 practicing statisticians completed the quiz. The mean (SD) score was 9.3 (4.3) for students precourse and 19.0 (3.5) postcourse (P REGRESS quiz was internally reliable (Cronbach's alpha 0.89). The initial validation is quite promising with statistically significant and meaningful differences across time and study populations. Further work is needed to validate the quiz across multiple institutions. © 2013 Wiley Periodicals, Inc.
Acupuncture for musculoskeletal pain: A meta-analysis and meta-regression of sham-controlled randomized clinical trials

Science.gov (United States)

Yuan, Qi-ling; Wang, Peng; Liu, Liang; Sun, Fu; Cai, Yong-song; Wu, Wen-tao; Ye, Mao-lin; Ma, Jiang-tao; Xu, Bang-bang; Zhang, Yin-gang

2016-01-01

The aims of this systematic review were to study the analgesic effect of real acupuncture and to explore whether sham acupuncture (SA) type is related to the estimated effect of real acupuncture for musculoskeletal pain. Five databases were searched. The outcome was pain or disability immediately (≤1 week) following an intervention. Standardized mean differences (SMDs) with 95% confidence intervals were calculated. Meta-regression was used to explore possible sources of heterogeneity. Sixty-three studies (6382 individuals) were included. Eight condition types were included. The pooled effect size was moderate for pain relief (59 trials, 4980 individuals, SMD −0.61, 95% CI −0.76 to −0.47; P acupuncture has a moderate effect (approximate 12-point reduction on the 100-mm visual analogue scale) on musculoskeletal pain. SA type did not appear to be related to the estimated effect of real acupuncture. PMID:27471137
The MIDAS Touch: Mixed Data Sampling Regression Models

OpenAIRE

Ghysels, Eric; Santa-Clara, Pedro; Valkanov, Rossen

2004-01-01

We introduce Mixed Data Sampling (henceforth MIDAS) regression models. The regressions involve time series data sampled at different frequencies. Technically speaking MIDAS models specify conditional expectations as a distributed lag of regressors recorded at some higher sampling frequencies. We examine the asymptotic properties of MIDAS regression estimation and compare it with traditional distributed lag models. MIDAS regressions have wide applicability in macroeconomics and ï¿½nance.
Suppression Situations in Multiple Linear Regression

Science.gov (United States)

Shieh, Gwowen

2006-01-01

This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Significance testing in ridge regression for genetic data

Directory of Open Access Journals (Sweden)

De Iorio Maria

2011-09-01

Full Text Available Abstract Background Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing interest in using penalised regression in the analysis of high dimensional data. Ridge regression is one such penalised regression technique which does not perform variable selection, instead estimating a regression coefficient for each predictor variable. It is therefore desirable to obtain an estimate of the significance of each ridge regression coefficient. Results We develop and evaluate a test of significance for ridge regression coefficients. Using simulation studies, we demonstrate that the performance of the test is comparable to that of a permutation test, with the advantage of a much-reduced computational cost. We introduce the p-value trace, a plot of the negative logarithm of the p-values of ridge regression coefficients with increasing shrinkage parameter, which enables the visualisation of the change in p-value of the regression coefficients with increasing penalisation. We apply the proposed method to a lung cancer case-control data set from EPIC, the European Prospective Investigation into Cancer and Nutrition. Conclusions The proposed test is a useful alternative to a permutation test for the estimation of the significance of ridge regression coefficients, at a much-reduced computational cost. The p-value trace is an informative graphical tool for evaluating the results of a test of significance of ridge regression coefficients as the shrinkage parameter increases, and the proposed test makes its production computationally feasible.
Regression calibration with more surrogates than mismeasured variables

KAUST Repository

Kipnis, Victor

2012-06-29

In a recent paper (Weller EA, Milton DK, Eisen EA, Spiegelman D. Regression calibration for logistic regression with multiple surrogates for one exposure. Journal of Statistical Planning and Inference 2007; 137: 449-461), the authors discussed fitting logistic regression models when a scalar main explanatory variable is measured with error by several surrogates, that is, a situation with more surrogates than variables measured with error. They compared two methods of adjusting for measurement error using a regression calibration approximate model as if it were exact. One is the standard regression calibration approach consisting of substituting an estimated conditional expectation of the true covariate given observed data in the logistic regression. The other is a novel two-stage approach when the logistic regression is fitted to multiple surrogates, and then a linear combination of estimated slopes is formed as the estimate of interest. Applying estimated asymptotic variances for both methods in a single data set with some sensitivity analysis, the authors asserted superiority of their two-stage approach. We investigate this claim in some detail. A troubling aspect of the proposed two-stage method is that, unlike standard regression calibration and a natural form of maximum likelihood, the resulting estimates are not invariant to reparameterization of nuisance parameters in the model. We show, however, that, under the regression calibration approximation, the two-stage method is asymptotically equivalent to a maximum likelihood formulation, and is therefore in theory superior to standard regression calibration. However, our extensive finite-sample simulations in the practically important parameter space where the regression calibration model provides a good approximation failed to uncover such superiority of the two-stage method. We also discuss extensions to different data structures.

Regression calibration with more surrogates than mismeasured variables

KAUST Repository

Kipnis, Victor; Midthune, Douglas; Freedman, Laurence S.; Carroll, Raymond J.

2012-01-01

In a recent paper (Weller EA, Milton DK, Eisen EA, Spiegelman D. Regression calibration for logistic regression with multiple surrogates for one exposure. Journal of Statistical Planning and Inference 2007; 137: 449-461), the authors discussed fitting logistic regression models when a scalar main explanatory variable is measured with error by several surrogates, that is, a situation with more surrogates than variables measured with error. They compared two methods of adjusting for measurement error using a regression calibration approximate model as if it were exact. One is the standard regression calibration approach consisting of substituting an estimated conditional expectation of the true covariate given observed data in the logistic regression. The other is a novel two-stage approach when the logistic regression is fitted to multiple surrogates, and then a linear combination of estimated slopes is formed as the estimate of interest. Applying estimated asymptotic variances for both methods in a single data set with some sensitivity analysis, the authors asserted superiority of their two-stage approach. We investigate this claim in some detail. A troubling aspect of the proposed two-stage method is that, unlike standard regression calibration and a natural form of maximum likelihood, the resulting estimates are not invariant to reparameterization of nuisance parameters in the model. We show, however, that, under the regression calibration approximation, the two-stage method is asymptotically equivalent to a maximum likelihood formulation, and is therefore in theory superior to standard regression calibration. However, our extensive finite-sample simulations in the practically important parameter space where the regression calibration model provides a good approximation failed to uncover such superiority of the two-stage method. We also discuss extensions to different data structures.
Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data

Science.gov (United States)

Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

2014-01-01

Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915
Few crystal balls are crystal clear : eyeballing regression

International Nuclear Information System (INIS)

Wittebrood, R.T.

1998-01-01

The theory of regression and statistical analysis as it applies to reservoir analysis was discussed. It was argued that regression lines are not always the final truth. It was suggested that regression lines and eyeballed lines are often equally accurate. The many conditions that must be fulfilled to calculate a proper regression were discussed. Mentioned among these conditions were the distribution of the data, hidden variables, knowledge of how the data was obtained, the need for causal correlation of the variables, and knowledge of the manner in which the regression results are going to be used. 1 tab., 13 figs
THE CRASH INTENSITY EVALUATION USING GENERAL CENTRALITY CRITERIONS AND A GEOGRAPHICALLY WEIGHTED REGRESSION

Directory of Open Access Journals (Sweden)

M. Ghadiriyan Arani

2017-09-01

Full Text Available Today, one of the social problems influencing on the lives of many people is the road traffic crashes especially the highway ones. In this regard, this paper focuses on highway of capital and the most populous city in the U.S. state of Georgia and the ninth largest metropolitan area in the United States namely Atlanta. Geographically weighted regression and general centrality criteria are the aspects of traffic used for this article. In the first step, in order to estimate of crash intensity, it is needed to extract the dual graph from the status of streets and highways to use general centrality criteria. With the help of the graph produced, the criteria are: Degree, Pageranks, Random walk, Eccentricity, Closeness, Betweenness, Clustering coefficient, Eigenvector, and Straightness. The intensity of crash point is counted for every highway by dividing the number of crashes in that highway to the total number of crashes. Intensity of crash point is calculated for each highway. Then, criteria and crash point were normalized and the correlation between them was calculated to determine the criteria that are not dependent on each other. The proposed hybrid approach is a good way to regression issues because these effective measures result to a more desirable output. R2 values for geographically weighted regression using the Gaussian kernel was 0.539 and also 0.684 was obtained using a triple-core cube. The results showed that the triple-core cube kernel is better for modeling the crash intensity.
Predictors of long-term benzodiazepine abstinence in participants of a randomized controlled benzodiazepine withdrawal program.

NARCIS (Netherlands)

Oude Voshaar, R.C.; Gorgels, W.J.M.J.; Mol, A.J.J.; Balkom, A.J.L.M. van; Mulder, J.; Lisdonk, E.H. van de; Breteler, M.H.M.; Zitman, F.G.

2006-01-01

OBJECTIVE: To identify predictors of resumed benzodiazepine use after participation in a benzodiazepine discontinuation trial. METHOD: We performed multiple Cox regression analyses to predict the long-term outcome of a 3-condition, randomized, controlled benzodiazepine discontinuation trial in
Predictors of long-term benzodiazepine abstinence in participants of a randomized controlled benzodiazepine withdrawal program

NARCIS (Netherlands)

Oude Voshaar, R.C.; Gorgels, W.J.M.J.; Mol, A.J.J.; Balkom, A.J.L.M. van; Mulder, J.; Lisdonk, E.H. van de; Breteler, M.H.M.; Zitman, F.G.

2006-01-01

Objective: To identify predictors of resumed benzodiazepine use after participation in a benzodiazepine discontinuation trial. Method: We performed multiple Cox regression analyses to predict the long-term outcome of a 3-condition, randomized, controlled benzodiazepine discontinuation trial in
Some effects of random dose measurement errors on analysis of atomic bomb survivor data

International Nuclear Information System (INIS)

Gilbert, E.S.

1985-01-01

The effects of random dose measurement errors on analyses of atomic bomb survivor data are described and quantified for several procedures. It is found that the ways in which measurement error is most likely to mislead are through downward bias in the estimated regression coefficients and through distortion of the shape of the dose-response curve. The magnitude of the bias with simple linear regression is evaluated for several dose treatments including the use of grouped and ungrouped data, analyses with and without truncation at 600 rad, and analyses which exclude doses exceeding 200 rad. Limited calculations have also been made for maximum likelihood estimation based on Poisson regression. 16 refs., 6 tabs
Regression methods for medical research

CERN Document Server

Tai, Bee Choo

2013-01-01

Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Should metacognition be measured by logistic regression?

Science.gov (United States)

Rausch, Manuel; Zehetleitner, Michael

2017-03-01

Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Prediction of retention indices for frequently reported compounds of plant essential oils using multiple linear regression, partial least squares, and support vector machine.

Science.gov (United States)

Yan, Jun; Huang, Jian-Hua; He, Min; Lu, Hong-Bing; Yang, Rui; Kong, Bo; Xu, Qing-Song; Liang, Yi-Zeng

2013-08-01

Retention indices for frequently reported compounds of plant essential oils on three different stationary phases were investigated. Multivariate linear regression, partial least squares, and support vector machine combined with a new variable selection approach called random-frog recently proposed by our group, were employed to model quantitative structure-retention relationships. Internal and external validations were performed to ensure the stability and predictive ability. All the three methods could obtain an acceptable model, and the optimal results by support vector machine based on a small number of informative descriptors with the square of correlation coefficient for cross validation, values of 0.9726, 0.9759, and 0.9331 on the dimethylsilicone stationary phase, the dimethylsilicone phase with 5% phenyl groups, and the PEG stationary phase, respectively. The performances of two variable selection approaches, random-frog and genetic algorithm, are compared. The importance of the variables was found to be consistent when estimated from correlation coefficients in multivariate linear regression equations and selection probability in model spaces. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data

Directory of Open Access Journals (Sweden)

Himmelreich Uwe

2009-07-01

Full Text Available Abstract Background Regularized regression methods such as principal component or partial least squares regression perform well in learning tasks on high dimensional spectral data, but cannot explicitly eliminate irrelevant features. The random forest classifier with its associated Gini feature importance, on the other hand, allows for an explicit feature elimination, but may not be optimally adapted to spectral data due to the topology of its constituent classification trees which are based on orthogonal splits in feature space. Results We propose to combine the best of both approaches, and evaluated the joint use of a feature selection based on a recursive feature elimination using the Gini importance of random forests' together with regularized classification methods on spectral data sets from medical diagnostics, chemotaxonomy, biomedical analytics, food science, and synthetically modified spectral data. Here, a feature selection using the Gini feature importance with a regularized classification by discriminant partial least squares regression performed as well as or better than a filtering according to different univariate statistical tests, or using regression coefficients in a backward feature elimination. It outperformed the direct application of the random forest classifier, or the direct application of the regularized classifiers on the full set of features. Conclusion The Gini importance of the random forest provided superior means for measuring feature relevance on spectral data, but – on an optimal subset of features – the regularized classifiers might be preferable over the random forest classifier, in spite of their limitation to model linear dependencies only. A feature selection based on Gini importance, however, may precede a regularized linear classification to identify this optimal subset of features, and to earn a double benefit of both dimensionality reduction and the elimination of noise from the classification task.
BOX-COX REGRESSION METHOD IN TIME SCALING

Directory of Open Access Journals (Sweden)

ATİLLA GÖKTAŞ

2013-06-01

Full Text Available Box-Cox regression method with λj, for j = 1, 2, ..., k, power transformation can be used when dependent variable and error term of the linear regression model do not satisfy the continuity and normality assumptions. The situation obtaining the smallest mean square error when optimum power λj, transformation for j = 1, 2, ..., k, of Y has been discussed. Box-Cox regression method is especially appropriate to adjust existence skewness or heteroscedasticity of error terms for a nonlinear functional relationship between dependent and explanatory variables. In this study, the advantage and disadvantage use of Box-Cox regression method have been discussed in differentiation and differantial analysis of time scale concept.
Gaussian Process Regression Model in Spatial Logistic Regression

Science.gov (United States)

Sofro, A.; Oktaviarina, A.

2018-01-01

Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Regression Analysis and the Sociological Imagination

Science.gov (United States)

De Maio, Fernando

2014-01-01

Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Comparison of Linear and Non-linear Regression Analysis to Determine Pulmonary Pressure in Hyperthyroidism.

Science.gov (United States)

Scarneciu, Camelia C; Sangeorzan, Livia; Rus, Horatiu; Scarneciu, Vlad D; Varciu, Mihai S; Andreescu, Oana; Scarneciu, Ioan

2017-01-01

This study aimed at assessing the incidence of pulmonary hypertension (PH) at newly diagnosed hyperthyroid patients and at finding a simple model showing the complex functional relation between pulmonary hypertension in hyperthyroidism and the factors causing it. The 53 hyperthyroid patients (H-group) were evaluated mainly by using an echocardiographical method and compared with 35 euthyroid (E-group) and 25 healthy people (C-group). In order to identify the factors causing pulmonary hypertension the statistical method of comparing the values of arithmetical means is used. The functional relation between the two random variables (PAPs and each of the factors determining it within our research study) can be expressed by linear or non-linear function. By applying the linear regression method described by a first-degree equation the line of regression (linear model) has been determined; by applying the non-linear regression method described by a second degree equation, a parabola-type curve of regression (non-linear or polynomial model) has been determined. We made the comparison and the validation of these two models by calculating the determination coefficient (criterion 1), the comparison of residuals (criterion 2), application of AIC criterion (criterion 3) and use of F-test (criterion 4). From the H-group, 47% have pulmonary hypertension completely reversible when obtaining euthyroidism. The factors causing pulmonary hypertension were identified: previously known- level of free thyroxin, pulmonary vascular resistance, cardiac output; new factors identified in this study- pretreatment period, age, systolic blood pressure. According to the four criteria and to the clinical judgment, we consider that the polynomial model (graphically parabola- type) is better than the linear one. The better model showing the functional relation between the pulmonary hypertension in hyperthyroidism and the factors identified in this study is given by a polynomial equation of second
A regressive methodology for estimating missing data in rainfall daily time series

Science.gov (United States)

Barca, E.; Passarella, G.

2009-04-01

The "presence" of gaps in environmental data time series represents a very common, but extremely critical problem, since it can produce biased results (Rubin, 1976). Missing data plagues almost all surveys. The problem is how to deal with missing data once it has been deemed impossible to recover the actual missing values. Apart from the amount of missing data, another issue which plays an important role in the choice of any recovery approach is the evaluation of "missingness" mechanisms. When data missing is conditioned by some other variable observed in the data set (Schafer, 1997) the mechanism is called MAR (Missing at Random). Otherwise, when the missingness mechanism depends on the actual value of the missing data, it is called NCAR (Not Missing at Random). This last is the most difficult condition to model. In the last decade interest arose in the estimation of missing data by using regression (single imputation). More recently multiple imputation has become also available, which returns a distribution of estimated values (Scheffer, 2002). In this paper an automatic methodology for estimating missing data is presented. In practice, given a gauging station affected by missing data (target station), the methodology checks the randomness of the missing data and classifies the "similarity" between the target station and the other gauging stations spread over the study area. Among different methods useful for defining the similarity degree, whose effectiveness strongly depends on the data distribution, the Spearman correlation coefficient was chosen. Once defined the similarity matrix, a suitable, nonparametric, univariate, and regressive method was applied in order to estimate missing data in the target station: the Theil method (Theil, 1950). Even though the methodology revealed to be rather reliable an improvement of the missing data estimation can be achieved by a generalization. A first possible improvement consists in extending the univariate technique to
An Additive-Multiplicative Cox-Aalen Regression Model

DEFF Research Database (Denmark)

Scheike, Thomas H.; Zhang, Mei-Jie

2002-01-01

Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...
Analysis of Palm Oil Production, Export, and Government Consumption to Gross Domestic Product of Five Districts in West Kalimantan by Panel Regression

Science.gov (United States)

Sulistianingsih, E.; Kiftiah, M.; Rosadi, D.; Wahyuni, H.

2017-04-01

Gross Domestic Product (GDP) is an indicator of economic growth in a region. GDP is a panel data, which consists of cross-section and time series data. Meanwhile, panel regression is a tool which can be utilised to analyse panel data. There are three models in panel regression, namely Common Effect Model (CEM), Fixed Effect Model (FEM) and Random Effect Model (REM). The models will be chosen based on results of Chow Test, Hausman Test and Lagrange Multiplier Test. This research analyses palm oil about production, export, and government consumption to five district GDP are in West Kalimantan, namely Sanggau, Sintang, Sambas, Ketapang and Bengkayang by panel regression. Based on the results of analyses, it concluded that REM, which adjusted-determination-coefficient is 0,823, is the best model in this case. Also, according to the result, only Export and Government Consumption that influence GDP of the districts.
Estimation of genetic effects in the presence of multicollinearity in multibreed beef cattle evaluation.

Science.gov (United States)

Roso, V M; Schenkel, F S; Miller, S P; Schaeffer, L R

2005-08-01

Breed additive, dominance, and epistatic loss effects are of concern in the genetic evaluation of a multibreed population. Multiple regression equations used for fitting these effects may show a high degree of multicollinearity among predictor variables. Typically, when strong linear relationships exist, the regression coefficients have large SE and are sensitive to changes in the data file and to the addition or deletion of variables in the model. Generalized ridge regression methods were applied to obtain stable estimates of direct and maternal breed additive, dominance, and epistatic loss effects in the presence of multicollinearity among predictor variables. Preweaning weight gains of beef calves in Ontario, Canada, from 1986 to 1999 were analyzed. The genetic model included fixed direct and maternal breed additive, dominance, and epistatic loss effects, fixed environmental effects of age of the calf, contemporary group, and age of the dam x sex of the calf, random additive direct and maternal genetic effects, and random maternal permanent environment effect. The degree and the nature of the multicollinearity were identified and ridge regression methods were used as an alternative to ordinary least squares (LS). Ridge parameters were obtained using two different objective methods: 1) generalized ridge estimator of Hoerl and Kennard (R1); and 2) bootstrap in combination with cross-validation (R2). Both ridge regression methods outperformed the LS estimator with respect to mean squared error of predictions (MSEP) and variance inflation factors (VIF) computed over 100 bootstrap samples. The MSEP of R1 and R2 were similar, and they were 3% less than the MSEP of LS. The average VIF of LS, R1, and R2 were equal to 26.81, 6.10, and 4.18, respectively. Ridge regression methods were particularly effective in decreasing the multicollinearity involving predictor variables of breed additive effects. Because of a high degree of confounding between estimates of maternal
Drug treatment rates with beta-blockers and ACE-inhibitors/angiotensin receptor blockers and recurrences in takotsubo cardiomyopathy: A meta-regression analysis.

Science.gov (United States)

Brunetti, Natale Daniele; Santoro, Francesco; De Gennaro, Luisa; Correale, Michele; Gaglione, Antonio; Di Biase, Matteo

2016-07-01

In a recent paper Singh et al. analyzed the effect of drug treatment on recurrence of takotsubo cardiomyopathy (TTC) in a comprehensive meta-analysis. The study found that recurrence rates were independent of clinic utilization of BB prescription, but inversely correlated with ACEi/ARB prescription: authors therefore conclude that ACEi/ARB rather than BB may reduce risk of recurrence. We aimed to re-analyze data reported in the study, now weighted for populations' size, in a meta-regression analysis. After multiple meta-regression analysis, we found a significant regression between rates of prescription of ACEi and rates of recurrence of TTC; regression was not statistically significant for BBs. On the bases of our re-analysis, we confirm that rates of recurrence of TTC are lower in populations of patients with higher rates of treatment with ACEi/ARB. That could not necessarily imply that ACEi may prevent recurrence of TTC, but barely that, for example, rates of recurrence are lower in cohorts more compliant with therapy or more prescribed with ACEi because more carefully followed. Randomized prospective studies are surely warranted. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

Linear and logistic regression analysis

NARCIS (Netherlands)

Tripepi, G.; Jager, K. J.; Dekker, F. W.; Zoccali, C.

2008-01-01

In previous articles of this series, we focused on relative risks and odds ratios as measures of effect to assess the relationship between exposure to risk factors and clinical outcomes and on control for confounding. In randomized clinical trials, the random allocation of patients is hoped to
Model-based Quantile Regression for Discrete Data

KAUST Repository

Padellini, Tullia

2018-04-10

Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
riskRegression

DEFF Research Database (Denmark)

Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas

2017-01-01

In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...
Real estate value prediction using multivariate regression models

Science.gov (United States)

Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

2017-11-01

The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Personality disorders, violence, and antisocial behavior: a systematic review and meta-regression analysis.

Science.gov (United States)

Yu, Rongqin; Geddes, John R; Fazel, Seena

2012-10-01

The risk of antisocial outcomes in individuals with personality disorder (PD) remains uncertain. The authors synthesize the current evidence on the risks of antisocial behavior, violence, and repeat offending in PD, and they explore sources of heterogeneity in risk estimates through a systematic review and meta-regression analysis of observational studies comparing antisocial outcomes in personality disordered individuals with controls groups. Fourteen studies examined risk of antisocial and violent behavior in 10,007 individuals with PD, compared with over 12 million general population controls. There was a substantially increased risk of violent outcomes in studies with all PDs (random-effects pooled odds ratio [OR] = 3.0, 95% CI = 2.6 to 3.5). Meta-regression revealed that antisocial PD and gender were associated with higher risks (p = .01 and .07, respectively). The odds of all antisocial outcomes were also elevated. Twenty-five studies reported the risk of repeat offending in PD compared with other offenders. The risk of a repeat offense was also increased (fixed-effects pooled OR = 2.4, 95% CI = 2.2 to 2.7) in offenders with PD. The authors conclude that although PD is associated with antisocial outcomes and repeat offending, the risk appears to differ by PD category, gender, and whether individuals are offenders or not.
Computing multiple-output regression quantile regions

Czech Academy of Sciences Publication Activity Database

Paindaveine, D.; Šiman, Miroslav

2012-01-01

Roč. 56, č. 4 (2012), s. 840-853 ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : halfspace depth * multiple-output regression * parametric linear programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 1.304, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376413.pdf
Evaluating disease management programme effectiveness: an introduction to the regression discontinuity design.

Science.gov (United States)

Linden, Ariel; Adams, John L; Roberts, Nancy

2006-04-01

Although disease management (DM) has been in existence for over a decade, there is still much uncertainty as to its effectiveness in improving health status and reducing medical cost. The main reason is that most programme evaluations typically follow weak observational study designs that are subject to bias, most notably selection bias and regression to the mean. The regression discontinuity (RD) design may be the best alternative to randomized studies for evaluating DM programme effectiveness. The most crucial element of the RD design is its use of a 'cut-off' score on a pre-test measure to determine assignment to intervention or control. A valuable feature of this technique is that the pre-test measure does not have to be the same as the outcome measure, thus maximizing the programme's ability to use research-based practice guidelines, survey instruments and other tools to identify those individuals in greatest need of the programme intervention. Similarly, the cut-off score can be based on clinical understanding of the disease process, empirically derived, or resource-based. In the RD design, programme effectiveness is determined by a change in the pre-post relationship at the cut-off point. While the RD design is uniquely suitable for DM programme evaluation, its success will depend, in large part, on fundamental changes being made in the way DM programmes identify and assign individuals to the programme intervention.
Preface to Berk's "Regression Analysis: A Constructive Critique"

OpenAIRE

de Leeuw, Jan

2003-01-01

It is pleasure to write a preface for the book ”Regression Analysis” of my fellow series editor Dick Berk. And it is a pleasure in particular because the book is about regression analysis, the most popular and the most fundamental technique in applied statistics. And because it is critical of the way regression analysis is used in the sciences, in particular in the social and behavioral sciences. Although the book can be read as an introduction to regression analysis, it can also be read as a...
A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

Science.gov (United States)

Karabatsos, George

2017-02-01

Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Five cases of caudal regression with an aberrant abdominal umbilical artery: Further support for a caudal regression-sirenomelia spectrum.

Science.gov (United States)

Duesterhoeft, Sara M; Ernst, Linda M; Siebert, Joseph R; Kapur, Raj P

2007-12-15

Sirenomelia and caudal regression have sparked centuries of interest and recent debate regarding their classification and pathogenetic relationship. Specific anomalies are common to both conditions, but aside from fusion of the lower extremities, an aberrant abdominal umbilical artery ("persistent vitelline artery") has been invoked as the chief anatomic finding that distinguishes sirenomelia from caudal regression. This observation is important from a pathogenetic viewpoint, in that diversion of blood away from the caudal portion of the embryo through the abdominal umbilical artery ("vascular steal") has been proposed as the primary mechanism leading to sirenomelia. In contrast, caudal regression is hypothesized to arise from primary deficiency of caudal mesoderm. We present five cases of caudal regression that exhibit an aberrant abdominal umbilical artery similar to that typically associated with sirenomelia. Review of the literature identified four similar cases. Collectively, the series lends support for a caudal regression-sirenomelia spectrum with a common pathogenetic basis and suggests that abnormal umbilical arterial anatomy may be the consequence, rather than the cause, of deficient caudal mesoderm. (c) 2007 Wiley-Liss, Inc.
A main factors affecting average number of teats in pigs

Directory of Open Access Journals (Sweden)

Emil Krupa

2016-09-01

Full Text Available The influence of factors (breed, year and season of farrowing, herd, parity order, sire of litter, total number of born piglets - TNB, number of piglets born alive - NBA, number of weaned piglets - NW, and linear and quadratic regression on the number of teats, found for all piglets in the litter till ten days after born, expressed as arithmetic mean for each litter as sum of all teats number of each piglet in appropriate litter divided by number of piglets in this litter at first litter (ANT1 and second and subsequent litters (ANT2+ were analysed. The coefficient of determination was 0.46 and 0.33 for ANT1 and ANT2+, respectively. The statistically high influence (P<0.001 on ANT1 and ANT2+ was determined for year and season of farrowing, herd, parity order (only for ANT2+ and sire of litter effects. Impact of breed was found only on ANT2+ (P<0.001. The rest of factors have negligible of no impact on traits. Based on the data available for analyses, obtained results will serve as a relevant set-up in developing the model for genetic evaluation for these traits.
Quantitative trait loci for udder conformation and other udder traits in Finnish Ayrshire cattle

Directory of Open Access Journals (Sweden)

N.F. SCHULMAN

2008-12-01

Full Text Available Udder traits are important due to their correlation with clinical mastitis which causes major economic losses to the dairy farms. Chromosomal areas associated with udder conformation traits, milking speed and leakage could be used in breeding programs to improve both udder traits and mastitis resistance. Quantitative trait loci (QTL mapping for udder traits was carried out on bovine chromosomes (BTA 9, 11, 14, 18, 20, 23, and 29, where earlier studies have indicated QTL for mastitis. A granddaughter design with 12 Ayrshire sire families and 360 sons was used. The sires and sons were typed for 35 markers. The traits analysed were udder depth, fore udder attachment, central ligament, distance from udder to floor, body stature, fore teat length, udder balance, rear udder height, milking speed, and leakage. Associations between markers and traits were analysed with multiple marker regression. Five genome-wise significant QTL were detected: stature on BTA14 and 23, udder balance on BTA23, rear udder height on BTA11, and central ligament on BTA23. On BTA11 and 14 the suggested QTL positions for udder traits are at the same position as previously detected QTL for mastitis and somatic cell count.;
Model-based Quantile Regression for Discrete Data

KAUST Repository

Padellini, Tullia; Rue, Haavard

2018-01-01

Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite
A Whole Genome Association Study on Meat Quality Traits Using High Density SNP Chips in a Cross between Korean Native Pig and Landrace

Directory of Open Access Journals (Sweden)

K.-T Lee

2012-11-01

Full Text Available A whole genome association (WGA study was performed to detect significant polymorphisms for meat quality traits in an F2 cross population (N = 478 that were generated with Korean native pig sires and Landrace dams in National Livestock Research Institute, Songwhan, Korea. The animals were genotyped using Illumina porcine 60k SNP beadchips, in which a set of 46,865 SNPs were available for the WGA analyses on ten carcass quality traits; live weight, crude protein, crude lipids, crude ash, water holding capacity, drip loss, shear force, CIE L, CIE a and CIE b. Phenotypes were regressed on additive and dominance effects for each SNP using a simple linear regression model, after adjusting for sex, sire and slaughter stage as fixed effects. With the significant SNPs for each trait (p<0.001, a stepwise regression procedure was applied to determine the best set of SNPs with the additive and/or dominance effects. A total of 106 SNPs, or quantitative trait loci (QTL were detected, and about 32 to 66% of the total phenotypic variation was explained by the significant SNPs for each trait. The QTL were identified in most porcine chromosomes (SSCs, in which majority of the QTL were detected in SSCs 1, 2, 12, 13, 14 and 16. Several QTL clusters were identified on SSCs 12, 16 and 17, and a cluster of QTL influencing crude protein, crude lipid, drip loss, shear force, CIE a and CIE b were located between 20 and 29 Mb of SSC12. A pleiotropic QTL for drip loss, CIE L and CIE b was also detected on SSC16. These QTL need to be validated in commercial pig populations for genetic improvement in meat quality via marker-assisted selection.
Linear Regression Analysis

CERN Document Server

Seber, George A F

2012-01-01

Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Moderation analysis using a two-level regression model.

Science.gov (United States)

Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

2014-10-01

Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Regression analysis utilizing subjective evaluation of emotional experience in PET studies on emotions.

Science.gov (United States)

Aalto, Sargo; Wallius, Esa; Näätänen, Petri; Hiltunen, Jaana; Metsähonkala, Liisa; Sipilä, Hannu; Karlsson, Hasse

2005-09-01

A methodological study on subject-specific regression analysis (SSRA) exploring the correlation between the neural response and the subjective evaluation of emotional experience in eleven healthy females is presented. The target emotions, i.e., amusement and sadness, were induced using validated film clips, regional cerebral blood flow (rCBF) was measured using positron emission tomography (PET), and the subjective intensity of the emotional experience during the PET scanning was measured using a category ratio (CR-10) scale. Reliability analysis of the rating data indicated that the subjects rated the intensity of their emotional experience fairly consistently on the CR-10 scale (Cronbach alphas 0.70-0.97). A two-phase random-effects analysis was performed to ensure the generalizability and inter-study comparability of the SSRA results. Random-effects SSRAs using Statistical non-Parametric Mapping 99 (SnPM99) showed that rCBF correlated with the self-rated intensity of the emotional experience mainly in the brain regions that were identified in the random-effects subtraction analyses using the same imaging data. Our results give preliminary evidence of a linear association between the neural responses related to amusement and sadness and the self-evaluated intensity of the emotional experience in several regions involved in the emotional response. SSRA utilizing subjective evaluation of emotional experience turned out a feasible and promising method of analysis. It allows versatile exploration of the neurobiology of emotions and the neural correlates of actual and individual emotional experience. Thus, SSRA might be able to catch the idiosyncratic aspects of the emotional response better than traditional subtraction analysis.
Independent contrasts and PGLS regression estimators are equivalent.

Science.gov (United States)

Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

2012-05-01

We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
Demonstration of a Fiber Optic Regression Probe

Science.gov (United States)

Korman, Valentin; Polzin, Kurt A.

2010-01-01

The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Caudal regression syndrome : a case report

International Nuclear Information System (INIS)

Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun

1998-01-01

Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging

Caudal regression syndrome : a case report

Energy Technology Data Exchange (ETDEWEB)

Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun [Chungang Gil Hospital, Incheon (Korea, Republic of)

1998-07-01

Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging.
Random matrices and random difference equations

International Nuclear Information System (INIS)

Uppuluri, V.R.R.

1975-01-01

Mathematical models leading to products of random matrices and random difference equations are discussed. A one-compartment model with random behavior is introduced, and it is shown how the average concentration in the discrete time model converges to the exponential function. This is of relevance to understanding how radioactivity gets trapped in bone structure in blood--bone systems. The ideas are then generalized to two-compartment models and mammillary systems, where products of random matrices appear in a natural way. The appearance of products of random matrices in applications in demography and control theory is considered. Then random sequences motivated from the following problems are studied: constant pulsing and random decay models, random pulsing and constant decay models, and random pulsing and random decay models
Correlation and simple linear regression.

Science.gov (United States)

Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G

2003-06-01

In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
bayesQR: A Bayesian Approach to Quantile Regression

Directory of Open Access Journals (Sweden)

Dries F. Benoit

2017-01-01

Full Text Available After its introduction by Koenker and Basset (1978, quantile regression has become an important and popular tool to investigate the conditional response distribution in regression. The R package bayesQR contains a number of routines to estimate quantile regression parameters using a Bayesian approach based on the asymmetric Laplace distribution. The package contains functions for the typical quantile regression with continuous dependent variable, but also supports quantile regression for binary dependent variables. For both types of dependent variables, an approach to variable selection using the adaptive lasso approach is provided. For the binary quantile regression model, the package also contains a routine that calculates the fitted probabilities for each vector of predictors. In addition, functions for summarizing the results, creating traceplots, posterior histograms and drawing quantile plots are included. This paper starts with a brief overview of the theoretical background of the models used in the bayesQR package. The main part of this paper discusses the computational problems that arise in the implementation of the procedure and illustrates the usefulness of the package through selected examples.
Metabolic Profiling of Adiponectin Levels in Adults: Mendelian Randomization Analysis.

Science.gov (United States)

Borges, Maria Carolina; Barros, Aluísio J D; Ferreira, Diana L Santos; Casas, Juan Pablo; Horta, Bernardo Lessa; Kivimaki, Mika; Kumari, Meena; Menon, Usha; Gaunt, Tom R; Ben-Shlomo, Yoav; Freitas, Deise F; Oliveira, Isabel O; Gentry-Maharaj, Aleksandra; Fourkala, Evangelia; Lawlor, Debbie A; Hingorani, Aroon D

2017-12-01

Adiponectin, a circulating adipocyte-derived protein, has insulin-sensitizing, anti-inflammatory, antiatherogenic, and cardiomyocyte-protective properties in animal models. However, the systemic effects of adiponectin in humans are unknown. Our aims were to define the metabolic profile associated with higher blood adiponectin concentration and investigate whether variation in adiponectin concentration affects the systemic metabolic profile. We applied multivariable regression in ≤5909 adults and Mendelian randomization (using cis -acting genetic variants in the vicinity of the adiponectin gene as instrumental variables) for analyzing the causal effect of adiponectin in the metabolic profile of ≤37 545 adults. Participants were largely European from 6 longitudinal studies and 1 genome-wide association consortium. In the multivariable regression analyses, higher circulating adiponectin was associated with higher high-density lipoprotein lipids and lower very-low-density lipoprotein lipids, glucose levels, branched-chain amino acids, and inflammatory markers. However, these findings were not supported by Mendelian randomization analyses for most metabolites. Findings were consistent between sexes and after excluding high-risk groups (defined by age and occurrence of previous cardiovascular event) and 1 study with admixed population. Our findings indicate that blood adiponectin concentration is more likely to be an epiphenomenon in the context of metabolic disease than a key determinant. © 2017 The Authors.
Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

Science.gov (United States)

Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

2017-12-01

The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Random effects coefficient of determination for mixed and meta-analysis models.

Science.gov (United States)

Demidenko, Eugene; Sargent, James; Onega, Tracy

2012-01-01

The key feature of a mixed model is the presence of random effects. We have developed a coefficient, called the random effects coefficient of determination, [Formula: see text], that estimates the proportion of the conditional variance of the dependent variable explained by random effects. This coefficient takes values from 0 to 1 and indicates how strong the random effects are. The difference from the earlier suggested fixed effects coefficient of determination is emphasized. If [Formula: see text] is close to 0, there is weak support for random effects in the model because the reduction of the variance of the dependent variable due to random effects is small; consequently, random effects may be ignored and the model simplifies to standard linear regression. The value of [Formula: see text] apart from 0 indicates the evidence of the variance reduction in support of the mixed model. If random effects coefficient of determination is close to 1 the variance of random effects is very large and random effects turn into free fixed effects-the model can be estimated using the dummy variable approach. We derive explicit formulas for [Formula: see text] in three special cases: the random intercept model, the growth curve model, and meta-analysis model. Theoretical results are illustrated with three mixed model examples: (1) travel time to the nearest cancer center for women with breast cancer in the U.S., (2) cumulative time watching alcohol related scenes in movies among young U.S. teens, as a risk factor for early drinking onset, and (3) the classic example of the meta-analysis model for combination of 13 studies on tuberculosis vaccine.
Property-Composition-Temperature Modeling of Waste Glass Melt Data Subject to a Randomization Restriction

International Nuclear Information System (INIS)

Piepel, Gregory F.; Heredia-Langner, Alejandro; Cooley, Scott K.

2008-01-01

Properties such as viscosity and electrical conductivity of glass melts are functions of melt temperature as well as glass composition. When measuring such a property for several glasses, the property is typically measured at several temperatures for one glass, then at several temperatures for the next glass, and so on. This data-collection process involves a restriction on randomization, which is referred to as split-plot experiment. The split-plot data structure must be accounted for in developing property-composition-temperature models and the corresponding uncertainty equations for model predictions. Instead of ordinary least squares (OLS) regression methods, generalized least squares (GLS) regression methods using restricted maximum likelihood (REML) estimation must be used. This article describes the methodology for developing property-composition-temperature models and corresponding prediction uncertainty equations using the GLS/REML regression approach. Viscosity data collected on 197 simulated nuclear waste glasses are used to illustrate the GLS/REML methods for developing a viscosity-composition-temperature model and corresponding equations for model prediction uncertainties. The correct results using GLS/REML regression are compared to the incorrect results obtained using OLS regression
Background stratified Poisson regression analysis of cohort data.

Science.gov (United States)

Richardson, David B; Langholz, Bryan

2012-03-01

Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.
Variable importance in latent variable regression models

NARCIS (Netherlands)

Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.

2014-01-01

The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
A Gaussian mixture copula model based localized Gaussian process regression approach for long-term wind speed prediction

International Nuclear Information System (INIS)

Yu, Jie; Chen, Kuilin; Mori, Junichi; Rashid, Mudassir M.

2013-01-01

Optimizing wind power generation and controlling the operation of wind turbines to efficiently harness the renewable wind energy is a challenging task due to the intermittency and unpredictable nature of wind speed, which has significant influence on wind power production. A new approach for long-term wind speed forecasting is developed in this study by integrating GMCM (Gaussian mixture copula model) and localized GPR (Gaussian process regression). The time series of wind speed is first classified into multiple non-Gaussian components through the Gaussian mixture copula model and then Bayesian inference strategy is employed to incorporate the various non-Gaussian components using the posterior probabilities. Further, the localized Gaussian process regression models corresponding to different non-Gaussian components are built to characterize the stochastic uncertainty and non-stationary seasonality of the wind speed data. The various localized GPR models are integrated through the posterior probabilities as the weightings so that a global predictive model is developed for the prediction of wind speed. The proposed GMCM–GPR approach is demonstrated using wind speed data from various wind farm locations and compared against the GMCM-based ARIMA (auto-regressive integrated moving average) and SVR (support vector regression) methods. In contrast to GMCM–ARIMA and GMCM–SVR methods, the proposed GMCM–GPR model is able to well characterize the multi-seasonality and uncertainty of wind speed series for accurate long-term prediction. - Highlights: • A novel predictive modeling method is proposed for long-term wind speed forecasting. • Gaussian mixture copula model is estimated to characterize the multi-seasonality. • Localized Gaussian process regression models can deal with the random uncertainty. • Multiple GPR models are integrated through Bayesian inference strategy. • The proposed approach shows higher prediction accuracy and reliability
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.

Science.gov (United States)

Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam

2017-11-01

The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
Academic performance of children born preterm: a meta-analysis and meta-regression.

Science.gov (United States)

Twilhaar, E Sabrina; de Kieviet, Jorrit F; Aarnoudse-Moens, Cornelieke Sh; van Elburg, Ruurd M; Oosterlaan, Jaap

2017-08-28

Advances in neonatal healthcare have resulted in decreased mortality after preterm birth but have not led to parallel decreases in morbidity. Academic performance provides insight in the outcomes and specific difficulties and needs of preterm children. To study academic performance in preterm children born in the antenatal steroids and surfactant era and possible moderating effects of perinatal and demographic factors. PubMed, Web of Science and PsycINFO were searched for peer-reviewed articles. Cohort studies with a full-term control group reporting standardised academic performance scores of preterm children (Academic test scores and special educational needs of preterm and full-term children were analysed using random effects meta-analysis. Random effects meta-regressions were performed to explore the predictive role of perinatal and demographic factors for between-study variance in effect sizes. The 17 eligible studies included 2390 preterm children and 1549 controls. Preterm children scored 0.71 SD below full-term peers on arithmetic (pacademic performance (p=0.006). Preterm children born in the antenatal steroids and surfactant era show considerable academic difficulties. Preterm children with bronchopulmonarydysplasia are at particular risk for poor academic outcome. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Regression: The Apple Does Not Fall Far From the Tree.

Science.gov (United States)

Vetter, Thomas R; Schober, Patrick

2018-05-15

Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Multiple regression and beyond an introduction to multiple regression and structural equation modeling

CERN Document Server

Keith, Timothy Z

2014-01-01

Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.
Daily Feed Intake, Energy Intake, Growth Rate and Measures of Dietary Energy Efficiency of Pigs from Four Sire Lines Fed Diets with High or Low Metabolizable and Net Energy Concentrations

Directory of Open Access Journals (Sweden)

A. P. Schinckel

2012-03-01

Full Text Available A trial was conducted to: i evaluate the BW growth, energy intakes and energetic efficiency of pigs fed high and low density diets from 27 to 141 kg BW, ii evaluate sire line and sex differences when fed both diets, and iii to compare ME to NE as predictor of pig performance. The experiment had a replicated factorial arrangement of treatments including four sire lines, two sexes (2,192 barrows and 2,280 gilts, two dietary energy densities and a light or heavy target BW, 118 and 131.5 kg in replicates 1 to 6 and 127 and 140.6 kg in replicates 7 to 10. Pigs were allocated to a series of low energy (LE, 3.27 Mcal ME/kg corn-soybean meal based diets with 16% wheat midds or high energy diets (HE, 3.53 to 3.55 Mcal ME/kg with 4.5 to 4.95% choice white grease. All diets contained 6% DDGS. The HE and LE diets of each of the four phases were formulated to have equal lysine:Mcal ME ratios. Pigs were weighed and pen feed intake (11 or 12 pigs/pen recorded at 28-d intervals. The barrow and gilt daily feed (DFI, ME (MEI and NE (NEI intake data were fitted to a Bridges function of BW. The BW data of each sex were fitted to a generalized Michaelis-Menten function of days of age. ME and NE required for maintenance (Mcal/d were predicted using functions of BW (0.255 and 0.179 BW^0.60 respectively. Pigs fed LE diets had decreased ADG (915 vs. 945 g/d, p<0.001 than pigs fed HE diets. Overall, DFI was greater (p<0.001 for pigs fed the LE diets (2.62 vs. 2.45 kg/d. However, no diet differences were observed for MEI (8.76 vs. 8.78 Mcal/d, p = 0.49 or NEI (6.39 vs. 6.44 Mcal/d, p = 0.13, thereby indicating that the pigs compensated for the decreased energy content of the diet. Overall ADG:DFI (0.362 vs. 0.377 and ADG:Mcal MEI (0.109 vs. 0.113 was less (p<0.001 for pigs fed LE compared to HE diets. Pigs fed HE diets had 3.6% greater ADG:Mcal MEI above maintenance and only 1.3% greater ADG:Mcal NEI (0.152 versus 0.150, therefore NEI is a more accurate predictor of
Quasi-experimental evidence on tobacco tax regressivity.

Science.gov (United States)

Koch, Steven F

2018-01-01

Tobacco taxes are known to reduce tobacco consumption and to be regressive, such that tobacco control policy may have the perverse effect of further harming the poor. However, if tobacco consumption falls faster amongst the poor than the rich, tobacco control policy can actually be progressive. We take advantage of persistent and committed tobacco control activities in South Africa to examine the household tobacco expenditure burden. For the analysis, we make use of two South African Income and Expenditure Surveys (2005/06 and 2010/11) that span a series of such tax increases and have been matched across the years, yielding 7806 matched pairs of tobacco consuming households and 4909 matched pairs of cigarette consuming households. By matching households across the surveys, we are able to examine both the regressivity of the household tobacco burden, and any change in that regressivity, and since tobacco taxes have been a consistent component of tobacco prices, our results also relate to the regressivity of tobacco taxes. Like previous research into cigarette and tobacco expenditures, we find that the tobacco burden is regressive; thus, so are tobacco taxes. However, we find that over the five-year period considered, the tobacco burden has decreased, and, most importantly, falls less heavily on the poor. Thus, the tobacco burden and the tobacco tax is less regressive in 2010/11 than in 2005/06. Thus, increased tobacco taxes can, in at least some circumstances, reduce the financial burden that tobacco places on households. Copyright © 2017 Elsevier Ltd. All rights reserved.
Polylinear regression analysis in radiochemistry

International Nuclear Information System (INIS)

Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.

1995-01-01

A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
Influence diagnostics in meta-regression model.

Science.gov (United States)

Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua

2017-09-01

This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Random walk on random walks

NARCIS (Netherlands)

Hilário, M.; Hollander, den W.Th.F.; Sidoravicius, V.; Soares dos Santos, R.; Teixeira, A.

2014-01-01

In this paper we study a random walk in a one-dimensional dynamic random environment consisting of a collection of independent particles performing simple symmetric random walks in a Poisson equilibrium with density ¿¿(0,8). At each step the random walk performs a nearest-neighbour jump, moving to

Some links on this page may take you to non-federal websites. Their policies may differ from this site.