Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2017-07-26
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R(2)), using R(2) as the primary metric of assay agreement. However, the use of R(2) alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Symbolic regression of generative network models
Menezes, Telmo
2014-01-01
Networks are a powerful abstraction with applicability to a variety of scientific fields. Models explaining their morphology and growth processes permit a wide range of phenomena to be more systematically analysed and understood. At the same time, creating such models is often challenging and requires insights that may be counter-intuitive. Yet there currently exists no general method to arrive at better models. We have developed an approach to automatically detect realistic decentralised network growth models from empirical data, employing a machine learning technique inspired by natural selection and defining a unified formalism to describe such models as computer programs. As the proposed method is completely general and does not assume any pre-existing models, it can be applied "out of the box" to any given network. To validate our approach empirically, we systematically rediscover pre-defined growth laws underlying several canonical network generation models and credible laws for diverse real-world netwo...
The use of GLS regression in regional hydrologic analyses
Griffis, V. W.; Stedinger, J. R.
2007-09-01
SummaryTo estimate flood quantiles and other statistics at ungauged sites, many organizations employ an iterative generalized least squares (GLS) regression procedure to estimate the parameters of a model of the statistic of interest as a function of basin characteristics. The GLS regression procedure accounts for differences in available record lengths and spatial correlation in concurrent events by using an estimator of the sampling covariance matrix of available flood quantiles. Previous studies by the US Geological Survey using the LP3 distribution have neglected the impact of uncertainty in the weighted skew on quantile precision. The needed relationship is developed here and its use is illustrated in a regional flood study with 162 sites from South Carolina. The performance of a pooled regression model is compared to separate models for each hydrologic region: statistical tests recommend an interesting hybrid of the two which is both surprising and hydrologically reasonable. The statistical analysis is augmented with new diagnostic metrics including a condition number to check for multicollinearity, a new pseudo- R appropriate for use with GLS regression, and two error variance ratios. GLS regression for the standard deviation demonstrates that again a hybrid model is attractive, and that GLS rather than an OLS or WLS analysis is appropriate for the development of regional standard deviation models.
Analysing inequalities in Germany a structured additive distributional regression approach
Silbersdorff, Alexander
2017-01-01
This book seeks new perspectives on the growing inequalities that our societies face, putting forward Structured Additive Distributional Regression as a means of statistical analysis that circumvents the common problem of analytical reduction to simple point estimators. This new approach allows the observed discrepancy between the individuals’ realities and the abstract representation of those realities to be explicitly taken into consideration using the arithmetic mean alone. In turn, the method is applied to the question of economic inequality in Germany.
USE OF THE SIMPLE LINEAR REGRESSION MODEL IN MACRO-ECONOMICAL ANALYSES
Constantin ANGHELACHE
2011-10-01
Full Text Available The article presents the fundamental aspects of the linear regression, as a toolbox which can be used in macroeconomic analyses. The article describes the estimation of the parameters, the statistical tests used, the homoscesasticity and heteroskedasticity. The use of econometrics instrument in macroeconomics is an important factor that guarantees the quality of the models, analyses, results and possible interpretation that can be drawn at this level.
A Methodology for Generating Placement Rules that Utilizes Logistic Regression
Wurtz, Keith
2008-01-01
The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…
Analysing count data of Butterflies communities in Jasin, Melaka: A Poisson regression analysis
Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah
2017-09-01
Counting outcomes normally have remaining values highly skewed toward the right as they are often characterized by large values of zeros. The data of butterfly communities, had been taken from Jasin, Melaka and consists of 131 number of subject visits in Jasin, Melaka. In this paper, considering the count data of butterfly communities, an analysis is considered Poisson regression analysis as it is assumed to be an alternative way on better suited to the counting process. This research paper is about analysing count data from zero observation ecological inference of butterfly communities in Jasin, Melaka by using Poisson regression analysis. The software for Poisson regression is readily available and it is becoming more widely used in many field of research and the data was analysed by using SAS software. The purpose of analysis comprised the framework of identifying the concerns. Besides, by using Poisson regression analysis, the study determines the fitness of data for accessing the reliability on using the count data. The finding indicates that the highest and lowest number of subject comes from the third family (Nymphalidae) family and fifth (Hesperidae) family and the Poisson distribution seems to fit the zero values.
Tu, Y-K; Kellett, M; Clerehugh, V; Gilthorpe, M S
2005-10-01
Multivariable analysis is a widely used statistical methodology for investigating associations amongst clinical variables. However, the problems of collinearity and multicollinearity, which can give rise to spurious results, have in the past frequently been disregarded in dental research. This article illustrates and explains the problems which may be encountered, in the hope of increasing awareness and understanding of these issues, thereby improving the quality of the statistical analyses undertaken in dental research. Three examples from different clinical dental specialties are used to demonstrate how to diagnose the problem of collinearity/multicollinearity in multiple regression analyses and to illustrate how collinearity/multicollinearity can seriously distort the model development process. Lack of awareness of these problems can give rise to misleading results and erroneous interpretations. Multivariable analysis is a useful tool for dental research, though only if its users thoroughly understand the assumptions and limitations of these methods. It would benefit evidence-based dentistry enormously if researchers were more aware of both the complexities involved in multiple regression when using these methods and of the need for expert statistical consultation in developing study design and selecting appropriate statistical methodologies.
Performance Evaluation of Button Bits in Coal Measure Rocks by Using Multiple Regression Analyses
Su, Okan
2016-02-01
Electro-hydraulic and jumbo drills are commonly used for underground coal mines and tunnel drives for the purpose of blasthole drilling and rock bolt installations. Not only machine parameters but also environmental conditions have significant effects on drilling. This study characterizes the performance of button bits during blasthole drilling in coal measure rocks by using multiple regression analyses. The penetration rate of jumbo and electro-hydraulic drills was measured in the field by employing bits in different diameters and the specific energy of the drilling was calculated at various locations, including highway tunnels and underground roadways of coal mines. Large block samples were collected from each location at which in situ drilling measurements were performed. Then, the effects of rock properties and machine parameters on the drilling performance were examined. Multiple regression models were developed for the prediction of the specific energy of the drilling and the penetration rate. The results revealed that hole area, impact (blow) energy, blows per minute of the piston within the drill, and some rock properties, such as the uniaxial compressive strength (UCS) and the drilling rate index (DRI), influence the drill performance.
Growth regression models at two generations of selected populations Alabio ducks
L Hardi Prasetyo
2007-12-01
Full Text Available A selection process to increase egg production of Alabio ducks was conducted in Balai Penelitian Ternak, Ciawi-Bogor. The selection aimed at increasing production, however observation on growth of the selected ducks was necessary since early growth stage (0-8 wks determines the performance during laying period. This paper presents the growth models and the coefficient of determination of two generations of selected Alabio ducks. Body weight were observed weekly on 363 ducks from F1 and 356 ducks from F2, between 0-8 weeks and then fortinghly until 16 weeks. Growth curves were analysed using regression models between age and bodyweight of each population. The selection of model with the best fit was based on the large value of determination coefficient (R2, small value of MSE, and sinificant level of regression coefficient. Result showed that cubic polynomial regression was the best fit for the two populations, Y = 56.31-1.44X+0.64X2-0.005X3 for F1 and Y = 43.05 + 0.96X + 0.69X2 - 0.0056X3 for F2. The values of R2 were 0.9466 for F1 and 0.9243 for F2, and the values of MSE were 11.586 for F1 and 19.978 for F2. The growth of F1 is better during starter period, but F2 is better during grower period.
The number of subjects per variable required in linear regression analyses
P.C. Austin (Peter); E.W. Steyerberg (Ewout)
2015-01-01
textabstractObjectives To determine the number of independent variables that can be included in a linear regression model. Study Design and Setting We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression c
The number of subjects per variable required in linear regression analyses.
Austin, Peter C; Steyerberg, Ewout W
2015-06-01
To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Analysing the forward premium anomaly using a Logistic Smooth Transition Regression model.
Sofiane Amri
2008-01-01
Several researchers have suggested that exchange rates may be characterized by nonlinear behaviour. This paper examines these nonlinearities and asymetries and estimates a Logistic Transition Regression (LSTR) of Fama Regression with the Risk Adjusted Forward Premia as transition variable. Results confirm the existence of nonlinear dynamics in the relationship between spot exchange rate differential and the forward premium for all the currencies of the sample and for all maturities (three and...
Li, Spencer D.
2011-01-01
Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of t...
Giuliano de Oliveira Freitas
2013-10-01
Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.
Check-all-that-apply data analysed by Partial Least Squares regression
Rinnan, Åsmund; Giacalone, Davide; Frøst, Michael Bom
2015-01-01
are analysed by multivariate techniques. CATA data can be analysed both by setting the CATA as the X and the Y. The former is the PLS-Discriminant Analysis (PLS-DA) version, while the latter is the ANOVA-PLS (A-PLS) version. We investigated the difference between these two approaches, concluding...
Scott, Neil W; Fayers, Peter M; Aaronson, Neil K
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
Analyses of Developmental Rate Isomorphy in Ectotherms: Introducing the Dirichlet Regression.
David S Boukal
Full Text Available Temperature drives development in insects and other ectotherms because their metabolic rate and growth depends directly on thermal conditions. However, relative durations of successive ontogenetic stages often remain nearly constant across a substantial range of temperatures. This pattern, termed 'developmental rate isomorphy' (DRI in insects, appears to be widespread and reported departures from DRI are generally very small. We show that these conclusions may be due to the caveats hidden in the statistical methods currently used to study DRI. Because the DRI concept is inherently based on proportional data, we propose that Dirichlet regression applied to individual-level data is an appropriate statistical method to critically assess DRI. As a case study we analyze data on five aquatic and four terrestrial insect species. We find that results obtained by Dirichlet regression are consistent with DRI violation in at least eight of the studied species, although standard analysis detects significant departure from DRI in only four of them. Moreover, the departures from DRI detected by Dirichlet regression are consistently much larger than previously reported. The proposed framework can also be used to infer whether observed departures from DRI reflect life history adaptations to size- or stage-dependent effects of varying temperature. Our results indicate that the concept of DRI in insects and other ectotherms should be critically re-evaluated and put in a wider context, including the concept of 'equiproportional development' developed for copepods.
B Gadžurić, Slobodan; O Podunavac Kuzmanović, Sanja; B Vraneš, Milan; Petrin, Marija; Bugarski, Tatjana; Kovačević, Strahinja Z
2016-01-01
The purpose of this work is to promote and facilitate forensic profiling and chemical analysis of illicit drug samples in order to determine their origin, methods of production and transfer through the country. The article is based on the gas chromatography analysis of heroin samples seized from three different locations in Serbia. Chemometric approach with appropriate statistical tools (multiple-linear regression (MLR), hierarchical cluster analysis (HCA) and Wald-Wolfowitz run (WWR) test) were applied on chromatographic data of heroin samples in order to correlate and examine the geographic origin of seized heroin samples. The best MLR models were further validated by leave-one-out technique as well as by the calculation of basic statistical parameters for the established models. To confirm the predictive power of the models, external set of heroin samples was used. High agreement between experimental and predicted values of acetyl thebaol and diacetyl morphine peak ratio, obtained in the validation procedure, indicated the good quality of derived MLR models. WWR test showed which examined heroin samples come from the same population, and HCA was applied in order to overview the similarities among the studied heroine samples.
Drop-Weight Impact Test on U-Shape Concrete Specimens with Statistical and Regression Analyses
Xue-Chao Zhu
2015-09-01
Full Text Available According to the principle and method of drop-weight impact test, the impact resistance of concrete was measured using self-designed U-shape specimens and a newly designed drop-weight impact test apparatus. A series of drop-weight impact tests were carried out with four different masses of drop hammers (0.875, 0.8, 0.675 and 0.5 kg. The test results show that the impact resistance results fail to follow a normal distribution. As expected, U-shaped specimens can predetermine the location of the cracks very well. It is also easy to record the cracks propagation during the test. The maximum of coefficient of variation in this study is 31.2%; it is lower than the values obtained from the American Concrete Institute (ACI impact tests in the literature. By regression analysis, the linear relationship between the first-crack and ultimate failure impact resistance is good. It can suggested that a minimum number of specimens is required to reliably measure the properties of the material based on the observed levels of variation.
Forecasting municipal solid waste generation using prognostic tools and regression analysis.
Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria
2016-11-01
For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction.
Oka, Masayoshi; Wong, David W S
2016-06-01
Area-based measures of neighborhood characteristics simply derived from enumeration units (e.g., census tracts or block groups) ignore the potential of spatial spillover effects, and thus incorporating such measures into multilevel regression models may underestimate the neighborhood effects on health. To overcome this limitation, we describe the concept and method of areal median filtering to spatialize area-based measures of neighborhood characteristics for multilevel regression analyses. The areal median filtering approach provides a means to specify or formulate "neighborhoods" as meaningful geographic entities by removing enumeration unit boundaries as the absolute barriers and by pooling information from the neighboring enumeration units. This spatializing process takes into account for the potential of spatial spillover effects and also converts aspatial measures of neighborhood characteristics into spatial measures. From a conceptual and methodological standpoint, incorporating the derived spatial measures into multilevel regression analyses allows us to more accurately examine the relationships between neighborhood characteristics and health. To promote and set the stage for informative research in the future, we provide a few important conceptual and methodological remarks, and discuss possible applications, inherent limitations, and practical solutions for using the areal median filtering approach in the study of neighborhood effects on health.
Buishand, T. A.; Klein Tank, A. M. G.
1996-05-01
The precipitation amounts on wet days at De Bilt (the Netherlands) are linked to temperature and surface air pressure through advanced regression techniques. Temperature is chosen as a covariate to use the model for generating synthetic time series of daily precipitation in a CO2 induced warmer climate. The precipitation-temperature dependence can partly be ascribed to the phenomenon that warmer air can contain more moisture. Spline functions are introduced to reproduce the non-monotonous change of the mean daily precipitation amount with temperature. Because the model is non-linear and the variance of the errors depends on the expected response, an iteratively reweighted least-squares technique is needed to estimate the regression coefficients. A representative rainfall sequence for the situation of a systematic temperature rise is obtained by multiplying the precipitation amounts in the observed record with a temperature dependent factor based on a fitted regression model. For a temperature change of 3°C (reasonable guess for a doubled CO2 climate according to the present-day general circulation models) this results in an increase in the annual average amount of 9% (20% in winter and 4% in summer). An extended model with both temperature and surface air pressure is presented which makes it possible to study the additional effects of a potential systematic change in surface air pressure on precipitation.
Buck, J. A.; Underhill, P. R.; Morelli, J.; Krause, T. W.
2016-02-01
Nuclear steam generators (SGs) are a critical component for ensuring safe and efficient operation of a reactor. Life management strategies are implemented in which SG tubes are regularly inspected by conventional eddy current testing (ECT) and ultrasonic testing (UT) technologies to size flaws, and safe operating life of SGs is predicted based on growth models. ECT, the more commonly used technique, due to the rapidity with which full SG tube wall inspection can be performed, is challenged when inspecting ferromagnetic support structure materials in the presence of magnetite sludge and multiple overlapping degradation modes. In this work, an emerging inspection method, pulsed eddy current (PEC), is being investigated to address some of these particular inspection conditions. Time-domain signals were collected by an 8 coil array PEC probe in which ferromagnetic drilled support hole diameter, depth of rectangular tube frets and 2D tube off-centering were varied. Data sets were analyzed with a modified principal components analysis (MPCA) to extract dominant signal features. Multiple linear regression models were applied to MPCA scores to size hole diameter as well as size rectangular outer diameter tube frets. Models were improved through exploratory factor analysis, which was applied to MPCA scores to refine selection for regression models inputs by removing nonessential information.
Feest, Uljana
2016-08-01
This paper revisits the debate between Harry Collins and Allan Franklin, concerning the experimenters' regress. Focusing my attention on a case study from recent psychology (regarding experimental evidence for the existence of a Mozart Effect), I argue that Franklin is right to highlight the role of epistemological strategies in scientific practice, but that his account does not sufficiently appreciate Collins's point about the importance of tacit knowledge in experimental practice. In turn, Collins rightly highlights the epistemic uncertainty (and skepticism) surrounding much experimental research. However, I will argue that his analysis of tacit knowledge fails to elucidate the reasons why scientists often are (and should be) skeptical of other researchers' experimental results. I will present an analysis of tacit knowledge in experimental research that not only answers to this desideratum, but also shows how such skepticism can in fact be a vital enabling factor for the dynamic processes of experimental knowledge generation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bou Kheir, Rania; Greve, Mogens Humlekrog; Deroin, Jean-Paul
2013-01-01
Soil contamination by heavy metals has become a widespread dangerous problem in many parts of the world, including the Mediterranean environments. This is closely related to the increase irrigation by waste waters, to the uncontrolled application of sewage sludge, industrial effluents, pesticides...... coastal area situated in northern Lebanon using a geographic information system (GIS) and regression-tree analysis. The chosen area represents a typical case study of Mediterranean coastal landscape with deteriorating environment. Fifteen environmental parameters (parent material, soil type, p......H, hydraulical conductivity, organic matter, stoniness ratio, soil depth, slope gradient, slope aspect, slope curvature, land cover/use, distance to drainage line, proximity to roads, nearness to cities, and surroundings to waste areas) were generated from satellite imageries, Digital Elevation Models (DEMs...
Cashman, Kevin D; Ritz, Christian; Kiely, Mairead; Odin Collaborators
2017-05-08
Dietary Reference Values (DRVs) for vitamin D have a key role in the prevention of vitamin D deficiency. However, despite adopting similar risk assessment protocols, estimates from authoritative agencies over the last 6 years have been diverse. This may have arisen from diverse approaches to data analysis. Modelling strategies for pooling of individual subject data from cognate vitamin D randomized controlled trials (RCTs) are likely to provide the most appropriate DRV estimates. Thus, the objective of the present work was to undertake the first-ever individual participant data (IPD)-level meta-regression, which is increasingly recognized as best practice, from seven winter-based RCTs (with 882 participants ranging in age from 4 to 90 years) of the vitamin D intake-serum 25-hydroxyvitamin D (25(OH)D) dose-response. Our IPD-derived estimates of vitamin D intakes required to maintain 97.5% of 25(OH)D concentrations >25, 30, and 50 nmol/L across the population are 10, 13, and 26 µg/day, respectively. In contrast, standard meta-regression analyses with aggregate data (as used by several agencies in recent years) from the same RCTs estimated that a vitamin D intake requirement of 14 µg/day would maintain 97.5% of 25(OH)D >50 nmol/L. These first IPD-derived estimates offer improved dietary recommendations for vitamin D because the underpinning modeling captures the between-person variability in response of serum 25(OH)D to vitamin D intake.
Flam-Zalcman, Rosely; Mann, Robert E; Stoduto, Gina; Nochajski, Thomas H; Rush, Brian R; Koski-Jännes, Anja; Wickens, Christine M; Thomas, Rita K; Rehm, Jürgen
2013-03-01
Brief interventions effectively reduce alcohol problems; however, it is controversial whether longer interventions result in greater improvement. This study aims to determine whether an increase in treatment for people with more severe problems resulted in better outcome. We employed regression-discontinuity analyses to determine if drinking driver clients (n = 22,277) in Ontario benefited when they were assigned to a longer treatment program (8-hour versus 16-hour) based on assessed addiction severity criteria. Assignment to the longer16-hour program was based on two addiction severity measures derived from the Research Institute on Addictions Self-inventory (RIASI) (meeting criteria for assignment based on either the total RIASI score or the score on the recidivism subscale). The main outcome measure was self-reported number of days of alcohol use during the 90 days preceding the six month follow-up interview. We found significant reductions of one or two self-reported drinking days at the point of assignment, depending on the severity criterion used. These data suggest that more intensive treatment for alcohol problems may improve results for individuals with more severe problems.
WANG Tijian(王体健); K. S. LAM; C. W. TSANG; S. C. KOT
2004-01-01
This paper investigates,the variability and correlation of surface ozone (03) and carbon monoxide (CO) observed at Cape D'Aguilar in Hong Kong from I January 1994 to 31 December 1995.Statistical analysis shows that the average 03 and CO mixing ratios during the two years are 32:k17 ppbv and 305:k191ppbv,respectively.The O3/CO ratio ranges from 0.05 to 0.6 ppbv/ppbv with its frequency peaking at 0.15.The raw dataset is divided into six groups using backward trajectory and cluster analyses.For data assigned to the same trajectory type,three groups are further sorted out based on CO and NOx mixing ratios.The correlation coefficients and slopes of O3/CO for the 18 groups are calculated using linear regression analysis.Final]y,five kinds of air masses with different chemical features are identified:continental background (CB),marine background (MB),regional polluted continental (RPC),perturbed marine (P'M),and local polluted (LP) air masses.Further studies indicate that 03 and CO in the continental and marine background air masses (CB and MB) are positively correlated for the reason that they are well mixed over the long range transport before arriving at the site.The negative correlation between 03 and CO in air mass LP is believed to be associated with heavy anthropogenic influence,which results from the enhancement by local sources as indicated by high CO and NOx and depletion of 03 when mixed with fresh emissions.The positive correlation in the perturbed marine air mass P*M favors the low photochemical production of 03.The negative,correlation found in the regional polluted continental air mass RPC is different from the observations at Oki Island in Japan due to the more complex 03 chemistry at Cape D'Aguilar.
Željko V. Račić
2010-12-01
Full Text Available This paper aims to present the specifics of the application of multiple linear regression model. The economic (financial crisis is analyzed in terms of gross domestic product which is in a function of the foreign trade balance (on one hand and the credit cards, i.e. indebtedness of the population on this basis (on the other hand, in the USA (from 1999. to 2008. We used the extended application model which shows how the analyst should run the whole development process of regression model. This process began with simple statistical features and the application of regression procedures, and ended with residual analysis, intended for the study of compatibility of data and model settings. This paper also analyzes the values of some standard statistics used in the selection of appropriate regression model. Testing of the model is carried out with the use of the Statistics PASW 17 program.
Piyawat Wuttichaikitcharoen
2014-08-01
Full Text Available Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA. Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are developed. The PCA shows six components of key factors that can explain at least up to 86.7% of the variation of all variables. The regression models show that basin size, channel network characteristics, land use, basin steepness and rainfall distribution are the key factors affecting sediment yield. The validation of regression relationships for estimating suspended sediment yield shows the error of estimation ranging from −55% to +315% and −59% to +259% for suspended sediment yield and for area-specific suspended sediment yield, respectively. The proposed relationships may be considered useful for predicting suspended sediment yield in ungauged basins of Northern Thailand that have geologic, climatic and hydrologic conditions similar to the study area.
Analyses of optimum generation scenarios for sustainable power generation in Ghana
Albert K. Awopone
2017-02-01
Full Text Available This study examines optimum generation scenarios for Ghana from 2010 to 2040. The Open Source Energy Modelling System (OSeMOSYS, an optimisation model for long term energy planning, which is integrated in Long-range Energy Alternatives Planning (LEAP tool, was applied to model the generation system. The developed model was applied to the case study of the reference scenario (OPT which examines the least cost development of the system without any shift in policy. Three groups of policy scenario were developed based on the future possible energy policy direction in Ghana: energy emission targets, carbon taxes and transmission and distribution losses improvements. The model was then used to simulate the development of technologies in each scenario up to 2040 and the level of renewable generation examined. Finally, cost benefit analysis of the policy scenarios, as well as their greenhouse gas mitigation potential were also discussed. The results show that: suitable policies for clean power generation have an important role in CO2 mitigation in Ghana. The introduction of carbon minimisation policies will also promote diversification of the generation mix with higher penetration of renewable energy technologies, thus reducing the overall fossil fuel generation in Ghana. It further indicated that, significant greenhouse emissions savings is achieved with improvement in transmission and distribution losses.
Generation of Natural Runoff Monthly Series at Ungauged Sites Using a Regional Regressive Model
Dario Pumo
2016-05-01
Full Text Available Many hydrologic applications require reliable estimates of runoff in river basins to face the widespread lack of data, both in time and in space. A regional method for the reconstruction of monthly runoff series is here developed and applied to Sicily (Italy. A simple modeling structure is adopted, consisting of a regression-based rainfall–runoff model with four model parameters, calibrated through a two-step procedure. Monthly runoff estimates are based on precipitation, temperature, and exploiting the autocorrelation with runoff at the previous month. Model parameters are assessed by specific regional equations as a function of easily measurable physical and climate basin descriptors. The first calibration step is aimed at the identification of a set of parameters optimizing model performances at the level of single basin. Such “optimal” sets are used at the second step, part of a regional regression analysis, to establish the regional equations for model parameters assessment as a function of basin attributes. All the gauged watersheds across the region have been analyzed, selecting 53 basins for model calibration and using the other six basins exclusively for validation. Performances, quantitatively evaluated by different statistical indexes, demonstrate relevant model ability in reproducing the observed hydrological time-series at both the monthly and coarser time resolutions. The methodology, which is easily transferable to other arid and semi-arid areas, provides a reliable tool for filling/reconstructing runoff time series at any gauged or ungauged basin of a region.
Qi, Danyi; Roe, Brian E
2016-01-01
We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
2016-01-01
We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents’ food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits. PMID:27441687
Danyi Qi
Full Text Available We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
Hewett, Timothy E; Webster, Kate E; Hurd, Wendy J
2017-08-16
The evolution of clinical practice and medical technology has yielded an increasing number of clinical measures and tests to assess a patient's progression and return to sport readiness after injury. The plethora of available tests may be burdensome to clinicians in the absence of evidence that demonstrates the utility of a given measurement. Thus, there is a critical need to identify a discrete number of metrics to capture during clinical assessment to effectively and concisely guide patient care. The data sources included Pubmed and PMC Pubmed Central articles on the topic. Therefore, we present a systematic approach to injury risk analyses and how this concept may be used in algorithms for risk analyses for primary anterior cruciate ligament (ACL) injury in healthy athletes and patients after ACL reconstruction. In this article, we present the five-factor maximum model, which states that in any predictive model, a maximum of 5 variables will contribute in a meaningful manner to any risk factor analysis. We demonstrate how this model already exists for prevention of primary ACL injury, how this model may guide development of the second ACL injury risk analysis, and how the five-factor maximum model may be applied across the injury spectrum for development of the injury risk analysis.
Betsey Dexter Dyer
2008-01-01
Full Text Available Classification and regression tree (CART analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear qualities of genomes may reflect certain environmental conditions (such as temperature in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results.
Xie, Heping; Wang, Fuxing; Hao, Yanbin; Chen, Jiaxue; An, Jing; Wang, Yuxin; Liu, Huashan
2017-01-01
Cueing facilitates retention and transfer of multimedia learning. From the perspective of cognitive load theory (CLT), cueing has a positive effect on learning outcomes because of the reduction in total cognitive load and avoidance of cognitive overload. However, this has not been systematically evaluated. Moreover, what remains ambiguous is the direct relationship between the cue-related cognitive load and learning outcomes. A meta-analysis and two subsequent meta-regression analyses were conducted to explore these issues. Subjective total cognitive load (SCL) and scores on a retention test and transfer test were selected as dependent variables. Through a systematic literature search, 32 eligible articles encompassing 3,597 participants were included in the SCL-related meta-analysis. Among them, 25 articles containing 2,910 participants were included in the retention-related meta-analysis and the following retention-related meta-regression, while there were 29 articles containing 3,204 participants included in the transfer-related meta-analysis and the transfer-related meta-regression. The meta-analysis revealed a statistically significant cueing effect on subjective ratings of cognitive load (d = -0.11, 95% CI = [-0.19, -0.02], p multimedia materials can indeed reduce SCL and promote learning outcomes, and the more SCL is reduced by cues, the better retention and transfer of multimedia learning.
Mazurana, Dyan; Benelli, Prisca; Walker, Peter
2013-07-01
Humanitarian aid remains largely driven by anecdote rather than by evidence. The contemporary humanitarian system has significant weaknesses with regard to data collection, analysis, and action at all stages of response to crises involving armed conflict or natural disaster. This paper argues that humanitarian actors can best determine and respond to vulnerabilities and needs if they use sex- and age-disaggregated data (SADD) and gender and generational analyses to help shape their assessments of crises-affected populations. Through case studies, the paper shows how gaps in information on sex and age limit the effectiveness of humanitarian response in all phases of a crisis. The case studies serve to show how proper collection, use, and analysis of SADD enable operational agencies to deliver assistance more effectively and efficiently. The evidence suggests that the employment of SADD and gender and generational analyses assists in saving lives and livelihoods in a crisis.
Umezawa, Osamu [Department of Mechanical Engineering and Materials Science, Yokohama National University 79-5 Tokiwadai, Hodogaya, Yokohama, 240-8501 (Japan); Morita, Motoaki [Department of Mechanical Engineering and Materials Science, Yokohama National University 79-5 Tokiwadai, Hodogaya, Yokohama, 240-8501, Japan and Now Tokyo University of Marine Science and Technology, Koto-ku, Tokyo 135-8533 (Japan); Yuasa, Takayuki [Department of Mechanical Engineering and Materials Science, Yokohama National University 79-5 Tokiwadai, Hodogaya, Yokohama, 240-8501, Japan and Now Nippon Steel and Sumitomo Metal, Kashima, 314-0014 (Japan); Morooka, Satoshi [Department of Mechanical Engineering and Materials Science, Yokohama National University 79-5 Tokiwadai, Hodogaya, Yokohama, 240-8501, Japan and Now Tokyo Metropolitan University, Hino, Tokyo 191-0065 (Japan); Ono, Yoshinori; Yuri, Tetsumi; Ogata, Toshio [National Institute for Materials Science, 1-2-1 Sengen, Tsukuba, 305-0047 (Japan)
2014-01-27
Subsurface crack initiation in high-cycle fatigue has been detected as (0001) transgranular facet in titanium alloys at low temperature. The discussion on the subsurface crack generation was reviewed. Analyses by neutron diffraction and full constraints model under tension mode as well as crystallographic identification of the facet were focused. The accumulated tensile stress along <0001> may be responsible to initial microcracking on (0001) and the crack opening.
Barellini, A; Bogi, L; Licitra, G; Silvi, A M; Zari, A
2009-12-01
Air traffic control (ATC) primary radars are 'classical' radars that use echoes of radiofrequency (RF) pulses from aircraft to determine their position. High-power RF pulses radiated from radar antennas may produce high electromagnetic field levels in the surrounding area. Measurement of electromagnetic fields produced by RF-pulsed radar by means of a swept-tuned spectrum analyser are investigated here. Measurements have been carried out both in the laboratory and in situ on signals generated by an ATC primary radar.
Long Cui; Emily Hoi-Man Wong; Guo Cheng; Manoel Firmato de Almeida; Man-Ting So; Pak-Chung Sham; Stacey S Cherny; Paul Kwong-Hang Tam; Maria-Mercè Garcia-Barceló
2013-01-01
We present the genetic analyses conducted on a three-generation family (14 individuals) with three members affected with isolated-Hirschsprung disease (HSCR) and one with HSCR and heterochromia iridum (syndromic-HSCR), a phenotype reminiscent of Waardenburg-Shah syndrome (WS4). WS4 is characterized by pigmentary abnormalities of the skin, eyes and/or hair, sensorineural deafness and HSCR. None of the members had sensorineural deafness. The family was screened for copy number variations (CNVs)...
Akın Avşaroğlu; Suphi URAL
2017-01-01
The purpose of this study is to Reducing and Analysing of Flow Accelerated Corrosion in Thermal Plant Heat Recovery Steam Generators. All these studies have been performed in a new and 16 year-old established Combined Cycle Power Plants in Turkey. Corrosion cases have been investigated due to Mechanical Outage Reports at Power Plant in 2011-2015. Flow Accelerated Corrosion study has been based on specific zone related with Economizer Low Pressure connection pipings. It was issued a performanc...
Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-04-04
Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; Pgaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07-3.75; P=.03). Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non
Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.
Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T
2014-06-01
Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.
Akın Avşaroğlu
2017-01-01
Full Text Available The purpose of this study is to Reducing and Analysing of Flow Accelerated Corrosion in Thermal Plant Heat Recovery Steam Generators. All these studies have been performed in a new and 16 year-old established Combined Cycle Power Plants in Turkey. Corrosion cases have been investigated due to Mechanical Outage Reports at Power Plant in 2011-2015. Flow Accelerated Corrosion study has been based on specific zone related with Economizer Low Pressure connection pipings. It was issued a performance report. Results and lessons learnt from these studies will be used as a preventive action manner in all similar Plants.
Analyses of an air conditioning system with entropy generation minimization and entransy theory
Yan-Qiu, Wu; Li, Cai; Hong-Juan, Wu
2016-06-01
In this paper, based on the generalized heat transfer law, an air conditioning system is analyzed with the entropy generation minimization and the entransy theory. Taking the coefficient of performance (denoted as COP) and heat flow rate Q out which is released into the room as the optimization objectives, we discuss the applicabilities of the entropy generation minimization and entransy theory to the optimizations. Five numerical cases are presented. Combining the numerical results and theoretical analyses, we can conclude that the optimization applicabilities of the two theories are conditional. If Q out is the optimization objective, larger entransy increase rate always leads to larger Q out, while smaller entropy generation rate does not. If we take COP as the optimization objective, neither the entropy generation minimization nor the concept of entransy increase is always applicable. Furthermore, we find that the concept of entransy dissipation is not applicable for the discussed cases. Project supported by the Youth Programs of Chongqing Three Gorges University, China (Grant No. 13QN18).
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O.
2017-03-01
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (Kb), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated Kb and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81 × 10- 7 M for anthracene and 3.48 × 10- 8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a
Peluso, Marco E M; Munnia, Armelle; Ceppi, Marcello
2014-11-05
Exposures to bisphenol-A, a weak estrogenic chemical, largely used for the production of plastic containers, can affect the rodent behaviour. Thus, we examined the relationships between bisphenol-A and the anxiety-like behaviour, spatial skills, and aggressiveness, in 12 toxicity studies of rodent offspring from females orally exposed to bisphenol-A, while pregnant and/or lactating, by median and linear splines analyses. Subsequently, the meta-regression analysis was applied to quantify the behavioural changes. U-shaped, inverted U-shaped and J-shaped dose-response curves were found to describe the relationships between bisphenol-A with the behavioural outcomes. The occurrence of anxiogenic-like effects and spatial skill changes displayed U-shaped and inverted U-shaped curves, respectively, providing examples of effects that are observed at low-doses. Conversely, a J-dose-response relationship was observed for aggressiveness. When the proportion of rodents expressing certain traits or the time that they employed to manifest an attitude was analysed, the meta-regression indicated that a borderline significant increment of anxiogenic-like effects was present at low-doses regardless of sexes (β)=-0.8%, 95% C.I. -1.7/0.1, P=0.076, at ≤120 μg bisphenol-A. Whereas, only bisphenol-A-males exhibited a significant inhibition of spatial skills (β)=0.7%, 95% C.I. 0.2/1.2, P=0.004, at ≤100 μg/day. A significant increment of aggressiveness was observed in both the sexes (β)=67.9,C.I. 3.4, 172.5, P=0.038, at >4.0 μg. Then, bisphenol-A treatments significantly abrogated spatial learning and ability in males (P<0.001 vs. females). Overall, our study showed that developmental exposures to low-doses of bisphenol-A, e.g. ≤120 μg/day, were associated to behavioural aberrations in offspring.
Jessica Jeyanthi James Antony
2015-12-01
Full Text Available This study was conducted to detect the morphological, histological and molecular diff erences in the second generation of the PVS2 cryopreserved Dendrobium Bobby Messina [DBM] (18 months old culture plantlets. Morphological analyses indicated that similarities and diff erences in cryopreserved DBM plantlets comparing to control stock culture based on selected morphological criteria. Morphological criteria, such as root length, number of shoot per explant and shoot length displayed diff erences, while the other three criteria, leaf diameter, leaf length and PLBs size were similar in cryopreserved compared to the control stock culture plant. Higher amount of homogenous cell population and denser cytoplasm were observed in cryopreserved PLBs compared to control stock culture PLBs based on histological analysis. This suggests the existance of somatic embryogenesis development mechanism taking place during the recovery and regeneration of the cryopreserved PLBs. However, RAPD analyses based on 10 primers indicated that cryopreserved DBM regenerated from vitrifi cation method generated a total of 20 to 39.9% polymorphic bands as compared to stock culture indicating potential somaclonal variation. Hence, an increase percentage of polymorphics bands in cryopreserved plantlets 18 months post cryopreservation as compared to previous report of 10% polymorphic bands in cryopreserved DBM 3 months post cryopreservation.
Khanfar, Mohammad A; Taha, Mutasem O
2013-10-28
The mammalian target of rapamycin (mTOR) has an important role in cell growth, proliferation, and survival. mTOR is frequently hyperactivated in cancer, and therefore, it is a clinically validated target for cancer therapy. In this study, we combined exhaustive pharmacophore modeling and quantitative structure-activity relationship (QSAR) analysis to explore the structural requirements for potent mTOR inhibitors employing 210 known mTOR ligands. Genetic function algorithm (GFA) coupled with k nearest neighbor (kNN) and multiple linear regression (MLR) analyses were employed to build self-consistent and predictive QSAR models based on optimal combinations of pharmacophores and physicochemical descriptors. Successful pharmacophores were complemented with exclusion spheres to optimize their receiver operating characteristic curve (ROC) profiles. Optimal QSAR models and their associated pharmacophore hypotheses were validated by identification and experimental evaluation of several new promising mTOR inhibitory leads retrieved from the National Cancer Institute (NCI) structural database. The most potent hit illustrated an IC50 value of 48 nM.
Riley, D G; Coleman, S W; Chase, C C; Olson, T A; Hammond, A C
2007-01-01
The objective of this research was to assess the genetic control of BW, hip height, and the ratio of BW to hip height (n = 5,055) in Brahman cattle through 170 d on feed using covariance function-random regression models. A progeny test of Brahman sires (n = 27) generated records of Brahman steers and heifers (n = 724) over 7 yr. Each year after weaning, calves were assigned to feedlot pens, where they were fed a high-concentrate grain diet. Body weights and hip heights were recorded every 28 d until cattle reached a targeted fatness level. All calves had records through 170 d on feed; subsequent records were excluded. Models included contemporary group (sex-pen-year combinations, n = 63) and age at the beginning of the feeding period as a covariate. The residual error structure was modeled as a random effect, with 2 levels corresponding to two 85-d periods on feed. Information criterion values indicated that linear, random regression coefficients on Legendre polynomials of days on feed were most appropriate to model additive genetic effects for all 3 traits. Cubic (hip height and BW:hip height ratio) or quartic (BW) polynomials best modeled permanent environmental effects. Estimates of heritability across the 170-d feeding period ranged from 0.31 to 0.53 for BW, from 0.37 to 0.53 for hip height, and from 0.23 to 0.6 for BW:hip height ratio. Estimates of the permanent environmental proportion of phenotypic variance ranged from 0.44 to 0.58 for BW, 0.07 to 0.26 for hip height, and 0.30 to 0.48 for BW:hip height ratio. Within-trait estimates of genetic correlation on pairs of days on feed (at 28-d intervals) indicated lower associations of BW:hip height ratio EBV early and late in the feeding period but large positive associations for BW or hip height EBV throughout. Estimates of genetic correlations among the 3 traits indicated almost no association of BW:hip height ratio and hip height EBV. The ratio of BW to hip height in cattle has previously been used as an
CHEOL HO PYEON
2013-02-01
Full Text Available Neutron spectrum analyses of spallation neutrons are conducted in the accelerator-driven system (ADS facility at the Kyoto University Critical Assembly (KUCA. High-energy protons (100 MeV obtained from the fixed field alternating gradient accelerator are injected onto a tungsten target, whereby the spallation neutrons are generated. For neutronic characteristics of spallation neutrons, the reaction rates and the continuous energy distribution of spallation neutrons are measured by the foil activation method and by an organic liquid scintillator, respectively. Numerical calculations are executed by MCNPX with JENDL/HE-2007 and ENDF/B-VI libraries to evaluate the reaction rates of activation foils (bismuth and indium set at the target and the continuous energy distribution of spallation neutrons set in front of the target. For the reaction rates by the foil activation method, the C/E values between the experiments and the calculations are found around a relative difference of 10%, except for some reactions. For continuous energy distribution by the organic liquid scintillator, the spallation neutrons are observed up to 45 MeV. From these results, the neutron spectrum information on the spallation neutrons generated at the target are attained successfully in injecting 100 MeV protons onto the tungsten target.
Liu, Zhanqi; Panousis, Con; Smyth, Fiona E; Murphy, Roger; Wirth, Veronika; Cartwright, Glenn; Johns, Terrance G; Scott, Andrew M
2003-08-01
The chimeric monoclonal antibody ch806 specifically targets the tumor-associated mutant epidermal growth factor receptor (de 2-7EGFR or EGFRVIII) and is currently under investigation for its potential use in cancer therapy. The humanised monoclonal antibody hu3S193 specifically targets the Lewis Y epithelial antigen and is currently in Phase I clinical trials in patients with advanced breast, colon, and ovarian carcinomas. To assist the clinical evaluation of ch806 and hu3S193, laboratory assays are required to monitor their serum pharmacokinetics and quantitate any immune responses to the antibodies. Mice immunized with ch806 or hu3S193 were used to generate hybridomas producing antibodies with specific binding to ch806 or hu3S193 and competitive for antigen binding. These anti-idiotype antibodies (designated Ludwig Melbourne Hybridomas, LMH) were investigated as reagents suitable for use as positive controls for HAHA or HACA analyses and for measuring hu3S193 or ch806 in human serum. Anti-idiotypes with the ability to concurrently bind two target antibody molecules were identified, which enabled the development of highly reproducible, sensitive, specific ELISA assays for determining serum concentrations of hu3S193 and ch806 with a 3 ng/mL limit of quantitation using LMH-3 and LMH-12, respectively. BIAcore analyses determined high apparent binding affinity for both idiotypes: LMH-3 binding immobilized hu3S193, Ka = 4.76 x 10(8) M(-1); LMH-12 binding immobilised ch806, Ka = 1.74 x 10(9) M(-1). Establishment of HAHA or HACA analysis of sera samples using BIAcore was possible using LMH-3 and LMH-12 as positive controls for quantitation of immune responses to hu3S193 or ch806 in patient sera. These anti-idiotypes could also be used to study the penetrance and binding of ch806 or hu3S193 to tumor cells through immunohistochemical analysis of tumor biopsies. The generation of anti-idiotype antibodies capable of concurrently binding a target antibody on each variable
Novel Primer Sets for Next Generation Sequencing-Based Analyses of Water Quality.
Lee, Elvina; Khurana, Maninder S; Whiteley, Andrew S; Monis, Paul T; Bath, Andrew; Gordon, Cameron; Ryan, Una M; Paparini, Andrea
2017-01-01
Next generation sequencing (NGS) has rapidly become an invaluable tool for the detection, identification and relative quantification of environmental microorganisms. Here, we demonstrate two new 16S rDNA primer sets, which are compatible with NGS approaches and are primarily for use in water quality studies. Compared to 16S rRNA gene based universal primers, in silico and experimental analyses demonstrated that the new primers showed increased specificity for the Cyanobacteria and Proteobacteria phyla, allowing increased sensitivity for the detection, identification and relative quantification of toxic bloom-forming microalgae, microbial water quality bioindicators and common pathogens. Significantly, Cyanobacterial and Proteobacterial sequences accounted for ca. 95% of all sequences obtained within NGS runs (when compared to ca. 50% with standard universal NGS primers), providing higher sensitivity and greater phylogenetic resolution of key water quality microbial groups. The increased selectivity of the new primers allow the parallel sequencing of more samples through reduced sequence retrieval levels required to detect target groups, potentially reducing NGS costs by 50% but still guaranteeing optimal coverage and species discrimination.
Petrenko, B.; Ignatov, A.; Kramar, M.; Kihai, Y.
2016-05-01
Multichannel regression algorithms are widely used to retrieve sea surface temperature (SST) from infrared observations with satellite radiometers. Their theoretical foundations were laid in the 1980s-1990s, during the era of the Advanced Very High Resolution Radiometers which have been flown onboard NOAA satellites since 1981. Consequently, the multi-channel and non-linear SST algorithms employ the bands centered at 3.7, 11 and 12 μm, similar to available in AVHRR. More recent radiometers carry new bands located in the windows near 4 μm, 8.5 μm and 10 μm, which may also be used for SST. Involving these bands in SST retrieval requires modifications to the regression SST equations. The paper describes a general approach to constructing SST regression equations for an arbitrary number of radiometric bands and explores the benefits of using extended sets of bands available with the Visible Infrared Imager Radiometer Suite (VIIRS) flown onboard the Suomi National Polar-orbiting Partnership (SNPP) and to be flown onboard the follow-on Joint Polar Satellite System (JPSS) satellites, J1-J4, to be launched from 2017-2031; Moderate Resolution Imaging Spectroradiometers (MODIS) flown onboard Aqua and Terra satellites; and the Advanced Himawari Imager (AHI) flown onboard the Japanese Himawari-8 satellite (which in turn is a close proxy of the Advanced Baseline Imager (ABI) to be flown onboard the future Geostationary Operational Environmental Satellites - R Series (GOES-R) planned for launch in October 2016.
FLORIN MARIUS PAVELESCU
2010-12-01
Full Text Available In econometric models, linear regressions with three explanatory variables are widely used. As examples can be cited: Cobb-Douglas production function with three inputs (capital, labour and disembodied technical change, Kmenta function used for approximation of CES production function parameters, error-correction models, etc. In case of multiple linear regressions, estimated parameters values and some statistical tests are influenced by collinearity between explanatory variables. In fact, collinearity acts as a noise which distorts the signal (proper parameter values. This influence is emphasized by the coefficients of alignment to collinearity hazard values. The respective coefficients have some similarities with the signal to noise ratio. Consequently, it may be used when the type of collinearity is determined. For these reasons, the main purpose of this paper is to identify all the modeling factors and quantify their impact on the above-mentioned indicator values in the context of linear regression with three explanatory variables.Classification-JEL:C13,C20,C51,C52Keywords:types of collinearity, coefficient of mediated correlation, rank of explanatory variable, order of attractor of collinearity, mediated collinearity, anticollinearity.
Cui, Long; Wong, Emily Hoi-Man; Cheng, Guo; Firmato de Almeida, Manoel; So, Man-Ting; Sham, Pak-Chung; Cherny, Stacey S; Tam, Paul Kwong-Hang; Garcia-Barceló, Maria-Mercè
2013-01-01
We present the genetic analyses conducted on a three-generation family (14 individuals) with three members affected with isolated-Hirschsprung disease (HSCR) and one with HSCR and heterochromia iridum (syndromic-HSCR), a phenotype reminiscent of Waardenburg-Shah syndrome (WS4). WS4 is characterized by pigmentary abnormalities of the skin, eyes and/or hair, sensorineural deafness and HSCR. None of the members had sensorineural deafness. The family was screened for copy number variations (CNVs) using Illumina-HumanOmni2.5-Beadchip and for coding sequence mutations in WS4 genes (EDN3, EDNRB, or SOX10) and in the main HSCR gene (RET). Confocal microscopy and immunoblotting were used to assess the functional impact of the mutations. A heterozygous A/G transition in EDNRB was identified in 4 affected and 3 unaffected individuals. While in EDNRB isoforms 1 and 2 (cellular receptor) the transition results in the abolishment of translation initiation (M1V), in isoform 3 (only in the cytosol) the replacement occurs at Met91 (M91V) and is predicted benign. Another heterozygous transition (c.-248G/A; -predicted to affect translation efficiency-) in the 5'-untranslated region of EDN3 (EDNRB ligand) was detected in all affected individuals but not in healthy carriers of the EDNRB mutation. Also, a de novo CNVs encompassing DACH1 was identified in the patient with heterochromia iridum and HSCR Since the EDNRB and EDN3 variants only coexist in affected individuals, HSCR could be due to the joint effect of mutations in genes of the same pathway. Iris heterochromia could be due to an independent genetic event and would account for the additional phenotype within the family.
Long Cui
Full Text Available We present the genetic analyses conducted on a three-generation family (14 individuals with three members affected with isolated-Hirschsprung disease (HSCR and one with HSCR and heterochromia iridum (syndromic-HSCR, a phenotype reminiscent of Waardenburg-Shah syndrome (WS4. WS4 is characterized by pigmentary abnormalities of the skin, eyes and/or hair, sensorineural deafness and HSCR. None of the members had sensorineural deafness. The family was screened for copy number variations (CNVs using Illumina-HumanOmni2.5-Beadchip and for coding sequence mutations in WS4 genes (EDN3, EDNRB, or SOX10 and in the main HSCR gene (RET. Confocal microscopy and immunoblotting were used to assess the functional impact of the mutations. A heterozygous A/G transition in EDNRB was identified in 4 affected and 3 unaffected individuals. While in EDNRB isoforms 1 and 2 (cellular receptor the transition results in the abolishment of translation initiation (M1V, in isoform 3 (only in the cytosol the replacement occurs at Met91 (M91V and is predicted benign. Another heterozygous transition (c.-248G/A; -predicted to affect translation efficiency- in the 5'-untranslated region of EDN3 (EDNRB ligand was detected in all affected individuals but not in healthy carriers of the EDNRB mutation. Also, a de novo CNVs encompassing DACH1 was identified in the patient with heterochromia iridum and HSCR Since the EDNRB and EDN3 variants only coexist in affected individuals, HSCR could be due to the joint effect of mutations in genes of the same pathway. Iris heterochromia could be due to an independent genetic event and would account for the additional phenotype within the family.
Cheng, Guo; Firmato de Almeida, Manoel; So, Man-Ting; Sham, Pak-Chung; Cherny, Stacey S.; Tam, Paul Kwong-Hang; Garcia-Barceló, Maria-Mercè
2013-01-01
We present the genetic analyses conducted on a three-generation family (14 individuals) with three members affected with isolated-Hirschsprung disease (HSCR) and one with HSCR and heterochromia iridum (syndromic-HSCR), a phenotype reminiscent of Waardenburg-Shah syndrome (WS4). WS4 is characterized by pigmentary abnormalities of the skin, eyes and/or hair, sensorineural deafness and HSCR. None of the members had sensorineural deafness. The family was screened for copy number variations (CNVs) using Illumina-HumanOmni2.5-Beadchip and for coding sequence mutations in WS4 genes (EDN3, EDNRB, or SOX10) and in the main HSCR gene (RET). Confocal microscopy and immunoblotting were used to assess the functional impact of the mutations. A heterozygous A/G transition in EDNRB was identified in 4 affected and 3 unaffected individuals. While in EDNRB isoforms 1 and 2 (cellular receptor) the transition results in the abolishment of translation initiation (M1V), in isoform 3 (only in the cytosol) the replacement occurs at Met91 (M91V) and is predicted benign. Another heterozygous transition (c.-248G/A; -predicted to affect translation efficiency-) in the 5′-untranslated region of EDN3 (EDNRB ligand) was detected in all affected individuals but not in healthy carriers of the EDNRB mutation. Also, a de novo CNVs encompassing DACH1 was identified in the patient with heterochromia iridum and HSCR Since the EDNRB and EDN3 variants only coexist in affected individuals, HSCR could be due to the joint effect of mutations in genes of the same pathway. Iris heterochromia could be due to an independent genetic event and would account for the additional phenotype within the family. PMID:23840513
Alexeeff, Stacey E; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A
2015-01-01
Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1 km × 1 km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R(2) yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with >0.9 out-of-sample R(2) yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the SEs. Land use regression models performed better in chronic effect simulations. These results can help researchers when interpreting health effect estimates in these types of studies.
Land, Kenneth C.; And Others
1994-01-01
Advantages of using logistic and hazards regression techniques in assessing the overall impact of a treatment program and the differential impact on client subgroups are examined and compared using data from a juvenile court program for status offenders. Implications are drawn for management and effectiveness of intensive supervision programs.…
Tamer Khatib
2015-01-01
Full Text Available This paper presents a model for predicting hourly solar radiation data using daily solar radiation averages. The proposed model is a generalized regression artificial neural network. This model has three inputs, namely, mean daily solar radiation, hour angle, and sunset hour angle. The output layer has one node which is mean hourly solar radiation. The training and development of the proposed model are done using MATLAB and 43800 records of hourly global solar radiation. The results show that the proposed model has better prediction accuracy compared to some empirical and statistical models. Two error statistics are used in this research to evaluate the proposed model, namely, mean absolute percentage error and root mean square error. These values for the proposed model are 11.8% and −3.1%, respectively. Finally, the proposed model shows better ability in overcoming the sophistic nature of the solar radiation data.
Quantum, classical and semiclassical analyses of photon statistics in harmonic generation
Bajer, J; Bajer, Jiri; Miranowicz, Adam
2001-01-01
In this review, we compare different descriptions of photon-number statistics in harmonic generation processes within quantum, classical and semiclassical approaches. First, we study the exact quantum evolution of the harmonic generation by applying numerical methods including those of Hamiltonian diagonalization and global characteristics. We show explicitly that the harmonic generations can indeed serve as a source of nonclassical light. Then, we demonstrate that the quasi-stationary sub-Poissonian light can be generated in these quantum processes under conditions corresponding to the so-called no-energy-transfer regime known in classical nonlinear optics. By applying method of classical trajectories, we demonstrate that the analytical predictions of the Fano factors are in good agreement with the quantum results. On comparing second and higher harmonic generations in the no-energy-transfer regime, we show that the highest noise reduction is achieved in third-harmonic generation with the Fano-factor of the ...
Kumaki, Masafumi, E-mail: masafumi.kumaki@riken.jp [Cooperative Major in Nuclear Energy, Waseda University, Shinjuku, Tokyo (Japan); RIKEN, Wako, Saitama (Japan); Ikeda, Shunsuke; Sekine, Megumi; Munemoto, Naoya [RIKEN, Wako, Saitama (Japan); Department of Energy Sciences, Tokyo Institute of Technology, Meguro, Tokyo (Japan); Fuwa, Yasuhiro [RIKEN, Wako, Saitama (Japan); Department of Physics and Astronomy, Kyoto University, Uji, Kyoto (Japan); Cinquegrani, David [American Nuclear Society, University of Michigan, Ann Arbor, Michigan 48109 (United States); Kanesue, Takeshi; Okamura, Masahiro [Collider-Accelerator Department, Brookhaven National Laboratory, Upton, New York 11973 (United States); Washio, Masakazu [Cooperative Major in Nuclear Energy, Waseda University, Shinjuku, Tokyo (Japan)
2014-02-15
In Brookhaven National Laboratory, laser ion source has been developed to provide heavy ion beams by using plasma generation with 1064 nm Nd:YAG laser irradiation onto solid targets. The laser energy is transferred to the target material and creates a crater on the surface. However, only the partial material can be turned into plasma state and the other portion is considered to be just vaporized. Since heat propagation in the target material requires more than typical laser irradiation period, which is typically several ns, only the certain depth of the layers may contribute to form the plasma. As a result, the depth is more than 500 nm because the base material Al ions were detected. On the other hand, the result of comparing each carbon thickness case suggests that the surface carbon layer is not contributed to generate plasma.
Hao, Lingxin
2007-01-01
Quantile Regression, the first book of Hao and Naiman's two-book series, establishes the seldom recognized link between inequality studies and quantile regression models. Though separate methodological literature exists for each subject, the authors seek to explore the natural connections between this increasingly sought-after tool and research topics in the social sciences. Quantile regression as a method does not rely on assumptions as restrictive as those for the classical linear regression; though more traditional models such as least squares linear regression are more widely utilized, Hao
Analyses of steam generator collector rupture for WWER-1000 using Relap5 code
Balabanov, E.; Ivanova, A. [Energoproekt, Sofia (Bulgaria)
1995-12-31
The paper presents some of the results of analyses of an accident with a LOCA from the primary to the secondary side of a WWER-1000/320 unit. The objective of the analyses is to estimate the primary coolant to the atmosphere, to point out the necessity of a well defined operator strategy for this type of accident as well as to evaluate the possibility to diagnose the accident and to minimize the radiological impact on the environment.
Gai Xueliang; Fan Zhimin; Liu Guojin; Jacques Brisson
1998-01-01
Objective: To compare with five-year survival after surgery for the 116 breast cancer patients treated at the First Teaching Hospital (FTH) and the 866 breast cancer patients at Hopital du Saint-Sacrement (HSS). Methods:Using Cox regression model, after eliminating the confounders, to develop the comparison of the five-year average hazard rates between two hospitals and among the levels of prognostic factors. Results: It has significant difference for the old patients (50 years old or more)between the two hospitals. Conclusion: Tumor size at pathology and involvement of lymph nodes were important prognostic factors.
Analysing generator matrices G of similar state but varying minimum determinants
Harun, H.; Razali, M. F.; Rahman, N. A. Abdul
2016-10-01
Since Tarokh discovered Space-Time Trellis Code (STTC) in 1998, a considerable effort has been done to improve the performance of the original STTC. One way of achieving enhancement is by focusing on the generator matrix G, which represents the encoder structure for STTC. Until now, researchers have only concentrated on STTCs of different states in analyzing the performance of generator matrix G. No effort has been made on different generator matrices G of similar state. The reason being, it is difficult to produce a wide variety of generator matrices G with diverse minimum determinants. In this paper a number of generator matrices G with minimum determinant of four (4), eight (8) and sixteen (16) of the same state (i.e., 4-PSK) have been successfully produced. The performance of different generator matrices G in term of their bit error rate and signal-to-noise ratio for a Rayleigh fading environment are compared and evaluated. It is found from the MATLAB simulation that at low SNR (14) there is no significant difference between the BER of these generator matrices G.
Scale/Analytical Analyses of Freezing and Convective Melting with Internal Heat Generation
Ali S. Siahpush; John Crepeau; Piyush Sabharwall
2013-07-01
Using a scale/analytical analysis approach, we model phase change (melting) for pure materials which generate constant internal heat generation for small Stefan numbers (approximately one). The analysis considers conduction in the solid phase and natural convection, driven by internal heat generation, in the liquid regime. The model is applied for a constant surface temperature boundary condition where the melting temperature is greater than the surface temperature in a cylindrical geometry. The analysis also consider constant heat flux (in a cylindrical geometry).We show the time scales in which conduction and convection heat transfer dominate.
Nazarzadeh, Milad; Bidel, Zeinab; Mosavi Jarahi, Alireza; Esmaeelpour, Keihan; Menati, Walieh; Shakeri, Ali Asghar; Menati, Rostam; Kikhavani, Sattar; Saki, Kourosh
2015-09-01
Cannabis is the most widely used substance in the world. This study aimed to estimate the prevalence of cannabis lifetime use (CLU) in high school and college students of Iran and also to determine factors related to changes in prevalence. A systematic review of literature on cannabis use in Iran was conducted according to MOOSE guideline. Domestic scientific databases, PubMed/Medline, ISI Web of Knowledge, and Google Scholar, relevant reference lists, and relevant journals were searched up to April, 2014. Prevalences were calculated using the variance stabilizing double arcsine transformation and confidence intervals (CIs) estimated using the Wilson method. Heterogeneity was assessed by Cochran's Q statistic and I(2) index and causes of heterogeneity were evaluated using meta-regression model. In electronic database search, 4,000 citations were retrieved, producing a total of 33 studies. CLU was reported with a random effects pooled prevalence of 4.0% (95% CI = 3.0% to 5.0%). In subgroups of high school and college students, prevalences were 5.0% (95% CI = 3.0% to -7.0%) and 2.0% (95% CI = 2.0% to -3.0%), respectively. Meta-regression model indicated that prevalence is higher in college students (β = 0.089, p < .001), male gender (β = 0.017, p < .001), and is lower in studies with sampling versus census studies (β = -0.096, p < .001). This study reported that prevalence of CLU in Iranian students are lower than industrialized countries. In addition, gender, level of education, and methods of sampling are highly associated with changes in the prevalence of CLU across provinces.
Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-01-01
BACKGROUND: Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of
Multi-Objective Sensitivity Analyses for Power Generation Mix: Malaysia Case Study
Siti Mariam Mohd Shokri
2017-08-01
Full Text Available This paper presents an optimization framework to determine long-term optimal generation mix for Malaysia Power Sector using Dynamic Programming (DP technique. Several new candidate units with a pre-defined MW capacity were included in the model for generation expansion planning from coal, natural gas, hydro and renewable energy (RE. Four objective cases were considered, 1 economic cost, 2 environmental, 3 reliability and 4 multi-objectives that combining the three cases. Results show that Malaysia optimum generation mix in 2030 for, 1 economic case is 48% from coal, 41% from gas, 3% from hydro and 8% from RE, 2 environmental case is 19% from coal, 58% from gas, 11% from hydro and 12% from RE, 3 for reliability case is 64% from coal, 32% from gas, 3% from hydro and 1% from RE and 4 multi-objective case is 49% from coal, 41% from gas, 7% from hydro and 3% from RE. The findings of this paper are the optimum generation mix for Malaysia from 2013 to 2030 which is less expensive, substantially reduce carbon emission and that less risky.
Hui Wang
2014-01-01
Full Text Available Immunoglobulin A nephropathy (IgAN is a complex trait regulated by the inter-action among multiple physiologic regulatory systems and probably involving numerous genes, which leads to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of IgAN nephropathy is based on multiple genes with minor effects. To learn the association between 23 single nucleotide polymorphisms (SNPs in 14 genes predisposing to chronic glomerular diseases and IgAN in Han males, the 23 SNPs genotypes of 21 Han males were detected and analyzed with a BaiO gene chip, and their asso-ciations were analyzed with univariate analysis and multiple linear regression analysis. Analysis showed that CTLA4 rs231726 and CR2 rs1048971 revealed a significant association with IgAN. These findings support the multi-gene nature of the etiology of IgAN and propose a potential gene-gene interactive model for future studies.
Azadi, Sama; Karimi-Jashni, Ayoub
2016-02-01
Predicting the mass of solid waste generation plays an important role in integrated solid waste management plans. In this study, the performance of two predictive models, Artificial Neural Network (ANN) and Multiple Linear Regression (MLR) was verified to predict mean Seasonal Municipal Solid Waste Generation (SMSWG) rate. The accuracy of the proposed models is illustrated through a case study of 20 cities located in Fars Province, Iran. Four performance measures, MAE, MAPE, RMSE and R were used to evaluate the performance of these models. The MLR, as a conventional model, showed poor prediction performance. On the other hand, the results indicated that the ANN model, as a non-linear model, has a higher predictive accuracy when it comes to prediction of the mean SMSWG rate. As a result, in order to develop a more cost-effective strategy for waste management in the future, the ANN model could be used to predict the mean SMSWG rate.
Romanovsky, G.; Xydis, G.; Mutale, J.
2011-01-01
While there are presently different options for renewable and distributed generation (RES/DG) to participate in the UK electricity market, none of the market options is specifically tailored for such types of generation and in particular, the smaller (up to 5 MW) RES/DG. This is because the UK has...... a number of specific historical, technical and economic reasons that significantly influenced the ability of the smaller size RES/DG to participate in the electricity market and in provision of balancing services in accordance with the UK National Grid requirements. This paper discusses some perspectives...... and approaches aiming to help stand alone small size and clusters of RES and DG units to participate in the UK electricity market drawing on relevant experience from Denmark....
Goran Bjelakovic
Full Text Available BACKGROUND AND AIMS: Evidence shows that antioxidant supplements may increase mortality. Our aims were to assess whether different doses of beta-carotene, vitamin A, and vitamin E affect mortality in primary and secondary prevention randomized clinical trials with low risk of bias. METHODS: The present study is based on our 2012 Cochrane systematic review analyzing beneficial and harmful effects of antioxidant supplements in adults. Using random-effects meta-analyses, meta-regression analyses, and trial sequential analyses, we examined the association between beta-carotene, vitamin A, and vitamin E, and mortality according to their daily doses and doses below and above the recommended daily allowances (RDA. RESULTS: We included 53 randomized trials with low risk of bias (241,883 participants, aged 18 to 103 years, 44.6% women assessing beta-carotene, vitamin A, and vitamin E. Meta-regression analysis showed that the dose of vitamin A was significantly positively associated with all-cause mortality. Beta-carotene in a dose above 9.6 mg significantly increased mortality (relative risk (RR 1.06, 95% confidence interval (CI 1.02 to 1.09, I(2 = 13%. Vitamin A in a dose above the RDA (> 800 µg did not significantly influence mortality (RR 1.08, 95% CI 0.98 to 1.19, I(2 = 53%. Vitamin E in a dose above the RDA (> 15 mg significantly increased mortality (RR 1.03, 95% CI 1.00 to 1.05, I(2 = 0%. Doses below the RDAs did not affect mortality, but data were sparse. CONCLUSIONS: Beta-carotene and vitamin E in doses higher than the RDA seem to significantly increase mortality, whereas we lack information on vitamin A. Dose of vitamin A was significantly associated with increased mortality in meta-regression. We lack information on doses below the RDA. BACKGROUND: All essential compounds to stay healthy cannot be synthesized in our body. Therefore, these compounds must be taken through our diet or obtained in other ways [1]. Oxidative stress has been
Fang Aiping; Li Keji; Shi Haoyu; He Jingjing; Li He
2014-01-01
Background Chinese dietary reference intakes for calcium are largely based on foreign studies.We undertook metaregression to estimate calcium requirements for Chinese adults derived from calcium balance data in Chinese adults.Methods We searched PubMed,Cochrane CENTRAL,and SinoMed from inception to March 5,2014,by using a structured search strategy.The bibliographies of any relevant papers and journals were also screened for potentially eligible studies.We extracted a standardized data set from studies in Chinese adults that reported calcium balance data.The relationship between calcium intake and output was examined by an individual participant data (IPD) and aggregate data (AD) meta-regression.Results We identified 11 metabolic studies in Chinese adults within 18-60 years of age.One hundred and forty-one IPD (n=35) expressed as mg/d,127 IPD (n=32) expressed as mg·kg body wt-1·d-1,and 44 AD (n=132) expressed as mg/d were collected.The models predicted a neutral calcium balance (defined as calcium output (Y) equal to calcium intake (C)) at intakes of 460 mg/d (Y=0.60C+183.98) and 8.27 mg·kg body wt-1·d-1 (Y=0.60C+3.33)for IPD,or 409 mg/d (Y=0.66C+139.00) for AD.Calcium requirements at upper intakes were higher than that at lower intakes in all these models.Conclusion Calcium requirement for Chinese adults 18-60 years of age approximately ranges between 400 mg/d and 500 mg/d when consuming traditional plant-based Chinese diets.
Duda, David P.; Minnis, Patrick
2009-01-01
Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.
Kahane, Leo H
2007-01-01
Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:
Analyses of MYMIV-induced transcriptome in Vigna mungo as revealed by next generation sequencing.
Ganguli, Sayak; Dey, Avishek; Banik, Rahul; Kundu, Anirban; Pal, Amita
2016-03-01
Mungbean Yellow Mosaic Virus (MYMIV) is the viral pathogen that causes yellow mosaic disease to a number of legumes including Vigna mungo. VM84 is a recombinant inbred line resistant to MYMIV, developed in our laboratory through introgression of resistance trait from V. mungo line VM-1. Here we present the quality control passed transcriptome data of mock inoculated (control) and MYMIV-infected VM84, those have already been submitted in Sequence Read Archive (SRX1032950, SRX1082731) of NCBI. QC reports of FASTQ files generated by 'SeqQC V2.2' bioinformatics tool.
Analyses of MYMIV-induced transcriptome in Vigna mungo as revealed by next generation sequencing
Sayak Ganguli
2016-03-01
Full Text Available Mungbean Yellow Mosaic Virus (MYMIV is the viral pathogen that causes yellow mosaic disease to a number of legumes including Vigna mungo. VM84 is a recombinant inbred line resistant to MYMIV, developed in our laboratory through introgression of resistance trait from V. mungo line VM-1. Here we present the quality control passed transcriptome data of mock inoculated (control and MYMIV-infected VM84, those have already been submitted in Sequence Read Archive (SRX1032950, SRX1082731 of NCBI. QC reports of FASTQ files generated by ‘SeqQC V2.2’ bioinformatics tool.
Duda, David P.; Minnis, Patrick
2009-01-01
Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.
Ganter, J.H.
1996-02-01
This paper suggests that inexorable changes in the society are presenting both challenges and a rich selection of technologies for responding to these challenges. The citizen is more demanding of environmental and personal protection, and of information. Simultaneously, the commercial and government information technologies markets are providing new technologies like commercial off-the-shelf (COTS) software, common datasets, ``open`` GIS, recordable CD-ROM, and the World Wide Web. Thus one has the raw ingredients for creating new techniques and tools for spatial analysis, and these tools can support participative study and decision-making. By carrying out a strategy of thorough and demonstrably correct science, design, and development, can move forward into a new generation of participative risk assessment and routing for radioactive and hazardous materials.
Pak Yoong
2004-01-01
Full Text Available One of the difficulties of conducting applied qualitative research on the applications of emerging technologies is finding available sources of relevant data for analysis. Because the adoption of emerging technologies is, by definition, new in many organizations, there is often a lack of experienced practitioners who have relevant background and are willing to provide useful information for the study. Therefore, it is necessary to design research approaches that can generate accessible and relevant data. This paper describes two case studies in which the researchers used a grounded action learning approach to study the nature of e-facilitation for face-to-face and for distributed electronic meetings. The grounded action learning approach combines two research methodologies, grounded theory and action learning, to produce a rigorous and flexible method for studying e-facilitation. The implications of this grounded action learning approach for practice and research will be discussed.
Thermodynamic analyses of a biomass-coal co-gasification power generation system.
Yan, Linbo; Yue, Guangxi; He, Boshu
2016-04-01
A novel chemical looping power generation system is presented based on the biomass-coal co-gasification with steam. The effects of different key operation parameters including biomass mass fraction (Rb), steam to carbon mole ratio (Rsc), gasification temperature (Tg) and iron to fuel mole ratio (Rif) on the system performances like energy efficiency (ηe), total energy efficiency (ηte), exergy efficiency (ηex), total exergy efficiency (ηtex) and carbon capture rate (ηcc) are analyzed. A benchmark condition is set, under which ηte, ηtex and ηcc are found to be 39.9%, 37.6% and 96.0%, respectively. Furthermore, detailed energy Sankey diagram and exergy Grassmann diagram are drawn for the entire system operating under the benchmark condition. The energy and exergy efficiencies of the units composing the system are also predicted.
How well do analyses capture dust-generating winds in the Sahara and Sahel?
Roberts, Alexander; Marsham, John; Knippertz, Peter; Parker, Douglas
2016-04-01
Airborne mineral dust is important for weather, climate and earth-system prediction. Uncertainty in winds, as well as the land-surface, are known to be key to model uncertainties for dust uplift. Recent research has shown that during the summer wet season in the Sahel strong winds generated by the cold outflow from organized convective systems are an important dust storm mechanism (so called haboobs), while over the inner Sahara nocturnal low-level jets forming on the pressure gradient around the heat low dominate. Together the Sahel and Sahara are the world's largest dust source. Until now there has been a severe shortage of data for evaluating models for this region. Here, we bring together new observations from the remote Sahara, made during the Fennec project, with Sahelian data from the African Monsoon Multidisciplinary Analysis (AMMA), to provide an unprecedented evaluation of dust-generating winds in the European Centre for Medium-Range Weather Forecasts ERA-Interim (ERA-I) reanalysis. Differences between observations and ERA-I are explored with specific attention to monsoon and non-monsoon influenced regions. The main results are: (1) High speed winds in instantaneous ERA-I grid-box mean winds are lacking compared to time-averaged wind speed observations; (2) agreement between ERA-I and observations is lower during the monsoon season, even in parts of the Sahara not directly affected by the monsoon; and (3) both the seasonal and diurnal variability is under-represented in ERA-I. ERA-I fails to capture the summertime maximum for monsoon-affected stations and seasonally, correlations between daily-mean ERA-I and observed winds vary from 0.8 to 0.4, with lower correlations for 3-hourly data. These differences demonstrate that the model used in the production of the ERA-I reanalysis is unable to represent some important dust uplift processes, especially during the monsoon season when moist convection plays a key role, and that the product is not sufficiently
Burczyk, Jaroslaw; Koralewski, Tomasz E
2005-07-01
Assessment of contemporary pollen-mediated gene flow in plants is important for various aspects of plant population biology, genetic conservation and breeding. Here, through simulations we compare the two alternative approaches for measuring pollen-mediated gene flow: (i) the NEIGHBORHOOD model--a representative of parentage analyses, and (ii) the recently developed TWOGENER analysis of pollen pool structure. We investigate their properties in estimating the effective number of pollen parents (N(ep)) and the mean pollen dispersal distance (delta). We demonstrate that both methods provide very congruent estimates of N(ep) and delta, when the methods' assumptions considering the shape of pollen dispersal curve and the mating system follow those used in data simulations, although the NEIGHBORHOOD model exhibits generally lower variances of the estimates. The violations of the assumptions, especially increased selfing or long-distance pollen dispersal, affect the two methods to a different degree; however, they are still capable to provide comparable estimates of N(ep). The NEIGHBORHOOD model inherently allows to estimate both self-fertilization and outcrossing due to the long-distance pollen dispersal; however, the TWOGENER method is particularly sensitive to inflated selfing levels, which in turn may confound and suppress the effects of distant pollen movement. As a solution we demonstrate that in case of TWOGENER it is possible to extract the fraction of intraclass correlation that results from outcrossing only, which seems to be very relevant for measuring pollen-mediated gene flow. The two approaches differ in estimation precision and experimental efforts but they seem to be complementary depending on the main research focus and type of a population studied.
Software tool for analysing the family shopping basket without candidate generation
Roberto Carlos Naranjo Cuervo
2010-05-01
Full Text Available Tools leading to useful knowledge being obtained for supporting marketing decisions being taken are currently needed in the e-commerce environment. A process is needed for this which uses a series of techniques for data-processing; data-mining is one such technique enabling automatic information discovery. This work presents the association rules as a suitable technique for dis-covering how customers buy from a company offering business to consumer (B2C e-business, aimed at supporting decision-ma-king in supplying its customers or capturing new ones. Many algorithms such as A priori, DHP, Partition, FP-Growth and Eclat are available for implementing association rules; the following criteria were defined for selecting the appropriate algorithm: database insert, computational cost, performance and execution time. The development of a software tool is also presented which involved the CRISP-DM approach; this software tool was formed by the following four sub-modules: data pre-processing, data-mining, re-sults analysis and results application. The application design used three-layer architecture: presentation logic, business logic and service logic. Data warehouse design and algorithm design were included in developing this data-mining software tool. It was tested by using a FoodMart company database; the tests included performance, functionality and results’ validity, thereby allo-wing association rules to be found. The results led to concluding that using association rules as a data mining technique facilita-tes analysing volumes of information for B2C e-business services which represents a competitive advantage for those companies using Internet as their sales’ media.
Engel, Erwan; Ratel, Jérémy
2007-06-22
The objective of the work was to assess the relevance for the authentication of food of a novel chemometric method developed to correct mass spectrometry (MS) data from instrumental drifts, namely, the comprehensive combinatory standard correction (CCSC). Applied to gas chromatography (GC)-MS data, the method consists in analyzing a liquid sample with a mixture of n internal standards and in using the best combination of standards to correct the MS signal provided by each compound. The paper focuses on the authentication of the type of feeding in farm animals based on the composition in volatile constituents of their adipose tissues. The first step of the work enabled on one hand to ensure the feasibility of the conversion of the adipose tissue sample into a liquid phase required for the use of the CCSC method and on the other hand, to determine the key parameters of the extraction of the volatile fraction from this liquid phase by dynamic headspace. The second step showed the relevance of the CCSC pre-processing of the MS fingerprints generated by dynamic headspace-MS analysis of lamb tissues, for the discrimination of animals fed exclusively with pasture (n=8) or concentrate (n=8). When compared with filtering of raw data, internal normalization and correction by a single standard, the CCSC method increased by 17.1-, 3.3- and 1.3-fold, respectively, the number of mass fragments which discriminated the type of feeding. The final step confirmed the advantage of the CCSC pre-processing of dynamic headspace-gas chromatography-MS data for revealing molecular tracers of the type of feeding those number (n=72) was greater when compared to the number of tracers obtained with raw data (n=42), internal normalization (n=63) and correction by a single standard (n=57). The relevance of the information gained by using the CCSC method is discussed.
Leake, Stanley A.; Macy, Jamie P.; Truini, Margot
2016-06-01
IntroductionThe U.S. Department of Interior’s Bureau of Reclamation, Lower Colorado Region (Reclamation) is preparing an environmental impact statement (EIS) for the Navajo Generating Station-Kayenta Mine Complex Project (NGS-KMC Project). The proposed project involves various Federal approvals that would facilitate continued operation of the Navajo Generating Station (NGS) from December 23, 2019 through 2044, and continued operation of the Kayenta Mine and support facilities (collectively called the Kayenta Mine Complex, or KMC) to supply coal to the NGS for this operational period. The EIS will consider several project alternatives that are likely to produce different effects on the Navajo (N) aquifer; the N aquifer is the principal water resource in the Black Mesa area used by the Navajo Nation, Hopi Tribe, and Peabody Western Coal Company (PWCC).The N aquifer is composed of three hydraulically connected formations—the Navajo Sandstone, the Kayenta Formation, and the Lukachukai Member of the Wingate Sandstone—that function as a single aquifer. The N aquifer is confined under most of Black Mesa, and the overlying stratigraphy limits recharge to this part of the aquifer. The N aquifer is unconfined in areas surrounding Black Mesa, and most recharge occurs where the Navajo Sandstone is exposed in the area near Shonto, Arizona. Overlying the N aquifer is the D aquifer, which includes the Dakota Sandstone, Morrison Formation, Entrada Sandstone, and Carmel Formation. The aquifer is named for the Dakota Sandstone, which is the primary water-bearing unit.The NGS is located near Page, Arizona on the Navajo Nation. The KMC, which delivers coal to NGS by way of a dedicated electric railroad, is located approximately 83 miles southeast of NGS (about 125 miles northeast of Flagstaff, Arizona). The Kayenta Mine permit area is located on about 44,073 acres of land leased within the boundaries of the Hopi and Navajo Indian Reservations. KMC has been conducting mining and
Leake, Stanley A.; Macy, Jamie P.; Truini, Margot
2016-06-01
IntroductionThe U.S. Department of Interior’s Bureau of Reclamation, Lower Colorado Region (Reclamation) is preparing an environmental impact statement (EIS) for the Navajo Generating Station-Kayenta Mine Complex Project (NGS-KMC Project). The proposed project involves various Federal approvals that would facilitate continued operation of the Navajo Generating Station (NGS) from December 23, 2019 through 2044, and continued operation of the Kayenta Mine and support facilities (collectively called the Kayenta Mine Complex, or KMC) to supply coal to the NGS for this operational period. The EIS will consider several project alternatives that are likely to produce different effects on the Navajo (N) aquifer; the N aquifer is the principal water resource in the Black Mesa area used by the Navajo Nation, Hopi Tribe, and Peabody Western Coal Company (PWCC).The N aquifer is composed of three hydraulically connected formations—the Navajo Sandstone, the Kayenta Formation, and the Lukachukai Member of the Wingate Sandstone—that function as a single aquifer. The N aquifer is confined under most of Black Mesa, and the overlying stratigraphy limits recharge to this part of the aquifer. The N aquifer is unconfined in areas surrounding Black Mesa, and most recharge occurs where the Navajo Sandstone is exposed in the area near Shonto, Arizona. Overlying the N aquifer is the D aquifer, which includes the Dakota Sandstone, Morrison Formation, Entrada Sandstone, and Carmel Formation. The aquifer is named for the Dakota Sandstone, which is the primary water-bearing unit.The NGS is located near Page, Arizona on the Navajo Nation. The KMC, which delivers coal to NGS by way of a dedicated electric railroad, is located approximately 83 miles southeast of NGS (about 125 miles northeast of Flagstaff, Arizona). The Kayenta Mine permit area is located on about 44,073 acres of land leased within the boundaries of the Hopi and Navajo Indian Reservations. KMC has been conducting mining and
Al-Khatib, Issam A; Abu Fkhidah, Ismail; Khatib, Jumana I; Kontogianni, Stamatia
2016-03-01
Forecasting of hospital solid waste generation is a critical challenge for future planning. The composition and generation rate of hospital solid waste in hospital units was the field where the proposed methodology of the present article was applied in order to validate the results and secure the outcomes of the management plan in national hospitals. A set of three multiple-variable regression models has been derived for estimating the daily total hospital waste, general hospital waste, and total hazardous waste as a function of number of inpatients, number of total patients, and number of beds. The application of several key indicators and validation procedures indicates the high significance and reliability of the developed models in predicting the hospital solid waste of any hospital. Methodology data were drawn from existent scientific literature. Also, useful raw data were retrieved from international organisations and the investigated hospitals' personnel. The primal generation outcomes are compared with other local hospitals and also with hospitals from other countries. The main outcome, which is the developed model results, are presented and analysed thoroughly. The goal is this model to act as leverage in the discussions among governmental authorities on the implementation of a national plan for safe hospital waste management in Palestine.
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Nick, Todd G; Campbell, Kathleen M
2007-01-01
The Medical Subject Headings (MeSH) thesaurus used by the National Library of Medicine defines logistic regression models as "statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable." Logistic regression models are used to study effects of predictor variables on categorical outcomes and normally the outcome is binary, such as presence or absence of disease (e.g., non-Hodgkin's lymphoma), in which case the model is called a binary logistic model. When there are multiple predictors (e.g., risk factors and treatments) the model is referred to as a multiple or multivariable logistic regression model and is one of the most frequently used statistical model in medical journals. In this chapter, we examine both simple and multiple binary logistic regression models and present related issues, including interaction, categorical predictor variables, continuous predictor variables, and goodness of fit.
Freund, Rudolf J; Sa, Ping
2006-01-01
The book provides complete coverage of the classical methods of statistical analysis. It is designed to give students an understanding of the purpose of statistical analyses, to allow the student to determine, at least to some degree, the correct type of statistical analyses to be performed in a given situation, and have some appreciation of what constitutes good experimental design
Tai, Meng Wei; Chong, Zhen Feng; Asif, Muhammad Khan; Rahmat, Rabiah A; Nambiar, Phrabhakaran
2016-09-01
This study was to compare the suitability and precision of xerographic and computer-assisted methods for bite mark investigations. Eleven subjects were asked to bite on their forearm and the bite marks were photographically recorded. Alginate impressions of the subjects' dentition were taken and their casts were made using dental stone. The overlays generated by xerographic method were obtained by photocopying the subjects' casts and the incisal edge outlines were then transferred on a transparent sheet. The bite mark images were imported into Adobe Photoshop® software and printed to life-size. The bite mark analyses using xerographically generated overlays were done by comparing an overlay to the corresponding printed bite mark images manually. In computer-assisted method, the subjects' casts were scanned into Adobe Photoshop®. The bite mark analyses using computer-assisted overlay generation were done by matching an overlay and the corresponding bite mark images digitally using Adobe Photoshop®. Another comparison method was superimposing the cast images with corresponding bite mark images employing the Adobe Photoshop® CS6 and GIF-Animator©. A score with a range of 0-3 was given during analysis to each precision-determining criterion and the score was increased with better matching. The Kruskal Wallis H test showed significant difference between the three sets of data (H=18.761, poverlay generation and lastly the xerographic method. The superior precision contributed by digital method is discernible despite the human skin being a poor recording medium of bite marks. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Bhamidipati, Ravi Kanth; Syed, Muzeeb; Mullangi, Ramesh; Srinivas, Nuggehally
2017-03-14
1. Dalbavancin, a lipoglycopeptide, is approved for treating gram-positive bacterial infections. Area under plasma concentration versus time curve (AUCinf) of dalbavancin is a key parameter and AUCinf/MIC ratio is a critical pharmacodynamic marker. 2. Using end of intravenous infusion concentration (i.e. Cmax) Cmax versus AUCinf relationship for dalbavancin was established by regression analyses (i.e. linear, log-log, log-linear and power models) using 21 pairs of subject data. 3. The predictions of the AUCinf were performed using published Cmax data by application of regression equations. The quotient of observed/predicted values rendered fold difference. The mean absolute error (MAE)/root mean square error (RMSE) and correlation coefficient (r) were used in the assessment. 4. MAE and RMSE values for the various models were comparable. The Cmax versus AUCinf exhibited excellent correlation (r > 0.9488). The internal data evaluation showed narrow confinement (0.84-1.14-fold difference) with a RMSE regression models, a single time point strategy of using Cmax (i.e. end of 30-min infusion) is amenable as a prospective tool for predicting AUCinf of dalbavancin in patients.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Roest, Annelieke M; de Jonge, Peter; Williams, Craig D; de Vries, Ymkje Anna; Schoevers, Robert A; Turner, Erick H
2015-05-01
Studies have shown that the scientific literature has overestimated the efficacy of antidepressants for depression, but other indications for these drugs have not been considered. To examine reporting biases in double-blind, placebo-controlled trials on the pharmacologic treatment of anxiety disorders and quantify the extent to which these biases inflate estimates of drug efficacy. We included reviews obtained from the US Food and Drug Administration (FDA) for premarketing trials of 9 second-generation antidepressants in the treatment of anxiety disorders. A systematic search for matching publications (until December 19, 2012) was performed using PubMed, EMBASE, and the Cochrane Central Register of Controlled Trials. Double data extraction was performed for the FDA reviews and the journal articles. The Hedges g value was calculated as the measure of effect size. Reporting bias was examined and classified as study publication bias, outcome reporting bias, or spin (abstract conclusion not consistent with published results on primary end point). Separate meta-analyses were conducted for the 2 sources, and the effect of publication status on the effect estimates was examined using meta-regression. The findings of 41 of the 57 trials (72%) were positive according to the FDA, but 43 of the 45 published article conclusions (96%) were positive (P antidepressants for anxiety disorders. Although these biases did not significantly inflate estimates of drug efficacy, reporting biases led to significant increases in the number of positive findings in the literature.
石修权; 王增珍
2008-01-01
探讨Meta回归与亚组分析在异质性的识别与处理中的应用及意义.利用文献提供的二次数据建立Meta回归模型,筛选出异质性的影响因素,根据该因素做亚组分析,并比较亚组分析前后异质性的变化.Meta分析资料经异质性检验,Q=44.71,df=27,P=0.017,认为存在异质性.经Meta回归分析,从可能导致异质性的因素(研究时间、地区、样本量、病例对照比值等)中筛选出样本含量为异质性因素(P=0.012)、地区为可能的异质性因素(P=0.091).然后进行亚组分析,异质性明显减小(∑Q 由44.71减小至32.11).结论 :Meta回归法对筛选异质性影响因素比较简便可靠,据此进行的亚组分析能明显降低亚组内的异质性.故存在统计学异质性又要计算合并效应时推荐二者结合使用,可正确识别并降低异质性,从而使Meta分析结果更为稳健与合理.%To explore the role and application of Meta-regression and subgroup analyses to recognize and control the heterogeneity in Meta-analysis, Meta-regression models were established by secondary data to screen the factors resulting heterogeneity,and subgroup analyses were used to compare the change of heterogeneity before and after.The heterogeneity was found in the Meta-analysis(Q=44.71,df=27,P=0.017).Sample size and region were selected(P=0.012 and P=0.091,respectively)by Meta-regression from many possible factors such as sample size,year,region and case/contml ratio.The Q values were lowered from 44.71 to 32.11 after subgroup analyses.Thus,Metaregression method was convenient and reliable to screen the affected factors of heterogeneity,and subgroup analyses based on the hypothesis that could significantly lower the heterogeneity.It was recommended to a combined use when an obvious heterogeneity existed but was in need to get an overall result in Metaanalysis.We could correctly judge and lower the heterogeneity to increase the robustness and rationality of results from Meta-analysis.
Regression Verification Using Impact Summaries
Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana
2013-01-01
versions [19]. These techniques compare two programs with a large degree of syntactic similarity to prove that portions of one program version are equivalent to the other. Regression verification can be used for guaranteeing backward compatibility, and for showing behavioral equivalence in programs with syntactic differences, e.g., when a program is refactored to improve its performance, maintainability, or readability. Existing regression verification techniques leverage similarities between program versions by using abstraction and decomposition techniques to improve scalability of the analysis [10, 12, 19]. The abstractions and decomposition in the these techniques, e.g., summaries of unchanged code [12] or semantically equivalent methods [19], compute an over-approximation of the program behaviors. The equivalence checking results of these techniques are sound but not complete-they may characterize programs as not functionally equivalent when, in fact, they are equivalent. In this work we describe a novel approach that leverages the impact of the differences between two programs for scaling regression verification. We partition program behaviors of each version into (a) behaviors impacted by the changes and (b) behaviors not impacted (unimpacted) by the changes. Only the impacted program behaviors are used during equivalence checking. We then prove that checking equivalence of the impacted program behaviors is equivalent to checking equivalence of all program behaviors for a given depth bound. In this work we use symbolic execution to generate the program behaviors and leverage control- and data-dependence information to facilitate the partitioning of program behaviors. The impacted program behaviors are termed as impact summaries. The dependence analyses that facilitate the generation of the impact summaries, we believe, could be used in conjunction with other abstraction and decomposition based approaches, [10, 12], as a complementary reduction technique. An
Angelica Quatela
2016-10-01
Full Text Available This systematic review investigated the effects of differing energy intakes, macronutrient compositions, and eating patterns of meals consumed after an overnight fast on Diet Induced Thermogenesis (DIT. The initial search identified 2482 records; 26 papers remained once duplicates were removed and inclusion criteria were applied. Studies (n = 27 in the analyses were randomized crossover designs comparing the effects of two or more eating events on DIT. Higher energy intake increased DIT; in a mixed model meta-regression, for every 100 kJ increase in energy intake, DIT increased by 1.1 kJ/h (p < 0.001. Meals with a high protein or carbohydrate content had a higher DIT than high fat, although this effect was not always significant. Meals with medium chain triglycerides had a significantly higher DIT than long chain triglycerides (meta-analysis, p = 0.002. Consuming the same meal as a single bolus eating event compared to multiple small meals or snacks was associated with a significantly higher DIT (meta-analysis, p = 0.02. Unclear or inconsistent findings were found by comparing the consumption of meals quickly or slowly, and palatability was not significantly associated with DIT. These findings indicate that the magnitude of the increase in DIT is influenced by the energy intake, macronutrient composition, and eating pattern of the meal.
Quatela, Angelica; Callister, Robin; Patterson, Amanda; MacDonald-Wicks, Lesley
2016-10-25
This systematic review investigated the effects of differing energy intakes, macronutrient compositions, and eating patterns of meals consumed after an overnight fast on Diet Induced Thermogenesis (DIT). The initial search identified 2482 records; 26 papers remained once duplicates were removed and inclusion criteria were applied. Studies (n = 27) in the analyses were randomized crossover designs comparing the effects of two or more eating events on DIT. Higher energy intake increased DIT; in a mixed model meta-regression, for every 100 kJ increase in energy intake, DIT increased by 1.1 kJ/h (p < 0.001). Meals with a high protein or carbohydrate content had a higher DIT than high fat, although this effect was not always significant. Meals with medium chain triglycerides had a significantly higher DIT than long chain triglycerides (meta-analysis, p = 0.002). Consuming the same meal as a single bolus eating event compared to multiple small meals or snacks was associated with a significantly higher DIT (meta-analysis, p = 0.02). Unclear or inconsistent findings were found by comparing the consumption of meals quickly or slowly, and palatability was not significantly associated with DIT. These findings indicate that the magnitude of the increase in DIT is influenced by the energy intake, macronutrient composition, and eating pattern of the meal.
Douglas Blackiston
Full Text Available A deep understanding of cognitive processes requires functional, quantitative analyses of the steps leading from genetics and the development of nervous system structure to behavior. Molecularly-tractable model systems such as Xenopus laevis and planaria offer an unprecedented opportunity to dissect the mechanisms determining the complex structure of the brain and CNS. A standardized platform that facilitated quantitative analysis of behavior would make a significant impact on evolutionary ethology, neuropharmacology, and cognitive science. While some animal tracking systems exist, the available systems do not allow automated training (feedback to individual subjects in real time, which is necessary for operant conditioning assays. The lack of standardization in the field, and the numerous technical challenges that face the development of a versatile system with the necessary capabilities, comprise a significant barrier keeping molecular developmental biology labs from integrating behavior analysis endpoints into their pharmacological and genetic perturbations. Here we report the development of a second-generation system that is a highly flexible, powerful machine vision and environmental control platform. In order to enable multidisciplinary studies aimed at understanding the roles of genes in brain function and behavior, and aid other laboratories that do not have the facilities to undergo complex engineering development, we describe the device and the problems that it overcomes. We also present sample data using frog tadpoles and flatworms to illustrate its use. Having solved significant engineering challenges in its construction, the resulting design is a relatively inexpensive instrument of wide relevance for several fields, and will accelerate interdisciplinary discovery in pharmacology, neurobiology, regenerative medicine, and cognitive science.
Lee, Mi Hee; Lee, Soo Bong; Eo, Yang Dam; Kim, Sun Woong; Woo, Jung-Hun; Han, Soo Hee
2017-07-01
Landsat optical images have enough spatial and spectral resolution to analyze vegetation growth characteristics. But, the clouds and water vapor degrade the image quality quite often, which limits the availability of usable images for the time series vegetation vitality measurement. To overcome this shortcoming, simulated images are used as an alternative. In this study, weighted average method, spatial and temporal adaptive reflectance fusion model (STARFM) method, and multilinear regression analysis method have been tested to produce simulated Landsat normalized difference vegetation index (NDVI) images of the Korean Peninsula. The test results showed that the weighted average method produced the images most similar to the actual images, provided that the images were available within 1 month before and after the target date. The STARFM method gives good results when the input image date is close to the target date. Careful regional and seasonal consideration is required in selecting input images. During summer season, due to clouds, it is very difficult to get the images close enough to the target date. Multilinear regression analysis gives meaningful results even when the input image date is not so close to the target date. Average R (2) values for weighted average method, STARFM, and multilinear regression analysis were 0.741, 0.70, and 0.61, respectively.
Held, Elizabeth; Cape, Joshua; Tintle, Nathan
2016-01-01
Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.
Ramos, Dorel Soares; Negri, Jean Cesari; Kann, Zevi [Companhia Energetica de Sao Paulo, SP (Brazil); Pereira, Mario Veiga Ferraz [PSR Inc., Rio de Janeiro, RJ (Brazil)
1996-12-31
The paper describes the model SAEGET developed to analyse the thermal power plants generation expansion and its integration with the set of models currently used for planning of expansion purposes in the Brazilian electrical sector. Additionally some illustrative examples are presented and future developments of the model are proposed. (author) 3 refs., 2 figs., 2 tabs.
Time-adaptive quantile regression
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Py, Bernard
A progress report is presented of a study which applies a system of generative grammar to error analysis. The objective of the study was to reconstruct the grammar of students' interlanguage, using a systematic analysis of errors. (Interlanguage refers to the linguistic competence of a student who possesses a relatively systematic body of rules,…
Miyawaki, Shinjiro; Tawhai, Merryn H.; Hoffman, Eric A.; Lin, Ching-Long
2014-11-01
The authors have developed a method to automatically generate non-uniform CFD mesh for image-based human airway models. The sizes of generated tetrahedral elements vary in both radial and longitudinal directions to account for boundary layer and multiscale nature of pulmonary airflow. The proposed method takes advantage of our previously developed centerline-based geometry reconstruction method. In order to generate the mesh branch by branch in parallel, we used the open-source programs Gmsh and TetGen for surface and volume meshes, respectively. Both programs can specify element sizes by means of background mesh. The size of an arbitrary element in the domain is a function of wall distance, element size on the wall, and element size at the center of airway lumen. The element sizes on the wall are computed based on local flow rate and airway diameter. The total number of elements in the non-uniform mesh (10 M) was about half of that in the uniform mesh, although the computational time for the non-uniform mesh was about twice longer (170 min). The proposed method generates CFD meshes with fine elements near the wall and smooth variation of element size in longitudinal direction, which are required, e.g., for simulations with high flow rate. NIH Grants R01-HL094315, U01-HL114494, and S10-RR022421. Computer time provided by XSEDE.
Gutschow, Christian; The ATLAS collaboration
2016-01-01
The Monte Carlo setups used by ATLAS to model boson+jets and multi-boson processes in 13 TeV pp collisions are described. Comparisons between data and several events generators are provided for key kinematic distributions at 7 TeV, 8 TeV and 13 TeV. Issues associated with sample normalisation and the evaluation of systematic uncertainties are also discussed.
Huri, Emre; Dogantekin, Engin; Hayran, Murvet; Malkan, Umit Yavuz; Ergun, Mine; Firat, Aysegul; Beyazit, Yavuz; Ustun, Huseyin; Kekilli, Murat; Dadali, Mumtaz; Astarci, Muzeyyen; Haznedaroglu, Ibrahim C
2016-01-01
Ankaferd Blood Stopper (ABS), a hemostatic agent of plant origin, has been registered for the prevention of clinical hemorrhages. Currently there is no data regarding the ultrastructural analysis of ABS at the tissue level. The aim of this study is to assess renal tissue effects via scanning electron microscopy (SEM) analyses for the ABS and ABS nanohemostat (formed by the combination of self-assembling peptide amphiphile molecules and ABS). SEM experiments were performed with FEI Nova NanoSEM 230, using the ETD detector at low vacuum mode with 30 keV beam energy. SEM analyses revealed that significant erythroid aggregation are present inside the capillary bed of the renal tissue. However, neither the signs of necrosis nor any other sign of tissue damage are evident in the surrounding renal tissue supplied by the microcapillary vasculature. Our study is important for several reasons. Firstly, in our study we used ABS nanohemostat which was recently developed. This study adds valuable information to the literature regarding ABS nanohemostat. Secondly, this study is the first ultrastructural analysis of ABS that was performed at the tissue level. Thirdly, we disclosed that ABS nanohemostat could induce vital erythroid aggregation at the renal tissue level as detected by SEM. Lastly, we detected that ABS nanohemostat causes no harm to the tissues including necrosis and any other detrimental effects.
Bacher, P
2006-11-15
The author discusses the paper of Helmut Alt, published in the International Journal Energy Technology and Policy, in 2005. With more than 15,000 plants of different capacity ranges, Germany has become the world's No. 1 in generating electricity from wind power, covering around 4% of the electricity requirements at home. This is due on the one hand to the successful promotion by the Renewable Energy Act (EEG) and, on the other hand, to the tax breaks from loss allocation and depreciation. During low-load periods, load dispatchers today already have to balance power gradients of more than 10% of the respective network load per minute, which are increasingly covered by the provision of balancing power from conventional power plants. The legally fixed permanent subsidy burden of the electricity industry due to the high compensation fee for wind power alone currently amounts to? 1.4 billion a year. If base load capacity from nuclear power plants is replaced in the medium term, this will not reduce but rather increase CO{sub 2} emissions as generation from gas turbines will have to be increased temporarily in times of flagging winds. (A.L.B.)
Monniaux, D.
2009-06-15
Software operating critical systems (aircraft, nuclear power plants) should not fail - whereas most computerised systems of daily life (personal computer, ticket vending machines, cell phone) fail from time to time. This is not a simple engineering problem: it is known, since the works of Turing and Cook, that proving that programs work correctly is intrinsically hard. In order to solve this problem, one needs methods that are, at the same time, efficient (moderate costs in time and memory), safe (all possible failures should be found), and precise (few warnings about nonexistent failures). In order to reach a satisfactory compromise between these goals, one can research fields as diverse as formal logic, numerical analysis or 'classical' algorithmics. From 2002 to 2007 I participated in the development of the Astree static analyser. This suggested to me a number of side projects, both theoretical and practical (use of formal proof techniques, analysis of numerical filters...). More recently, I became interested in modular analysis of numerical property and in the applications to program analysis of constraint solving techniques (semi-definite programming, SAT and SAT modulo theory). (author)
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record
Regression analysis by example
Chatterjee, Samprit; Hadi, Ali S
2012-01-01
.... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...
Doddy Kastanya
2017-02-01
Full Text Available In any reactor physics analysis, the instantaneous power distribution in the core can be calculated when the actual bundle-wise burnup distribution is known. Considering the fact that CANDU (Canada Deuterium Uranium utilizes on-power refueling to compensate for the reduction of reactivity due to fuel burnup, in the CANDU fuel management analysis, snapshots of power and burnup distributions can be obtained by simulating and tracking the reactor operation over an extended period using various tools such as the *SIMULATE module of the Reactor Fueling Simulation Program (RFSP code. However, for some studies, such as an evaluation of a conceptual design of a next-generation CANDU reactor, the preferred approach to obtain a snapshot of the power distribution in the core is based on the patterned-channel-age model implemented in the *INSTANTAN module of the RFSP code. The objective of this approach is to obtain a representative snapshot of core conditions quickly. At present, such patterns could be generated by using a program called RANDIS, which is implemented within the *INSTANTAN module. In this work, we present an alternative approach to derive the patterned-channel-age model where a simulated-annealing-based algorithm is used to find such patterns, which produce reasonable power distributions.
Tan, Hui Teng; Lee, Keat Teong; Mohamed, Abdul Rahman
2010-07-01
Recently, second-generation bio-ethanol (SGB), which utilizes readily available lignocellulosic biomass has received much interest as another potential source of liquid biofuel comparable to biodiesel. Thus the aim of this paper is to determine the exergy efficiency and to compare the effectiveness of SGB and palm methyl ester (PME) processes. It was found that the production of bio-ethanol is more thermodynamically sustainable than that of biodiesel as the net exergy value (NExV) of SGB is 10% higher than that of PME. Contrarily, the former has a net energy value (NEV) which is 9% lower than the latter. Despite this, SGB is still strongly recommended as a potential biofuel because SGB production can help mitigate several detrimental impacts on the environment.
Rita Kéri
2016-12-01
Full Text Available The paper presents a field study that looked at teaching contexts as instances of joint knowledge construction. The study was part of a larger enterprise in the vein of grounded theory, exploring qualitative connections between communication dynamics and evolving cooperation patterns, aiming to provide feedback to theories on the overall relationship between communication and cooperation. This study also involved looking at the joint problem definition and planning in groups of adults with different sociocultural backgrounds. In the kinds of settings selected, participants are likely to start with diverging strategies and axioms used in articulating knowledge. Comparative analyses of formal and extracurricular teaching situations are presented in the paper, and their implications are explained in the conceptual framework of common ground, private experience, and public knowledge products. The focus is on the communicative context, the role that verbal contributions and interpersonal strategies play in jointly framing a problem: how different dimensions of communication complement or interfere with each other to serve the purposes of local and long-term coordination and knowledge production, and meanwhile shape the community. In the preliminary theoretical considerations governing the study, I aimed to develop a perspective that enables the exploration of the types of situations selected, and this has been refined to give meaningful analysis of such situations. I am presenting strategies that simultaneously shape cooperative potential and construct the means that enable joint action and limit its form, involving the creative mobilization of private worlds.
Rank regression: an alternative regression approach for data with outliers.
Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin
2014-10-01
Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.
Nomura, Kouji; Nakaji-Hirabayashi, Tadashi; Gemmei-Ide, Makoto; Kitano, Hiromi; Noguchi, Hidenori; Uosaki, Kohei
2014-09-01
Surfaces of both a cover glass and the flat plane of a semi-cylindrical quartz prism were modified with a mixture of positively and negatively charged silane coupling reagents (3-aminopropyltriethoxysilane (APTES) and 3-(trihydroxysilyl)propylmethylphosphonate (THPMP), respectively). The glass surface modified with a self-assembled monolayer (SAM) prepared at a mixing ratio of APTES:THPMP=4:6 was electrically almost neutral and was resistant to non-specific adsorption of proteins, whereas fibroblasts gradually adhered to an amphoteric (mixed) SAM surface probably due to its stiffness, though the number of adhered cells was relatively small. Sum frequency generation (SFG) spectra indicated that total intensity of the OH stretching region (3000-3600cm(-1)) for the amphoteric SAM-modified quartz immersed in liquid water was smaller than those for the positively and negatively charged SAM-modified quartz prisms and a bare quartz prism in contact with liquid water. These results suggested that water molecules at the interface of water and an amphoteric SAM-modified quartz prism are not strongly oriented in comparison with those at the interface of a lopsidedly charged SAM-modified quartz prism and bare quartz. The importance of charge neutralization for the anti-biofouling properties of solid materials was strongly suggested.
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating e...
Gielen, Fabrice; Buryska, Tomas; Van Vliet, Liisa; Butz, Maren; Damborsky, Jiri; Prokop, Zbynek; Hollfelder, Florian
2015-01-06
Analysis of concentration dependencies is key to the quantitative understanding of biological and chemical systems. In experimental tests involving concentration gradients such as inhibitor library screening, the number of data points and the ratio between the stock volume and the volume required in each test determine the quality and efficiency of the information gained. Titerplate assays are currently the most widely used format, even though they require microlitre volumes. Compartmentalization of reactions in pico- to nanoliter water-in-oil droplets in microfluidic devices provides a solution for massive volume reduction. This work addresses the challenge of producing microfluidic-based concentration gradients in a way that every droplet represents one unique reagent combination. We present a simple microcapillary technique able to generate such series of monodisperse water-in-oil droplets (with a frequency of up to 10 Hz) from a sample presented in an open well (e.g., a titerplate). Time-dependent variation of the well content results in microdroplets that represent time capsules of the composition of the source well. By preserving the spatial encoding of the droplets in tubing, each reactor is assigned an accurate concentration value. We used this approach to record kinetic time courses of the haloalkane dehalogenase DbjA and analyzed 150 combinations of enzyme/substrate/inhibitor in less than 5 min, resulting in conclusive Michaelis-Menten and inhibition curves. Avoiding chips and merely requiring two pumps, a magnetic plate with a stirrer, tubing, and a pipet tip, this easy-to-use device rivals the output of much more expensive liquid handling systems using a fraction (∼100-fold less) of the reagents consumed in microwell format.
Liu, Chung-Wei; Chang, Shoou-Jinn; Brahma, Sanjaya; Hsiao, Chih-Hung; Chang, Feng Ming; Wang, Peng Han; Lo, Kuang-Yao
2015-02-01
We report a systematic study about the effect of cobalt concentration in the growth solution over the crystallization, growth, and optical properties of hydrothermally synthesized Zn1-xCoxO [0 ≤ x ≤ 0.40, x is the weight (wt.) % of Co in the growth solution] nanorods. Dilute Co concentration of 1 wt. % in the growth solution enhances the bulk crystal quality of ZnO nanorods, and high wt. % leads to distortion in the ZnO lattice that depresses the crystallization, growth as well as the surface structure quality of ZnO. Although, Co concentration in the growth solution varies from 1 to 40 wt. %, the real doping concentration is limited to 0.28 at. % that is due to the low growth temperature of 80 °C. The enhancement in the crystal quality of ZnO nanorods at dilute Co concentration in the solution is due to the strain relaxation that is significantly higher for ZnO nanorods prepared without, and with high wt. % of Co in the growth solution. Second harmonic generation is used to investigate the net dipole distribution from these coatings, which provides detailed information about bulk and surface structure quality of ZnO nanorods at the same time. High quality ZnO nanorods are fabricated by a low-temperature (80 °C) hydrothermal synthesis method, and no post synthesis treatment is needed for further crystallization. Therefore, this method is advantageous for the growth of high quality ZnO coatings on plastic substrates that may lead toward its application in flexible electronics.
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Schneider, Jesper Wiborg; Larsen, Birger; Ingwersen, Peter
2009-01-01
XML documents extracted from the IEEE collection. These data allow the construction of ad-hoc citation indexes, which enables us to carry out the hitherto largest all-author co-citation study. Four ACA are made, combining the different units of analyses with the different matrix generation approaches...... into groupings. Finally, the study also demonstrates the importance of sparse matrices and their potential problems in connection with factor analysis. Conclusion: We can confirm that inclusive all-ACA produce more coherent groupings of authors, whereas the present study cannot clearly confirm previous findings...
Logistic regression for circular data
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Unitary Response Regression Models
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Flexible survival regression modelling
Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben
2009-01-01
Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights by m...... treatment of the topic is based on the perspective of applied researchers using quantile regression in their empirical work....
Chambers, David W
2005-01-01
Groups naturally promote their strengths and prefer values and rules that give them an identity and an advantage. This shows up as generational tensions across cohorts who share common experiences, including common elders. Dramatic cultural events in America since 1925 can help create an understanding of the differing value structures of the Silents, the Boomers, Gen Xers, and the Millennials. Differences in how these generations see motivation and values, fundamental reality, relations with others, and work are presented, as are some applications of these differences to the dental profession.
Naghshpour, Shahdad
2012-01-01
Regression analysis is the most commonly used statistical method in the world. Although few would characterize this technique as simple, regression is in fact both simple and elegant. The complexity that many attribute to regression analysis is often a reflection of their lack of familiarity with the language of mathematics. But regression analysis can be understood even without a mastery of sophisticated mathematical concepts. This book provides the foundation and will help demystify regression analysis using examples from economics and with real data to show the applications of the method. T
Tokinaga, Shozo; Ikeda, Yoshikazu
In investments, it is not easy to identify traders'behavior from stock prices, and agent systems may help us. This paper deals with discriminant analyses of stock prices using multifractality of time series generated via multi-agent systems and interpolation based on Wavelet Transforms. We assume five types of agents where a part of agents prefer forecast equations or production rules. Then, it is shown that the time series of artificial stock price reveals as a multifractal time series whose features are defined by the Hausedorff dimension D(h). As a result, we see the relationship between the reliability (reproducibility) of multifractality and D(h) under sufficient number of time series data. However, generally we need sufficient samples to estimate D(h), then we use interpolations of multifractal times series based on the Wavelet Transform.
Autistic epileptiform regression.
Canitano, Roberto; Zappella, Michele
2006-01-01
Autistic regression is a well known condition that occurs in one third of children with pervasive developmental disorders, who, after normal development in the first year of life, undergo a global regression during the second year that encompasses language, social skills and play. In a portion of these subjects, epileptiform abnormalities are present with or without seizures, resembling, in some respects, other epileptiform regressions of language and behaviour such as Landau-Kleffner syndrome. In these cases, for a more accurate definition of the clinical entity, the term autistic epileptifom regression has been suggested. As in other epileptic syndromes with regression, the relationships between EEG abnormalities, language and behaviour, in autism, are still unclear. We describe two cases of autistic epileptiform regression selected from a larger group of children with autistic spectrum disorders, with the aim of discussing the clinical features of the condition, the therapeutic approach and the outcome.
Scaled Sparse Linear Regression
Sun, Tingni
2011-01-01
Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual squares and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs nearly nothing beyond the computation of a path of the sparse regression estimator for penalty levels above a threshold. For the scaled Lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the method yields simultaneously an estimator for the noise level and an estimated coefficient vector in the Lasso path satisfying certain oracle inequalities for the estimation of the noise level, prediction, and the estimation of regression coefficients. These oracle inequalities provide sufficient conditions for the consistency and asymptotic...
High-dimensional regression with unknown variance
Giraud, Christophe; Verzelen, Nicolas
2011-01-01
We review recent results for high-dimensional sparse linear regression in the practical case of unknown variance. Different sparsity settings are covered, including coordinate-sparsity, group-sparsity and variation-sparsity. The emphasize is put on non-asymptotic analyses and feasible procedures. In addition, a small numerical study compares the practical performance of three schemes for tuning the Lasso esti- mator and some references are collected for some more general models, including multivariate regression and nonparametric regression.
Rolling Regressions with Stata
Kit Baum
2004-01-01
This talk will describe some work underway to add a "rolling regression" capability to Stata's suite of time series features. Although commands such as "statsby" permit analysis of non-overlapping subsamples in the time domain, they are not suited to the analysis of overlapping (e.g. "moving window") samples. Both moving-window and widening-window techniques are often used to judge the stability of time series regression relationships. We will present an implementation of a rolling regression...
Müller, Wolfgang; Kelley, Simon; Villa, Igor
2002-07-01
Three different geochronological techniques (stepwise-heating, laser-ablation 40Ar/39Ar, Rb-Sr microsampling) have been evaluated for dating fault-generated pseudotachylytes sampled along the Periadriatic Fault System (PAF) of the Alps. Because pseudotachylytes are whole-rock systems composed of melt, clast and alteration phases, chemical control from both Ar isotopes (Cl/K, Ca/K ratios) and EMPA analyses is crucial for their discrimination. When applied to stepwise-heating 40Ar/39Ar analyses, this approach yields accurate melt-related ages, even for complex age spectra. The spatial resolution of laser-ablation 40Ar/39Ar analyses is capable of contrasting melt, clast and alteration phases in situ, provided the clasts are not too fine grained, the latter of which results in integrated "mixed" ages without geological information. Elevated Cl/K and Ca/K ratios were found to be an invaluable indicator for the presence of clast admixture or inherited 40Ar. Due to incomplete isotopic resetting during frictional melting, Rb-Sr microsampling dating did not furnish geologically meaningful ages. On the basis of isotopic disequilibria among pseudotachylyte matrix phases, and independent Rb-Sr microsampling dating of cogenetic (ultra)mylonites, the concordant 40Ar/39Ar pseudotachylyte ages are interpreted as formation ages. The investigated pseudotachylytes altogether reveal a Cretaceous to Miocene history for the entire PAF, consistent with independent geological evidence. Individual faults, however, consistently reveal narrower intervals of enhanced activity lasting a few million years. Electronic supplementary material to this paper can be obtained by using the Springer LINK server at http://dx.doi.org/10.1008/s00410-002-0381-6
Guijun YANG; Lu LIN; Runchu ZHANG
2007-01-01
Quasi-regression, motivated by the problems arising in the computer experiments, focuses mainly on speeding up evaluation. However, its theoretical properties are unexplored systemically. This paper shows that quasi-regression is unbiased, strong convergent and asymptotic normal for parameter estimations but it is biased for the fitting of curve. Furthermore, a new method called unbiased quasi-regression is proposed. In addition to retaining the above asymptotic behaviors of parameter estimations, unbiased quasi-regression is unbiased for the fitting of curve.
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Weisberg, Sanford
2005-01-01
Master linear regression techniques with a new edition of a classic text Reviews of the Second Edition: ""I found it enjoyable reading and so full of interesting material that even the well-informed reader will probably find something new . . . a necessity for all of those who do linear regression."" -Technometrics, February 1987 ""Overall, I feel that the book is a valuable addition to the now considerable list of texts on applied linear regression. It should be a strong contender as the leading text for a first serious course in regression analysis."" -American Scientist, May-June 1987
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Gerber, Samuel [Univ. of Utah, Salt Lake City, UT (United States); Rubel, Oliver [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bremer, Peer -Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Whitaker, Ross T. [Univ. of Utah, Salt Lake City, UT (United States)
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
F.M.O. Borges
2003-12-01
que significou pouca influência da metodologia sobre essa medida. A FDN não mostrou ser melhor preditor de EM do que a FB.One experiment was run with broiler chickens, to obtain prediction equations for metabolizable energy (ME based on feedstuffs chemical analyses, and determined ME of wheat grain and its by-products, using four different methodologies. Seven wheat grain by-products were used in five treatments: wheat grain, wheat germ, white wheat flour, dark wheat flour, wheat bran for human use, wheat bran for animal use and rough wheat bran. Based on chemical analyses of crude fiber (CF, ether extract (EE, crude protein (CP, ash (AS and starch (ST of the feeds and the determined values of apparent energy (MEA, true energy (MEV, apparent corrected energy (MEAn and true energy corrected by nitrogen balance (MEVn in five treatments, prediction equations were obtained using the stepwise procedure. CF showed the best relationship with metabolizable energy values, however, this variable alone was not enough for a good estimate of the energy values (R² below 0.80. When EE and CP were included in the equations, R² increased to 0.90 or higher in most estimates. When the equations were calculated with all treatments, the equation for MEA were less precise and R² decreased. When ME data of the traditional or force-feeding methods were used separately, the precision of the equations increases (R² higher than 0.85. For MEV and MEVn values, the best multiple linear equations included CF, EE and CP (R²>0.90, independently of using all experimental data or separating by methodology. The estimates of MEVn values showed high precision and the linear coefficients (a of the equations were similar for all treatments or methodologies. Therefore, it explains the small influence of the different methodologies on this parameter. NDF was not a better predictor of ME than CF.
Transductive Ordinal Regression
Seah, Chun-Wei; Ong, Yew-Soon
2011-01-01
Ordinal regression is commonly formulated as a multi-class problem with ordinal constraints. The challenge of designing accurate classifiers for ordinal regression generally increases with the number of classes involved, due to the large number of labeled patterns that are needed. The availability of ordinal class labels, however, are often costly to calibrate or difficult to obtain. Unlabeled patterns, on the other hand, often exist in much greater abundance and are freely available. To take benefits from the abundance of unlabeled patterns, we present a novel transductive learning paradigm for ordinal regression in this paper, namely Transductive Ordinal Regression (TOR). The key challenge of the present study lies in the precise estimation of both the ordinal class label of the unlabeled data and the decision functions of the ordinal classes, simultaneously. The core elements of the proposed TOR include an objective function that caters to several commonly used loss functions casted in transductive setting...
Nonparametric Predictive Regression
Ioannis Kasparis; Elena Andreou; Phillips, Peter C.B.
2012-01-01
A unifying framework for inference is developed in predictive regressions where the predictor has unknown integration properties and may be stationary or nonstationary. Two easily implemented nonparametric F-tests are proposed. The test statistics are related to those of Kasparis and Phillips (2012) and are obtained by kernel regression. The limit distribution of these predictive tests holds for a wide range of predictors including stationary as well as non-stationary fractional and near unit...
Multilingual speaker age recognition: regression analyses on the Lwazi corpus
Feld, M
2009-12-01
Full Text Available towards improved understanding of multilingual speech processing, the current contribution investigates how an important para-linguistic aspect of speech, namely speaker age, depends on the language spoken. In particular, the authors study how certain...
Logistic regression: a brief primer.
Stoltzfus, Jill C
2011-10-01
Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model
Prediction, Regression and Critical Realism
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... of prediction necessary and possible in spatial planning of urban development. Finally, the political implications of positions within theory of science rejecting the possibility of predictions about social phenomena are addressed....... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...
[Understanding logistic regression].
El Sanharawi, M; Naudet, F
2013-10-01
Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.
Constrained Sparse Galerkin Regression
Loiseau, Jean-Christophe
2016-01-01
In this work, we demonstrate the use of sparse regression techniques from machine learning to identify nonlinear low-order models of a fluid system purely from measurement data. In particular, we extend the sparse identification of nonlinear dynamics (SINDy) algorithm to enforce physical constraints in the regression, leading to energy conservation. The resulting models are closely related to Galerkin projection models, but the present method does not require the use of a full-order or high-fidelity Navier-Stokes solver to project onto basis modes. Instead, the most parsimonious nonlinear model is determined that is consistent with observed measurement data and satisfies necessary constraints. The constrained Galerkin regression algorithm is implemented on the fluid flow past a circular cylinder, demonstrating the ability to accurately construct models from data.
Ferreira de Carvalho, J; Poulain, J; Da Silva, C; Wincker, P; Michon-Coudouel, S; Dheilly, A; Naquin, D; Boutte, J; Salmon, A; Ainouche, M
2013-02-01
Spartina species have a critical ecological role in salt marshes and represent an excellent system to investigate recurrent polyploid speciation. Using the 454 GS-FLX pyrosequencer, we assembled and annotated the first reference transcriptome (from roots and leaves) for two related hexaploid Spartina species that hybridize in Western Europe, the East American invasive Spartina alterniflora and the Euro-African S. maritima. The de novo read assembly generated 38 478 consensus sequences and 99% found an annotation using Poaceae databases, representing a total of 16 753 non-redundant genes. Spartina expressed sequence tags were mapped onto the Sorghum bicolor genome, where they were distributed among the subtelomeric arms of the 10 S. bicolor chromosomes, with high gene density correlation. Normalization of the complementary DNA library improved the number of annotated genes. Ecologically relevant genes were identified among GO biological function categories in salt and heavy metal stress response, C4 photosynthesis and in lignin and cellulose metabolism. Expression of some of these genes had been found to be altered by hybridization and genome duplication in a previous microarray-based study in Spartina. As these species are hexaploid, up to three duplicated homoeologs may be expected per locus. When analyzing sequence polymorphism at four different loci in S. maritima and S. alterniflora, we found up to four haplotypes per locus, suggesting the presence of two expressed homoeologous sequences with one or two allelic variants each. This reference transcriptome will allow analysis of specific Spartina genes of ecological or evolutionary interest, estimation of homoeologous gene expression variation using RNA-seq and further gene expression evolution analyses in natural populations.
Practical Session: Logistic Regression
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Adaptive metric kernel regression
Goutte, Cyril; Larsen, Jan
2000-01-01
regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Software Regression Verification
2013-12-11
of recursive procedures. Acta Informatica , 45(6):403 – 439, 2008. [GS11] Benny Godlin and Ofer Strichman. Regression verifica- tion. Technical Report...functions. Therefore, we need to rede - fine m-term. – Mutual termination. If either function f or function f ′ (or both) is non- deterministic, then their
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Analyses of Generation and Release of Tritium in Nuclear Power Plant%核电厂氚的产生和排放分析
黎辉; 梅其良; 付亚茹
2015-01-01
T ritium research including tritium generation in reactor core and in the primary coolant ,release pathways ,tritium chemical forms and release amount is a very impor‐tant part of environment assessment of nuclear power plant .Based on the international operation practice ,the primary coolant system ,auxiliary systems ,radwaste system and ventilation system were analysed ,and the tritium release pathways and chemical forms were investigated .The results indicate that the theoretic calculation results agree with the nuclear power plant operation data very well .The tritium contained in the primary coolant is mainly produced from the three‐fragment fission reaction ,boron activation in the burnable poison rods and boron ,lithium and deuterium activation w hen they pass through the core . The released tritium to the environment is mainly in the form of tritiated water and the percentage between the liquid and gaseous of release tritium mainly depends on the leakage rate from the primary coolant to the reactor building and auxiliary building .%研究核电厂中氚在堆芯和主冷却剂中的产生方式，以及进入环境的途径、形态和排放量，是核电厂辐射环境影响评价非常重要的内容之一。本文通过分析压水堆核电厂中的主冷却剂系统、辅助系统、三废系统和厂房通风系统的运行模式，结合国际上的运行经验参数，研究主冷却剂中的氚排放进入环境大气的途径和形态。研究结果表明：理论计算分析结果与电厂运行经验数据相吻合，氚主要通过燃料棒中的三元裂变，可燃毒物棒中硼的活化以及主冷却剂中硼、锂和氘流经堆芯时的活化产生，主要以液态氚水形式排放，影响气液两相分配份额的主要因素取决于主冷却剂向反应堆厂房和辅助厂房的泄漏率。
Low rank Multivariate regression
Giraud, Christophe
2010-01-01
We consider in this paper the multivariate regression problem, when the target regression matrix $A$ is close to a low rank matrix. Our primary interest in on the practical case where the variance of the noise is unknown. Our main contribution is to propose in this setting a criterion to select among a family of low rank estimators and prove a non-asymptotic oracle inequality for the resulting estimator. We also investigate the easier case where the variance of the noise is known and outline that the penalties appearing in our criterions are minimal (in some sense). These penalties involve the expected value of the Ky-Fan quasi-norm of some random matrices. These quantities can be evaluated easily in practice and upper-bounds can be derived from recent results in random matrix theory.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Hansen, Henrik; Tarp, Finn
2001-01-01
. There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....
Robust Nonstationary Regression
1993-01-01
This paper provides a robust statistical approach to nonstationary time series regression and inference. Fully modified extensions of traditional robust statistical procedures are developed which allow for endogeneities in the nonstationary regressors and serial dependence in the shocks that drive the regressors and the errors that appear in the equation being estimated. The suggested estimators involve semiparametric corrections to accommodate these possibilities and they belong to the same ...
TWO REGRESSION CREDIBILITY MODELS
Constanţa-Nicoleta BODEA
2010-03-01
Full Text Available In this communication we will discuss two regression credibility models from Non – Life Insurance Mathematics that can be solved by means of matrix theory. In the first regression credibility model, starting from a well-known representation formula of the inverse for a special class of matrices a risk premium will be calculated for a contract with risk parameter θ. In the next regression credibility model, we will obtain a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state and the collective estimate (based on aggregate USA data. To illustrate the solution with the properties mentioned above, we shall need the well-known representation theorem for a special class of matrices, the properties of the trace for a square matrix, the scalar product of two vectors, the norm with respect to a positive definite matrix given in advance and the complicated mathematical properties of conditional expectations and of conditional covariances.
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Relationship between Multiple Regression and Selected Multivariable Methods.
Schumacker, Randall E.
The relationship of multiple linear regression to various multivariate statistical techniques is discussed. The importance of the standardized partial regression coefficient (beta weight) in multiple linear regression as it is applied in path, factor, LISREL, and discriminant analyses is emphasized. The multivariate methods discussed in this paper…
Panel data specifications in nonparametric kernel regression
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
Boolsen, Merete Watt
bogen forklarer de fundamentale trin i forskningsprocessen og applikerer dem på udvalgte kvalitative analyser: indholdsanalyse, Grounded Theory, argumentationsanalyse og diskursanalyse......bogen forklarer de fundamentale trin i forskningsprocessen og applikerer dem på udvalgte kvalitative analyser: indholdsanalyse, Grounded Theory, argumentationsanalyse og diskursanalyse...
Nonparametric Regression with Common Shocks
Eduardo A. Souza-Rodrigues
2016-09-01
Full Text Available This paper considers a nonparametric regression model for cross-sectional data in the presence of common shocks. Common shocks are allowed to be very general in nature; they do not need to be finite dimensional with a known (small number of factors. I investigate the properties of the Nadaraya-Watson kernel estimator and determine how general the common shocks can be while still obtaining meaningful kernel estimates. Restrictions on the common shocks are necessary because kernel estimators typically manipulate conditional densities, and conditional densities do not necessarily exist in the present case. By appealing to disintegration theory, I provide sufficient conditions for the existence of such conditional densities and show that the estimator converges in probability to the Kolmogorov conditional expectation given the sigma-field generated by the common shocks. I also establish the rate of convergence and the asymptotic distribution of the kernel estimator.
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Karim Hardani*
2012-05-01
Full Text Available A 10-month-old baby presented with developmental delay. He had flaccid paralysis on physical examination.An MRI of the spine revealed malformation of the ninth and tenth thoracic vertebral bodies with complete agenesis of the rest of the spine down that level. The thoracic spinal cord ends at the level of the fifth thoracic vertebra with agenesis of the posterior arches of the eighth, ninth and tenth thoracic vertebral bodies. The roots of the cauda equina appear tightened down and backward and ended into a subdermal fibrous fatty tissue at the level of the ninth and tenth thoracic vertebral bodies (closed meningocele. These findings are consistent with caudal regression syndrome.
Kim Farah Giuliani
2017-04-01
Full Text Available Dieser Beitrag greift sowohl den wissenschaftlichen Diskurs über die „Generation Y“ als auch Aspekte der Vermittlung von Informationskompetenz in Bibliotheken auf. Beide Themen werden verknüpft und in Bezug auf eine erfolgreiche Bibliotheksdidaktik diskutiert. This article takes up both the scientific discourse about the Millenials (Generation Y as well as the debate on how to impart information literacy and library didactics. Both topics are discussed in combination, with respect to successful didactics in libraries.
Mosher, Jennifer J; Bernberg, Erin L; Shevchenko, Olga; Kan, Jinjun; Kaplan, Louis A
2013-11-01
Longer sequences of the bacterial 16S rRNA gene could provide greater phylogenetic and taxonomic resolutions and advance knowledge of population dynamics within complex natural communities. We assessed the accuracy of a Pacific Biosciences (PacBio) single molecule, real time (SMRT) sequencing based on DNA polymerization, a promising 3rd generation high-throughput technique, and compared this to the 2nd generation Roche 454 pyrosequencing platform. Amplicons of the 16S rRNA gene from a known isolate, Shewanella oneidensis MR1, and environmental samples from two streambed habitats, rocks and sediments, and a riparian zone soil, were analyzed. On the PacBio we analyzed ~500 bp amplicons that covered the V1-V3 regions and the full 1500 bp amplicons of the V1-V9 regions. On the Roche 454 we analyzed the ~500 bp amplicons. Error rates associated with the isolate were lowest with the Roche 454 method (2%), increased by more than 2-fold for the 500 bp amplicons with the PacBio SMRT chip (4-5%), and by more than 8-fold for the full gene with the PacBio SMRT chip (17-18%). Higher error rates with the PacBio SMRT chip artificially inflated estimates of richness and lowered estimates of coverage for environmental samples. The 3rd generation sequencing technology we evaluated does not provide greater phylogenetic and taxonomic resolutions for studies of microbial ecology. © 2013.
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Tsai, Jen-Hsiung; Chen, Shui-Jen; Huang, Kuo-Lin; Lin, Wen-Yinn; Lee, Wen-Jhy; Lin, Chih-Chung; Hsieh, Lien-Te; Chiu, Juei-Yu; Kuo, Wen-Chien
2014-01-01
Biodiesel is one of alternative energies that have been extensively discussed and studied. This research investigates the characteristics of particulate matter (PM), particulate carbon, and polycyclic aromatic hydrocarbons (PAHs) emitted from a generator fueled by waste-edible-oil-biodiesel with acetone and isopropyl alcohol (IPA) addition. The tested biodieselhols consisted of pure diesel oil (D100) with 1-3 vol.% pure acetone (denoted as A), 1-70 vol.% waste-edible-oil-biodiesel (denoted as W), and 1 vol.% pure isopropyl alcohol (the stabilizer, denoted as P). The results show that in comparison to W1D99, W3D97, W5D95, W10D90, and W20D80, the use of biodieselhols achieved additional reduction of PM and particulate organic carbon (OC) emission, and such reduction increased as the addition percentage of pure acetone increased. Regardless of the percentages of added waste-edible-oil-biodiesel, acetone, and isopropyl alcohol, the use of biodieselhol in place of D100 could reduce the emissions of Total-PAHs (by 6.13-42.5% (average = 24.1%)) and Total-BaPeq (by 16.6-74.8% (average = 53.2%)) from the diesel engine generator. Accordingly, the W/D blended fuels (W<20 vol.%) containing acetone (1-3 vol.%) and isopropyl alcohol (1 vol.%) are a potential alternative fuel for diesel engine generators because they substantially reduce emissions of PM, particulate OC, Total-PAHs, and Total-BaPeq. © 2013. Published by Elsevier B.V. All rights reserved.
Kandler, Anne; Shennan, Stephen
2015-12-06
Cultural change can be quantified by temporal changes in frequency of different cultural artefacts and it is a central question to identify what underlying cultural transmission processes could have caused the observed frequency changes. Observed changes, however, often describe the dynamics in samples of the population of artefacts, whereas transmission processes act on the whole population. Here we develop a modelling framework aimed at addressing this inference problem. To do so, we firstly generate population structures from which the observed sample could have been drawn randomly and then determine theoretical samples at a later time t2 produced under the assumption that changes in frequencies are caused by a specific transmission process. Thereby we also account for the potential effect of time-averaging processes in the generation of the observed sample. Subsequent statistical comparisons (e.g. using Bayesian inference) of the theoretical and observed samples at t2 can establish which processes could have produced the observed frequency data. In this way, we infer underlying transmission processes directly from available data without any equilibrium assumption. We apply this framework to a dataset describing pottery from settlements of some of the first farmers in Europe (the LBK culture) and conclude that the observed frequency dynamic of different types of decorated pottery is consistent with age-dependent selection, a preference for 'young' pottery types which is potentially indicative of fashion trends. © 2015 The Author(s).
Genetic Programming Transforms in Linear Regression Situations
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
Streamflow forecasting using functional regression
Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.
2016-07-01
Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Combining Alphas via Bounded Regression
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Competing Risks Quantile Regression at Work
Dlugosz, Stephan; Lo, Simon M. S.; Wilke, Ralf
2017-01-01
Despite its emergence as a frequently used method for the empirical analysis of multivariate data, quantile regression is yet to become a mainstream tool for the analysis of duration data. We present a pioneering empirical study on the grounds of a competing risks quantile regression model. We us...... into the distribution of transitions out of maternity leave. It is found that cumulative incidences implied by the quantile regression model differ from those implied by a proportional hazards model. To foster the use of the model, we make an R-package (cmprskQR) available....... large-scale maternity duration data with multiple competing risks derived from German linked social security records to analyse how public policies are related to the length of economic inactivity of young mothers after giving birth. Our results show that the model delivers detailed insights...
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Alexandre G. de Brevern
2015-01-01
Full Text Available Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.
de Brevern, Alexandre G; Meyniel, Jean-Philippe; Fairhead, Cécile; Neuvéglise, Cécile; Malpertuy, Alain
2015-01-01
Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.
Knowledge and Awareness: Linear Regression
Monika Raghuvanshi
2016-12-01
Full Text Available Knowledge and awareness are factors guiding development of an individual. These may seem simple and practicable, but in reality a proper combination of these is a complex task. Economically driven state of development in younger generations is an impediment to the correct manner of development. As youths are at the learning phase, they can be molded to follow a correct lifestyle. Awareness and knowledge are important components of any formal or informal environmental education. The purpose of this study is to evaluate the relationship of these components among students of secondary/ senior secondary schools who have undergone a formal study of environment in their curricula. A suitable instrument is developed in order to measure the elements of Awareness and Knowledge among the participants of the study. Data was collected from various secondary and senior secondary school students in the age group 14 to 20 years using cluster sampling technique from the city of Bikaner, India. Linear regression analysis was performed using IBM SPSS 23 statistical tool. There exists a weak relation between knowledge and awareness about environmental issues, caused due to routine practices mishandling; hence one component can be complemented by other for improvement in both. Knowledge and awareness are crucial factors and can provide huge opportunities in any field. Resource utilization for economic solutions may pave the way for eco-friendly products and practices. If green practices are inculcated at the learning phase, they may become normal routine. This will also help in repletion of the environment.
Fondeur, F. F.; Fink, S. D.
2011-12-07
A new solvent system referred to as Next Generation Solvent or NGS, has been developed at Oak Ridge National Laboratory for the removal of cesium from alkaline solutions in the Caustic Side Solvent Extraction process. The NGS is proposed for deployment at MCU{sup a} and at the Salt Waste Processing Facility. This work investigated the chemical compatibility between NGS and 16 M, 8 M, and 3 M nitric acid from contact that may occur in handling of analytical samples from MCU or, for 3 M acid, which may occur during contactor cleaning operations at MCU. This work shows that reactions occurred between NGS components and the high molarity nitric acid. Reaction rates are much faster in 8 M and 16 M nitric acid than in 3 M nitric acid. In the case of 16 M and 8 M nitric acid, the nitric acid reacts with the extractant to produce initially organo-nitrate species. The reaction also releases soluble fluorinated alcohols such as tetrafluoropropanol. With longer contact time, the modifier reacts to produce a tarry substance with evolved gases (NO{sub x} and possibly CO). Calorimetric analysis of the reaction product mixtures revealed that the organo-nitrates reaction products are not explosive and will not deflagrate.
Guo Li
Full Text Available BACKGROUND: Rapidly growing evidence suggests that microRNAs (miRNAs are involved in a wide range of cancer malignant behaviours including radioresistance. Therefore, the present study was designed to investigate miRNA expression patterns associated with radioresistance in NPC. METHODS: The differential expression profiles of miRNAs and mRNAs associated with NPC radioresistance were constructed. The predicted target mRNAs of miRNAs and their enriched signaling pathways were analyzed via biological informatical algorithms. Finally, partial miRNAs and pathways-correlated target mRNAs were validated in two NPC radioreisitant cell models. RESULTS: 50 known and 9 novel miRNAs with significant difference were identified, and their target mRNAs were narrowed down to 53 nasopharyngeal-/NPC-specific mRNAs. Subsequent KEGG analyses demonstrated that the 53 mRNAs were enriched in 37 signaling pathways. Further qRT-PCR assays confirmed 3 down-regulated miRNAs (miR-324-3p, miR-93-3p and miR-4501, 3 up-regulated miRNAs (miR-371a-5p, miR-34c-5p and miR-1323 and 2 novel miRNAs. Additionally, corresponding alterations of pathways-correlated target mRNAs were observed including 5 up-regulated mRNAs (ICAM1, WNT2B, MYC, HLA-F and TGF-β1 and 3 down-regulated mRNAs (CDH1, PTENP1 and HSP90AA1. CONCLUSIONS: Our study provides an overview of miRNA expression profile and the interactions between miRNA and their target mRNAs, which will deepen our understanding of the important roles of miRNAs in NPC radioresistance.
Logistic regression applied to natural hazards: rare event logistic regression with replications
M. Guns
2012-06-01
Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Kasai, Chika; Sugimoto, Kazushi; Moritani, Isao; Tanaka, Junichiro; Oya, Yumi; Inoue, Hidekazu; Tameda, Masahiko; Shiraki, Katsuya; Ito, Masaaki; Takei, Yoshiyuki; Takase, Kojiro
2016-01-01
Colorectal cancer (CRC) is the third leading cause of cancer-related deaths in Japan. The etiology of CRC has been linked to numerous factors including genetic mutation, diet, life style, inflammation, and recently, the gut microbiota. However, CRC-associated gut microbiota is still largely unexamined. This study used terminal restriction fragment length polymorphism (T-RFLP) and next-generation sequencing (NGS) to analyze and compare gut microbiota of Japanese control subjects and Japanese patients with carcinoma in adenoma. Stool samples were collected from 49 control subjects, 50 patients with colon adenoma, and 9 patients with colorectal cancer (3/9 with invasive cancer and 6/9 with carcinoma in adenoma) immediately before colonoscopy; DNA was extracted from each stool sample. Based on T-RFLP analysis, 12 subjects (six control and six carcinoma in adenoma subjects) were selected; their samples were used for NGS and species-level analysis. T-RFLP analysis showed no significant differences in bacterial population between control, adenoma and cancer groups. However, NGS revealed that i), control and carcinoma in adenoma subjects had different gut microbiota compositions, ii), one bacterial genus (Slackia) was significantly associated with the control group and four bacterial genera (Actinomyces, Atopobium, Fusobacterium, and Haemophilus) were significantly associated with the carcinoma-in-adenoma group, and iii), several bacterial species were significantly associated with each type (control: Eubacterium coprostanoligens; carcinoma in adenoma: Actinomyces odontolyticus, Bacteroides fragiles, Clostridium nexile, Fusobacterium varium, Haemophilus parainfluenzae, Prevotella stercorea, Streptococcus gordonii, and Veillonella dispar). Gut microbial properties differ between control subjects and carcinoma-in-adenoma patients in this Japanese population, suggesting that gut microbiota is related to CRC prevention and development.
Smith Derek
2009-01-01
Full Text Available Abstract Background iTRAQ is a proteomics technique that uses isobaric tags for relative and absolute quantitation of tryptic peptides. In proteomics experiments, the detection and high confidence annotation of proteins and the significance of corresponding expression differences can depend on the quality and the species specificity of the tryptic peptide map database used for analysis of the data. For species for which finished genome sequence data are not available, identification of proteins relies on similarity to proteins from other species using comprehensive peptide map databases such as the MSDB. Results We were interested in characterizing ripening initiation ('veraison' in grape berries at the protein level in order to better define the molecular control of this important process for grape growers and wine makers. We developed a bioinformatic pipeline for processing EST data in order to produce a predicted tryptic peptide database specifically targeted to the wine grape cultivar, Vitis vinifera cv. Cabernet Sauvignon, and lacking truncated N- and C-terminal fragments. By searching iTRAQ MS/MS data generated from berry exocarp and mesocarp samples at ripening initiation, we determined that implementation of the custom database afforded a large improvement in high confidence peptide annotation in comparison to the MSDB. We used iTRAQ MS/MS in conjunction with custom peptide db searches to quantitatively characterize several important pathway components for berry ripening previously described at the transcriptional level and confirmed expression patterns for these at the protein level. Conclusion We determined that a predicted peptide database for MS/MS applications can be derived from EST data using advanced clustering and trimming approaches and successfully implemented for quantitative proteome profiling. Quantitative shotgun proteome profiling holds great promise for characterizing biological processes such as fruit ripening
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Ferrini, Marcello [GeNERG - DIME/TEC, University of Genova, via all’Opera Pia 15/a, 16145 Genova (Italy); Borreani, Walter [Ansaldo Nucleare S.p.A., Corso F.M. Perrone 25, 16152 Genova (Italy); INFN, Via Dodecaneso 33, 16146 Genova (Italy); Lomonaco, Guglielmo, E-mail: guglielmo.lomonaco@unige.it [GeNERG - DIME/TEC, University of Genova, via all’Opera Pia 15/a, 16145 Genova (Italy); INFN, Via Dodecaneso 33, 16146 Genova (Italy); Magugliani, Fabrizio [Ansaldo Nucleare S.p.A., Corso F.M. Perrone 25, 16152 Genova (Italy)
2016-02-15
Lead-cooled fast reactor (LFR) has both a long history and a penchant of innovation. With early work related to its use for submarine propulsion dating to the 1950s, Russian scientists pioneered the development of reactors cooled by heavy liquid metals (HLM). More recently, there has been substantial interest in both critical and subcritical reactors cooled by lead (Pb) or lead–bismuth eutectic (LBE), not only in Russia, but also in Europe, Asia, and the USA. The growing knowledge of the thermal-fluid-dynamic properties of these fluids and the choice of the LFR as one of the six reactor types selected by Generation IV International Forum (GIF) for further research and development has fostered the exploration of new geometries and new concepts aimed at optimizing the key components that will be adopted in the Advanced Lead Fast Reactor European Demonstrator (ALFRED), the 300 MW{sub t} pool-type reactor aimed at proving the feasibility of the design concept adopted for the European Lead-cooled Fast Reactor (ELFR). In this paper, a theoretical and computational analysis is presented of a multi-blade screw pump evolving liquid Lead as primary pump for the adopted reference conceptual design of ALFRED. The pump is at first analyzed at design operating conditions from the theoretical point of view to determine the optimal geometry according to the velocity triangles and then modeled with a 3D CFD code (ANSYS CFX). The choice of a 3D simulation is dictated by the need to perform a detailed spatial simulation taking into account the peculiar geometry of the pump as well as the boundary layers and turbulence effects of the flow, which are typically tri-dimensional. The use of liquid Lead impacts significantly the fluid dynamic design of the pump because of the key requirement to avoid any erosion affects. These effects have a major impact on the performance, reliability and lifespan of the pump. Albeit some erosion-related issues remain to be fully addressed, the results
Polynomial Regression on Riemannian Manifolds
Hinkle, Jacob; Fletcher, P Thomas; Joshi, Sarang
2012-01-01
In this paper we develop the theory of parametric polynomial regression in Riemannian manifolds and Lie groups. We show application of Riemannian polynomial regression to shape analysis in Kendall shape space. Results are presented, showing the power of polynomial regression on the classic rat skull growth data of Bookstein as well as the analysis of the shape changes associated with aging of the corpus callosum from the OASIS Alzheimer's study.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Gallart, Francesc; Llorens, Pilar; Pérez-Gallego, Nuria; Latron, Jérôme
2016-04-01
The Vallcebre research catchments are located in NE Spain, in a middle mountain area with a Mediterranean sub-humid climate. Most of the bedrock consists of continental red lutites that are easily weathered into loamy soils. This area was intensely used for agriculture in the past when most of the sunny gentle hillslopes were terraced. The land was progressively abandoned since the mid-20th Century and most of the fields were converted to meadows or were spontaneously forested. Early studies carried out in the terraced Cal Parisa catchment demonstrated the occurrence of two types of frequently saturated areas, ones situated in downslope locations with high topographic index values, and the others located in the inner parts of many terraces, where the shallow water table usually outcrops due to the topographical modifications linked to terrace construction. Both the increased extent of saturated areas and the role of a man-made elementary drainage system designed for depleting water from the terraces suggested that terraced areas would induce an enhanced hydrological response during rainfall events when compared with non-terraced hillslopes. The response of 3 sub-catchments, of increasing area and decreasing percentage of terraced area, during a set of major events collected during over 15 years has been analysed. The results show that storm runoff depths were roughly proportional to precipitations above 30 mm although the smallest catchment (Cal Parisa), with the highest percentage of terraces, was able to completely buffer rainfall events of 60 mm in one hour without any runoff when antecedent conditions were dry. Runoff coefficients depended on antecedent conditions and peak discharges were weakly linked to rainfall intensities. Peak lag times, peak runoff rates and recession coefficients were similar in the 3 catchments; the first variable values were in the range between Hortonian and saturation overland flow and the two last ones were in the range of
Analyzing industrial energy use through ordinary least squares regression models
Golden, Allyson Katherine
Extensive research has been performed using regression analysis and calibrated simulations to create baseline energy consumption models for residential buildings and commercial institutions. However, few attempts have been made to discuss the applicability of these methodologies to establish baseline energy consumption models for industrial manufacturing facilities. In the few studies of industrial facilities, the presented linear change-point and degree-day regression analyses illustrate ideal cases. It follows that there is a need in the established literature to discuss the methodologies and to determine their applicability for establishing baseline energy consumption models of industrial manufacturing facilities. The thesis determines the effectiveness of simple inverse linear statistical regression models when establishing baseline energy consumption models for industrial manufacturing facilities. Ordinary least squares change-point and degree-day regression methods are used to create baseline energy consumption models for nine different case studies of industrial manufacturing facilities located in the southeastern United States. The influence of ambient dry-bulb temperature and production on total facility energy consumption is observed. The energy consumption behavior of industrial manufacturing facilities is only sometimes sufficiently explained by temperature, production, or a combination of the two variables. This thesis also provides methods for generating baseline energy models that are straightforward and accessible to anyone in the industrial manufacturing community. The methods outlined in this thesis may be easily replicated by anyone that possesses basic spreadsheet software and general knowledge of the relationship between energy consumption and weather, production, or other influential variables. With the help of simple inverse linear regression models, industrial manufacturing facilities may better understand their energy consumption and
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Business applications of multiple regression
Richardson, Ronny
2015-01-01
This second edition of Business Applications of Multiple Regression describes the use of the statistical procedure called multiple regression in business situations, including forecasting and understanding the relationships between variables. The book assumes a basic understanding of statistics but reviews correlation analysis and simple regression to prepare the reader to understand and use multiple regression. The techniques described in the book are illustrated using both Microsoft Excel and a professional statistical program. Along the way, several real-world data sets are analyzed in deta
Augmenting Data with Published Results in Bayesian Linear Regression
de Leeuw, Christiaan; Klugkist, Irene
2012-01-01
In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…
Augmenting Data with Published Results in Bayesian Linear Regression
de Leeuw, Christiaan; Klugkist, Irene
2012-01-01
In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…
An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy
Merlo, Juan; Wagner, Philippe; Ghith, Nermin
2016-01-01
BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that disting......BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach...
A. Alsaed
2004-11-18
''The Disposal Criticality Analysis Methodology Topical Report'' prescribes an approach to the methodology for performing postclosure criticality analyses within the monitored geologic repository at Yucca Mountain, Nevada. An essential component of the methodology is the ''Configuration Generator Model for In-Package Criticality'' that provides a tool to evaluate the probabilities of degraded configurations achieving a critical state. The configuration generator model is a risk-informed, performance-based process for evaluating the criticality potential of degraded configurations in the monitored geologic repository. The method uses event tree methods to define configuration classes derived from criticality scenarios and to identify configuration class characteristics (parameters, ranges, etc.). The probabilities of achieving the various configuration classes are derived in part from probability density functions for degradation parameters. The NRC has issued ''Safety Evaluation Report for Disposal Criticality Analysis Methodology Topical Report, Revision 0''. That report contained 28 open items that required resolution through additional documentation. Of the 28 open items, numbers 5, 6, 9, 10, 18, and 19 were concerned with a previously proposed software approach to the configuration generator methodology and, in particular, the k{sub eff} regression analysis associated with the methodology. However, the use of a k{sub eff} regression analysis is not part of the current configuration generator methodology and, thus, the referenced open items are no longer considered applicable and will not be further addressed.
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
A tutorial on Bayesian Normal linear regression
Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens
2015-12-01
Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.
Regression Testing Cost Reduction Suite
Mohamed Alaa El-Din
2014-08-01
Full Text Available The estimated cost of software maintenance exceeds 70 percent of total software costs [1], and large portion of this maintenance expenses is devoted to regression testing. Regression testing is an expensive and frequently executed maintenance activity used to revalidate the modified software. Any reduction in the cost of regression testing would help to reduce the software maintenance cost. Test suites once developed are reused and updated frequently as the software evolves. As a result, some test cases in the test suite may become redundant when the software is modified over time since the requirements covered by them are also covered by other test cases. Due to the resource and time constraints for re-executing large test suites, it is important to develop techniques to minimize available test suites by removing redundant test cases. In general, the test suite minimization problem is NP complete. This paper focuses on proposing an effective approach for reducing the cost of regression testing process. The proposed approach is applied on real-time case study. It was found that the reduction in cost of regression testing for each regression testing cycle is ranging highly improved in the case of programs containing high number of selected statements which in turn maximize the benefits of using it in regression testing of complex software systems. The reduction in the regression test suite size will reduce the effort and time required by the testing teams to execute the regression test suite. Since regression testing is done more frequently in software maintenance phase, the overall software maintenance cost can be reduced considerably by applying the proposed approach.
Slentoe, E.; Moeller, F.; Winther, M.; Hjort Mikkelsen, M.
2010-10-15
The report examines in an integrated form, the energy, emissions and welfare economic implications of introducing Danish produced biodiesel, i.e. rapeseed diesel (RME) and the first and second generation wheat ethanol in two scenarios with low and high rate of blending with fossil fuel based automotive fuels. Within this project's, analytical framework and assumptions the welfare economic analysis shows, that it would be beneficial for society to realize the biofuel scenarios to some extent by oil prices above $ 100 a barrel, while it will cause losses by oil prices at $ 65. In all cases, the fossil fuel consumption and the emissions CO2eq emissions are reduced, the effect of which is priced and included in the welfare economic analysis. The implementation of biofuels in Denmark will be dependent on market price. As it stands now, it is not favorable in terms of biofuels. The RME is currently produced in Denmark is exported to other European countries where there are state subsidies. Subsidies would also be a significant factor in Denmark to achieve objectives for biofuel blending. (ln)
Who Will Win?: Predicting the Presidential Election Using Linear Regression
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Who Will Win?: Predicting the Presidential Election Using Linear Regression
Lamb, John H.
2007-01-01
This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…
Roberts, Steven; Martin, Michael
Most investigations of the adverse health effects of multiple air pollutants analyse the time series involved by simultaneously entering the multiple pollutants into a Poisson log-linear model. Concerns have been raised about this type of analysis, and it has been stated that new methodology or models should be developed for investigating the adverse health effects of multiple air pollutants. In this paper, we introduce the use of the lasso for this purpose and compare its statistical properties to those of ridge regression and the Poisson log-linear model. Ridge regression has been used in time series analyses on the adverse health effects of multiple air pollutants but its properties for this purpose have not been investigated. A series of simulation studies was used to compare the performance of the lasso, ridge regression, and the Poisson log-linear model. In these simulations, realistic mortality time series were generated with known air pollution mortality effects permitting the performance of the three models to be compared. Both the lasso and ridge regression produced more accurate estimates of the adverse health effects of the multiple air pollutants than those produced using the Poisson log-linear model. This increase in accuracy came at the expense of increased bias. Ridge regression produced more accurate estimates than the lasso, but the lasso produced more interpretable models. The lasso and ridge regression offer a flexible way of obtaining more accurate estimation of pollutant effects than that provided by the standard Poisson log-linear model.
ORDINAL REGRESSION FOR INFORMATION RETRIEVAL
无
2008-01-01
This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effectiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM significantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.
Multiple Regression and Its Discontents
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Multiple Regression and Its Discontents
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Wrong Signs in Regression Coefficients
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
From Rasch scores to regression
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
A Matlab program for stepwise regression
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
XRA image segmentation using regression
Jin, Jesse S.
1996-04-01
Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.
Wheeler, David; Tiefelsdorf, Michael
2005-06-01
Present methodological research on geographically weighted regression (GWR) focuses primarily on extensions of the basic GWR model, while ignoring well-established diagnostics tests commonly used in standard global regression analysis. This paper investigates multicollinearity issues surrounding the local GWR coefficients at a single location and the overall correlation between GWR coefficients associated with two different exogenous variables. Results indicate that the local regression coefficients are potentially collinear even if the underlying exogenous variables in the data generating process are uncorrelated. Based on these findings, applied GWR research should practice caution in substantively interpreting the spatial patterns of local GWR coefficients. An empirical disease-mapping example is used to motivate the GWR multicollinearity problem. Controlled experiments are performed to systematically explore coefficient dependency issues in GWR. These experiments specify global models that use eigenvectors from a spatial link matrix as exogenous variables.
Montgomery, Katherine L; Vaughn, Michael G; Thompson, Sanna J; Howard, Matthew O
2013-11-01
Research on juvenile offenders has largely treated this population as a homogeneous group. However, recent findings suggest that this at-risk population may be considerably more heterogeneous than previously believed. This study compared mixture regression analyses with standard regression techniques in an effort to explain how known factors such as distress, trauma, and personality are associated with drug abuse among juvenile offenders. Researchers recruited 728 juvenile offenders from Missouri juvenile correctional facilities for participation in this study. Researchers investigated past-year substance use in relation to the following variables: demographic characteristics (gender, ethnicity, age, familial use of public assistance), antisocial behavior, and mental illness symptoms (psychopathic traits, psychiatric distress, and prior trauma). Results indicated that standard and mixed regression approaches identified significant variables related to past-year substance use among this population; however, the mixture regression methods provided greater specificity in results. Mixture regression analytic methods may help policy makers and practitioners better understand and intervene with the substance-related subgroups of juvenile offenders.
Biplots in Reduced-Rank Regression
Braak, ter C.J.F.; Looman, C.W.N.
1994-01-01
Regression problems with a number of related response variables are typically analyzed by separate multiple regressions. This paper shows how these regressions can be visualized jointly in a biplot based on reduced-rank regression. Reduced-rank regression combines multiple regression and principal c
Interpretation of Standardized Regression Coefficients in Multiple Regression.
Thayer, Jerome D.
The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for…
Inferential Models for Linear Regression
Zuoyi Zhang
2011-09-01
Full Text Available Linear regression is arguably one of the most widely used statistical methods in applications. However, important problems, especially variable selection, remain a challenge for classical modes of inference. This paper develops a recently proposed framework of inferential models (IMs in the linear regression context. In general, an IM is able to produce meaningful probabilistic summaries of the statistical evidence for and against assertions about the unknown parameter of interest and, moreover, these summaries are shown to be properly calibrated in a frequentist sense. Here we demonstrate, using simple examples, that the IM framework is promising for linear regression analysis --- including model checking, variable selection, and prediction --- and for uncertain inference in general.
[Is regression of atherosclerosis possible?].
Thomas, D; Richard, J L; Emmerich, J; Bruckert, E; Delahaye, F
1992-10-01
Experimental studies have shown the regression of atherosclerosis in animals given a cholesterol-rich diet and then given a normal diet or hypolipidemic therapy. Despite favourable results of clinical trials of primary prevention modifying the lipid profile, the concept of atherosclerosis regression in man remains very controversial. The methodological approach is difficult: this is based on angiographic data and requires strict standardisation of angiographic views and reliable quantitative techniques of analysis which are available with image processing. Several methodologically acceptable clinical coronary studies have shown not only stabilisation but also regression of atherosclerotic lesions with reductions of about 25% in total cholesterol levels and of about 40% in LDL cholesterol levels. These reductions were obtained either by drugs as in CLAS (Cholesterol Lowering Atherosclerosis Study), FATS (Familial Atherosclerosis Treatment Study) and SCOR (Specialized Center of Research Intervention Trial), by profound modifications in dietary habits as in the Lifestyle Heart Trial, or by surgery (ileo-caecal bypass) as in POSCH (Program On the Surgical Control of the Hyperlipidemias). On the other hand, trials with non-lipid lowering drugs such as the calcium antagonists (INTACT, MHIS) have not shown significant regression of existing atherosclerotic lesions but only a decrease on the number of new lesions. The clinical benefits of these regression studies are difficult to demonstrate given the limited period of observation, relatively small population numbers and the fact that in some cases the subjects were asymptomatic. The decrease in the number of cardiovascular events therefore seems relatively modest and concerns essentially subjects who were symptomatic initially. The clinical repercussion of studies of prevention involving a single lipid factor is probably partially due to the reduction in progression and anatomical regression of the atherosclerotic plaque
Nonparametric regression with filtered data
Linton, Oliver; Nielsen, Jens Perch; Van Keilegom, Ingrid; 10.3150/10-BEJ260
2011-01-01
We present a general principle for estimating a regression function nonparametrically, allowing for a wide variety of data filtering, for example, repeated left truncation and right censoring. Both the mean and the median regression cases are considered. The method works by first estimating the conditional hazard function or conditional survivor function and then integrating. We also investigate improved methods that take account of model structure such as independent errors and show that such methods can improve performance when the model structure is true. We establish the pointwise asymptotic normality of our estimators.
Quasi-least squares regression
Shults, Justine
2014-01-01
Drawing on the authors' substantial expertise in modeling longitudinal and clustered data, Quasi-Least Squares Regression provides a thorough treatment of quasi-least squares (QLS) regression-a computational approach for the estimation of correlation parameters within the framework of generalized estimating equations (GEEs). The authors present a detailed evaluation of QLS methodology, demonstrating the advantages of QLS in comparison with alternative methods. They describe how QLS can be used to extend the application of the traditional GEE approach to the analysis of unequally spaced longitu
Multivariate differential analyses of adolescents' experiences of ...
and second order factor analyses, correlations, multiple regression, MANOVA, ... This does not mean that the high levels of violence, crime and abuse that are aggravated by socio economic factors such as poverty, unemployment, corruption, ...
Mohamad Amin Pourhoseingholi
2008-03-01
Full Text Available
Background: Logistic regression is one of the most widely used models to analyze the relation between one or more explanatory variables and a categorical response in the field of epidemiology, health and medicine. When there is strong correlation among explanatory variables, i.e.multicollinearity, the efficiency of model reduces considerably. The objective of this research was to employ latent variables to reduce the effect of multicollinearity in analysis of a case-control study about breast cancer risk factors.
Methods: The data belonged to a case-control study in which 300 women with breast cancer were compared to same number of controls. To assess the effect of multicollinearity, five highly correlated quantitative variables were selected. Ordinary logistic regression with collinear data was compared to two models contain latent variables were generated using either factor analysis or principal components analysis. Estimated standard errors of parameters were selected to compare the efficiency of models. We also conducted a simulation study in order to compare the efficiency of models with and without latent factors. All analyses were carried out using S-plus.
Results: Logistic regression based on five primary variables showed an unusual odds ratios for age at first pregnancy (OR=67960, 95%CI: 10184-453503 and for total length of breast feeding (OR=0. On the other hand the parameters estimated for logistic regression on latent variables generated by both factor analysis and principal components analysis were statistically significant (P<0.003. Their standard errors were smaller than that of ordinary logistic regression on original variables. The simulation showed that in the case of normal error and 58% reliability the logistic regression based on latent variables is more efficient than that model for collinear variables.
Conclusions: This research
A Regression Approach to Generate Aircraft Predictor Information
1976-07-01
Groton, CT 06340 Orlando, FL 32813 CDR James Goodson Dr. Alfred F. Smode Chief, Aerospace Psychology Division Training Analysis & Evaluation Group Naval...Stanley Deutsch Dr. Lloyd Hitchcock Office of Life Sciences Human Factors Engineering Branch Headquarters, NASA Crew Systems Department 600
Regression of lumbar disk herniation
G. Yu Evzikov
2015-01-01
Full Text Available Compression of the spinal nerve root, giving rise to pain and sensory and motor disorders in the area of its innervation is the most vivid manifestation of herniated intervertebral disk. Different treatment modalities, including neurosurgery, for evolving these conditions are discussed. There has been recent evidence that spontaneous regression of disk herniation can regress. The paper describes a female patient with large lateralized disc extrusion that has caused compression of the nerve root S1, leading to obvious myotonic and radicular syndrome. Magnetic resonance imaging has shown that the clinical manifestations of discogenic radiculopathy, as well myotonic syndrome and morphological changes completely regressed 8 months later. The likely mechanism is inflammation-induced resorption of a large herniated disk fragment, which agrees with the data available in the literature. A decision to perform neurosurgery for which the patient had indications was made during her first consultation. After regression of discogenic radiculopathy, there was only moderate pain caused by musculoskeletal diseases (facet syndrome, piriformis syndrome that were successfully eliminated by minimally invasive techniques.
Heteroscedasticity checks for regression models
无
2001-01-01
For checking on heteroscedasticity in regression models, a unified approach is proposed to constructing test statistics in parametric and nonparametric regression models. For nonparametric regression, the test is not affected sensitively by the choice of smoothing parameters which are involved in estimation of the nonparametric regression function. The limiting null distribution of the test statistic remains the same in a wide range of the smoothing parameters. When the covariate is one-dimensional, the tests are, under some conditions, asymptotically distribution-free. In the high-dimensional cases, the validity of bootstrap approximations is investigated. It is shown that a variant of the wild bootstrap is consistent while the classical bootstrap is not in the general case, but is applicable if some extra assumption on conditional variance of the squared error is imposed. A simulation study is performed to provide evidence of how the tests work and compare with tests that have appeared in the literature. The approach may readily be extended to handle partial linear, and linear autoregressive models.
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Growth Regression and Economic Theory
Elbers, Chris; Gunning, Jan Willem
2002-01-01
In this note we show that the standard, loglinear growth regression specificationis consistent with one and only one model in the class of stochastic Ramsey models. Thismodel is highly restrictive: it requires a Cobb-Douglas technology and a 100% depreciationrate and it implies that risk does not af
Correlation Weights in Multiple Regression
Waller, Niels G.; Jones, Jeff A.
2010-01-01
A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…
Ridge Regression for Interactive Models.
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are favorable to…
Satellite rainfall retrieval by logistic regression
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Meaney, Christopher; Moineddin, Rahim
2014-01-24
response data are generated from a discrete multinomial distribution with support on (0,1). The linear regression model, the variable-dispersion beta regression model and the fractional logit regression model all perform well across the simulation experiments under consideration. When employing beta regression to estimate covariate effects on (0,1) response data, researchers should ensure their dispersion sub-model is properly specified, else inferential errors could arise.
Inferring gene regression networks with model trees
Aguilar-Ruiz Jesus S
2010-10-01
Full Text Available Abstract Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear
Downscaling Wind Forecasts via Clustering and Regression
Lee, H. S.; Zhang, Y.; Liu, Y.; Wu, L.; He, Y.; Schaake, J. C.
2016-12-01
Wind is an important weather variable and a key determinant of evaporation, snowfall and coastal flooding. At present, wind information from medium-range weather forecast is of limited accuracy, and the associated resolution is often too coarse to be used directly for hydrologic prediction purposes. This work presents a statistical post-processing framework that will be used to generate fine-scale wind products to serve the NOAA's National Water Model effort. The prototype of this framework consists of two components: a) a cluster analysis module that classifies Automated Surface Observing System (ASOS) stations into multiple groups based on elevation and/or surface roughness lengths derived from National Land Cover Database 2011 (NLCD2011), and b) a regression module based on the Heteroscedastic Extended Logistic Regression (HXLR) technique that statistically downscales GEFS wind hindcasts to the location of the closest station within each identified cluster. The efficacy of the framework is assessed for a region that is roughly the service area of NOAA's Middle Atlantic River Forecast Center (MARFC). For this region, wind hindcasts generated from Global Ensemble Forecast System (GEFS) are downscaled and corrected using digital elevation model and National Land Cover Database; observations from ASOS serve both as the predictands for establishing the relationship, and as the reference for validation. Our results showed that this framework considerably enhance the quality of wind forecast, with Nash-Sutcliffe efficiency of the downscaled wind speed improved by 0.2 - 0.4 relative to raw GEFS forecast.
Rao, V V; Yuan, T
1980-01-01
It has been traditional in demographic research to undertake studies based on cross country regression analyses of crude birth rate (CBR), its correlates, or even marital fertility rates (MFR), on various socioeconomic indicators. The general conclusion to emerge from these studies has been that there exists a relationship between fertility and certain significant socioeconomic correlates. This conclusion does not go much beyond observations based on demographic transition theory or differential fertility studies. These multiple regression studies do not come close to the dynamics and underlying processes that generate the actual observations. It seems that cross country regression analyses of the prevalence of family planning may be more useful for policy purposes. Certain correlates of the level of family planning practice have been identified: foremost among these are per capita income, adult literacy, and the period of family planning advocacy. From a policy standpoint, the literacy of the population seems to be the most amenable to intervention by policy making bodies interested in achieving optimal demographic and socioeconomic conditions within a society.
Polynomial Regressions and Nonsense Inference
Daniel Ventosa-Santaulària
2013-11-01
Full Text Available Polynomial specifications are widely used, not only in applied economics, but also in epidemiology, physics, political analysis and psychology, just to mention a few examples. In many cases, the data employed to estimate such specifications are time series that may exhibit stochastic nonstationary behavior. We extend Phillips’ results (Phillips, P. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. by proving that an inference drawn from polynomial specifications, under stochastic nonstationarity, is misleading unless the variables cointegrate. We use a generalized polynomial specification as a vehicle to study its asymptotic and finite-sample properties. Our results, therefore, lead to a call to be cautious whenever practitioners estimate polynomial regressions.
Producing The New Regressive Left
Crone, Christine
to be a committed artist, and how that translates into supporting al-Assad’s rule in Syria; the Ramadan programme Harrir Aqlak’s attempt to relaunch an intellectual renaissance and to promote religious pluralism; and finally, al-Mayadeen’s cooperation with the pan-Latin American TV station TeleSur and its ambitions...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...... coalition (Iran, Hizbollah, Syria), capitalises on a series of factors that bring them together in spite of their otherwise diverse worldviews and agendas. The New Regressive Left is united by resistance against the growing influence of Saudi Arabia in the religious, cultural, political, economic...
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Heteroscedasticity checks for regression models
ZHU; Lixing
2001-01-01
［1］Carroll, R. J., Ruppert, D., Transformation and Weighting in Regression, New York: Chapman and Hall, 1988.［2］Cook, R. D., Weisberg, S., Diagnostics for heteroscedasticity in regression, Biometrika, 1988, 70: 1—10.［3］Davidian, M., Carroll, R. J., Variance function estimation, J. Amer. Statist. Assoc., 1987, 82: 1079—1091.［4］Bickel, P., Using residuals robustly I: Tests for heteroscedasticity, Ann. Statist., 1978, 6: 266—291.［5］Carroll, R. J., Ruppert, D., On robust tests for heteroscedasticity, Ann. Statist., 1981, 9: 205—209.［6］Eubank, R. L., Thomas, W., Detecting heteroscedasticity in nonparametric regression, J. Roy. Statist. Soc., Ser. B, 1993, 55: 145—155.［7］Diblasi, A., Bowman, A., Testing for constant variance in a linear model, Statist. and Probab. Letters, 1997, 33: 95—103.［8］Dette, H., Munk, A., Testing heteoscedasticity in nonparametric regression, J. R. Statist. Soc. B, 1998, 60: 693—708.［9］Müller, H. G., Zhao, P. L., On a semi-parametric variance function model and a test for heteroscedasticity, Ann. Statist., 1995, 23: 946—967.［10］Stute, W., Manteiga, G., Quindimil, M. P., Bootstrap approximations in model checks for regression, J. Amer. Statist. Asso., 1998, 93: 141—149.［11］Stute, W., Thies, G., Zhu, L. X., Model checks for regression: An innovation approach, Ann. Statist., 1998, 26: 1916—1939.［12］Shorack, G. R., Wellner, J. A., Empirical Processes with Applications to Statistics, New York: Wiley, 1986.［13］Efron, B., Bootstrap methods: Another look at the jackknife, Ann. Statist., 1979, 7: 1—26.［14］Wu, C. F. J., Jackknife, bootstrap and other re-sampling methods in regression analysis, Ann. Statist., 1986, 14: 1261—1295.［15］H rdle, W., Mammen, E., Comparing non-parametric versus parametric regression fits, Ann. Statist., 1993, 21: 1926—1947.［16］Liu, R. Y., Bootstrap procedures under some non-i.i.d. models, Ann. Statist., 1988, 16: 1696—1708.［17
Clustered regression with unknown clusters
Barman, Kishor
2011-01-01
We consider a collection of prediction experiments, which are clustered in the sense that groups of experiments ex- hibit similar relationship between the predictor and response variables. The experiment clusters as well as the regres- sion relationships are unknown. The regression relation- ships define the experiment clusters, and in general, the predictor and response variables may not exhibit any clus- tering. We call this prediction problem clustered regres- sion with unknown clusters (CRUC) and in this paper we focus on linear regression. We study and compare several methods for CRUC, demonstrate their applicability to the Yahoo Learning-to-rank Challenge (YLRC) dataset, and in- vestigate an associated mathematical model. CRUC is at the crossroads of many prior works and we study several prediction algorithms with diverse origins: an adaptation of the expectation-maximization algorithm, an approach in- spired by K-means clustering, the singular value threshold- ing approach to matrix rank minimization u...
Robust nonlinear regression in applications
Lim, Changwon; Sen, Pranab K.; Peddada, Shyamal D.
2013-01-01
Robust statistical methods, such as M-estimators, are needed for nonlinear regression models because of the presence of outliers/influential observations and heteroscedasticity. Outliers and influential observations are commonly observed in many applications, especially in toxicology and agricultural experiments. For example, dose response studies, which are routinely conducted in toxicology and agriculture, sometimes result in potential outliers, especially in the high dose gr...
Astronomical Methods for Nonparametric Regression
Steinhardt, Charles L.; Jermyn, Adam
2017-01-01
I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.
Genetics Home Reference: caudal regression syndrome
... Twitter Home Health Conditions caudal regression syndrome caudal regression syndrome Enable Javascript to view the expand/collapse ... Download PDF Open All Close All Description Caudal regression syndrome is a disorder that impairs the development ...
Qiutong Jin
2016-06-01
Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
On Weighted Support Vector Regression
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Sensfuss, F.; Ragwitz, M.
2007-06-18
The authors of the contribution under consideration analyse the impact of power generation from EEG on the electricity rate. Under this aspect, the effect of the market value, the effect of CO{sub 2} and the effect of merit order on the power market have to be distinguished. This contribution presents a detailed analysis of the effect of the merit order. The priority feed of EEG reduces the demand of conventional power. Therefore, the most expensive power plants needed for covering the demand are not needed any more according to the merit order, the price in the spot market is reduced correspondingly. Due to the fact that spot market prices simultaneously are the most important indicator on the whole power market, the EEG not only should result into increased reductions of prices at the spot market but also into savings for all customers (leverage effect). The quantification of this effect was performed on the basis of a detailed model of the power market (PowerACE).
Multiatlas segmentation as nonparametric regression.
Awate, Suyash P; Whitaker, Ross T
2014-09-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems.
Interpreting Multiple Linear Regression: A Guidebook of Variable Importance
Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim
2012-01-01
Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…
Estimating the exceedance probability of rain rate by logistic regression
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Does the absence of cointegration explain the typical findings in long horizon regressions?
Berben, R-P.; Dijk, Dick van
1998-01-01
textabstractOne of the stylized facts in financial and international economics is that of increasing predictability of variables such as exchange rates and stock returns at longer horizons. This fact is based upon applications of long horizon regressions, from which the typical findings are that the point estimates of the regression parameter, the associated t-statistic, and the regression R^2 all tend to increase as the horizon increases. Such long horizon regression analyses implicitly assu...
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Use of probabilistic weights to enhance linear regression myoelectric control
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Use of probabilistic weights to enhance linear regression myoelectric control.
Smith, Lauren H; Kuiken, Todd A; Hargrove, Levi J
2015-12-01
Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts' law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p linear regression control. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Gouronnec, A.M. [Institut de Radioprotection et de Surete Nucleaire (IRSN), 92 - Clamart (France)
2004-06-15
The olfactometric analyses presented here are applied to industrial odors being able to generate harmful effects for people. The aim of the olfactometric analyses is to quantify odors, to qualify them or to join a pleasant or an unpleasant character to them (hedonism notion). The aim of this work is at first to present the different measurements carried out, the different measurement methods used and the current applications for each of the methods. (O.M.)
Practical Session: Multiple Linear Regression
Clausel, M.; Grégoire, G.
2014-12-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
Lumbar herniated disc: spontaneous regression
Yüksel, Kasım Zafer
2017-01-01
Background Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. Methods This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3−L4, L4−L5 or L5−S1 were enrolled. Results The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3−L4, L4−L5, and L5−S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5−22). Conclusions It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery. PMID:28119770
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Hildebrandt, Ian M; Marks, Bradley P; Juneja, Vijay K; Osoria, Marangeli; Hall, Nicole O; Ryser, Elliot T
2016-07-01
Isothermal inactivation studies are commonly used to quantify thermal inactivation kinetics of bacteria. Meta-analyses and comparisons utilizing results from multiple sources have revealed large variations in reported thermal resistance parameters for Salmonella, even when in similar food materials. Different laboratory or regression methodologies likely are the source of methodology-specific artifacts influencing the estimated parameters; however, such effects have not been quantified. The objective of this study was to evaluate the effects of laboratory and regression methodologies on thermal inactivation data generation, interpretation, modeling, and inherent error, based on data generated in two independent laboratories. The overall experimental design consisted of a cross-laboratory comparison using two independent laboratories (Michigan State University and U.S. Department of Agriculture, Agricultural Research Service, Eastern Regional Research Center [ERRC] laboratories), both conducting isothermal Salmonella inactivation studies (55, 60, 62°C) in ground beef, and each using two methodologies reported in prior studies. Two primary models (log-linear and Weibull) with one secondary model (Bigelow) were fitted to the resultant data using three regression methodologies (two two-step regressions and a one-step regression). Results indicated that laboratory methodology impacted the estimated D60°C- and z-values (α = 0.05), with the ERRC methodology yielding parameter estimates ∼25% larger than the Michigan State University methodology, regardless of the laboratory. Regression methodology also impacted the model and parameter error estimates. Two-step regressions yielded root mean square error values on average 40% larger than the one-step regressions. The Akaike Information Criterion indicated the Weibull as the more correct model in most cases; however, caution should be used to confirm model robustness in application to real-world data. Overall, the
Fuzzy rule-based support vector regression system
Ling WANG; Zhichun MU; Hui GUO
2005-01-01
In this paper,we design a fuzzy rule-based support vector regression system.The proposed system utilizes the advantages of fuzzy model and support vector regression to extract support vectors to generate fuzzy if-then rules from the training data set.Based on the first-order linear Tagaki-Sugeno (TS) model,the structure of rules is identified by the support vector regression and then the consequent parameters of rules are tuned by the global least squares method.Our model is applied to the real world regression task.The simulation results gives promising performances in terms of a set of fuzzy rules,which can be easily interpreted by humans.
Comparing parametric and nonparametric regression methods for panel data
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
Varying-coefficient functional linear regression
Wu, Yichao; Müller, Hans-Georg; 10.3150/09-BEJ231
2011-01-01
Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regression model. We study a varying-coefficient approach where the scalar covariates are modeled as additional arguments of the regression parameter function. This extension of the functional linear regression model is analogous to the extension of conventional linear regression models to varying-coefficient models and shares its advantages, such as increased flexibility; however, the details of this extension are more challenging in the functional case. Our methodology combines smoothing methods with regularization by truncation at a finite numb...
Score normalization using logistic regression with expected parameters
Aly, Robin
2014-01-01
State-of-the-art score normalization methods use generative models that rely on sometimes unrealistic assumptions. We propose a novel parameter estimation method for score normalization based on logistic regression. Experiments on the Gov2 and CluewebA collection indicate that our method is consiste
Functional Regression for Quasar Spectra
Ciollaro, Mattia; Freeman, Peter; Genovese, Christopher; Lei, Jing; O'Connell, Ross; Wasserman, Larry
2014-01-01
The Lyman-alpha forest is a portion of the observed light spectrum of distant galactic nuclei which allows us to probe remote regions of the Universe that are otherwise inaccessible. The observed Lyman-alpha forest of a quasar light spectrum can be modeled as a noisy realization of a smooth curve that is affected by a `damping effect' which occurs whenever the light emitted by the quasar travels through regions of the Universe with higher matter concentration. To decode the information conveyed by the Lyman-alpha forest about the matter distribution, we must be able to separate the smooth `continuum' from the noise and the contribution of the damping effect in the quasar light spectra. To predict the continuum in the Lyman-alpha forest, we use a nonparametric functional regression model in which both the response and the predictor variable (the smooth part of the damping-free portion of the spectrum) are function-valued random variables. We demonstrate that the proposed method accurately predicts the unobserv...
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Comparing parametric and nonparametric regression methods for panel data
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb-Douglas and......We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs...... rejects both the Cobb-Douglas and the Translog functional form, while a recently developed nonparametric kernel regression method with a fully nonparametric panel data specification delivers plausible results. On average, the nonparametric regression results are similar to results that are obtained from...
Spontaneous Regression of an Incidental Spinal Meningioma
Yilmaz, Ali; Kizilay, Zahir; Sair, Ahmet; Avcil, Mucahit; Ozkul, Ayca
2015-01-01
AIM: The regression of meningioma has been reported in literature before. In spite of the fact that the regression may be involved by hemorrhage, calcification or some drugs withdrawal, it is rarely observed spontaneously. CASE REPORT...
Common pitfalls in statistical analysis: Logistic regression.
Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh
2017-01-01
Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.
Prediction, Regression and Critical Realism
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...... of prediction necessary and possible in spatial planning of urban development. Finally, the political implications of positions within theory of science rejecting the possibility of predictions about social phenomena are addressed....
Semiparametric regression during 2003–2007
Ruppert, David
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Unbalanced Regressions and the Predictive Equation
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Synthesizing Regression Results: A Factored Likelihood Method
Wu, Meng-Jia; Becker, Betsy Jane
2013-01-01
Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…
Regression Analysis by Example. 5th Edition
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Regression with Sparse Approximations of Data
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected by...
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Land Walker H
2011-01-01
Full Text Available Abstract Background When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear information content. Statistical learning (SL techniques with kernels are capable of addressing nonlinear problems without making parametric assumptions. However, these techniques do not produce findings relevant for epidemiologic interpretations. A simulated case-control study was used to contrast the information embedding characteristics and separation boundaries produced by a specific SL technique with logistic regression (LR modeling representing a parametric approach. The SL technique was comprised of a kernel mapping in combination with a perceptron neural network. Because the LR model has an important epidemiologic interpretation, the SL method was modified to produce the analogous interpretation and generate odds ratios for comparison. Results The SL approach is capable of generating odds ratios for main effects and risk factor interactions that better capture nonlinear relationships between exposure variables and outcome in comparison with LR. Conclusions The integration of SL methods in epidemiology may improve both the understanding and interpretation of complex exposure/disease relationships.
Dessens, Jos A. G.; Jansen, Wim; Ganzeboom, Harry B. G.; Heijden, Peter G. M. van der
2003-01-01
This paper brings together the virtues of linear regression models for status attainment models formulated by second-generation social mobility researchers and the strengths of log-linear models formulated by third-generation researchers, into fourth-generation social mobility models, by using condi
Regression with Sparse Approximations of Data
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
A Bayesian approach to linear regression in astronomy
Sereno, Mauro
2015-01-01
Linear regression is common in astronomical analyses. I discuss a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter. The method fully accounts for time evolution. The slope, the normalization, and the intrinsic scatter of the relation can evolve with the redshift. The intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time. The method can address scatter in the measured independent variable (a kind of Eddington bias), selection effects in the response variable (Malmquist bias), and departure from linearity in form of a knee. I tested the method with toy models and simulations and quantified the effect of biases and inefficient modeling. The R-package LIRA (LInear Regression in Astronomy) is made available to perform the regression.
A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries
Aida Mustapha
2014-12-01
Full Text Available In many telecommunication companies, Entrepreneur Development Unit (EDU is responsible to manage a big group of vendors that hold contract with the company. This unit assesses the vendors’ performance in terms of revenue and profitability on yearly basis and uses the information in arranging suitable development trainings. The main challenge faced by this unit, however, is to obtain the annual revenue data from the vendors due to time constraints. This paper presents a regression approach to predict the vendors’ annual revenues based on their previous records so the assessment exercise could be expedited. Three regression methods were investigated; linear regression, sequential minimal optimization algorithm, and M5rules. The results were analysed and discussed.
Network class superposition analyses.
Carl A B Pearson
Full Text Available Networks are often used to understand a whole system by modeling the interactions among its pieces. Examples include biomolecules in a cell interacting to provide some primary function, or species in an environment forming a stable community. However, these interactions are often unknown; instead, the pieces' dynamic states are known, and network structure must be inferred. Because observed function may be explained by many different networks (e.g., ≈ 10(30 for the yeast cell cycle process, considering dynamics beyond this primary function means picking a single network or suitable sample: measuring over all networks exhibiting the primary function is computationally infeasible. We circumvent that obstacle by calculating the network class ensemble. We represent the ensemble by a stochastic matrix T, which is a transition-by-transition superposition of the system dynamics for each member of the class. We present concrete results for T derived from boolean time series dynamics on networks obeying the Strong Inhibition rule, by applying T to several traditional questions about network dynamics. We show that the distribution of the number of point attractors can be accurately estimated with T. We show how to generate Derrida plots based on T. We show that T-based Shannon entropy outperforms other methods at selecting experiments to further narrow the network structure. We also outline an experimental test of predictions based on T. We motivate all of these results in terms of a popular molecular biology boolean network model for the yeast cell cycle, but the methods and analyses we introduce are general. We conclude with open questions for T, for example, application to other models, computational considerations when scaling up to larger systems, and other potential analyses.
Assumptions of Multiple Regression: Correcting Two Misconceptions
Matt N. Williams
2013-09-01
Full Text Available In 2002, an article entitled - Four assumptions of multiple regression that researchers should always test- by.Osborne and Waters was published in PARE. This article has gone on to be viewed more than 275,000 times.(as of August 2013, and it is one of the first results displayed in a Google search for - regression.assumptions- . While Osborne and Waters' efforts in raising awareness of the need to check assumptions.when using regression are laudable, we note that the original article contained at least two fairly important.misconceptions about the assumptions of multiple regression: Firstly, that multiple regression requires the.assumption of normally distributed variables; and secondly, that measurement errors necessarily cause.underestimation of simple regression coefficients. In this article, we clarify that multiple regression models.estimated using ordinary least squares require the assumption of normally distributed errors in order for.trustworthy inferences, at least in small samples, but not the assumption of normally distributed response or.predictor variables. Secondly, we point out that regression coefficients in simple regression models will be.biased (toward zero estimates of the relationships between variables of interest when measurement error is.uncorrelated across those variables, but that when correlated measurement error is present, regression.coefficients may be either upwardly or downwardly biased. We conclude with a brief corrected summary of.the assumptions of multiple regression when using ordinary least squares.
Functional linear regression via canonical analysis
He, Guozhong; Wang, Jane-Ling; Yang, Wenjing; 10.3150/09-BEJ228
2011-01-01
We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection between functional regression and functional canonical analysis and suggests alternative approaches for the implementation of functional linear regression analysis. A specific procedure for the estimation of the regression parameter function using canonical expansions is proposed and compared with an established functional principal component regression approach. As an example of an application, we present an analysis of mortality data for cohorts of medflies, obtained in experimental studies of aging and longevity.
Regression in children with autism spectrum disorders.
Malhi, Prahbhjot; Singhi, Pratibha
2012-10-01
To understand the characteristics of autistic regression and to compare the clinical and developmental profile of children with autism spectrum disorders (ASD) in whom parents report developmental regression with age matched ASD children in whom no regression is reported. Participants were 35 (Mean age = 3.57 y, SD = 1.09) children with ASD in whom parents reported developmental regression before age 3 y and a group of age and IQ matched 35 ASD children in whom parents did not report regression. All children were recruited from the outpatient Child Psychology Clinic of the Department of Pediatrics of a tertiary care teaching hospital in North India. Multi-disciplinary evaluations including neurological, diagnostic, cognitive, and behavioral assessments were done. Parents were asked in detail about the age at onset of regression, type of regression, milestones lost, and event, if any, related to the regression. In addition, the Childhood Autism Rating Scale (CARS) was administered to assess symptom severity. The mean age at regression was 22.43 mo (SD = 6.57) and large majority (66.7%) of the parents reported regression between 12 and 24 mo. Most (75%) of the parents of the regression-autistic group reported regression in the language domain, particularly in the expressive language sector, usually between 18 and 24 mo of age. Regression of language was not an isolated phenomenon and regression in other domains was also reported including social skills (75%), cognition (31.25%). In majority of the cases (75%) the regression reported was slow and subtle. There were no significant differences in the motor, social, self help, and communication functioning between the two groups as measured by the DP II.There were also no significant differences between the two groups on the total CARS score and total number of DSM IV symptoms endorsed. However, the regressed children had significantly (t = 2.36, P = .021) more social deficits as per the DSM IV as
Anke Hüls
2017-05-01
Full Text Available Antimicrobial resistance in livestock is a matter of general concern. To develop hygiene measures and methods for resistance prevention and control, epidemiological studies on a population level are needed to detect factors associated with antimicrobial resistance in livestock holdings. In general, regression models are used to describe these relationships between environmental factors and resistance outcome. Besides the study design, the correlation structures of the different outcomes of antibiotic resistance and structural zero measurements on the resistance outcome as well as on the exposure side are challenges for the epidemiological model building process. The use of appropriate regression models that acknowledge these complexities is essential to assure valid epidemiological interpretations. The aims of this paper are (i to explain the model building process comparing several competing models for count data (negative binomial model, quasi-Poisson model, zero-inflated model, and hurdle model and (ii to compare these models using data from a cross-sectional study on antibiotic resistance in animal husbandry. These goals are essential to evaluate which model is most suitable to identify potential prevention measures. The dataset used as an example in our analyses was generated initially to study the prevalence and associated factors for the appearance of cefotaxime-resistant Escherichia coli in 48 German fattening pig farms. For each farm, the outcome was the count of samples with resistant bacteria. There was almost no overdispersion and only moderate evidence of excess zeros in the data. Our analyses show that it is essential to evaluate regression models in studies analyzing the relationship between environmental factors and antibiotic resistances in livestock. After model comparison based on evaluation of model predictions, Akaike information criterion, and Pearson residuals, here the hurdle model was judged to be the most appropriate
Deep Human Parsing with Active Template Regression.
Liang, Xiaodan; Liu, Si; Shen, Xiaohui; Yang, Jianchao; Liu, Luoqi; Dong, Jian; Lin, Liang; Yan, Shuicheng
2015-12-01
In this work, the human parsing task, namely decomposing a human image into semantic fashion/body regions, is formulated as an active template regression (ATR) problem, where the normalized mask of each fashion/body item is expressed as the linear combination of the learned mask templates, and then morphed to a more precise mask with the active shape parameters, including position, scale and visibility of each semantic region. The mask template coefficients and the active shape parameters together can generate the human parsing results, and are thus called the structure outputs for human parsing. The deep Convolutional Neural Network (CNN) is utilized to build the end-to-end relation between the input human image and the structure outputs for human parsing. More specifically, the structure outputs are predicted by two separate networks. The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters. For a new image, the structure outputs of the two networks are fused to generate the probability of each label for each pixel, and super-pixel smoothing is finally used to refine the human parsing result. Comprehensive evaluations on a large dataset well demonstrate the significant superiority of the ATR framework over other state-of-the-arts for human parsing. In particular, the F1-score reaches 64.38 percent by our ATR framework, significantly higher than 44.76 percent based on the state-of-the-art algorithm [28].
Nishanee Rampersad
2017-01-01
Full Text Available Background: Assessment of intraocular pressure (IOP is an important test in glaucoma. In addition, anterior segment variables may be useful in screening for glaucoma risk. Studies have investigated the associations between IOP and anterior segment variables using traditional statistical methods. The classification and regression tree (CART method provides another dimension to detect important variables in a relationship automatically.Aim: To identify the critical factors that influence IOP using a regression tree.Methods: A quantitative cross-sectional research design was used. Anterior segment variables were measured in 700 participants using the iVue100 optical coherence tomographer, Oculus Keratograph and Nidek US-500 ultrasonographer. A Goldmann applanation tonometer was used to measure IOP. Data from only the right eyes were analysed because of high levels of interocular symmetry. A regression tree model was generated with the CART method and Pearson’s correlation coefficients were used to assess the relationships between the ocular variables.Results: The mean IOP for the entire sample was 14.63 mmHg ± 2.40 mmHg. The CART method selected three anterior segment variables in the regression tree model. Central corneal thickness was the most important variable with a cut-off value of 527 µm. The other important variables included average paracentral corneal thickness and axial anterior chamber depth. Corneal thickness measurements increased towards the periphery and were significantly correlated with IOP (r ≥ 0.50, p ≤ 0.001.Conclusion: The CART method identified the anterior segment variables that influenced IOP. Understanding the relationship between IOP and anterior segment variables may help to clinically identify patients with ocular risk factors associated with elevated IOPs.
Gravitational Wave Emulation Using Gaussian Process Regression
Doctor, Zoheyr; Farr, Ben; Holz, Daniel
2017-01-01
Parameter estimation (PE) for gravitational wave signals from compact binary coalescences (CBCs) requires reliable template waveforms which span the parameter space. Waveforms from numerical relativity are accurate but computationally expensive, so approximate templates are typically used for PE. These `approximants', while quick to compute, can introduce systematic errors and bias PE results. We describe a machine learning method for generating CBC waveforms and uncertainties using existing accurate waveforms as a training set. Coefficients of a reduced order waveform model are computed and each treated as arising from a Gaussian process. These coefficients and their uncertainties are then interpolated using Gaussian process regression (GPR). As a proof of concept, we construct a training set of approximant waveforms (rather than NR waveforms) in the two-dimensional space of chirp mass and mass ratio and interpolate new waveforms with GPR. We demonstrate that the mismatch between interpolated waveforms and approximants is below the 1% level for an appropriate choice of training set and GPR kernel hyperparameters.
管军; 杨兴易; 赵良; 林兆奋; 郭昌星; 李文放
2003-01-01
Objective To investigate the incidence, crude mortality and independent risk factors of ventilator-associated pneumonia (VAP) in comprehensive ICU in China.Methods The clinical and microbiological data were retrospectively collected and analysed of all the 97 patients receiving mechanical ventilation (>48hr) in our comprehensive ICU during 1999. 1 - 2000. 12. Firstly several statistically significant risk factors were screened out with univariate analysis, then independent risk factors were determined with multivariate stepwise logistic regression analysis.Results The incidence of VAP was 54. 64% (15. 60 cases per 1000 ventilation days), the crude mortality 47.42% . Interval between the establishment of artificial airway and diagnosis of VAP was 6.9 ± 4.3 d. Univariate analysis suggested that indwelling naso-gastric tube, corticosteroid, acid inhibitor, third-generation cephalosporin/ imipenem, non - infection lung disease, and extrapulmonary infection were the statistically significant risk factors of
Using Regression Mixture Analysis in Educational Research
Cody S. Ding
2006-11-01
Full Text Available Conventional regression analysis is typically used in educational research. Usually such an analysis implicitly assumes that a common set of regression parameter estimates captures the population characteristics represented in the sample. In some situations, however, this implicit assumption may not be realistic, and the sample may contain several subpopulations such as high math achievers and low math achievers. In these cases, conventional regression models may provide biased estimates since the parameter estimates are constrained to be the same across subpopulations. This paper advocates the applications of regression mixture models, also known as latent class regression analysis, in educational research. Regression mixture analysis is more flexible than conventional regression analysis in that latent classes in the data can be identified and regression parameter estimates can vary within each latent class. An illustration of regression mixture analysis is provided based on a dataset of authentic data. The strengths and limitations of the regression mixture models are discussed in the context of educational research.
Technological progress and regress in pre-industrial times
Aiyar, Shekhar; Dalgaard, Carl-Johan Lars; Moav, Omer
2008-01-01
This paper offers micro-foundations for the dynamic relationship between technology and population in the pre-industrial world, accounting for both technological progress and the hitherto neglected but common phenomenon of technological regress. A positive feedback between population...... and the adoption of new techniques that increase the division of labor explains technological progress. A transient shock to productivity or population induces the neglect of some techniques rendered temporarily unprofitable, which are therefore not transmitted to the next generation. Productivity remains...... constrained by the smaller stock of knowledge and technology has thereby regressed. A slow process of rediscovery is required for the economy to reach its previous level of technological sophistication and population size. The model is employed to analyze specific historical examples of technological regress...
Modelling multimodal photometric redshift regression with noisy observations
Kügler, S D
2016-01-01
In this work, we are trying to extent the existing photometric redshift regression models from modeling pure photometric data back to the spectra themselves. To that end, we developed a PCA that is capable of describing the input uncertainty (including missing values) in a dimensionality reduction framework. With this "spectrum generator" at hand, we are capable of treating the redshift regression problem in a fully Bayesian framework, returning a posterior distribution over the redshift. This approach allows therefore to approach the multimodal regression problem in an adequate fashion. In addition, input uncertainty on the magnitudes can be included quite naturally and lastly, the proposed algorithm allows in principle to make predictions outside the training values which makes it a fascinating opportunity for the detection of high-redshifted quasars.
Association between regression and self injury among children with autism.
Lance, Eboni I; York, Janet M; Lee, Li-Ching; Zimmerman, Andrew W
2014-02-01
Self injurious behaviors (SIBs) are challenging clinical problems in individuals with autism spectrum disorders (ASDs). This study is one of the first and largest to utilize inpatient data to examine the associations between autism, developmental regression, and SIBs. Medical records of 125 neurobehavioral hospitalized patients with diagnoses of ASDs and SIBs between 4 and 17 years of age were reviewed. Data were collected from medical records on the type and frequency of SIBs and a history of language, social, or behavioral regression during development. The children with a history of any type of developmental regression (social, behavioral, or language) were more likely to have a diagnosis of autistic disorder than other ASD diagnoses. There were no significant differences in the occurrence of self injurious or other problem behaviors (such as aggression or disruption) between children with and without regression. Regression may influence the diagnostic considerations in ASDs but does not seem to influence the clinical phenotype with regard to behavioral issues. Additional data analyses explored the frequencies and subtypes of SIBs and other medical diagnoses in ASDs, with intellectual disability and disruptive behavior disorder found most commonly. Copyright © 2013 Elsevier Ltd. All rights reserved.
Quantile regression provides a fuller analysis of speed data.
Hewson, Paul
2008-03-01
Considerable interest already exists in terms of assessing percentiles of speed distributions, for example monitoring the 85th percentile speed is a common feature of the investigation of many road safety interventions. However, unlike the mean, where t-tests and ANOVA can be used to provide evidence of a statistically significant change, inference on these percentiles is much less common. This paper examines the potential role of quantile regression for modelling the 85th percentile, or any other quantile. Given that crash risk may increase disproportionately with increasing relative speed, it may be argued these quantiles are of more interest than the conditional mean. In common with the more usual linear regression, quantile regression admits a simple test as to whether the 85th percentile speed has changed following an intervention in an analogous way to using the t-test to determine if the mean speed has changed by considering the significance of parameters fitted to a design matrix. Having briefly outlined the technique and briefly examined an application with a widely published dataset concerning speed measurements taken around the introduction of signs in Cambridgeshire, this paper will demonstrate the potential for quantile regression modelling by examining recent data from Northamptonshire collected in conjunction with a "community speed watch" programme. Freely available software is used to fit these models and it is hoped that the potential benefits of using quantile regression methods when examining and analysing speed data are demonstrated.
Evaluation of Linear Regression Simultaneous Myoelectric Control Using Intramuscular EMG.
Smith, Lauren H; Kuiken, Todd A; Hargrove, Levi J
2016-04-01
The objective of this study was to evaluate the ability of linear regression models to decode patterns of muscle coactivation from intramuscular electromyogram (EMG) and provide simultaneous myoelectric control of a virtual 3-DOF wrist/hand system. Performance was compared to the simultaneous control of conventional myoelectric prosthesis methods using intramuscular EMG (parallel dual-site control)-an approach that requires users to independently modulate individual muscles in the residual limb, which can be challenging for amputees. Linear regression control was evaluated in eight able-bodied subjects during a virtual Fitts' law task and was compared to performance of eight subjects using parallel dual-site control. An offline analysis also evaluated how different types of training data affected prediction accuracy of linear regression control. The two control systems demonstrated similar overall performance; however, the linear regression method demonstrated improved performance for targets requiring use of all three DOFs, whereas parallel dual-site control demonstrated improved performance for targets that required use of only one DOF. Subjects using linear regression control could more easily activate multiple DOFs simultaneously, but often experienced unintended movements when trying to isolate individual DOFs. Offline analyses also suggested that the method used to train linear regression systems may influence controllability. Linear regression myoelectric control using intramuscular EMG provided an alternative to parallel dual-site control for 3-DOF simultaneous control at the wrist and hand. The two methods demonstrated different strengths in controllability, highlighting the tradeoff between providing simultaneous control and the ability to isolate individual DOFs when desired.
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Beta blockers & left ventricular hypertrophy regression.
George, Thomas; Ajit, Mullasari S; Abraham, Georgi
2010-01-01
Left ventricular hypertrophy (LVH) particularly in hypertensive patients is a strong predictor of adverse cardiovascular events. Identifying LVH not only helps in the prognostication but also in the choice of therapeutic drugs. The prevalence of LVH is age linked and has a direct correlation to the severity of hypertension. Adequate control of blood pressure, most importantly central aortic pressure and blocking the effects of cardiomyocyte stimulatory growth factors like Angiotensin II helps in regression of LVH. Among the various antihypertensives ACE-inhibitors and angiotensin receptor blockers are more potent than other drugs in regressing LVH. Beta blockers especially the newer cardio selective ones do still have a role in regressing LVH albeit a minor one. A meta-analysis of various studies on LVH regression shows many lacunae. There have been no consistent criteria for defining LVH and documenting LVH regression. This article reviews current evidence on the role of Beta Blockers in LVH regression.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Regression calibration with heteroscedastic error variance.
Spiegelman, Donna; Logan, Roger; Grove, Douglas
2011-01-01
The problem of covariate measurement error with heteroscedastic measurement error variance is considered. Standard regression calibration assumes that the measurement error has a homoscedastic measurement error variance. An estimator is proposed to correct regression coefficients for covariate measurement error with heteroscedastic variance. Point and interval estimates are derived. Validation data containing the gold standard must be available. This estimator is a closed-form correction of the uncorrected primary regression coefficients, which may be of logistic or Cox proportional hazards model form, and is closely related to the version of regression calibration developed by Rosner et al. (1990). The primary regression model can include multiple covariates measured without error. The use of these estimators is illustrated in two data sets, one taken from occupational epidemiology (the ACE study) and one taken from nutritional epidemiology (the Nurses' Health Study). In both cases, although there was evidence of moderate heteroscedasticity, there was little difference in estimation or inference using this new procedure compared to standard regression calibration. It is shown theoretically that unless the relative risk is large or measurement error severe, standard regression calibration approximations will typically be adequate, even with moderate heteroscedasticity in the measurement error model variance. In a detailed simulation study, standard regression calibration performed either as well as or better than the new estimator. When the disease is rare and the errors normally distributed, or when measurement error is moderate, standard regression calibration remains the method of choice.
Enhanced piecewise regression based on deterministic annealing
ZHANG JiangShe; YANG YuQian; CHEN XiaoWen; ZHOU ChengHu
2008-01-01
Regression is one of the important problems in statistical learning theory. This paper proves the global convergence of the piecewise regression algorithm based on deterministic annealing and continuity of global minimum of free energy w.r.t temperature, and derives a new simplified formula to compute the initial critical temperature. A new enhanced piecewise regression algorithm by using "migration of prototypes" is proposed to eliminate "empty cell" in the annealing process. Numerical experiments on several benchmark datasets show that the new algo-rithm can remove redundancy and improve generalization of the piecewise regres-sion model.
Geodesic least squares regression on information manifolds
Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be [Department of Applied Physics, Ghent University, Ghent, Belgium and Laboratory for Plasma Physics, Royal Military Academy, Brussels (Belgium)
2014-12-05
We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R(2)) indicates the importance of independent variables in the outcome.
Logistic Regression for Evolving Data Streams Classification
YIN Zhi-wu; HUANG Shang-teng; XUE Gui-rong
2007-01-01
Logistic regression is a fast classifier and can achieve higher accuracy on small training data. Moreover,it can work on both discrete and continuous attributes with nonlinear patterns. Based on these properties of logistic regression, this paper proposed an algorithm, called evolutionary logistical regression classifier (ELRClass), to solve the classification of evolving data streams. This algorithm applies logistic regression repeatedly to a sliding window of samples in order to update the existing classifier, to keep this classifier if its performance is deteriorated by the reason of bursting noise, or to construct a new classifier if a major concept drift is detected. The intensive experimental results demonstrate the effectiveness of this algorithm.
New ridge parameters for ridge regression
A.V. Dorugade
2014-04-01
Full Text Available Hoerl and Kennard (1970a introduced the ridge regression estimator as an alternative to the ordinary least squares (OLS estimator in the presence of multicollinearity. In ridge regression, ridge parameter plays an important role in parameter estimation. In this article, a new method for estimating ridge parameters in both situations of ordinary ridge regression (ORR and generalized ridge regression (GRR is proposed. The simulation study evaluates the performance of the proposed estimator based on the mean squared error (MSE criterion and indicates that under certain conditions the proposed estimators perform well compared to OLS and other well-known estimators reviewed in this article.
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Tong, Fuhui
2006-01-01
Background: An extensive body of researches has favored the use of regression over other parametric analyses that are based on OVA. In case of noteworthy regression results, researchers tend to explore magnitude of beta weights for the respective predictors. Purpose: The purpose of this paper is to examine both beta weights and structure…
Associative Regressive Decision Rule Mining for Predicting Customer Satisfactory Patterns
P. Suresh
2016-04-01
Full Text Available Opinion mining also known as sentiment analysis, involves cust omer satisfactory patterns, sentiments and attitudes toward entities, products, service s and their attributes. With the rapid development in the field of Internet, potential customer’s provi des a satisfactory level of product/service reviews. The high volume of customer rev iews were developed for product/review through taxonomy-aware processing but, it was di fficult to identify the best reviews. In this paper, an Associative Regression Decisio n Rule Mining (ARDRM technique is developed to predict the pattern for service provider and to improve customer satisfaction based on the review comments. Associative Regression based Decisi on Rule Mining performs two- steps for improving the customer satisfactory level. Initial ly, the Machine Learning Bayes Sentiment Classifier (MLBSC is used to classify the cla ss labels for each service reviews. After that, Regressive factor of the opinion words and Class labels w ere checked for Association between the words by using various probabilistic rules. Based on t he probabilistic rules, the opinion and sentiments effect on customer reviews, are analyzed to arrive at specific set of service preferred by the customers with their review com ments. The Associative Regressive Decision Rule helps the service provider to take decision on imp roving the customer satisfactory level. The experimental results reveal that the Associ ative Regression Decision Rule Mining (ARDRM technique improved the performance in terms of true positive rate, Associative Regression factor, Regressive Decision Rule Generation time a nd Review Detection Accuracy of similar pattern.
The benefits of using quantile regression for analysing the effect of weeds on organic winter wheat
Casagrande, M.; Makowski, D.; Jeuffroy, M.H.; Valantin-Morison, M.; David, C.
2010-01-01
P>In organic farming, weeds are one of the threats that limit crop yield. An early prediction of weed effect on yield loss and the size of late weed populations could help farmers and advisors to improve weed management. Numerous studies predicting the effect of weeds on yield have already been
Gilstrap, Donald L.
2013-01-01
In addition to qualitative methods presented in chaos and complexity theories in educational research, this article addresses quantitative methods that may show potential for future research studies. Although much in the social and behavioral sciences literature has focused on computer simulations, this article explores current chaos and…
The benefits of using quantile regression for analysing the effect of weeds on organic winter wheat
Casagrande, M.; Makowski, D.; Jeuffroy, M.H.; Valantin-Morison, M.; David, C.
2010-01-01
P>In organic farming, weeds are one of the threats that limit crop yield. An early prediction of weed effect on yield loss and the size of late weed populations could help farmers and advisors to improve weed management. Numerous studies predicting the effect of weeds on yield have already been c
Optimization of Regression Models of Experimental Data Using Confirmation Points
Ulbrich, N.
2010-01-01
A new search metric is discussed that may be used to better assess the predictive capability of different math term combinations during the optimization of a regression model of experimental data. The new search metric can be determined for each tested math term combination if the given experimental data set is split into two subsets. The first subset consists of data points that are only used to determine the coefficients of the regression model. The second subset consists of confirmation points that are exclusively used to test the regression model. The new search metric value is assigned after comparing two values that describe the quality of the fit of each subset. The first value is the standard deviation of the PRESS residuals of the data points. The second value is the standard deviation of the response residuals of the confirmation points. The greater of the two values is used as the new search metric value. This choice guarantees that both standard deviations are always less or equal to the value that is used during the optimization. Experimental data from the calibration of a wind tunnel strain-gage balance is used to illustrate the application of the new search metric. The new search metric ultimately generates an optimized regression model that was already tested at regression model independent confirmation points before it is ever used to predict an unknown response from a set of regressors.
Fault Isolation for Nonlinear Systems Using Flexible Support Vector Regression
Yufang Liu
2014-01-01
Full Text Available While support vector regression is widely used as both a function approximating tool and a residual generator for nonlinear system fault isolation, a drawback for this method is the freedom in selecting model parameters. Moreover, for samples with discordant distributing complexities, the selection of reasonable parameters is even impossible. To alleviate this problem we introduce the method of flexible support vector regression (F-SVR, which is especially suited for modelling complicated sample distributions, as it is free from parameters selection. Reasonable parameters for F-SVR are automatically generated given a sample distribution. Lastly, we apply this method in the analysis of the fault isolation of high frequency power supplies, where satisfactory results have been obtained.
Incremental Net Effects in Multiple Regression
Lipovetsky, Stan; Conklin, Michael
2005-01-01
A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Dealing with Outliers: Robust, Resistant Regression
Glasser, Leslie
2007-01-01
Least-squares linear regression is the best of statistics and it is the worst of statistics. The reasons for this paradoxical claim, arising from possible inapplicability of the method and the excessive influence of "outliers", are discussed and substitute regression methods based on median selection, which is both robust and resistant, are…
Competing Risks Quantile Regression at Work
Dlugosz, Stephan; Lo, Simon M. S.; Wilke, Ralf
2017-01-01
Despite its emergence as a frequently used method for the empirical analysis of multivariate data, quantile regression is yet to become a mainstream tool for the analysis of duration data. We present a pioneering empirical study on the grounds of a competing risks quantile regression model. We use...
Implementing Variable Selection Techniques in Regression.
Thayer, Jerome D.
Variable selection techniques in stepwise regression analysis are discussed. In stepwise regression, variables are added or deleted from a model in sequence to produce a final "good" or "best" predictive model. Stepwise computer programs are discussed and four different variable selection strategies are described. These…
Regression Model With Elliptically Contoured Errors
Arashi, M; Tabatabaey, S M M
2012-01-01
For the regression model where the errors follow the elliptically contoured distribution (ECD), we consider the least squares (LS), restricted LS (RLS), preliminary test (PT), Stein-type shrinkage (S) and positive-rule shrinkage (PRS) estimators for the regression parameters. We compare the quadratic risks of the estimators to determine the relative dominance properties of the five estimators.
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Atherosclerotic plaque regression: fact or fiction?
Shanmugam, Nesan; Román-Rego, Ana; Ong, Peter; Kaski, Juan Carlos
2010-08-01
Coronary artery disease is the major cause of death in the western world. The formation and rapid progression of atheromatous plaques can lead to serious cardiovascular events in patients with atherosclerosis. The better understanding, in recent years, of the mechanisms leading to atheromatous plaque growth and disruption and the availability of powerful HMG CoA-reductase inhibitors (statins) has permitted the consideration of plaque regression as a realistic therapeutic goal. This article reviews the existing evidence underpinning current therapeutic strategies aimed at achieving atherosclerotic plaque regression. In this review we also discuss imaging modalities for the assessment of plaque regression, predictors of regression and whether plaque regression is associated with a survival benefit.
Pathological assessment of liver fibrosis regression
WANG Bingqiong
2017-03-01
Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Epidemiology of CKD Regression in Patients under Nephrology Care.
Borrelli, Silvio; Leonardis, Daniela; Minutolo, Roberto; Chiodini, Paolo; De Nicola, Luca; Esposito, Ciro; Mallamaci, Francesca; Zoccali, Carmine; Conte, Giuseppe
2015-01-01
Chronic Kidney Disease (CKD) regression is considered as an infrequent renal outcome, limited to early stages, and associated with higher mortality. However, prevalence, prognosis and the clinical correlates of CKD regression remain undefined in the setting of nephrology care. This is a multicenter prospective study in 1418 patients with established CKD (eGFR: 60-15 ml/min/1.73m²) under nephrology care in 47 outpatient clinics in Italy from a least one year. We defined CKD regressors as a ΔGFR ≥0 ml/min/1.73 m2/year. ΔGFR was estimated as the absolute difference between eGFR measured at baseline and at follow up visit after 18-24 months, respectively. Outcomes were End Stage Renal Disease (ESRD) and overall-causes Mortality.391 patients (27.6%) were identified as regressors as they showed an eGFR increase between the baseline visit in the renal clinic and the follow up visit. In multivariate regression analyses the regressor status was not associated with CKD stage. Low proteinuria was the main factor associated with CKD regression, accounting per se for 48% of the likelihood of this outcome. Lower systolic blood pressure, higher BMI and absence of autosomal polycystic disease (PKD) were additional predictors of CKD regression. In regressors, ESRD risk was 72% lower (HR: 0.28; 95% CI 0.14-0.57; pCKD stage. CKD regression occurs in about one-fourth patients receiving renal care in nephrology units and correlates with low proteinuria, BP and the absence of PKD. This condition portends better renal prognosis, mostly in earlier CKD stages, with no excess risk for mortality.
Fitting Additive Binomial Regression Models with the R Package blm
Stephanie Kovalchik
2013-09-01
Full Text Available The R package blm provides functions for fitting a family of additive regression models to binary data. The included models are the binomial linear model, in which all covariates have additive effects, and the linear-expit (lexpit model, which allows some covariates to have additive effects and other covariates to have logisitc effects. Additive binomial regression is a model of event probability, and the coefficients of linear terms estimate covariate-adjusted risk differences. Thus, in contrast to logistic regression, additive binomial regression puts focus on absolute risk and risk differences. In this paper, we give an overview of the methodology we have developed to fit the binomial linear and lexpit models to binary outcomes from cohort and population-based case-control studies. We illustrate the blm packages methods for additive model estimation, diagnostics, and inference with risk association analyses of a bladder cancer nested case-control study in the NIH-AARP Diet and Health Study.
Quantile regression applied to spectral distance decay
Rocchini, D.; Cade, B.S.
2008-01-01
Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.
Hypotheses testing for fuzzy robust regression parameters
Kula, Kamile Sanli [Ahi Evran University, Department of Mathematics, 40200 Kirsehir (Turkey)], E-mail: sanli2004@hotmail.com; Apaydin, Aysen [Ankara University, Department of Statistics, 06100 Ankara (Turkey)], E-mail: apaydin@science.ankara.edu.tr
2009-11-30
The classical least squares (LS) method is widely used in regression analysis because computing its estimate is easy and traditional. However, LS estimators are very sensitive to outliers and to other deviations from basic assumptions of normal theory [Huynh H. A comparison of four approaches to robust regression. Psychol Bull 1982;92:505-12; Stephenson D. 2000. Available from: (http://folk.uib.no/ngbnk/kurs/notes/node38.html); Xu R, Li C. Multidimensional least-squares fitting with a fuzzy model. Fuzzy Sets and Systems 2001;119:215-23.]. If there exists outliers in the data set, robust methods are preferred to estimate parameters values. We proposed a fuzzy robust regression method by using fuzzy numbers when x is crisp and Y is a triangular fuzzy number and in case of outliers in the data set, a weight matrix was defined by the membership function of the residuals. In the fuzzy robust regression, fuzzy sets and fuzzy regression analysis was used in ranking of residuals and in estimation of regression parameters, respectively [Sanli K, Apaydin A. Fuzzy robust regression analysis based on the ranking of fuzzy sets. Inernat. J. Uncertainty Fuzziness and Knowledge-Based Syst 2008;16:663-81.]. In this study, standard deviation estimations are obtained for the parameters by the defined weight matrix. Moreover, we propose another point of view in hypotheses testing for parameters.
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Relative risk regression analysis of epidemiologic data.
Prentice, R L
1985-11-01
Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation
Saro, Lee; Woo, Jeon Seong; Kwan-Young, Oh; Moung-Jin, Lee
2016-02-01
The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs) followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS). These factors were analysed using artificial neural network (ANN) and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50%) and a test set (50%). A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10%) was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%). Of the weights used in the artificial neural network model, `slope' yielded the highest weight value (1.330), and `aspect' yielded the lowest value (1.000). This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.
Wolff, Marc
2011-10-14
This work is devoted to the construction of numerical methods that allow the accurate simulation of inertial confinement fusion (ICF) implosion processes by taking self-generated magnetic field terms into account. In the sequel, we first derive a two-temperature resistive magnetohydrodynamics model and describe the considered closure relations. The resulting system of equations is then split in several subsystems according to the nature of the underlying mathematical operator. Adequate numerical methods are then proposed for each of these subsystems. Particular attention is paid to the development of finite volume schemes for the hyperbolic operator which actually is the hydrodynamics or ideal magnetohydrodynamics system depending on whether magnetic fields are considered or not. More precisely, a new class of high-order accurate dimensionally split schemes for structured meshes is proposed using the Lagrange re-map formalism. One of these schemes' most innovative features is that they have been designed in order to take advantage of modern massively parallel computer architectures. This property can for example be illustrated by the dimensionally split approach or the use of artificial viscosity techniques and is practically highlighted by sequential performance and parallel efficiency figures. Hyperbolic schemes are then combined with finite volume methods for dealing with the thermal and resistive conduction operators and taking magnetic field generation into account. In order to study the characteristics and effects of self-generated magnetic field terms, simulation results are finally proposed with the complete two-temperature resistive magnetohydrodynamics model on a test problem that represents the state of an ICF capsule at the beginning of the deceleration phase. (author)
Variable and subset selection in PLS regression
Høskuldsson, Agnar
2001-01-01
The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Regressive language in severe head injury.
Thomsen, I V; Skinhoj, E
1976-09-01
In a follow-up study of 50 patients with severe head injuries three patients had echolalia. One patient with initially global aphasia had echolalia for some weeks when he started talking. Another patient with severe diffuse brain damage, dementia, and emotional regression had echolalia. The dysfunction was considered a detour performance. In the third patient echolalia and palilalia were details in a total pattern of regression lasting for months. The patient, who had extensive frontal atrophy secondary to a very severe head trauma, presented an extreme state of regression returning to a foetal-body pattern and behaving like a baby.
Regression of altitude-produced cardiac hypertrophy.
Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.
1973-01-01
The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.
Regression of altitude-produced cardiac hypertrophy.
Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.
1973-01-01
The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.
A linear regression solution to the spatial autocorrelation problem
Griffith, Daniel A.
The Moran Coefficient spatial autocorrelation index can be decomposed into orthogonal map pattern components. This decomposition relates it directly to standard linear regression, in which corresponding eigenvectors can be used as predictors. This paper reports comparative results between these linear regressions and their auto-Gaussian counterparts for the following georeferenced data sets: Columbus (Ohio) crime, Ottawa-Hull median family income, Toronto population density, southwest Ohio unemployment, Syracuse pediatric lead poisoning, and Glasgow standard mortality rates, and a small remotely sensed image of the High Peak district. This methodology is extended to auto-logistic and auto-Poisson situations, with selected data analyses including percentage of urban population across Puerto Rico, and the frequency of SIDs cases across North Carolina. These data analytic results suggest that this approach to georeferenced data analysis offers considerable promise.
An introduction to using Bayesian linear regression with clinical data.
Baldwin, Scott A; Larson, Michael J
2017-11-01
Statistical training psychology focuses on frequentist methods. Bayesian methods are an alternative to standard frequentist methods. This article provides researchers with an introduction to fundamental ideas in Bayesian modeling. We use data from an electroencephalogram (EEG) and anxiety study to illustrate Bayesian models. Specifically, the models examine the relationship between error-related negativity (ERN), a particular event-related potential, and trait anxiety. Methodological topics covered include: how to set up a regression model in a Bayesian framework, specifying priors, examining convergence of the model, visualizing and interpreting posterior distributions, interval estimates, expected and predicted values, and model comparison tools. We also discuss situations where Bayesian methods can outperform frequentist methods as well has how to specify more complicated regression models. Finally, we conclude with recommendations about reporting guidelines for those using Bayesian methods in their own research. We provide data and R code for replicating our analyses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Illustrating Bayesian evaluation of informative hypotheses for regression models
Anouck eKluytmans
2012-01-01
Full Text Available In the present paper we illustrate the Bayesian evaluation of informative hypotheses for regression models. This approach allows psychologists to more directly test their theories than they would using conventional statis- tical analyses. Throughout this paper, both real-world data and simulated datasets will be introduced and evaluated to investigate the pragmatical as well as the theoretical qualities of the approach. We will pave the way from forming informative hypotheses in the context of regression models to interpreting the Bayes factors that express the support for the hypotheses being evaluated. In doing so, the present approach goes beyond p-values and uninformative null hypothesis testing, moving on to informative testing and quantification of model support in a way that is accessible to everyday psychologists.
In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...
le Fevre Jakobsen, Bjarne
Publikationen indeholder øvematerialer, tekster, powerpointpræsentationer og handouts til undervisningsfaget Sproglig Metode og Analyse på BA og tilvalg i Dansk/Nordisk 2010-2011......Publikationen indeholder øvematerialer, tekster, powerpointpræsentationer og handouts til undervisningsfaget Sproglig Metode og Analyse på BA og tilvalg i Dansk/Nordisk 2010-2011...
A Multi-objective Procedure for Efficient Regression Modeling
Sinha, Ankur; Kuosmanen, Timo
2012-01-01
Variable selection is recognized as one of the most critical steps in statistical modeling. The problems encountered in engineering and social sciences are commonly characterized by over-abundance of explanatory variables, non-linearities and unknown interdependencies between the regressors. An added difficulty is that the analysts may have little or no prior knowledge on the relative importance of the variables. To provide a robust method for model selection, this paper introduces a technique called the Multi-objective Genetic Algorithm for Variable Selection (MOGA-VS) which provides the user with an efficient set of regression models for a given data-set. The algorithm considers the regression problem as a two objective task, where the purpose is to choose those models over the other which have less number of regression coefficients and better goodness of fit. In MOGA-VS, the model selection procedure is implemented in two steps. First, we generate the frontier of all efficient or non-dominated regression m...
Multiple Instance Regression with Structured Data
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
This slide presentation reviews the use of multiple instance regression with structured data from multiple and related data sets. It applies the concept to a practical problem, that of estimating crop yield using remote sensed country wide weekly observations.
Prediction of Dynamical Systems by Symbolic Regression
Quade, Markus; Shafi, Kamran; Niven, Robert K; Noack, Bernd R
2016-01-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting a...
Some Simple Computational Formulas for Multiple Regression
Aiken, Lewis R., Jr.
1974-01-01
Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)
Spontaneous Regression of an Incidental Spinal Meningioma
Ali Yilmaz
2015-12-01
Full Text Available AIM: The regression of meningioma has been reported in literature before. In spite of the fact that the regression may be involved by hemorrhage, calcification or some drugs withdrawal, it is rarely observed spontaneously. CASE REPORT: We report a 17 year old man with a cervical meningioma which was incidentally detected. In his cervical MRI an extradural, cranio-caudal contrast enchanced lesion at C2-C3 levels of the cervical spinal cord was detected. Despite the slight compression towards the spinal cord, he had no symptoms and refused any kind of surgical approach. The meningioma was followed by control MRI and it spontaneously regressed within six months. There were no signs of hemorrhage or calcification. CONCLUSION: Although it is a rare condition, the clinicians should consider that meningiomas especially incidentally diagnosed may be regressed spontaneously.
Spontaneous Regression of an Incidental Spinal Meningioma.
Yilmaz, Ali; Kizilay, Zahir; Sair, Ahmet; Avcil, Mucahit; Ozkul, Ayca
2016-03-15
The regression of meningioma has been reported in literature before. In spite of the fact that the regression may be involved by hemorrhage, calcification or some drugs withdrawal, it is rarely observed spontaneously. We report a 17 year old man with a cervical meningioma which was incidentally detected. In his cervical MRI an extradural, cranio-caudal contrast enchanced lesion at C2-C3 levels of the cervical spinal cord was detected. Despite the slight compression towards the spinal cord, he had no symptoms and refused any kind of surgical approach. The meningioma was followed by control MRI and it spontaneously regressed within six months. There were no signs of hemorrhage or calcification. Although it is a rare condition, the clinicians should consider that meningiomas especially incidentally diagnosed may be regressed spontaneously.
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding e
Patterns of Regression in Rett Syndrome
J Gordon Millichap
2002-10-01
Full Text Available Patterns and features of regression in a case series of 53 girls and women with Rett syndrome were studied at the Institute of Child Health and Great Ormond Street Children’s Hospital, London, UK.
A new bivariate negative binomial regression model
Faroughi, Pouya; Ismail, Noriszura
2014-12-01
This paper introduces a new form of bivariate negative binomial (BNB-1) regression which can be fitted to bivariate and correlated count data with covariates. The BNB regression discussed in this study can be fitted to bivariate and overdispersed count data with positive, zero or negative correlations. The joint p.m.f. of the BNB1 distribution is derived from the product of two negative binomial marginals with a multiplicative factor parameter. Several testing methods were used to check overdispersion and goodness-of-fit of the model. Application of BNB-1 regression is illustrated on Malaysian motor insurance dataset. The results indicated that BNB-1 regression has better fit than bivariate Poisson and BNB-2 models with regards to Akaike information criterion.
Heteroscedastic regression analysis method for mixed data
FU Hui-min; YUE Xiao-rui
2011-01-01
The heteroscedastic regression model was established and the heteroscedastic regression analysis method was presented for mixed data composed of complete data, type- I censored data and type- Ⅱ censored data from the location-scale distribution. The best unbiased estimations of regression coefficients, as well as the confidence limits of the location parameter and scale parameter were given. Furthermore, the point estimations and confidence limits of percentiles were obtained. Thus, the traditional multiple regression analysis method which is only suitable to the complete data from normal distribution can be extended to the cases of heteroscedastic mixed data and the location-scale distribution. So the presented method has a broad range of promising applications.
The best of both worlds: Phylogenetic eigenvector regression and mapping
José Alexandre Felizola Diniz Filho
2015-09-01
Full Text Available Eigenfunction analyses have been widely used to model patterns of autocorrelation in time, space and phylogeny. In a phylogenetic context, Diniz-Filho et al. (1998 proposed what they called Phylogenetic Eigenvector Regression (PVR, in which pairwise phylogenetic distances among species are submitted to a Principal Coordinate Analysis, and eigenvectors are then used as explanatory variables in regression, correlation or ANOVAs. More recently, a new approach called Phylogenetic Eigenvector Mapping (PEM was proposed, with the main advantage of explicitly incorporating a model-based warping in phylogenetic distance in which an Ornstein-Uhlenbeck (O-U process is fitted to data before eigenvector extraction. Here we compared PVR and PEM in respect to estimated phylogenetic signal, correlated evolution under alternative evolutionary models and phylogenetic imputation, using simulated data. Despite similarity between the two approaches, PEM has a slightly higher prediction ability and is more general than the original PVR. Even so, in a conceptual sense, PEM may provide a technique in the best of both worlds, combining the flexibility of data-driven and empirical eigenfunction analyses and the sounding insights provided by evolutionary models well known in comparative analyses.
Superquantile Regression: Theory, Algorithms, and Applications
2014-12-01
Isabel. I love having you in my arms, and although you are still too young to understand what a hug is, your warmth has given me the strength and...squares and the quantile regression models adjust to changes in the data set, denoted by the red dots. Notice that the observa- tions are moved upwards...model hardly changes. If we change this observation in red even further upwards, we would notice no more changes in the quantile regression function
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Boosted regression tree, table, and figure data
Spreadsheets are included here to support the manuscript Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition. This dataset is associated with the following publication:Golden , H., C. Lane , A. Prues, and E. D'Amico. Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition. JAWRA. American Water Resources Association, Middleburg, VA, USA, 52(5): 1251-1274, (2016).
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Discriminative Elastic-Net Regularized Linear Regression.
Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen
2017-03-01
In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Spontaneous regression of metastatic Merkel cell carcinoma.
Hassan, S J
2010-01-01
Merkel cell carcinoma is a rare aggressive neuroendocrine carcinoma of the skin predominantly affecting elderly Caucasians. It has a high rate of local recurrence and regional lymph node metastases. It is associated with a poor prognosis. Complete spontaneous regression of Merkel cell carcinoma has been reported but is a poorly understood phenomenon. Here we present a case of complete spontaneous regression of metastatic Merkel cell carcinoma demonstrating a markedly different pattern of events from those previously published.
The Infinite Hierarchical Factor Regression Model
Rai, Piyush
2009-01-01
We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.
Marginal longitudinal semiparametric regression via penalized splines.
Kadiri, M Al; Carroll, R J; Wand, M P
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
曹兴; 杜文静; 程林
2012-01-01
采用数值模拟的方法,研究了螺旋角对连续螺旋折流板换热器流动与传热性能的影响,并以熵产数为指标对换热器性能进行了基于热力学第二定律的分析评价.结果表明,相同质量流量时壳程传热系数和压降均随螺旋角的增大而降低,且后者降低的幅度大于前者.连续螺旋折流板换热器壳程横截面上切向速度分布较弓形折流板换热器更加均匀.在靠近中心假管的内层区域,同一径向位置的轴向速度随螺旋角的增大而降低,而在靠近壳体壁面的外层区域则相反.螺旋角越大,不同径向位置的换热管间的换热量分布均匀性越好.壳程质量流量相等时,换热器中传热引起的熵产占总熵产的比重随着螺旋角的增大而增加,熵产数随着螺旋角的增大而降低.%A numerical simulation for heat exchanger with continuous helical baffles was carried out by using commercial codes of ANSYS CFX 12. 0. The study focuses on the effects of helix angle on flow and heat transfer characteristics, and heat exchanger performance is evaluated by entropy generation number based on the analysis of the second law of thermodynamics. The results show that both the shell-side heat transfer coefficient and pressure drop decrease with the increase of the helix angle at certain mass flow rate. The latter decreases more quickly than the former. The tangential velocity distribution on shell-side cross section is more uniform with continuous helical baffles than with segmental baffles. The axial velocity at certain radial position decreases as the helix angle increases in the inner region near the central dummy tube, whereas it increases as the helix angle increases in the outer region near the shell. The heat exchange quantity distribution in tubes at different radial positions is more uniform at larger helix angle. The proportion of the entropy generation contributed by heat transfer in total entropy generation increases and the
Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady
2017-09-01
Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights
Multiple-Instance Regression with Structured Data
Wagstaff, Kiri L.; Lane, Terran; Roper, Alex
2008-01-01
We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents. Unlike previous MIR methods, MI-ClusterRegress can operate on bags that are structured in that they contain items drawn from a number of distinct (but unknown) distributions. MI-ClusterRegress simultaneously learns a model of the bag's internal structure, the relevance of each item, and a regression model that accurately predicts labels for new bags. We evaluated this approach on the challenging MIR problem of crop yield prediction from remote sensing data. MI-ClusterRegress provided predictions that were more accurate than those obtained with non-multiple-instance approaches or MIR methods that do not model the bag structure.
[Iris movement mediates pupillary membrane regression].
Morizane, Yuki
2007-11-01
In the course of mammalian lens development, a transient capillary meshwork called as the pupillary membrane (PM) forms. It is located in the pupil area to nourish the anterior surface of the lens, and then regresses to clear the optical path. Although the involvement of the apoptotic process has been reported in PM regression, the initiating factor remains unknown. We initially found that regression of the PM coincided with the development of iris motility, and that iris movement caused cessation and resumption of blood flow within the PM. Therefore, we investigated whether the development of the capacity of the iris to constrict and dilate can function as an essential signal that induces apoptosis in the PM. Continuous inhibition of iris movement with mydriatic agents suppressed apoptosis of the PM and resulted in the persistence of PM in rats. The distribution of apoptotic cells in the regressing PM was diffuse and showed no apparent localization. These results indicated that iris movement induced regression of the PM by changing the blood flow within it. This study suggests the importance of the physiological interactions between tissues-in this case, the iris and the PM-as a signal to advance vascular regression during organ development.
Post-processing through linear regression
B. Van Schaeybroeck
2011-03-01
Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.
These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Post-processing through linear regression
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Identifying predictors of physics item difficulty: A linear regression approach
Vanes Mesic
2011-06-01
Full Text Available Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal
Identifying predictors of physics item difficulty: A linear regression approach
Mesic, Vanes; Muratovic, Hasnija
2011-06-01
Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge
Afifah, Rawyanil; Andriyana, Yudhie; Jaya, I. G. N. Mindra
2017-03-01
Geographically Weighted Regression (GWR) is a development of an Ordinary Least Squares (OLS) regression which is quite effective in estimating spatial non-stationary data. On the GWR models, regression parameters are generated locally, each observation has a unique regression coefficient. Parameter estimation process in GWR uses Weighted Least Squares (WLS). But when there are outliers in the data, the parameter estimation process with WLS produces estimators which are not efficient. Hence, this study uses a robust method called Least Absolute Deviation (LAD), to estimate the parameters of GWR model in the case of poverty in Java Island. This study concludes that GWR model with LAD method has a better performance.
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Zordok, Wael A.
2014-08-01
The preparation and characterization of the new solid complexes [VO(CIP)2L]SO4ṡnH2O, where L = aniline (An), dimethylformamide (DMF), pyridine (Py) and triethylamine (Et3N) in the reaction of ciprofloxacin (CIP) with VO(SO4)2·2H2O in ethanol. The isolated complexes have been characterized with their melting points, elemental analysis, IR spectroscopy, magnetic properties, conductance measurements, UV-Vis. and 1H NMR spectroscopic methods and thermal analyses. The results supported the formation of the complexes and indicated that ciprofloxacin reacts as a bidentate ligand bound to the vanadium ion through the pyridone oxygen and one carboxylato oxygen. The activation energies, E*; entropies, ΔS*; enthalpies, ΔH*; Gibbs free energies, ΔG*, of the thermal decomposition reactions have been derived from thermo gravimetric (TGA) and differential thermo gravimetric (DTG) curves, using Coats-Redfern and Horowitz-Metzeger methods. The lowest energy model structure of each complex has been proposed by using the density functional theory (DFT) at the B3LYP/CEP-31G level of theory. The ligand and their metal complexes were also evaluated for their antibacterial activity against several bacterial species, such as Bacillus Subtilis (B. Subtilis), Staphylococcus aureus (S. aureus), Nesseria Gonorrhoeae (N. Gonorrhoeae), Pseudomonas aeruginosa (P. aeruginosa) and Escherichia coli (E. coli).
Adaptive support vector regression for UAV flight control.
Shin, Jongho; Jin Kim, H; Kim, Youdan
2011-01-01
This paper explores an application of support vector regression for adaptive control of an unmanned aerial vehicle (UAV). Unlike neural networks, support vector regression (SVR) generates global solutions, because SVR basically solves quadratic programming (QP) problems. With this advantage, the input-output feedback-linearized inverse dynamic model and the compensation term for the inversion error are identified off-line, which we call I-SVR (inversion SVR) and C-SVR (compensation SVR), respectively. In order to compensate for the inversion error and the unexpected uncertainty, an online adaptation algorithm for the C-SVR is proposed. Then, the stability of the overall error dynamics is analyzed by the uniformly ultimately bounded property in the nonlinear system theory. In order to validate the effectiveness of the proposed adaptive controller, numerical simulations are performed on the UAV model.
Cancer Regression in Patients After Transfer of Genetically Engineered Lymphocytes
Morgan, Richard A.; Dudley, Mark E.; Wunderlich, John R.; Hughes, Marybeth S.; Yang, James C.; Sherry, Richard M.; Royal, Richard E.; Topalian, Suzanne L.; Kammula, Udai S.; Restifo, Nicholas P.; Zheng, Zhili; Nahvi, Azam; de Vries, Christiaan R.; Rogers-Freezer, Linda J.; Mavroukakis, Sharon A.; Rosenberg, Steven A.
2006-10-01
Through the adoptive transfer of lymphocytes after host immunodepletion, it is possible to mediate objective cancer regression in human patients with metastatic melanoma. However, the generation of tumor-specific T cells in this mode of immunotherapy is often limiting. Here we report the ability to specifically confer tumor recognition by autologous lymphocytes from peripheral blood by using a retrovirus that encodes a T cell receptor. Adoptive transfer of these transduced cells in 15 patients resulted in durable engraftment at levels exceeding 10% of peripheral blood lymphocytes for at least 2 months after the infusion. We observed high sustained levels of circulating, engineered cells at 1 year after infusion in two patients who both demonstrated objective regression of metastatic melanoma lesions. This study suggests the therapeutic potential of genetically engineered cells for the biologic therapy of cancer.
Interpret with caution: multicollinearity in multiple regression of cognitive data.
Morrison, Catriona M
2003-08-01
Shibihara and Kondo in 2002 reported a reanalysis of the 1997 Kanji picture-naming data of Yamazaki, Ellis, Morrison, and Lambon-Ralph in which independent variables were highly correlated. Their addition of the variable visual familiarity altered the previously reported pattern of results, indicating that visual familiarity, but not age of acquisition, was important in predicting Kanji naming speed. The present paper argues that caution should be taken when drawing conclusions from multiple regression analyses in which the independent variables are so highly correlated, as such multicollinearity can lead to unreliable output.
Takeshi Takeda
2016-01-01
Full Text Available Two tests related to a new safety system for a pressurized water reactor were performed with the ROSA/LSTF (rig of safety assessment/large scale test facility. The tests simulated cold leg small-break loss-of-coolant accidents with 2-inch diameter break using an early steam generator (SG secondary-side depressurization with or without release of nitrogen gas dissolved in accumulator (ACC water. The SG depressurization was initiated by fully opening the depressurization valves in both SGs immediately after a safety injection signal. The pressure difference between the primary and SG secondary sides after the actuation of ACC system was larger in the test with the dissolved gas release than that in the test without the dissolved gas release. No core uncovery and heatup took place because of the ACC coolant injection and two-phase natural circulation. Long-term core cooling was ensured by the actuation of low-pressure injection system. The RELAP5 code predicted most of the overall trends of the major thermal-hydraulic responses after adjusting a break discharge coefficient for two-phase discharge flow under the assumption of releasing all the dissolved gas at the vessel upper plenum.
林剑; 张向前
2013-01-01
The quality of the competence of successors is the key factor which affeds the development of family business. First, this paper constructs the competency model of successors of family business named KAP (Knowledge, Ability and Per-sonality) model by qualitative study, then it conducts analyses on the data collected through on-site research based on SPSS statistic software and AMOS structural equation modeling analysis tool for verification. The results of the study show that the empirical results basically tallies with the hypothetic theory model. They also indicate that there are discrepancies of the components of competence between successors and the founders of family business. At last, the paper puts forward concrete suggestions, from preparations before succession, trials during succession and innovations after succession, on promoting the level of the competence of successors of family business.%家族企业继任者素质的高低成为影响企业发展的关键因素。通过质性研究方法构建家族企业继任者胜任力KAP模型并基于SPSS统计软件和AMOS结构方程分析工具对实证数据进行处理和分析。研究结果显示，除了实证结果与理论模型基本契合之外，还发现继任者与第一代创业者在胜任力要素构成上存在着差异。最后，从家族企业继任前的筹备、继任中的考验以及继任后的创新三个方面提出提升继任者胜任力水平的具体管理建议。
Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
温州地区太阳能电池板实际发电能力的分析%Analyses on the Actual Power-generating Capability of Solar Panels in Wenzhou
华晓玲; 梁步猛; 吴桂初
2014-01-01
简要介绍了薄膜、单晶硅和多晶硅太阳能电池板的优缺点以及实现最大功率点跟踪的几种方法。结合温州地区的太阳辐射情况，利用扰动观察法测量了大量的数据，以比功率为衡量标准对三种太阳能电池板的实际发电能力进行了比较。结果表明，不管是在光强很强还是在光强较弱的环境下，单晶硅太阳能电池板的发电能力均最高；对另两种太阳能板，在光强较强时，薄膜太阳能电池板占优势，在光强很弱时，多晶硅太阳能电池板发电能力更强。%In this paper, solar panels made of thin film, monocrystalline silicon and polycrystalline sil-icon were introduced and compared, and several methods to achieve maximum power point tracking (MPPT) were summarized as well. Combined with the condition of solar radiation in WenZhou, a large amou-nt of data were acquired by using the perturbing-and-observing algorithm and then analyzed to determinethe power-generating capacity by calculating their specific power. Experimental results indicated that mono-crystalline silicon solar panels always perform best in spite of weather conditions. Thin film solar panelshave advantages over polysilicon solar panels when light intensity is high, and when the light intensity is weak, polysilicon solar panels are better than the film solar panels.
Regression Test Selection for C# Programs
Nashat Mansour
2009-01-01
Full Text Available We present a regression test selection technique for C# programs. C# is fairly new and is often used within the Microsoft .Net framework to give programmers a solid base to develop a variety of applications. Regression testing is done after modifying a program. Regression test selection refers to selecting a suitable subset of test cases from the original test suite in order to be rerun. It aims to provide confidence that the modifications are correct and did not affect other unmodified parts of the program. The regression test selection technique presented in this paper accounts for C#.Net specific features. Our technique is based on three phases; the first phase builds an Affected Class Diagram consisting of classes that are affected by the change in the source code. The second phase builds a C# Interclass Graph (CIG from the affected class diagram based on C# specific features. In this phase, we reduce the number of selected test cases. The third phase involves further reduction and a new metric for assigning weights to test cases for prioritizing the selected test cases. We have empirically validated the proposed technique by using case studies. The empirical results show the usefulness of the proposed regression testing technique for C#.Net programs.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Hyperglycemia impairs atherosclerosis regression in mice.
Gaudreault, Nathalie; Kumar, Nikit; Olivas, Victor R; Eberlé, Delphine; Stephens, Kyle; Raffai, Robert L
2013-12-01
Diabetic patients are known to be more susceptible to atherosclerosis and its associated cardiovascular complications. However, the effects of hyperglycemia on atherosclerosis regression remain unclear. We hypothesized that hyperglycemia impairs atherosclerosis regression by modulating the biological function of lesional macrophages. HypoE (Apoe(h/h)Mx1-Cre) mice express low levels of apolipoprotein E (apoE) and develop atherosclerosis when fed a high-fat diet. Atherosclerosis regression occurs in these mice upon plasma lipid lowering induced by a change in diet and the restoration of apoE expression. We examined the morphological characteristics of regressed lesions and assessed the biological function of lesional macrophages isolated with laser-capture microdissection in euglycemic and hyperglycemic HypoE mice. Hyperglycemia induced by streptozotocin treatment impaired lesion size reduction (36% versus 14%) and lipid loss (38% versus 26%) after the reversal of hyperlipidemia. However, decreases in lesional macrophage content and remodeling in both groups of mice were similar. Gene expression analysis revealed that hyperglycemia impaired cholesterol transport by modulating ATP-binding cassette A1, ATP-binding cassette G1, scavenger receptor class B family member (CD36), scavenger receptor class B1, and wound healing pathways in lesional macrophages during atherosclerosis regression. Hyperglycemia impairs both reduction in size and loss of lipids from atherosclerotic lesions upon plasma lipid lowering without significantly affecting the remodeling of the vascular wall.
Detecting overdispersion in count data: A zero-inflated Poisson regression analysis
Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah
2017-09-01
This study focusing on analysing count data of butterflies communities in Jasin, Melaka. In analysing count dependent variable, the Poisson regression model has been known as a benchmark model for regression analysis. Continuing from the previous literature that used Poisson regression analysis, this study comprising the used of zero-inflated Poisson (ZIP) regression analysis to gain acute precision on analysing the count data of butterfly communities in Jasin, Melaka. On the other hands, Poisson regression should be abandoned in the favour of count data models, which are capable of taking into account the extra zeros explicitly. By far, one of the most popular models include ZIP regression model. The data of butterfly communities which had been called as the number of subjects in this study had been taken in Jasin, Melaka and consisted of 131 number of subjects visits Jasin, Melaka. Since the researchers are considering the number of subjects, this data set consists of five families of butterfly and represent the five variables involve in the analysis which are the types of subjects. Besides, the analysis of ZIP used the SAS procedure of overdispersion in analysing zeros value and the main purpose of continuing the previous study is to compare which models would be better than when exists zero values for the observation of the count data. The analysis used AIC, BIC and Voung test of 5% level significance in order to achieve the objectives. The finding indicates that there is a presence of over-dispersion in analysing zero value. The ZIP regression model is better than Poisson regression model when zero values exist.
SMOOTH TRANSITION LOGISTIC REGRESSION MODEL TREE
RODRIGO PINTO MOREIRA
2008-01-01
Este trabalho tem como objetivo principal adaptar o modelo STR-Tree, o qual é a combinação de um modelo Smooth Transition Regression com Classification and Regression Tree (CART), a fim de utilizá-lo em Classificação. Para isto algumas alterações foram realizadas em sua forma estrutural e na estimação. Devido ao fato de estarmos fazendo classificação de variáveis dependentes binárias, se faz necessária a utilização das técnicas empregadas em Regressão Logística, dessa forma a estimação dos pa...
Unsupervised K-Nearest Neighbor Regression
Kramer, Oliver
2011-01-01
In many scientific disciplines structures in high-dimensional data have to be found, e.g., in stellar spectra, in genome data, or for face recognition tasks. In this work we present a novel approach to non-linear dimensionality reduction. It is based on fitting K-nearest neighbor regression to the unsupervised regression framework for learning of low-dimensional manifolds. Similar to related approaches that are mostly based on kernel methods, unsupervised K-nearest neighbor (UKNN) regression optimizes latent variables w.r.t. the data space reconstruction error employing the K-nearest neighbor heuristic. The problem of optimizing latent neighborhoods is difficult to solve, but the UKNN formulation allows an efficient strategy of iteratively embedding latent points to fixed neighborhood topologies. The approaches will be tested experimentally.
LINEAR REGRESSION WITH R AND HADOOP
Bogdan OANCEA
2015-07-01
Full Text Available In this paper we present a way to solve the linear regression model with R and Hadoop using the Rhadoop library. We show how the linear regression model can be solved even for very large models that require special technologies. For storing the data we used Hadoop and for computation we used R. The interface between R and Hadoop is the open source library RHadoop. We present the main features of the Hadoop and R software systems and the way of interconnecting them. We then show how the least squares solution for the linear regression problem could be expressed in terms of map-reduce programming paradigm and how could be implemented using the Rhadoop library.
Rapidly Regressive Unilateral Fetal Pleural Effusion
Tuncay Yuce
2015-03-01
Full Text Available Intrauterine pleural effusion of fetal lungs rarely regresses without intervention. In our case we treated a women at 32th weeks of gestation. Her pregnancy was complicated with fetal pleural effusion and polyhydramniosis. A therapeutic thoracocentesis was planned and she received two courses of betamethasone prior to procedure. On the day of planned procedure, a substantial regression of pleural effusion was observed and procedure was postponed. During her antenatal follow-up a complete regression of pleural effusion was observed. After delivery pleural effusion did not relapse. These findings hint there may be a role of antenatal steroids in treatment of fetal pleural effusion, which is known to be resistant to treatment modalities both during antenatal and postnatal period. [Cukurova Med J 2015; 40(Suppl 1: 25-28
Uncertainty quantification in DIC with Kriging regression
Wang, Dezhi; DiazDelaO, F. A.; Wang, Weizhuo; Lin, Xiaoshan; Patterson, Eann A.; Mottershead, John E.
2016-03-01
A Kriging regression model is developed as a post-processing technique for the treatment of measurement uncertainty in classical subset-based Digital Image Correlation (DIC). Regression is achieved by regularising the sample-point correlation matrix using a local, subset-based, assessment of the measurement error with assumed statistical normality and based on the Sum of Squared Differences (SSD) criterion. This leads to a Kriging-regression model in the form of a Gaussian process representing uncertainty on the Kriging estimate of the measured displacement field. The method is demonstrated using numerical and experimental examples. Kriging estimates of displacement fields are shown to be in excellent agreement with 'true' values for the numerical cases and in the experimental example uncertainty quantification is carried out using the Gaussian random process that forms part of the Kriging model. The root mean square error (RMSE) on the estimated displacements is produced and standard deviations on local strain estimates are determined.
KINERJA JACKKNIFE RIDGE REGRESSION DALAM MENGATASI MULTIKOLINEARITAS
HANY DEVITA
2015-02-01
Full Text Available Ordinary least square is a parameter estimations for minimizing residual sum of squares. If the multicollinearity was found in the data, unbias estimator with minimum variance could not be reached. Multicollinearity is a linear correlation between independent variabels in model. Jackknife Ridge Regression(JRR as an extension of Generalized Ridge Regression (GRR for solving multicollinearity. Generalized Ridge Regression is used to overcome the bias of estimators caused of presents multicollinearity by adding different bias parameter for each independent variabel in least square equation after transforming the data into an orthoghonal form. Beside that, JRR can reduce the bias of the ridge estimator. The result showed that JRR model out performs GRR model.
On Solving Lq-Penalized Regressions
Tracy Zhou Wu
2007-01-01
Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno
2017-03-01
This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
无
2007-01-01
Detecting plant health conditions plays a key role in farm pest management and crop protection. In this study,measurement of hyperspectral leaf reflectance in rice crop (Oryzasativa L.) was conducted on groups of healthy and infected leaves by the fungus Bipolaris oryzae (Helminthosporium oryzae Breda. de Hann) through the wavelength range from 350 to 2 500 nm. The percentage of leaf surface lesions was estimated and defined as the disease severity. Statistical methods like multiple stepwise regression, principal component analysis and partial least-square regression were utilized to calculate and estimate the disease severity of rice brown spot at the leaf level. Our results revealed that multiple stepwise linear regressions could efficiently estimate disease severity with three wavebands in seven steps. The root mean square errors (RMSEs) for training (n=210) and testing (n=53) dataset were 6.5% and 5.8%, respectively. Principal component analysis showed that the first principal component could explain approximately 80% of the variance of the original hyperspectral reflectance. The regression model with the first two principal components predicted a disease severity with RMSEs of 16.3% and 13.9% for the training and testing dataset, respectively. Partial least-square regression with seven extracted factors could most effectively predict disease severity compared with other statistical methods with RMSEs of 4.1% and 2.0% for the training and testing dataset, respectively. Our research demonstrates that it is feasible to estimate the disease severity office brown spot using hyperspectral reflectance data at the leaf level.
Removing Malmquist bias from linear regressions
Verter, Frances
1993-01-01
Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.
Multicollinearity in cross-sectional regressions
Lauridsen, Jørgen; Mur, Jesùs
2006-10-01
The paper examines robustness of results from cross-sectional regression paying attention to the impact of multicollinearity. It is well known that the reliability of estimators (least-squares or maximum-likelihood) gets worse as the linear relationships between the regressors become more acute. We resolve the discussion in a spatial context, looking closely into the behaviour shown, under several unfavourable conditions, by the most outstanding misspecification tests when collinear variables are added to the regression. A Monte Carlo simulation is performed. The conclusions point to the fact that these statistics react in different ways to the problems posed.
Federico A. Sturzeneger
1992-03-01
Full Text Available Currency Substitution and the Regressivity of Inflationary Taxation The purpose of this paper is to show that in the presence of financial adaptation or currency substitution. the inflation tax is extremely regressive. This regressivity arises from the existence of a fixed cost of switching to inflation-proof transactions technologies. This fixed cost makes it optimal only for those agents with sufficiently high incomes to switch out of domestic currency. The effects are illustrated and quantified for a particular case.
Development of Super-Ensemble techniques for ocean analyses: the Mediterranean Sea case
Pistoia, Jenny; Pinardi, Nadia; Oddo, Paolo; Collins, Matthew; Korres, Gerasimos; Drillet, Yann
2017-04-01
Short-term ocean analyses for Sea Surface Temperature SST in the Mediterranean Sea can be improved by a statistical post-processing technique, called super-ensemble. This technique consists in a multi-linear regression algorithm applied to a Multi-Physics Multi-Model Super-Ensemble (MMSE) dataset, a collection of different operational forecasting analyses together with ad-hoc simulations produced by modifying selected numerical model parameterizations. A new linear regression algorithm based on Empirical Orthogonal Function filtering techniques is capable to prevent overfitting problems, even if best performances are achieved when we add correlation to the super-ensemble structure using a simple spatial filter applied after the linear regression. Our outcomes show that super-ensemble performances depend on the selection of an unbiased operator and the length of the learning period, but the quality of the generating MMSE dataset has the largest impact on the MMSE analysis Root Mean Square Error (RMSE) evaluated with respect to observed satellite SST. Lower RMSE analysis estimates result from the following choices: 15 days training period, an overconfident MMSE dataset (a subset with the higher quality ensemble members), and the least square algorithm being filtered a posteriori.
MYRONP.ZALUCKI; MICHAELJ.FURLONG
2005-01-01
Long-term forecasts of pest pressure are central to the effective management of many agricultural insect pests. In the eastern cropping regions of Australia, serious infestations of Helicoverpa punctigera (Wallenglen) and H. armigera (Hübner)(Lepidoptera:Noctuidae) are experienced annually. Regression analyses of a long series of light-trap catches of adult moths were used to describe the seasonal dynamics of both species. The size of the spring generation in eastern cropping zones could be related to rainfall in putative source areas in inland Australia. Subsequent generations could be related to the abundance of various crops in agricultural areas, rainfall and the magnitude of the spring population peak. As rainfall figured prominently as a predictor variable, and can itself be predicted using the Southern Oscillation Index (SOI), trap catches were also related to this variable. The geographic distribution of each species was modelled in relation to climate and CLIMEX was used to predict temporal variation in abundance at given putative source sites in inland Australia using historical meteorological data. These predictions were then correlated with subsequent pest abundance data in a major cropping region. The regression-based and bioclimatic-based approaches to predicting pest abundance are compared and their utility in predicting and interpreting pest dynamics are discussed.
Demonstration of a Fiber Optic Regression Probe
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Nielsen, Peter Carøe; Hansen, Hans Nørgaard; Olsen, Flemming Ove
2007-01-01
The quantitative and qualitative description of laser beam characteristics is important for process implementation and optimisation. In particular, a need for quantitative characterisation of beam diameter was identified when using fibre lasers for micro manufacturing. Here the beam diameter limits...... the obtainable features in direct laser machining as well as heat affected zones in welding processes. This paper describes the development of a measuring unit capable of analysing beam shape and diameter of lasers to be used in manufacturing processes. The analyser is based on the principle of a rotating...... mechanical wire being swept through the laser beam at varying Z-heights. The reflected signal is analysed and the resulting beam profile determined. The development comprised the design of a flexible fixture capable of providing both rotation and Z-axis movement, control software including data capture...
Tax System in Poland – Progressive or Regressive?
Jacek Tomkiewicz
2016-03-01
Full Text Available Purpose: To analyse the impact of the Polish fiscal regime on the general revenue of the country, and specifically to establish whether the cumulative tax burden borne by Polish households is progressive or regressive.Methodology: On the basis of Eurostat and OECD data, the author has analysed fiscal regimes in EU Member States and in OECD countries. The tax burden of households within different income groups has also been examined pursuant to applicable fiscal laws and data pertaining to the revenue and expenditure of households published by the Central Statistical Office (CSO.Conclusions: The fiscal regime in Poland is regressive; that is, the relative fiscal burden decreases as the taxpayer’s income increases.Research Implications: The article contributes to the on-going discussion on social cohesion, in particular with respect to economic policy instruments aimed at the redistribution of income within the economy.Originality: The author presents an analysis of data pertaining to fiscal policies in EU Member States and OECD countries and assesses the impact of the legal environment (fiscal regime and social security system in Poland on income distribution within the economy. The impact of the total tax burden (direct and indirect taxes, social security contributions on the economic situation of households from different income groups has been calculated using an original formula.
Epidemiology of CKD Regression in Patients under Nephrology Care.
Silvio Borrelli
Full Text Available Chronic Kidney Disease (CKD regression is considered as an infrequent renal outcome, limited to early stages, and associated with higher mortality. However, prevalence, prognosis and the clinical correlates of CKD regression remain undefined in the setting of nephrology care. This is a multicenter prospective study in 1418 patients with established CKD (eGFR: 60-15 ml/min/1.73m² under nephrology care in 47 outpatient clinics in Italy from a least one year. We defined CKD regressors as a ΔGFR ≥0 ml/min/1.73 m2/year. ΔGFR was estimated as the absolute difference between eGFR measured at baseline and at follow up visit after 18-24 months, respectively. Outcomes were End Stage Renal Disease (ESRD and overall-causes Mortality.391 patients (27.6% were identified as regressors as they showed an eGFR increase between the baseline visit in the renal clinic and the follow up visit. In multivariate regression analyses the regressor status was not associated with CKD stage. Low proteinuria was the main factor associated with CKD regression, accounting per se for 48% of the likelihood of this outcome. Lower systolic blood pressure, higher BMI and absence of autosomal polycystic disease (PKD were additional predictors of CKD regression. In regressors, ESRD risk was 72% lower (HR: 0.28; 95% CI 0.14-0.57; p<0.0001 while mortality risk did not differ from that in non-regressors (HR: 1.16; 95% CI 0.73-1.83; p = 0.540. Spline models showed that the reduction of ESRD risk associated with positive ΔGFR was attenuated in advanced CKD stage. CKD regression occurs in about one-fourth patients receiving renal care in nephrology units and correlates with low proteinuria, BP and the absence of PKD. This condition portends better renal prognosis, mostly in earlier CKD stages, with no excess risk for mortality.
A Skew-Normal Mixture Regression Model
Liu, Min; Lin, Tsung-I
2014-01-01
A challenge associated with traditional mixture regression models (MRMs), which rest on the assumption of normally distributed errors, is determining the number of unobserved groups. Specifically, even slight deviations from normality can lead to the detection of spurious classes. The current work aims to (a) examine how sensitive the commonly…
Predicting Social Trust with Binary Logistic Regression
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
The Geometry of Enhancement in Multiple Regression.
Waller, Niels G
2011-10-01
In linear multiple regression, "enhancement" is said to occur when R (2)=b'r>r'r, where b is a p×1 vector of standardized regression coefficients and r is a p×1 vector of correlations between a criterion y and a set of standardized regressors, x. When p=1 then b≡r and enhancement cannot occur. When p=2, for all full-rank R xx≠I, R xx=E[xx']=V Λ V' (where V Λ V' denotes the eigen decomposition of R xx; λ 1>λ 2), the set [Formula: see text] contains four vectors; the set [Formula: see text]; [Formula: see text] contains an infinite number of vectors. When p≥3 (and λ 1>λ 2>⋯>λ p ), both sets contain an uncountably infinite number of vectors. Geometrical arguments demonstrate that B 1 occurs at the intersection of two hyper-ellipsoids in ℝ (p) . Equations are provided for populating the sets B 1 and B 2 and for demonstrating that maximum enhancement occurs when b is collinear with the eigenvector that is associated with λ p (the smallest eigenvalue of the predictor correlation matrix). These equations are used to illustrate the logic and the underlying geometry of enhancement in population, multiple-regression models. R code for simulating population regression models that exhibit enhancement of any degree and any number of predictors is included in Appendices A and B.
Regression Segmentation for M³ Spinal Images.
Wang, Zhijie; Zhen, Xiantong; Tay, KengYeow; Osman, Said; Romano, Walter; Li, Shuo
2015-08-01
Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities (M(3)). Unfortunately, existing methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality (S(3)). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment M(3) spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially diverse M(3) images directly to desired object boundaries. Leveraging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where M(3) diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for M(3) spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases.
A Spline Regression Model for Latent Variables
Harring, Jeffrey R.
2014-01-01
Spline (or piecewise) regression models have been used in the past to account for patterns in observed data that exhibit distinct phases. The changepoint or knot marking the shift from one phase to the other, in many applications, is an unknown parameter to be estimated. As an extension of this framework, this research considers modeling the…
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Selecting a Regression Saturated by Indicators
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain...
Functional data analysis of generalized regression quantiles
Guo, Mengmeng
2013-11-05
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Nonparametric regression with martingale increment errors
Delattre, Sylvain
2010-01-01
We consider the problem of adaptive estimation of the regression function in a framework where we replace ergodicity assumptions (such as independence or mixing) by another structural assumption on the model. Namely, we propose adaptive upper bounds for kernel estimators with data-driven bandwidth (Lepski's selection rule) in a regression model where the noise is an increment of martingale. It includes, as very particular cases, the usual i.i.d. regression and auto-regressive models. The cornerstone tool for this study is a new result for self-normalized martingales, called ``stability'', which is of independent interest. In a first part, we only use the martingale increment structure of the noise. We give an adaptive upper bound using a random rate, that involves the occupation time near the estimation point. Thanks to this approach, the theoretical study of the statistical procedure is disconnected from usual ergodicity properties like mixing. Then, in a second part, we make a link with the usual minimax th...
Finite Algorithms for Robust Linear Regression
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Finite Algorithms for Robust Linear Regression
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Modeling confounding by half-sibling regression
Schölkopf, Bernhard; Hogg, David W; Wang, Dun
2016-01-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both...
Selecting a Regression Saturated by Indicators
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain...
Structural Break Tests Robust to Regression Misspecification
Abi Morshed, Alaa; Andreou, E.; Boldea, Otilia
2016-01-01
Structural break tests developed in the literature for regression models are sensitive to model misspecification. We show - analytically and through simulations - that the sup Wald test for breaks in the conditional mean and variance of a time series process exhibits severe size distortions when the
The M Word: Multicollinearity in Multiple Regression.
Morrow-Howell, Nancy
1994-01-01
Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…
Macroeconomic Forecasting Using Penalized Regression Methods
Smeekes, Stephan; Wijler, Etiënne
2016-01-01
We study the suitability of lasso-type penalized regression techniques when applied to macroeconomic forecasting with high-dimensional datasets. We consider performance of the lasso-type methods when the true DGP is a factor model, contradicting the sparsity assumption underlying penalized regressio
Deriving the Regression Line with Algebra
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Prediction of dynamical systems by symbolic regression
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K.; Noack, Bernd R.
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.
Diagnostic profiles of acute abdominal pain with multinomial logistic regression
Ohmann, Christian
2007-07-01
Full Text Available Purpose: Application of multinomial logistic regression for diagnostic support of acute abdominal pain, a diagnostic problem with many differential diagnoses. Methods: The analysis is based on a prospective data base with 2280 patients with acute abdominal pain, characterized by 87 variables from history and clinical examination and 12 differential diagnoses. Associations between single variables from history and clinical examination and the final diagnoses were investigated with multinomial logistic regression. Results: Exemplarily, the results are presented for the variable rigidity. A statistical significant association was observed for generalized rigidity and the diagnoses appendicitis, bowel obstruction, pancreatitis, perforated ulcer, multiple and other diagnoses and for localized rigidity and appendicitis, diverticulitis, biliary disease and perforated ulcer. Diagnostic profiles were generated by summarizing the statistical significant associations. As an example the diagnostic profile of acute appendicitis is presented. Conclusions: Compared to alternative approaches (e.g. independent Bayes, loglinear model there are advantages for multinomial logistic regression to support complex differential diagnostic problems, provided potential traps are avoided (e.g. α-error, interpretation of odds ratio.
Hendriks, M.A.; Luyten, J.W.; Scheerens, J.; Sleegers, P.J.C.; Scheerens, J.
2014-01-01
In this chapter results of a research synthesis and quantitative meta-analyses of three facets of time effects in education are presented, namely time at school during regular lesson hours, homework, and extended learning time. The number of studies for these three facets of time that could be used
Contesting Citizenship: Comparative Analyses
Siim, Birte; Squires, Judith
2007-01-01
. Comparative citizenship analyses need to be considered in relation to multipleinequalities and their intersections and to multiple governance and trans-national organisinf. This, in turn, suggests that comparative citizenship analysis needs to consider new spaces in which struggles for equal citizenship occur...
Wavelet Analyses and Applications
Bordeianu, Cristian C.; Landau, Rubin H.; Paez, Manuel J.
2009-01-01
It is shown how a modern extension of Fourier analysis known as wavelet analysis is applied to signals containing multiscale information. First, a continuous wavelet transform is used to analyse the spectrum of a nonstationary signal (one whose form changes in time). The spectral analysis of such a signal gives the strength of the signal in each…
Veldman, M.; Schelvis-Smit, A.A.M.
2005-01-01
On behalf of a client of Animal Sciences Group, different varieties of veal were analyzed by both instrumental and sensory analyses. The sensory evaluation was performed with a sensory analytical panel in the period of 13th of May and 31st of May, 2005. The three varieties of veal were: young bull,
Quantile regression for the statistical analysis of immunological data with many non-detects
Eilers Paul HC; Röder Esther; Savelkoul Huub FJ; van Wijk Roy
2012-01-01
Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a genera...
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
Czekaj, Tomasz Gerard
This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate...... practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric...
Geodesic least squares regression for scaling studies in magnetic confinement fusion
Verdoolaege, Geert [Department of Applied Physics, Ghent University, Ghent, Belgium and Laboratory for Plasma Physics, Royal Military Academy, Brussels (Belgium)
2015-01-13
In regression analyses for deriving scaling laws that occur in various scientific disciplines, usually standard regression methods have been applied, of which ordinary least squares (OLS) is the most popular. However, concerns have been raised with respect to several assumptions underlying OLS in its application to scaling laws. We here discuss a new regression method that is robust in the presence of significant uncertainty on both the data and the regression model. The method, which we call geodesic least squares regression (GLS), is based on minimization of the Rao geodesic distance on a probabilistic manifold. We demonstrate the superiority of the method using synthetic data and we present an application to the scaling law for the power threshold for the transition to the high confinement regime in magnetic confinement fusion devices.
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
Czekaj, Tomasz Gerard
This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates...... and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...
Progression and regression of the atherosclerotic plaque.
de Feyter, P J; Vos, J; Deckers, J W
1995-08-01
In animals in which atherosclerosis was induced experimentally (by a high cholesterol diet) regression of the atherosclerotic lesion was demonstrated after serum cholesterol was reduced by cholesterol- lowering drugs or a low-fat diet. Whether regression of advanced coronary arterly lesions also takes place in humans after a similar intervention remains conjectural. However, several randomized studies, primarily employing lipid-lowering intervention or comprehensive changes in lifestyle, have demonstrated, using serial angiograms, that it is possible to achieve less progression, arrest or even (small) regression of atherosclerotic lesions. The lipid-lowering trials (NHBLI, CLAS, POSCH, FATS, SCOR and STARS) studied 1240 symptomatic patients, mostly men, with moderately elevated cholesterol levels and moderately severe angiographic-proven coronary artery disease. A variety of lipid-lowering drugs, in addition to a diet, were used over an intervention period ranging from 2 to 3 years. In all but one study (NHBLI), the progression of coronary atherosclerosis was less in the treated group, but regression was induced in only a few patients. The overall relative risk of progression of coronary atherosclerosis was 0 x 62 and 2 x 13, respectively. The induced angiographic differences were small and did not produce any significant haemodynamic benefit. The most important result was tht the disease process could be stabilized in the majority of patients. Three comprehensive lifestyle change trials (the Lifestyle Heart study, STARS and the Heidelberg Study) studied 183 patients, who were subjected to stress management, and/or intensive exercise, in addition to a low fat diet, over a period ranging from 1 to 3 years. All three trials demonstrated less progression, and more regression with overall relative risks of 0 x 40 and 2 x 35 respectively, in the intervention groups. Angiographic trials demonstrated that retardation or arrest of coronary atherosclerosis was possible
Meyer, Karin
2007-11-01
WOMBAT is a software package for quantitative genetic analyses of continuous traits, fitting a linear, mixed model; estimates of covariance components and the resulting genetic parameters are obtained by restricted maximum likelihood. A wide range of models, comprising numerous traits, multiple fixed and random effects, selected genetic covariance structures, random regression models and reduced rank estimation are accommodated. WOMBAT employs up-to-date numerical and computational methods. Together with the use of efficient compilers, this generates fast executable programs, suitable for large scale analyses. Use of WOMBAT is illustrated for a bivariate analysis. The package consists of the executable program, available for LINUX and WINDOWS environments, manual and a set of worked example, and can be downloaded free of charge from (http://agbu. une.edu.au/~kmeyer/wombat.html).
Logistic regression when binary predictor variables are highly correlated.
Barker, L; Brown, C
Standard logistic regression can produce estimates having large mean square error when predictor variables are multicollinear. Ridge regression and principal components regression can reduce the impact of multicollinearity in ordinary least squares regression. Generalizations of these, applicable in the logistic regression framework, are alternatives to standard logistic regression. It is shown that estimates obtained via ridge and principal components logistic regression can have smaller mean square error than estimates obtained through standard logistic regression. Recommendations for choosing among standard, ridge and principal components logistic regression are developed. Published in 2001 by John Wiley & Sons, Ltd.
Geiser, Achim
2015-01-01
A variety of possible future analyses of HERA data in the context of the HERA data preservation programme is collected, motivated, and commented. The focus is placed on possible future analyses of the existing $ep$ collider data and their physics scope. Comparisons to the original scope of the HERA programme are made, and cross references to topics also covered by other participants of the workshop are given. This includes topics on QCD, proton structure, diffraction, jets, hadronic final states, heavy flavours, electroweak physics, and the application of related theory and phenomenology topics like NNLO QCD calculations, low-x related models, nonperturbative QCD aspects, and electroweak radiative corrections. Synergies with other collider programmes are also addressed. In summary, the range of physics topics which can still be uniquely covered using the existing data is very broad and of considerable physics interest, often matching the interest of results from colliders currently in operation. Due to well-e...
Analysing Access Control Specifications
Probst, Christian W.; Hansen, René Rydhof
2009-01-01
. Recent events have revealed intimate knowledge of surveillance and control systems on the side of the attacker, making it often impossible to deduce the identity of an inside attacker from logged data. In this work we present an approach that analyses the access control configuration to identify the set......When prosecuting crimes, the main question to answer is often who had a motive and the possibility to commit the crime. When investigating cyber crimes, the question of possibility is often hard to answer, as in a networked system almost any location can be accessed from almost anywhere. The most...... of credentials needed to reach a certain location in a system. This knowledge allows to identify a set of (inside) actors who have the possibility to commit an insider attack at that location. This has immediate applications in analysing log files, but also nontechnical applications such as identifying possible...
Wilen, C.; Moilanen, A.; Kurkela, E. [VTT Energy, Espoo (Finland). Energy Production Technologies
1996-12-31
The overall objectives of the project `Feasibility of electricity production from biomass by pressurized gasification systems` within the EC Research Programme JOULE II were to evaluate the potential of advanced power production systems based on biomass gasification and to study the technical and economic feasibility of these new processes with different type of biomass feed stocks. This report was prepared as part of this R and D project. The objectives of this task were to perform fuel analyses of potential woody and herbaceous biomasses with specific regard to the gasification properties of the selected feed stocks. The analyses of 15 Scandinavian and European biomass feed stock included density, proximate and ultimate analyses, trace compounds, ash composition and fusion behaviour in oxidizing and reducing atmospheres. The wood-derived fuels, such as whole-tree chips, forest residues, bark and to some extent willow, can be expected to have good gasification properties. Difficulties caused by ash fusion and sintering in straw combustion and gasification are generally known. The ash and alkali metal contents of the European biomasses harvested in Italy resembled those of the Nordic straws, and it is expected that they behave to a great extent as straw in gasification. Any direct relation between the ash fusion behavior (determined according to the standard method) and, for instance, the alkali metal content was not found in the laboratory determinations. A more profound characterisation of the fuels would require gasification experiments in a thermobalance and a PDU (Process development Unit) rig. (orig.) (10 refs.)
Geiser, Achim
2015-12-15
A variety of possible future analyses of HERA data in the context of the HERA data preservation programme is collected, motivated, and commented. The focus is placed on possible future analyses of the existing ep collider data and their physics scope. Comparisons to the original scope of the HERA pro- gramme are made, and cross references to topics also covered by other participants of the workshop are given. This includes topics on QCD, proton structure, diffraction, jets, hadronic final states, heavy flavours, electroweak physics, and the application of related theory and phenomenology topics like NNLO QCD calculations, low-x related models, nonperturbative QCD aspects, and electroweak radiative corrections. Synergies with other collider programmes are also addressed. In summary, the range of physics topics which can still be uniquely covered using the existing data is very broad and of considerable physics interest, often matching the interest of results from colliders currently in operation. Due to well-established data and MC sets, calibrations, and analysis procedures the manpower and expertise needed for a particular analysis is often very much smaller than that needed for an ongoing experiment. Since centrally funded manpower to carry out such analyses is not available any longer, this contribution not only targets experienced self-funded experimentalists, but also theorists and master-level students who might wish to carry out such an analysis.
Model selection in kernel ridge regression
Exterkate, Peter
2013-01-01
Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...
Online support vector regression for reinforcement learning
Yu Zhenhua; Cai Yuanli
2007-01-01
The goal in reinforcement learning is to learn the value of state-action pair in order to maximize the total reward. For continuous states and actions in the real world, the representation of value functions is critical. Furthermore, the samples in value functions are sequentially obtained. Therefore, an online support vector regression (OSVR) is set up, which is a function approximator to estimate value functions in reinforcement learning. OSVR updates the regression function by analyzing the possible variation of support vector sets after new samples are inserted to the training set. To evaluate the OSVR learning ability, it is applied to the mountain-car task. The simulation results indicate that the OSVR has a preferable convergence speed and can solve continuous problems that are infeasible using lookup table.
Constrained regression models for optimization and forecasting
P.J.S. Bruwer
2003-12-01
Full Text Available Linear regression models and the interpretation of such models are investigated. In practice problems often arise with the interpretation and use of a given regression model in spite of the fact that researchers may be quite "satisfied" with the model. In this article methods are proposed which overcome these problems. This is achieved by constructing a model where the "area of experience" of the researcher is taken into account. This area of experience is represented as a convex hull of available data points. With the aid of a linear programming model it is shown how conclusions can be formed in a practical way regarding aspects such as optimal levels of decision variables and forecasting.
Controlling attribute effect in linear regression
Calders, Toon
2013-12-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Multiple Kernel Spectral Regression for Dimensionality Reduction
Bing Liu
2013-01-01
Full Text Available Traditional manifold learning algorithms, such as locally linear embedding, Isomap, and Laplacian eigenmap, only provide the embedding results of the training samples. To solve the out-of-sample extension problem, spectral regression (SR solves the problem of learning an embedding function by establishing a regression framework, which can avoid eigen-decomposition of dense matrices. Motivated by the effectiveness of SR, we incorporate multiple kernel learning (MKL into SR for dimensionality reduction. The proposed approach (termed MKL-SR seeks an embedding function in the Reproducing Kernel Hilbert Space (RKHS induced by the multiple base kernels. An MKL-SR algorithm is proposed to improve the performance of kernel-based SR (KSR further. Furthermore, the proposed MKL-SR algorithm can be performed in the supervised, unsupervised, and semi-supervised situation. Experimental results on supervised classification and semi-supervised classification demonstrate the effectiveness and efficiency of our algorithm.
A Gibbs Sampler for Multivariate Linear Regression
Mantz, Adam B
2015-01-01
Kelly (2007, hereafter K07) described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modeled by a flexible mixture of Gaussians rather than assumed to be uniform. Here I extend the K07 algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Second, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamica...
Wavelet Scattering Regression of Quantum Chemical Energies
Hirn, Matthew; Poilvert, Nicolas
2016-01-01
We introduce multiscale invariant dictionaries to estimate quantum chemical energies of organic molecules, from training databases. Molecular energies are invariant to isometric atomic displacements, and are Lipschitz continuous to molecular deformations. Similarly to density functional theory (DFT), the molecule is represented by an electronic density function. A multiscale invariant dictionary is calculated with wavelet scattering invariants. It cascades a first wavelet transform which separates scales, with a second wavelet transform which computes interactions across scales. Sparse scattering regressions give state of the art results over two databases of organic planar molecules. On these databases, the regression error is of the order of the error produced by DFT codes, but at a fraction of the computational cost.
Are increases in cigarette taxation regressive?
Borren, P; Sutton, M
1992-12-01
Using the latest published data from Tobacco Advisory Council surveys, this paper re-evaluates the question of whether or not increases in cigarette taxation are regressive in the United Kingdom. The extended data set shows no evidence of increasing price-elasticity by social class as found in a major previous study. To the contrary, there appears to be no clear pattern in the price responsiveness of smoking behaviour across different social classes. Increases in cigarette taxation, while reducing smoking levels in all groups, fall most heavily on men and women in the lowest social class. Men and women in social class five can expect to pay eight and eleven times more of a tax increase respectively, than their social class one counterparts. Taken as a proportion of relative incomes, the regressive nature of increases in cigarette taxation is even more pronounced.
Nonexistence in Reciprocal and Logarithmic Regression
Josef Bukac
2003-01-01
Fitting logarithmic b ln(clx), a+bln(c+x) or reciprocal b/(c+x), a+b/(c+x) regression models to data by the least squares method asks for the determination of the closure of the set of each type of these functions defined on a finite domain. It follows that a minimal solution may not exist. But it does exist when the closure is considered.
Logistic regression a self-learning text
Kleinbaum, David G
1994-01-01
This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.
Curvatures for Parameter Subsets in Nonlinear Regression
1986-01-01
The relative curvature measures of nonlinearity proposed by Bates and Watts (1980) are extended to an arbitrary subset of the parameters in a normal, nonlinear regression model. In particular, the subset curvatures proposed indicate the validity of linearization-based approximate confidence intervals for single parameters. The derivation produces the original Bates-Watts measures directly from the likelihood function. When the intrinsic curvature is negligible, the Bates-Watts parameter-effec...
Realization of Ridge Regression in MATLAB
Dimitrov, S.; Kovacheva, S.; Prodanova, K.
2008-10-01
The least square estimator (LSE) of the coefficients in the classical linear regression models is unbiased. In the case of multicollinearity of the vectors of design matrix, LSE has very big variance, i.e., the estimator is unstable. A more stable estimator (but biased) can be constructed using ridge-estimator (RE). In this paper the basic methods of obtaining of Ridge-estimators and numerical procedures of its realization in MATLAB are considered. An application to Pharmacokinetics problem is considered.
Bayesian Inference of a Multivariate Regression Model
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Leukemia prediction using sparse logistic regression.
Tapio Manninen
Full Text Available We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Formula: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical.
Outlier Detection Using Nonconvex Penalized Regression
She, Yiyuan
2010-01-01
This paper studies the outlier detection problem from the point of view of penalized regressions. Our regression model adds one mean shift parameter for each of the $n$ data points. We then apply a regularization favoring a sparse vector of mean shift parameters. The usual $L_1$ penalty yields a convex criterion, but we find that it fails to deliver a robust estimator. The $L_1$ penalty corresponds to soft thresholding. We introduce a thresholding (denoted by $\\Theta$) based iterative procedure for outlier detection ($\\Theta$-IPOD). A version based on hard thresholding correctly identifies outliers on some hard test problems. We find that $\\Theta$-IPOD is much faster than iteratively reweighted least squares for large data because each iteration costs at most $O(np)$ (and sometimes much less) avoiding an $O(np^2)$ least squares estimate. We describe the connection between $\\Theta$-IPOD and $M$-estimators. Our proposed method has one tuning parameter with which to both identify outliers and estimate regression...
General regression and representation model for classification.
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
A Dirty Model for Multiple Sparse Regression
Jalali, Ali; Sanghavi, Sujay
2011-01-01
Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Simone Becker Lopes
2014-04-01
Full Text Available Considering the importance of spatial issues in transport planning, the main objective of this study was to analyze the results obtained from different approaches of spatial regression models. In the case of spatial autocorrelation, spatial dependence patterns should be incorporated in the models, since that dependence may affect the predictive power of these models. The results obtained with the spatial regression models were also compared with the results of a multiple linear regression model that is typically used in trips generation estimations. The findings support the hypothesis that the inclusion of spatial effects in regression models is important, since the best results were obtained with alternative models (spatial regression models or the ones with spatial variables included. This was observed in a case study carried out in the city of Porto Alegre, in the state of Rio Grande do Sul, Brazil, in the stages of specification and calibration of the models, with two distinct datasets.
Improving customer generation by analysing website visitor behaviour
Ramlall, Shalini
2011-01-01
This dissertation describes the creation of a new integrated Information Technology (IT) system that assisted in the collection of data about the behaviour of website visitors as well as sales and marketing data for those visitors who turned into customers. A key contribution to knowledge was the creation of a method to predict the outcome of visits to a website from visitors’ browsing behaviour. A new Online Tracking Module (OTM) was created that monitored visitors’ behaviour while they brow...
A Bibliography of Generative-Based Grammatical Analyses of Spanish.
Nuessel, Frank H.
One hundred sixty-eight books, articles, and dissertations written between 1960 and 1973 are listed in this bibliography of linguistic studies of the Spanish language within the grammatical theory originated by Noam Chomsky in his "Syntactic Structures" (1957). The present work is divided into two general categories: (1) phonology and (2) syntax…
A Bibliography of Generative-Based Grammatical Analyses of Spanish.
Nuessel, Frank H.
One hundred sixty-eight books, articles, and dissertations written between 1960 and 1973 are listed in this bibliography of linguistic studies of the Spanish language within the grammatical theory originated by Noam Chomsky in his "Syntactic Structures" (1957). The present work is divided into two general categories: (1) phonology and (2) syntax…
Jacob, Benjamin G; Griffith, Daniel; Muturi, Ephantus; Caamano, Erick X; Shililu, Josephat; Githure, John I; Novak, Robert J
2009-01-01
This research illustrates a geostatistical approach for modeling the spatial distribution patterns of Anopheles arabiensis Patton (Patton) aquatic habitats in two riceland environments. QuickBird 0.61 m data, encompassing the visible bands and the near-infra-red (NIR) band, were selected to synthesize images of An. arabiensis aquatic habitats. These bands and field sampled data were used to determine ecological parameters associated with riceland larval habitat development. SAS was used to calculate univariate statistics, correlations and Poisson regression models. Global autocorrelation statistics were generated in ArcGISfrom georeferenced Anopheles aquatic habitats in the study sites. The geographic distribution of Anopheles gambiae s.l. aquatic habitats in the study sites exhibited weak positive autocorrelation; similar numbers of log-larval count habitats tend to clustered in space. Individual rice land habitat data were further evaluated in terms of their covariations with spatial autocorrelation, by regressing them on candidate spatial filter eigenvectors. Each eigenvector generated from a geographically weighted matrix, for both study sites, revealed a distinctive spatial pattern. The spatial autocorrelation components suggest the presence of roughly 14-30% redundant information in the aquatic habitat larval count samples. Synthetic map pattern variables furnish a method of capturing spatial dependency effects in the mean response term in regression analyses of rice land An. arabiensis aquatic habitat data.
A general framework for the use of logistic regression models in meta-analysis.
Simmonds, Mark C; Higgins, Julian Pt
2016-12-01
Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.
An empirical study using permutation-based resampling in meta-regression
Gagnier Joel J
2012-02-01
Full Text Available Abstract Background In meta-regression, as the number of trials in the analyses decreases, the risk of false positives or false negatives increases. This is partly due to the assumption of normality that may not hold in small samples. Creation of a distribution from the observed trials using permutation methods to calculate P values may allow for less spurious findings. Permutation has not been empirically tested in meta-regression. The objective of this study was to perform an empirical investigation to explore the differences in results for meta-analyses on a small number of trials using standard large sample approaches verses permutation-based methods for meta-regression. Methods We isolated a sample of randomized controlled clinical trials (RCTs for interventions that have a small number of trials (herbal medicine trials. Trials were then grouped by herbal species and condition and assessed for methodological quality using the Jadad scale, and data were extracted for each outcome. Finally, we performed meta-analyses on the primary outcome of each group of trials and meta-regression for methodological quality subgroups within each meta-analysis. We used large sample methods and permutation methods in our meta-regression modeling. We then compared final models and final P values between methods. Results We collected 110 trials across 5 intervention/outcome pairings and 5 to 10 trials per covariate. When applying large sample methods and permutation-based methods in our backwards stepwise regression the covariates in the final models were identical in all cases. The P values for the covariates in the final model were larger in 78% (7/9 of the cases for permutation and identical for 22% (2/9 of the cases. Conclusions We present empirical evidence that permutation-based resampling may not change final models when using backwards stepwise regression, but may increase P values in meta-regression of multiple covariates for relatively small amount of
Severe accident recriticality analyses (SARA)
Frid, W.; Højerup, C.F.; Lindholm, I.
2001-01-01
three computer codes and to further develop and adapt them for the task. The codes were SIMULATE-3K, APROS and RECRIT. Recriticality analyses were carried out for a number of selected reflooding transients for the Oskarshamn 3 plant in Sweden with SIMULATE-3K and for the Olkiluoto I plant in Finland...... with all three codes. The core initial and boundary conditions prior to recriticality have been studied with the severe accident codes SCDAP/RELAP5, MELCOR and MAAP4. The results of the analyses show that all three codes predict recriticality-both super-prompt power bursts and quasi steady-state power...... generation-for the range of parameters studied, i.e. with core uncovering and heat-up to maximum core temperatures of approximately 1800 K, and water flow rates of 45-2000 kg s(-1) injected into the downcomer. Since recriticality takes place in a small fraction of the core, the power densities are high...
Lawson, E.M. [Australian Nuclear Science and Technology Organisation, Lucas Heights, NSW (Australia). Physics Division
1998-03-01
The major use of ANTARES is Accelerator Mass Spectrometry (AMS) with {sup 14}C being the most commonly analysed radioisotope - presently about 35 % of the available beam time on ANTARES is used for {sup 14}C measurements. The accelerator measurements are supported by, and dependent on, a strong sample preparation section. The ANTARES AMS facility supports a wide range of investigations into fields such as global climate change, ice cores, oceanography, dendrochronology, anthropology, and classical and Australian archaeology. Described here are some examples of the ways in which AMS has been applied to support research into the archaeology, prehistory and culture of this continent`s indigenous Aboriginal peoples. (author)
Interpreting parameters in the logistic regression model with random effects
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Mission assurance increased with regression testing
Lang, R.; Spezio, M.
Knowing what to test is an important attribute in any testing campaign, especially when it has to be right or the mission could be in jeopardy. The New Horizons mission, developed and operated by the John Hopkins University Applied Physics Laboratory, received a planned major upgrade to their Mission Operations and Control (MOC) ground system architecture. Early in the mission planning it was recognized that the ground system platform would require an upgrade to assure continued support of technology used for spacecraft operations. With the planned update to the six year operational ground architecture from Solaris 8 to Solaris 10, it was critical that the new architecture maintain critical operations and control functions. The New Horizons spacecraft is heading to its historic rendezvous with Pluto in July 2015 and then proceeding into the Kuiper Belt. This paper discusses the Independent Software Acceptance Testing (ISAT) Regression test campaign that played a critical role to assure the continued success of the New Horizons mission. The New Horizons ISAT process was designed to assure all the requirements were being met for the ground software functions developed to support the mission objectives. The ISAT team developed a test plan with a series of test case designs. The test objectives were to verify that the software developed from the requirements functioned as expected in the operational environment. As the test cases were developed and executed, a regression test suite was identified at the functional level. This regression test suite would serve as a crucial resource in assuring the operational system continued to function as required with such a large scale change being introduced. Some of the New Horizons ground software changes required modifications to the most critical functions of the operational software. Of particular concern was the new MOC architecture (Solaris 10) is Intel based and little endian, and the legacy architecture (Solaris 8) was SPA
Costs of sea dikes - regressions and uncertainty estimates
Lenk, Stephan; Rybski, Diego; Heidrich, Oliver; Dawson, Richard J.; Kropp, Jürgen P.
2017-05-01
Failure to consider the costs of adaptation strategies can be seen by decision makers as a barrier to implementing coastal protection measures. In order to validate adaptation strategies to sea-level rise in the form of coastal protection, a consistent and repeatable assessment of the costs is necessary. This paper significantly extends current knowledge on cost estimates by developing - and implementing using real coastal dike data - probabilistic functions of dike costs. Data from Canada and the Netherlands are analysed and related to published studies from the US, UK, and Vietnam in order to provide a reproducible estimate of typical sea dike costs and their uncertainty. We plot the costs divided by dike length as a function of height and test four different regression models. Our analysis shows that a linear function without intercept is sufficient to model the costs, i.e. fixed costs and higher-order contributions such as that due to the volume of core fill material are less significant. We also characterise the spread around the regression models which represents an uncertainty stemming from factors beyond dike length and height. Drawing an analogy with project cost overruns, we employ log-normal distributions and calculate that the range between 3x and x/3 contains 95 % of the data, where x represents the corresponding regression value. We compare our estimates with previously published unit costs for other countries. We note that the unit costs depend not only on the country and land use (urban/non-urban) of the sites where the dikes are being constructed but also on characteristics included in the costs, e.g. property acquisition, utility relocation, and project management. This paper gives decision makers an order of magnitude on the protection costs, which can help to remove potential barriers to developing adaptation strategies. Although the focus of this research is sea dikes, our approach is applicable and transferable to other adaptation measures.
Mapping geogenic radon potential by regression kriging.
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-02-15
Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. Copyright © 2015 Elsevier B.V. All rights
An Application on Multinomial Logistic Regression Model
Abdalla M El-Habil
2012-03-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE This study aims to identify an application of Multinomial Logistic Regression model which is one of the important methods for categorical data analysis. This model deals with one nominal/ordinal response variable that has more than two categories, whether nominal or ordinal variable. This model has been applied in data analysis in many areas, for example health, social, behavioral, and educational.To identify the model by practical way, we used real data on physical violence against children, from a survey of Youth 2003 which was conducted by Palestinian Central Bureau of Statistics (PCBS. Segment of the population of children in the age group (10-14 years for residents in Gaza governorate, size of 66,935 had been selected, and the response variable consisted of four categories. Eighteen of explanatory variables were used for building the primary multinomial logistic regression model. Model had been tested through a set of statistical tests to ensure its appropriateness for the data. Also the model had been tested by selecting randomly of two observations of the data used to predict the position of each observation in any classified group it can be, by knowing the values of the explanatory variables used. We concluded by using the multinomial logistic regression model that we can able to define accurately the relationship between the group of explanatory variables and the response variable, identify the effect of each of the variables, and we can predict the classification of any individual case.
Hierarchical linear regression models for conditional quantiles
TIAN Maozai; CHEN Gemai
2006-01-01
The quantile regression has several useful features and therefore is gradually developing into a comprehensive approach to the statistical analysis of linear and nonlinear response models,but it cannot deal effectively with the data with a hierarchical structure.In practice,the existence of such data hierarchies is neither accidental nor ignorable,it is a common phenomenon.To ignore this hierarchical data structure risks overlooking the importance of group effects,and may also render many of the traditional statistical analysis techniques used for studying data relationships invalid.On the other hand,the hierarchical models take a hierarchical data structure into account and have also many applications in statistics,ranging from overdispersion to constructing min-max estimators.However,the hierarchical models are virtually the mean regression,therefore,they cannot be used to characterize the entire conditional distribution of a dependent variable given high-dimensional covariates.Furthermore,the estimated coefficient vector (marginal effects)is sensitive to an outlier observation on the dependent variable.In this article,a new approach,which is based on the Gauss-Seidel iteration and taking a full advantage of the quantile regression and hierarchical models,is developed.On the theoretical front,we also consider the asymptotic properties of the new method,obtaining the simple conditions for an n1/2-convergence and an asymptotic normality.We also illustrate the use of the technique with the real educational data which is hierarchical and how the results can be explained.
The seam offset identification based on support vector regression machines
Zeng Songsheng; Shi Yonghua; Wang Guorong; Huang Guoxing
2009-01-01
The principle of the support vector regression machine(SVR) is first analysed. Then the new data-dependent kernel function is constructed from information geometry perspective. The current waveforms change regularly in accordance with the different horizontal offset when the rotational frequency of the high speed rotational arc sensor is in the range from 15 Hz to 30 Hz. The welding current data is pretreated by wavelet filtering, mean filtering and normalization treatment. The SVR model is constructed by making use of the evolvement laws, the decision function can be achieved by training the SVR and the seam offset can be identified. The experimental results show that the precision of the offset identification can be greatly improved by modifying the SVR and applying mean filtering from the longitudinal direction.
[Refractive regression after intraocular lens implantation].
Ma, Z Z; Momose, A
1991-05-01
Study of refractive changes after IOL implantation in 147 eyes revealed that astigmatism tended to increase, and the natural regressive course followed a negative exponential function, with the steep phase within 3 weeks for spherical, and 5 weeks for cylindrical errors. One (1) week after surgery, the axis of astigmatism was predominantly with the rule, and 2 months after operation, patients with preoperative WRA changed into various astigmatic axial directions, while 76.4% of the patients with preoperative ARA reverted to ARA. Those eyes in which the astigmatic axis was not horizontal 1 week after operation ended with stronger astigmatism in 2 months.
Learning regulatory programs by threshold SVD regression.
Ma, Xin; Xiao, Luo; Wong, Wing Hung
2014-11-04
We formulate a statistical model for the regulation of global gene expression by multiple regulatory programs and propose a thresholding singular value decomposition (T-SVD) regression method for learning such a model from data. Extensive simulations demonstrate that this method offers improved computational speed and higher sensitivity and specificity over competing approaches. The method is used to analyze microRNA (miRNA) and long noncoding RNA (lncRNA) data from The Cancer Genome Atlas (TCGA) consortium. The analysis yields previously unidentified insights into the combinatorial regulation of gene expression by noncoding RNAs, as well as findings that are supported by evidence from the literature.
Privacy Preserving Linear Regression on Distributed Databases
Fida K. Dankar
2015-04-01
Full Text Available Studies that combine data from multiple sources can tremendously improve the outcome of the statistical analysis. However, combining data from these various sources for analysis poses privacy risks. A number of protocols have been proposed in the literature to address the privacy concerns; however they do not fully deliver on either privacy or complexity. In this paper, we present a (theoretical privacy preserving linear regression model for the analysis of data owned by several sources. The protocol uses a semi-trusted third party and delivers on privacy and complexity.
Cyclodextrin promotes atherosclerosis regression via macrophage reprogramming
2016-01-01
Atherosclerosis is an inflammatory disease linked to elevated blood cholesterol concentrations. Despite ongoing advances in the prevention and treatment of atherosclerosis, cardiovascular disease remains the leading cause of death worldwide. Continuous retention of apolipoprotein B...... that increases cholesterol solubility in preventing and reversing atherosclerosis. We showed that CD treatment of murine atherosclerosis reduced atherosclerotic plaque size and CC load and promoted plaque regression even with a continued cholesterol-rich diet. Mechanistically, CD increased oxysterol production...... of CD as well as for augmented reverse cholesterol transport. Because CD treatment in humans is safe and CD beneficially affects key mechanisms of atherogenesis, it may therefore be used clinically to prevent or treat human atherosclerosis....
Neutrosophic Correlation and Simple Linear Regression
A. A. Salama
2014-09-01
Full Text Available Since the world is full of indeterminacy, the neutrosophics found their place into contemporary research. The fundamental concepts of neutrosophic set, introduced by Smarandache. Recently, Salama et al., introduced the concept of correlation coefficient of neutrosophic data. In this paper, we introduce and study the concepts of correlation and correlation coefficient of neutrosophic data in probability spaces and study some of their properties. Also, we introduce and study the neutrosophic simple linear regression model. Possible applications to data processing are touched upon.
Paraneoplastic pemphigus regression after thymoma resection
Stergiou Eleni
2008-08-01
Full Text Available Abstract Background Among human neoplasms thymomas are associated with highest frequency with paraneoplastic autoimmune diseases. Case presentation A case of a 42-year-old woman with paraneoplastic pemphigus as the first manifestation of thymoma is reported. Transsternal complete thymoma resection achieved pemphigus regression. The clinical correlations between pemphigus and thymoma are presented. Conclusion Our case report provides further evidence for the important role of autoantibodies in the pathogenesis of paraneoplastic skin diseases in thymoma patients. It also documents the improvement of the associated pemphigus after radical treatment of the thymoma.
Spectral density regression for bivariate extremes
Castro Camilo, Daniela
2016-05-11
We introduce a density regression model for the spectral density of a bivariate extreme value distribution, that allows us to assess how extremal dependence can change over a covariate. Inference is performed through a double kernel estimator, which can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts of practical interest. An extreme temperature dataset is used to illustrate our methods. © 2016 Springer-Verlag Berlin Heidelberg
Stability Analysis for Regularized Least Squares Regression
Rudin, Cynthia
2005-01-01
We discuss stability for a class of learning algorithms with respect to noisy labels. The algorithms we consider are for regression, and they involve the minimization of regularized risk functionals, such as L(f) := 1/N sum_i (f(x_i)-y_i)^2+ lambda ||f||_H^2. We shall call the algorithm `stable' if, when y_i is a noisy version of f*(x_i) for some function f* in H, the output of the algorithm converges to f* as the regularization term and noise simultaneously vanish. We consider two flavors of...
Parametric Regression Models Using Reversed Hazard Rates
Asokan Mulayath Variyath
2014-01-01
Full Text Available Proportional hazard regression models are widely used in survival analysis to understand and exploit the relationship between survival time and covariates. For left censored survival times, reversed hazard rate functions are more appropriate. In this paper, we develop a parametric proportional hazard rates model using an inverted Weibull distribution. The estimation and construction of confidence intervals for the parameters are discussed. We assess the performance of the proposed procedure based on a large number of Monte Carlo simulations. We illustrate the proposed method using a real case example.
Bayesian regression of piecewise homogeneous Poisson processes
Diego Sevilla
2015-12-01
Full Text Available In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Received: 2 November 2015, Accepted: 27 November 2015; Edited by: R. Dickman; Reviewed by: M. Hutter, Australian National University, Canberra, Australia.; DOI: http://dx.doi.org/10.4279/PIP.070018 Cite as: D J R Sevilla, Papers in Physics 7, 070018 (2015