1

Importance of Diagnostics in Multiple Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this study was to obtain some valuable information from different diagnostics in Multiple Regression Analysis (MRA). Sample data set was composed of live weights at different periods (birth weight (X1), live weights in 30th (X2), 45th(X3), 60th (X4) and 75th (Y) days) of 18 Hamdani breed single-male lambs born in early March of 2001. According to results of MRA, although all independent variables including in model explained approximately 92% of variation in dependent variable, Y, the effect of only independent variable X4 on dependent variable Y was significant (p<0.01). With respect to residual analysis, it could be said that the assumptions of normal distribution and homogeneity of error terms in MRA were provided. As the value of Durbin-Watson statistics equaled to 2.31, there was not a sequent correlation among error terms, that is, the assumption that error terms independent from each other was ensured. Considered the leverage and influence diagnostics calculating for observations of sample data set, only two observations (2nd and 16th observations) of all observations-both outliers and potential effective (influence) observations- should be carefully examined. It could be concluded that diagnostics would be an important statistics for researchers because they could give an idea about whether the basic assumptions would be provided for reliability of MRA, data set and goodness of fit.

E. Eyduran; T. Ozdemir; E. Alarslan

2005-01-01

2

Multiple Decision Procedures in Analysis of Variance and Regression Analysis.

We consider testing of the homogeneity hypothesis in the one-way ANOVA model and testing for the significance of regression in the multiple linear regression model. Unlike in the classical approach, there is no alternative hypothesis to accept when the nu...

S. S. Gupta D. Y. Huang S. Panchapakesan

1996-01-01

3

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

Directory of Open Access Journals (Sweden)

Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

Erika KULCSÁR

2009-01-01

4

Pulsars and interstellar medium - multiple regression analysis of related parameters

Energy Technology Data Exchange (ETDEWEB)

The relationship between pulsars and the interstellar-medium electron density (IED) is investigated by performing multiple stepwise regression analysis on the parameters for which linear correlations with the galactic continuum background temperature at 408 MHz (T408) have been established in 325 pulsars by Fracassini et al. (1983) (dispersion measure, radio luminosity, heliocentric distance, and galactocentric radius). An empirical relation is derived the results are presented in a table and graph and the pulsars are classified as peculiar, normal, or standard on the basis of their O-C values. Standard pulsars are shown to have actual T4O8 values equal to those calculated statistically and to confirm theoretically based T408-IED relationships, whereas normal pulsars are indicators of regions in which the observed and calculated T4O8 are in agreement only on average, and peculiar pulsars are associated with regions in which T4O8 is significantly above or below the average and IED is depleted (by accretion phenomena) or increased (by H II regions or star formation in front of the pulsar). 30 references.

Antonello, E.; Fracassini, M.

1985-01-01

5

Quantitative electron microscope autoradiography: application of multiple linear regression analysis

Energy Technology Data Exchange (ETDEWEB)

A new method for the analysis of high resolution EM autoradiographs is described. It identifies labelled cell organelle profiles in sections on a strictly statistical basis and provides accurate estimates for their radioactivity without the need to make any assumptions about their size, shape and spatial arrangement.

Markov, D.V.

1986-10-01

6

Modeling the energy content of municipal solid waste using multiple regression analysis

Energy Technology Data Exchange (ETDEWEB)

In this research multiple regression analysis was used to develop predictive models of the energy content of municipal solid waste (MSW). The scope of work included collecting waste samples in Kaohsiung City, Taiwan, characterizing the waste, and performing a stepwise forward selection procedure for isolating variables. Two regression models were developed to correlate the energy content with variables derived from physical composition and ultimate analysis. The performance of these models for this particular waste was superior to that of equations developed by other researchers (e.g., Dulong, Steuer) for estimating energy content. Attempts at developing regression models from proximate analysis data were not successful. 6 refs., 8 figs., 2 tabs.

Liu, J.I. [Kaohsiung Department of Environmental Protection (Taiwan, Province of China); Paode, R.D.; Holsen, T.M, [Illinois Inst. of Technology, Chicago, IL (United States)

1996-07-01

7

Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

2006-01-01

8

Energy Technology Data Exchange (ETDEWEB)

In this paper, multiple nonlinear regression models for estimation of higher heating value of coals are developed using proximate analysis data obtained generally from the low rank coal samples as-received basis. In this modeling study, three main model structures depended on the number of proximate analysis parameters, which are named the independent variables, such as moisture, ash, volatile matter and fixed carbon, are firstly categorized. Secondly, sub-model structures with different arrangements of the independent variables are considered. Each sub-model structure is analyzed with a number of model equations in order to find the best fitting model using multiple nonlinear regression method. Based on the results of nonlinear regression analysis, the best model for each sub-structure is determined. Among them, the models giving highest correlation for three main structures are selected. Although the selected all three models predicts HHV rather accurately, the model involving four independent variables provides the most accurate estimation of HHV. Additionally, when the chosen model with four independent variables and a literature model are tested with extra proximate analysis data, it is seen that that the developed model in this study can give more accurate prediction of HHV of coals. It can be concluded that the developed model is effective tool for HHV estimation of low rank coals. (author)

Akkaya, Ali Volkan [Department of Mechanical Engineering, Yildiz Technical University, 34349 Besiktas, Istanbul (Turkey)

2009-02-15

9

REVAAM Model to determine a company's value by multiple valuation and linear regression analysis

Directory of Open Access Journals (Sweden)

Full Text Available This paper shows an alternative model to the widely used method of multiple valuation (or relative valuation) in order to calculate the value of a company by using either the Price Earnings (PE) and/or the Enterprise Value to Earnings Before Interest, Taxes, Depreciation and Amortization (EV/EBITDA). When calculating multiples, analysts tend to consider average multiples within an industry and apply them directly to the target company; however, we believe that this practice is not considering differences among the companies being compared, although they belong to the same sector or industry. REVAAM Model uses linear regression to calculate adjusted PE and EV/EBITDA multiples by taking into consideration profitability factors for each multiple in order to differentiate companies in the samples. Calculations are based on public data for US companies, but could be further expanded to other markets. Not only REVAAM Model provides a better estimate to relative valuation analysis than simply using average multiples, but it could be used to compare under/overvalued companies or sectors, and also analyze multiple value changes over time as the intrinsic fundamentals change.

Luis G. Acosta-Calzado; Carlos Acosta-Calzado; Humberto Murrieta-Romo

2010-01-01

10

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

Directory of Open Access Journals (Sweden)

Full Text Available The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

Abhishek Taneja; R.K.Chauhan

2011-01-01

11

The book provides complete coverage of the classical methods of statistical analysis. It is designed to give students an understanding of the purpose of statistical analyses, to allow the student to determine, at least to some degree, the correct type of statistical analyses to be performed in a given situation, and have some appreciation of what constitutes good experimental design

Freund, Rudolf J; Sa, Ping

2006-01-01

12

UK PubMed Central (United Kingdom)

The purpose of this study was to quantitatively evaluate Akahori's preoperative classification of cubital tunnel syndrome. We analyzed the results for 57 elbows that were treated by a simple decompression procedure from 1997 to 2004. The relationship between each item of Akahori's preoperative classification and clinical stage was investigated based on the parameter distribution. We evaluated Akahori's classification system using multiple regression analysis, and investigated the association between the stage and treatment results. The usefulness of the regression equation was evaluated by analysis of variance of the expected and observed scores. In the parameter distribution, each item of Akahori's classification was mostly associated with the stage, but it was difficult to judge the severity of palsy. In the mathematical evaluation, the most effective item in determining the stage was sensory conduction velocity. It was demonstrated that the established regression equation was highly reliable (R = 0.922). Akahori's preoperative classification can also be used in postoperative classification, and this classification was correlated with postoperative prognosis. Our results indicate that Akahori's preoperative classification is a suitable system. It is reliable, reproducible and well-correlated with the postoperative prognosis. In addition, the established prediction formula is useful to reduce the diagnostic complexity of Akahori's classification.

Watanabe M; Arita S; Hashizume H; Honda M; Nishida K; Ozaki T

2013-01-01

13

The composting process of food wastes and tree cuttings was examined on four composting types composed from two kinds of systems and added mixture of microorganisms. The time courses of 32 parameters in each composting type were observed. The efficient composting system was found to be the static aerated reactor system in comparison with the turning pile one. Using the multiple regression analysis of all the data (159 samples) obtained from this study, some parameters were selected to predict the germination index (GI) value, which was adopted as a marker of compost maturity. For example, using the regression model generated from pH, NH(4)(+) concentration, acid phosphatase activity, and esterase activity of water extracts of the compost, GI value was expressed by the multi-linear regression equation (p<0.0001). High correlations between the measured GI value and the predicted one were made in each type of compost. As a result of these observations, the compost maturity might be predicted by only sensing of the water extract at the composting site without any requirements for a large-size equipment and skill, and this prediction system could contribute to the production of a stable compost in wide-spread use for the recycling market. PMID:16289625

Chikae, Miyuki; Ikeda, Ryuzoh; Kerman, Kagan; Morita, Yasutaka; Tamiya, Eiichi

2005-11-09

14

UK PubMed Central (United Kingdom)

The composting process of food wastes and tree cuttings was examined on four composting types composed from two kinds of systems and added mixture of microorganisms. The time courses of 32 parameters in each composting type were observed. The efficient composting system was found to be the static aerated reactor system in comparison with the turning pile one. Using the multiple regression analysis of all the data (159 samples) obtained from this study, some parameters were selected to predict the germination index (GI) value, which was adopted as a marker of compost maturity. For example, using the regression model generated from pH, NH(4)(+) concentration, acid phosphatase activity, and esterase activity of water extracts of the compost, GI value was expressed by the multi-linear regression equation (p<0.0001). High correlations between the measured GI value and the predicted one were made in each type of compost. As a result of these observations, the compost maturity might be predicted by only sensing of the water extract at the composting site without any requirements for a large-size equipment and skill, and this prediction system could contribute to the production of a stable compost in wide-spread use for the recycling market.

Chikae M; Ikeda R; Kerman K; Morita Y; Tamiya E

2006-11-01

15

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear re...

Taneja, Abhishek

2011-01-01

16

UK PubMed Central (United Kingdom)

Polycyclic aromatic hydrocarbons (PAHs) are contaminants that reside mainly in surface soils. Dietary intake of plant-based foods can make a major contribution to total PAH exposure. Little information is available on the relationship between root morphology and plant uptake of PAHs. An understanding of plant root morphologic and compositional factors that affect root uptake of contaminants is important and can inform both agricultural (chemical contamination of crops) and engineering (phytoremediation) applications. Five crop plant species are grown hydroponically in solutions containing the PAH phenanthrene. Measurements are taken for 1) phenanthrene uptake, 2) root morphology--specific surface area, volume, surface area, tip number and total root length and 3) root tissue composition--water, lipid, protein and carbohydrate content. These factors are compared through Pearson's correlation and multiple linear regression analysis. The major factors which promote phenanthrene uptake are specific surface area and lipid content.

Zhan X; Liang X; Xu G; Zhou L

2013-08-01

17

A Multiple Regression Analysis on Influencing Factors of Urban Services Growth in China

Directory of Open Access Journals (Sweden)

Full Text Available The indicator of urban success is the success of its urban services. Although much research on services have been made, there is major gap with regard to the regional services, especially on urban services within a country. As for urban ser-vices, there are few research on factors influencing urban services and its effect on regional growth. In reaction to this, the government intend to accelerate the development of urban services and regional economy in the present Twelfth Five-Year Plan 2011-2015.Thus, the main purpose of this paper is to investigate the factors that influence urban servic-es growth from demand , supply, institutional environment and spatial agglomeration side. By using cross-section mul-tiple regression analysis, the study examine the factors influencing urban services growth in China .The model indicated that except for urbanization, division of labor , other independent variables have contributed positively towards urban services growth in China.

Yuan Gao; ABDUL Razak bin Chik

2013-01-01

18

PUMA: a unified framework for penalized multiple regression analysis of GWAS data.

UK PubMed Central (United Kingdom)

Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.

Hoffman GE; Logsdon BA; Mezey JG

2013-06-01

19

Using Multiple Regression to Interpret Chi-Square Contingency Table Analysis.

|Statistics such as chi-square, phi, and Cramer's V are related to the R squared statistic of regression analysis. It is shown that the proportion of variance accounted for can be computed from many contingency table situations. (JKS)|

Leitner, Dennis W.

20

Regression analysis of multiple-source longitudinal outcomes: a "Stirling County" depression study.

UK PubMed Central (United Kingdom)

Epidemiologic studies of psychiatric disorders have increasingly relied on multiple sources of information to improve the validity of diagnoses and repeated assessments over time to provide a longitudinal perspective. In this paper, the authors present a general multivariate logistic regression method for the simultaneous analysis of discrete outcomes that exhibit such features. This approach permits risk factor and agreement analyses within a unified framework and appropriately uses data from subjects who may be missing some outcomes. The authors use this approach to analyze data from a "Stirling County" study of depression. During a 3- to 4-year period in the early 1990s, 631 subjects were assessed in two separate interviews, on each occasion with two diagnostic schedules (the DePression and AnXiety schedule (DPAX) and the Diagnostic Interview Schedule (DIS)). The female:male ratio of depression was found to be different for the DPAX and the DIS (0.8 and 1.6, respectively). Education was inversely associated with depression, while the effects of time, the subject's age, and the interviewer's sex were essentially null. With respect to the outcomes' association, agreement between the DPAX and the DIS was low. In addition, stability of the DPAX over time was significantly higher than that of the DIS. No covariates were found to affect significantly the association between outcomes.

Daskalakis C; Laird NM; Murphy JM

2002-01-01

21

Energy Technology Data Exchange (ETDEWEB)

Details of a study of measurements of pneumoconioses, emphysema and chronic bronchitis are provided from post-mortem examinations of 186 deceased coal workers. Clinical, radiological and physiological measurements of respiratory disease are provided for the five years before death in each case. These data have been analyzed by multiple regression analysis and the major findings are presented. (16 refs.)

Leigh, J.; Outhred, K.G.; McKenzie, H.I.; Wiles, A.N.

1982-01-01

22

Regression analysis by example

Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

Chatterjee, Samprit

2012-01-01

23

UK PubMed Central (United Kingdom)

How complex tactile sensations are encoded by populations of afferent mechanoreceptors is currently not well understood. While much is known about how individual afferents respond to prescribed stimuli, their behavior as a population distributed across the fingertip has not been well described. In this study, tactile afferent mechanoreceptors in monkey fingertips were mechanically stimulated, using a flat disc shaped probe, with several magnitudes of normal force (1.8, 2.2 and 2.5 N) and torque (2.0 and 3.5 mNm), in clockwise and anticlockwise directions. Afferent nerve responses were acquired from 58 slowly-adapting (SA) type-I and 25 fast-adapting (FA) type-I isolated single cutaneous mechanoreceptive afferents, recorded from the median nerve. At 10 ms time intervals after the application of torque begins, a multiple regression model was trained and evaluated to estimate the magnitude of the applied normal force and torque. Averaged results over the 200 ms period after the torque reaches its maximum indicate that SA-I and FA-I afferents can both estimate the applied torque value. FA-I afferents gave the lowest estimation error mean and standard deviation of -0.051 ± 0.334 mNm for a target torque of 2.0 mNm, and 0.003 ± 0.414 mNm for a target torque of 3.5 mNm. However, while SA-I afferents could estimate normal force well, there was no significant difference (ANOVA, p=0.173) in the FA-I estimates of normal force, as this force had already been held constant for one second before the torque loading phase under analysis began.

Fu J; Birznieks I; Goodwin AW; Khamis H; Redmond SJ

2012-01-01

24

Error analysis of dimensionless scaling experiments with multiple points using linear regression

International Nuclear Information System (INIS)

A general method of error estimation in the case of multiple point dimensionless scaling experiments, using linear regression and standard error propagation, is proposed. The method reduces to the previous result of Cordey (2009 Nucl. Fusion 49 052001) in the case of a two-point scan. On the other hand, if the points follow a linear trend, it explains how the estimated error decreases as more points are added to the scan. Based on the analytical expression that is derived, it is argued that for a low number of points, adding points to the ends of the scanned range, rather than the middle, results in a smaller error estimate. (letter)

2010-01-01

25

We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental. PMID:21287104

Lorenzo-Seva, Urbano; Ferrando, Pere J

2011-03-01

26

UK PubMed Central (United Kingdom)

We provide an SPSS program that implements currently recommended techniques and recent developments for selecting variables in multiple linear regression analysis via the relative importance of predictors. The approach consists of: (1) optimally splitting the data for cross-validation, (2) selecting the final set of predictors to be retained in the equation regression, and (3) assessing the behavior of the chosen model using standard indices and procedures. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

Lorenzo-Seva U; Ferrando PJ

2011-03-01

27

Relating turbulence to wind turbine blade loads: Parametric study with multiple regression analysis

Energy Technology Data Exchange (ETDEWEB)

Different wind parameters are studied to find a set that is most useful in estimating fatigue loads on wind turbine blades. The histograms of rainflow counted stress ranges are summarized through their first three statistical moments and regression analysis is used to estimate these moments in various wind conditions. A systematic method of comparing the ability of different wind parameters to estimate the moments is described and results are shown for flapwise loads on three HAWTs. In the case of two of these turbines, the stress ranges are shown to be highly correlated with a turbulence measure obtained by removing a portion of the low-frequency content of the wind.

Kashef, T.; Winterstein, S.R.

1999-08-01

28

Directory of Open Access Journals (Sweden)

Full Text Available This article presents the possibility of using of multiple regression analysis (MRA) and dynamic neural network (DNN) for prediction of stability of Hydrocortisone 100 mg (in a form of hydrocortisone sodium succinate) freeze-dried powder for injection packed into a dual chamber container. Degradation products of hydrocortisone sodium succinate: free hydrocortisone and related substances (impurities A, B, C, D and E; unspecified impurities and total impurities) were followed during stress and formal stability studies. All data obtained during stability studies were used for in silico modeling; multiple regression models and dynamic neural networks as well, in order to compare predicted and observed results. High values of coefficient of determination (0.950.99) were gained using MRA and DNN, so both methods are powerful tools for in silico stability studies, but superiority of DNN over mathematical modeling of degradation was also confirmed.

Solomun Ljiljana N.; Ibri? Svetlana R.; Pejanovi? Vjera M.; ?uriš Jelena D.; Jockovi? Jelena M.; Stankovic Predrag D.; Vuji? Zorica B.

2012-01-01

29

Multiple Regressions in Analysing House Price Variations

Directory of Open Access Journals (Sweden)

Full Text Available An application of rigorous statistical analysis in aiding investment decision making gains momentum in the United States of America as well as the United Kingdom. Nonetheless in Malaysia the responses from the local academician are rather slow and the rate is even slower as far as the practitioners are concern. This paper illustrates how Multiple Regression Analysis (MRA) and its extension, Hedonic Regression Analysis been used in explaining price variation for selected houses in Malaysia. Each attribute that theoretically identified as price determinant is priced and the perceived contribution of each is explicitly shown. The paper demonstrates how the statistical analysis is capable of analyzing property investment by considering multiple determinants. The consideration of various characteristics which is more rigorous enables better investment decision making.

Aminah Md Yusof; Syuhaida Ismail

2012-01-01

30

Introduction to regression analysis

This book is an introduction to regression analysis for upper division and graduate students in science, engineering, social science and medicine. The emphasis is on the classical linear model using least squares estimation and inference. In addition, topics of current interest, such as regression diagnostics, ridge and logistic regression are treated as well. In contrast to other books at this level, the theoretical foundation of the subject is presented in some detail based on extensive use of matrix algebra. Throughout the text model building and evaluation are emphasised and illustrated wi

GOLBERG, M

2003-01-01

31

Bayesian logistic regression analysis

In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an essential added ingredient. The application of the product rule gives the posterior of the unknown logistic regression coefficients. The Jacobian transformation then maps the posterior of these regression coefficients to the posterior of the corresponding probability of some event and some nuisance parameters. Finally, by way of the sumrule the nuissance parameters are integrated out.

van Erp, N.; van Gelder, P.

2013-08-01

32

Diagnostics for multiple regression problems

Energy Technology Data Exchange (ETDEWEB)

In the last 10 to 15 years there has been much work done in trying to improve linear regression results. Individuals have analyzed the susceptibility of least-squares results to values far removed from the center of the independent variable observations. They have studied the problem of heavy-tailed residuals, and they have studied the problem of collinearity. From these studies have come ridge regression techniques, robust regression techniques, regression on principal components, etc. However, many practitioners view these methods with suspicion (and ignorance), and prefer to continue using the usual least-squares procedures to fit their models, even though their results might not be answering the question they think. In reaction to this, statisticians are spending more time analyzing how the individual observations affect the least squares results. In the last few years approximately 10 papers and one text have appeared that address the problem of how to study the influence of the individual observations. This report is a study of the recent work done in linear regression diagnostics. It is concerned with analyzing the effect of one case at a time, since the methods to analyze this situation are relatively straight-forward and are not prohibitive computationally.

Daly, J.C.

1982-03-01

33

Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

Seber, George A F

2012-01-01

34

Commonality Analysis: A Method for Decomposing Explained Variance in Multiple Regression Analyses.

Offers a brief explication of commonality analysis; a step-by-step discussion of how communication researchers may perform commonality analyses using output from a computer-assisted statistical analysis program; and provides an extended example illustrating a commonality analysis. (JMF)

McPhee, Robert D.; Seibold, David R.

1979-01-01

35

Directory of Open Access Journals (Sweden)

Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L) and (P/A) factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm) has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals). Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.

M. Cholewa; M. Dziuba-Ka?u?a

2011-01-01

36

Analysis of Experimental Data via Poisson Regression.

This paper provides an exposition on simple and complex comparisons within the framework of Poisson regression. Poisson regression is well suited for the analysis of event count outcomes. Since simple and complex comparisons with both analysis of variance (ANOVA) and multiple linear regression (MLR) are common, the similarities between Poisson…

Griffin, Bryan W.

37

This paper analyzes the relationship between ultraviolet erythemal radiation (UVER) measured in Badajoz (Spain) and ozone, cloudiness and aerosols. Initially, the values of transmissivity of UVER are related with three parameters (ozone amount, reflectivity and aerosol index) estimated by the satellite instrument TOMS. The relative importance and dependence of each variable is analyzed by means of a multiple regression analysis with an expression derived from the Lambert-Bouger-Beer law. The results indicate that the aerosol index is not a statistically significant factor for the initial expression. Then, a partial model with only ozone and reflectivity as regressors is proposed and coefficients are obtained using UVER measurements of year 2001. Finally the model is validated comparing its prediction for 2002 with UVER measurements at ground. The agreement between both data sets is reasonably good, suggesting that UVER estimations can be successfully derived from observations of other atmospheric variables, thus providing the basis to obtain spatial distributed maps of UV variations.

Antón, M.; Cancillo, M. L.; Serrano, A.; García, J. A.

2005-01-01

38

UK PubMed Central (United Kingdom)

Alang-Sosiya is the largest ship-scrapping yard in the world, established in 1982. Every year an average of 171 ships having a mean weight of 2.10 x 10(6)(+/-7.82 x 10(5)) of light dead weight tonnage (LDT) being scrapped. Apart from scrapped metals, this yard generates a massive amount of combustible solid waste in the form of waste wood, plastic, insulation material, paper, glass wool, thermocol pieces (polyurethane foam material), sponge, oiled rope, cotton waste, rubber, etc. In this study multiple regression analysis was used to develop predictive models for energy content of combustible ship-scrapping solid wastes. The scope of work comprised qualitative and quantitative estimation of solid waste samples and performing a sequential selection procedure for isolating variables. Three regression models were developed to correlate the energy content (net calorific values (LHV)) with variables derived from material composition, proximate and ultimate analyses. The performance of these models for this particular waste complies well with the equations developed by other researchers (Dulong, Steuer, Scheurer-Kestner and Bento's) for estimating energy content of municipal solid waste.

Reddy MS; Basha S; Joshi HV; Sravan Kumar VG; Jha B; Ghosh PK

2005-01-01

39

Directory of Open Access Journals (Sweden)

Full Text Available After much exertion and care to run an experiment in social science, the analysis of data should not be ruined by an improper analysis. Often, classical methods, like the mean, the usual simple and multiple linear regressions, and the ANOVA require normality and absence of outliers, which rarely occurs in data coming from experiments. To palliate to this problem, researchers often use some ad-hoc methods like the detection and deletion of outliers. In this tutorial, we will show the shortcomings of such an approach. In particular, we will show that outliers can sometimes be very difficult to detect and that the full inferential procedure is somewhat distorted by such a procedure. A more appropriate and modern approach is to use a robust procedure that provides estimation, inference and testing that are not influenced by outlying observations but describes correctly the structure for the bulk of the data. It can also give diagnostic of the distance of any point or subject relative to the central tendency. Robust procedures can also be viewed as methods to check the appropriateness of the classical methods. To provide a step-by-step tutorial, we present descriptive analyses that allow researchers to make an initial check on the conditions of application of the data. Next, we compare classical and robust alternatives to ANOVA and regression and discuss their advantages and disadvantages. Finally, we present indices and plots that are based on the residuals of the analysis and can be used to determine if the conditions of applications of the analyses are respected. Examples on data from psychological research illustrate each of these points and for each analysis and plot, R code is provided to allow the readers to apply the techniques presented throughout the article

Delphine S. Courvoisier; Olivier Renaud

2010-01-01

40

Multiple Linear Regression Models in Outlier Detection

Directory of Open Access Journals (Sweden)

Full Text Available Identifying anomalous values in the real-world database is important both for improving the quality of original data and for reducing the impact of anomalous values in the process of knowledge discovery in databases. Such anomalous values give useful information to the data analyst in discovering useful patterns. Through isolation, these data may be separated and analyzed. The analysis of outliers and influential points is an important step of the regression diagnostics. In this paper, our aim is to detect the points which are very different from the others points. They do not seem to belong to a particular population and behave differently. If these influential points are to be removed it will lead to a different model. Distinction between these points is not always obvious and clear. Hence several indicators are used for identifying and analyzing outliers. Existing methods of outlier detection are based on manual inspection of graphically represented data. In this paper, we present a new approach in automating the process of detecting and isolating outliers. Impact of anomalous values on the dataset has been established by using two indicators DFFITS and Cook’sD. The process is based on modeling the human perception of exceptional values by using multiple linear regression analysis.

S.M.A.Khaleelur Rahman; M.Mohamed Sathik; K.Senthamarai Kannan

2012-01-01

41

UK PubMed Central (United Kingdom)

The main objective of this work was to establish a mathematical function that correlates pesticide residue levels in apple juice with the levels of the pesticides applied on the raw fruit, taking into account some of their physicochemical properties such as water solubility, the octanol/water partition coefficient, the organic carbon partition coefficient, vapour pressure and density. A mixture of 12 pesticides was applied to an apple tree; apples were collected after 10 days of application. After harvest, apples were treated with a mixture of three post-harvest pesticides and the fruits were then processed in order to obtain apple juice following a routine industrial process. The pesticide residue levels in the apple samples were analysed using two multi-residue methods based on LC-MS/MS and GC-MS/MS. The concentration of pesticides was determined in samples derived from the different steps of processing. The processing factors (the coefficient between residue level in the processed commodity and the residue level in the commodity to be processed) obtained for the full juicing process were found to vary among the different pesticides studied. In order to investigate the relationships between the levels of pesticide residue found in apple juice samples and their physicochemical properties, principal component analysis (PCA) was performed using two sets of samples (one of them using experimental data obtained in this work and the other including the data taken from the literature). In both cases the correlation was found between processing factors of pesticides in the apple juice and the negative logarithms (base 10) of the water solubility, octanol/water partition coefficient and organic carbon partition coefficient. The linear correlation between these physicochemical properties and the processing factor were established using a multiple linear regression technique.

Martin L; Mezcua M; Ferrer C; Gil Garcia MD; Malato O; Fernandez-Alba AR

2013-01-01

42

Regression Analysis A Constructive Critique

Regression Analysis: A Constructive Critique identifies a wide variety of problems with regression analysis as it is commonly used and then provides a number of ways in which practice could be improved. Regression is most useful for data reduction, leading to relatively simple but rich and precise descriptions of patterns in a data set. The emphasis on description provides readers with an insightful rethinking from the ground up of what regression analysis can do, so that readers can better match regression analysis with useful empirical questions and improved policy-related research. "An

Berk, Richard A

2003-01-01

43

Finite mixture models have come to play a very prominent role in modelling data. The finite mixture model is predicated on the assumption that distinct latent groups exist in the population. The finite mixture model therefore is based on a categorical latent variable that distinguishes the different groups. Often in practice distinct sub-populations do not actually exist. For example, disease severity (e.g. depression) may vary continuously and therefore, a distinction of diseased and not-diseased may not be based on the existence of distinct sub-populations. Thus, what is needed is a generalization of the finite mixture's discrete latent predictor to a continuous latent predictor. We cast the finite mixture model as a regression model with a latent Bernoulli predictor. A latent regression model is proposed by replacing the discrete Bernoulli predictor by a continuous latent predictor with a beta distribution. Motivation for the latent regression model arises from applications where distinct latent classes do not exist, but instead individuals vary according to a continuous latent variable. The shapes of the beta density are very flexible and can approximate the discrete Bernoulli distribution. Examples and a simulation are provided to illustrate the latent regression model. In particular, the latent regression model is used to model placebo effect among drug treated subjects in a depression study. PMID:20625443

Tarpey, Thaddeus; Petkova, Eva

2010-07-01

44

A Dirty Model for Multiple Sparse Regression

Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...

Jalali, Ali; Sanghavi, Sujay

2011-01-01

45

UK PubMed Central (United Kingdom)

BACKGROUND: Patient and visitor violence (PVV) is the most dangerous occupational hazard that health professionals must contend with. Staff training is recommended to prevent and manage PVV. There is minimal research focusing on risk factors associated with PVV in general hospital settings. Therefore, staff training is mostly based upon expert knowledge and knowledge from psychiatric and emergency settings. OBJECTIVES: This study investigates health professionals' experiences with PVV in order to describe risk factors related to PVV that occur in general hospital settings. DESIGN: A retrospective cross-sectional survey was conducted in 2007. SETTING: A university general hospital in Switzerland. PARTICIPANTS: 2495 out of 4845 health professionals participated (58.0% nurses & midwives, 19.2% medical doctors, 3.6% physical therapists, occupational therapists & nutritionists, 6.1% ward secretaries, medical & radiology assistants, 6.3% nursing assistants or less qualified nursing staff and 5.1% other staff). All had direct patient contact and 82% were female. METHODS: Data were collected via questionnaires using the Survey of Violence Experienced by Staff German-Version-Revised, the German version of the shortened Perception of Aggression Scale and the Perception of Importance of Intervention Skills Scale. Descriptive statistics and multiple logistic regression analyses were used. RESULTS: Risk factors associated with PVV depend upon the form of violence. Those trained in aggression management and/or those who work predominantly with patients over 65 years of age experience twice as much PVV as others. Health professionals working in emergency rooms, outpatient units, intensive care units, recovery rooms, anesthesia, intermediate care and step-down units also experience PVV more often. When health professionals are older in age, are from the medical profession, are students, or when they have an attitude rating preventive measures as being less important and aggression as emotionally letting off steam, they experience less PVV. CONCLUSION: Training could change the perception and the recognition of PVV, and could therefore increase the risk of experiencing PVV. The health professionals' specific occupation along with attitude and age, the patients' age, the communication and the workplace are all relevant risk factors. Further studies should investigate the impact of aggression management training and other measures that would reduce PVV.

Hahn S; Müller M; Hantikainen V; Kok G; Dassen T; Halfens RJ

2013-03-01

46

Scientific Electronic Library Online (English)

Full Text Available Abstract in english This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robustness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Gro (more) up) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Rodrigues, Paulo Canas; Pereira, Dulce Gamito Santinhos; Mexia, João Tiago

2011-12-01

47

Directory of Open Access Journals (Sweden)

Full Text Available This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robustness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Paulo Canas Rodrigues; Dulce Gamito Santinhos Pereira; João Tiago Mexia

2011-01-01

48

Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. PMID:23917407

Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi

2013-07-17

49

UK PubMed Central (United Kingdom)

Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well.

Musuku A; Tan A; Awaiye K; Trabelsi F

2013-09-01

50

UK PubMed Central (United Kingdom)

Dissociation is defined as the disruption of the usually integrated functions of consciousness, such as memory, identity, and perceptions of the environment. Causes include various psychological, neurological and neurobiological mechanisms, none of which have been consistently supported. To our knowledge, the role of gene-environment interactions in dissociative experiences in obsessive-compulsive disorder (OCD) has not previously been investigated. Eighty-three Caucasian patients (29 male, 54 female) with a principal diagnosis of OCD were included. The Dissociative Experiences Scale was used to assess dissociation. The role of childhood trauma (assessed with the Childhood Trauma Questionnaire), and a functional 44-bp insertion/deletion polymorphism in the promoter region of the serotonin transporter, or 5-HTT, in mediating dissociation, was investigated using multiple regression analysis and path analysis using the partial least squares model. Both analyses indicated that an interaction between physical neglect and the S/S genotype of the 5-HTT gene significantly predicted dissociation in patients with OCD. Dissociation may be a predictor of poorer treatment outcome in patients with OCD; therefore, a better understanding of the mechanisms that underlie this phenomenon may be useful. Here, two different but related statistical techniques (multiple regression and partial least squares), confirmed that physical neglect and the 5-HTT genotype jointly play a role in predicting dissociation in OCD.

Lochner C; Seedat S; Hemmings SM; Moolman-Smook JC; Kidd M; Stein DJ

2007-01-01

51

Multiple Retrieval Models and Regression Models for Prior Art Search

This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend.

Lopez, Patrice

2009-01-01

52

UK PubMed Central (United Kingdom)

Surgical risk is defined as the occurrence of complications arising in the individual as a result of surgical stress. The ability to forecast these consequences is an important factor in determining decision taken by surgeon. Several attempts have been made to quantify postsurgical prospects but up till now no overall solution has been found. This paper attempts to define a multifactorial risk index for adults subjected to surgery, with respect to immediate and early per- and post-surgical complications. 1182 adult patients, 14 yrs or more, surgically treated not for urgency during 1985 in six Italian centres, were prospectively studied in order to derive a multivariate prognostic index of after surgery mortality. Stepwise logistic regression model was applied to a set of preoperative and operative factors, five of which were found significantly correlate with death: nutritional status, renal failure, reintervention, bacterial contamination during surgery, age greater than 70 years. Thus, from regression coefficients, scores were derived for modalities of significant variables, allowing to build four classes of risk patients: low (less than 1%), medium (between 1% and 10%), high (between 10% and 50%), extremely high risk (greater than 50%).

Terracciano CA; Iannuzzi C; Schiavone G; Di Blasio V; Gallo C

1992-03-01

53

The relationship between maceral content plus mineral matter and gross calorific value (GCV) for a wide range of West Virginia coal samples (from 6518 to 15330 BTU/lb; 15.16 to 35.66MJ/kg) has been investigated by multivariable regression and adaptive neuro-fuzzy inference system (ANFIS). The stepwise least square mathematical method comparison between liptinite, vitrinite, plus mineral matter as input data sets with measured GCV reported a nonlinear correlation coefficient (R2) of 0.83. Using the same data set the correlation between the predicted GCV from the ANFIS model and the actual GCV reported a R2 value of 0.96. It was determined that the GCV-based prediction methods, as used in this article, can provide a reasonable estimation of GCV. Copyright ?? Taylor & Francis Group, LLC.

Chelgani, S. C.; Hart, B.; Grady, W. C.; Hower, J. C.

2011-01-01

54

UK PubMed Central (United Kingdom)

"The purpose of the paper is to compare results of estimation and inference concerning covariate effects as obtained from two approaches to the analysis of survival data with multiple causes of failure. The first approach involves a dynamic model for the cause-specific hazard rate. The second is based on a static logistic regression model for the conditional probability of having had an event of interest. The influence of sociodemographic characteristics on the rate of family initiation and, more importantly, on the choice between marriage and cohabitation as a first union, is examined. We found that results, generally, are similar across the methods considered. Some issues in relation to censoring mechanisms and independence among causes of failure are discussed."

Ghilagaber G

1998-08-01

55

A Constrained Linear Estimator for Multiple Regression

"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

2010-01-01

56

The use of multiple linear regression in property valuation

Directory of Open Access Journals (Sweden)

Full Text Available The property appraisal is of great importance for one country and its economy. Nowadays, successful land management system could not be imagined without the subsystem related to market economy. Having the information about land and its values offer broad possibilities for market economy and strongly influence development of the real estate market. Special attention should be paid to the mass appraisal methods and its use in developing the tax system and framework for appropriate property appraisal system. Multiple regression analysis is just one of the methods used for this purpose and this article is focused to its characteristics and advantages in mass appraisal system development.

Branko Boži?; Dragana Mili?evi?; Marko Peji?; Stevan Marošan

2013-01-01

57

Principal component regression analysis with SPSS.

The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS. PMID:12758135

Liu, R X; Kuang, J; Gong, Q; Hou, X L

2003-06-01

58

Principal component regression analysis with SPSS.

UK PubMed Central (United Kingdom)

The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

Liu RX; Kuang J; Gong Q; Hou XL

2003-06-01

59

Fuzzy Multiple Regression Model for Estimating Software Development Time

Directory of Open Access Journals (Sweden)

Full Text Available As software becomes more complex and its scope dramatically increase, the importance of research on developing methods for estimating software development time has perpetually increased, so accurate estimation is the main goal of software managers for reducing risks of projects. The purpose of this article is to introduce a new Fuzzy Multiple Regression approach, which has the higher accurate than other methods for estimating. Furthermore, we compare Fuzzy Multiple Regression model with Fuzzy Logic model & Multiple Regression model based on their accuracy.

Venus Marza; Mir Ali Seyyedi

2009-01-01

60

UK PubMed Central (United Kingdom)

The simultaneous contribution of 11 occlusal factors, dental attrition severity, orthodontic history, trauma (motor vehicle accident [MVA] and non-MVA), and age in defining two independent large populations of females diagnosed with five mutually exclusive temporomandibular disorders was tested through multiple stepwise logistic regression analysis. Non-MVA trauma was significant in both groups in defining disc displacement (DD) with and without reduction, and osteoarthrosis (OA) (both primary and following DD). Anterior open bite was also a significant factor in defining OA in both groups. Much smaller contributions were also made by missing teeth in one of the populations with OA following DD, and by retruded contact position-intercuspal position slide lengths and overjet in one of the primary OA populations. Motor vehicle accident trauma was significant in defining myofascial pain (MP) in both populations, and laterotrusive attrition mildly defined MP in one population. Only a minority of total variance was explained: 6% to 8% of DD with reduction; 10% to 14% of DD without reduction; 11% to 20% of OA following DD; 17% to 38% of primary OA; and 4% to 10% of MP. Non-MVA trauma was the major defining feature of the temporomandibular joint intracapsular disorders, and MVA trauma explained a very small percentage of the MP patients. Implications are discussed and recommendations are made for future research.

Seligman DA; Pullinger AG

1996-01-01

61

Regression Commonality Analysis: A Technique for Quantitative Theory Building

When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

Nimon, Kim; Reio, Thomas G., Jr.

2011-01-01

62

Energy Technology Data Exchange (ETDEWEB)

The first round of the LTA-e (long-term agreement for the flower bulb sector) ran from 1995 to 2006. Every year 300 to 450 businesses participated. For each participating business, the energy use (e.g. electricity, natural gas, sometimes fuel oil), the crop acreage and the number of forced bulbs were registered. The aim of this analysis is to derive the use of natural gas and electricity per hectare and/or per 1000 forced bulbs for a number of bulbous plants, which can serve as starting point and reference in the second round of the LTA-2 [Dutch] De eerste ronde van de MJA-e (meerjarenafspraak energie voor de bloembollensector) liep van 1995 tot en met 2006. Jaarlijks deden 300 tot 450 bedrijven hieraan mee. Per deelnemend bedrijf zijn het energieverbruik (o.a. elektriciteit, aardgas en soms ook huisbrandolie), de gewasarealen en het aantal afgebroeide bollen geregistreerd. Doel van deze analyse is om voor een aantal bolgewassen het verbruik van aardgas en elektriciteit per hectare en/of per 1000 stuks afgebroeide bollen af te leiden, die als referentie kunnen dienen als startpunt van de 2de ronde van de MJA-e.

Wildschut, J. [Praktijkonderzoek Plant en Omgeving PPO, Sector Bloembollen, Lisse (Netherlands)

2008-02-15

63

Steganalysis of LSB Image Steganography using Multiple Regression and Auto Regressive (AR) Model

Directory of Open Access Journals (Sweden)

Full Text Available The staggering growth in communication technologyand usage of public domain channels (i.e. Internet) has greatly facilitated transfer of data. However, such open communication channelshave greater vulnerability to security threats causing unauthorizedin- formation access. Traditionally, encryption is used to realizethen communication security. However, important information is notprotected once decoded. Steganography is the art and science of communicating in a way which hides the existence of the communication.Important information is ?rstly hidden in a host data, such as digitalimage, text, video or audio, etc, and then transmitted secretly tothe receiver. Steganalysis is another important topic in informationhiding which is the art of detecting the presence of steganography. Inthis paper a novel technique for the steganalysis of Image has beenpresented. The proposed technique uses an auto-regressive model todetect the presence of the hidden messages, as well as to estimatethe relative length of the embedded messages.Various auto regressiveparameters are used to classify cover image as well as stego imagewith the help of a SVM classi?er. Multiple Regression analysis ofthe cover carrier along with the stego carrier has been carried outin order to ?nd out the existence of the negligible amount of thesecret message. Experimental results demonstrate the effectivenessand accuracy of the proposed technique.

Souvik Bhattacharyya; Gautam Sanyal

2011-01-01

64

On cluster-wise fuzzy regression analysis.

Since Tanaka et al. (1982) proposed a study of linear regression analysis with a fuzzy model, fuzzy regression analysis has been widely studied and applied in a variety of substantive areas. Regression analysis in the case of heterogeneity of observations is commonly presented in practice. The authors' main goal is to apply fuzzy clustering techniques to fuzzy regression analysis. Fuzzy clustering is used to overcome the heterogeneous problem in the fuzzy regression model. They present the cluster-wise fuzzy regression analysis in two approaches: the two-stage weighted fuzzy regression and the one-stage generalized fuzzy regression. The two-stage procedure extends the results of Jajuga (1986) and Diamond (1988). The one-stage approach is created by embedding fuzzy clusterings into the fuzzy regression model fitting at each step of procedure. This kind of embedding in the one-stage procedure is more effective since the structure of regression line shape encountered in the data set is taken into account at each iteration of the algorithm. Numerical results give evidence that the one-stage procedure can be highly recommended in cluster-wise fuzzy regression analysis. PMID:18255835

Yang, M S; Ko, C H

1997-01-01

65

On cluster-wise fuzzy regression analysis.

UK PubMed Central (United Kingdom)

Since Tanaka et al. (1982) proposed a study of linear regression analysis with a fuzzy model, fuzzy regression analysis has been widely studied and applied in a variety of substantive areas. Regression analysis in the case of heterogeneity of observations is commonly presented in practice. The authors' main goal is to apply fuzzy clustering techniques to fuzzy regression analysis. Fuzzy clustering is used to overcome the heterogeneous problem in the fuzzy regression model. They present the cluster-wise fuzzy regression analysis in two approaches: the two-stage weighted fuzzy regression and the one-stage generalized fuzzy regression. The two-stage procedure extends the results of Jajuga (1986) and Diamond (1988). The one-stage approach is created by embedding fuzzy clusterings into the fuzzy regression model fitting at each step of procedure. This kind of embedding in the one-stage procedure is more effective since the structure of regression line shape encountered in the data set is taken into account at each iteration of the algorithm. Numerical results give evidence that the one-stage procedure can be highly recommended in cluster-wise fuzzy regression analysis.

Yang MS; Ko CH

1997-01-01

66

Significant Tests of Coefficient Multiple Regressions by using Permutation Methods

Directory of Open Access Journals (Sweden)

Full Text Available Tests of significance of a single partial regression coefficient in a multiple regression model are often made in situations where the standard assumptions underlying the probability calculation (for example assumption of normally of random error term) do not hold. When the random error term fails to fulfill some of these assumptions, one need resort to some other nonparametric methods to carry out statistical inferences. Permutation methods are a branch of nonparametric methods. This study compared empirical type one error of different permutation strategies that proposed for testing nullity of a partial regression coefficient in a multiple regression model, using simulation and show that the type one error of Freedman and Lanes strategy is lower to than the other methods.

Ali Shadrokh

2011-01-01

67

Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hierarchical model that explicitly deals with both of the above sources of variation. This model leads to a test statistic analogous to a weighted two-sample t-test. When the number of groups involved is more than two, however, a more general approach is needed. Results We describe how logistic regression with overdispersion supplies this generalization, carrying with it the framework for incorporating other covariates into the model as a byproduct. This approach has the advantage that logistic regression routines are available in several common statistical packages. Conclusions The described method provides an easily implemented tool for analyzing SAGE data that correctly handles multiple types of variation and allows for more flexible modelling.

Baggerly Keith A; Deng Li; Morris Jeffrey S; Aldaz C Marcelo

2004-01-01

68

Directory of Open Access Journals (Sweden)

Full Text Available In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent) of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.

Arindam Chaudhuri

2012-01-01

69

|The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one…

Hecht, Jeffrey B.

70

UK PubMed Central (United Kingdom)

BACKGROUND: When applying the recommended standard doses of recombinant human thyrotropin (rhTSH) in the diagnostic/therapeutic management of patients with differentiated thyroid cancer (DTC), the resulting peak TSH levels vary extensively. Previous studies applying multivariate statistics identified patient-inherent variables influencing the rhTSH/peak TSH relation. However, those results were inconclusive and partly conflicting. Notably, no independent role of renal function was substantiated, despite the fact that the kidneys are known to play a prominent role in TSH clearance from blood. Therefore, the study's aim was to investigate the impact of renal function on the peak TSH concentration after the standard administration of rhTSH used in the management of thyroid cancer. The second objective was to calculate a ranking regarding the effect sizes of the selected variables on the peak TSH. METHODS: There were 286 patients with DTC included in the study. Univariate and multivariate analyses were performed, testing the correlation of serum creatinine and glomerular filtration rate (GFR) as surrogate parameters of renal function, age, sex, weight, height, and body surface area (BSA) with the peak TSH level. In six additional patients, the subsequent TSH pharmacokinetics after the TSH peak were measured and qualitatively compared. RESULTS: By univariate analyses, TSH correlated negatively with BSA, GFR, weight, and height, and positively with age, female sex, and serum creatinine (p<0.001). On the multivariate analysis, the stepwise forward multiple linear regression revealed BSA and renal function as the two most influential independent variables, followed by age, sex, and height. The pharmacokinetic datasets indicated that these identified parameters also influence the TSH decline over time. CONCLUSION: Identifying those patients with a favorable combination of parameters predicting a high-peak TSH is the first step toward an individualization of rhTSH dosing. Additionally, the subsequent TSH decrease over time needs to be taken into account. A complete understanding of the interrelation of the identified key parameters and both the TSH peak and subsequent TSH pharmacokinetics might allow for a more personalized rhTSH dosage strategy to achieve sufficient TSH levels instead of the fixed dosing procedure used at present.

Hautzel H; Pisar E; Lindner D; Schott M; Grandt R; Müller HW

2013-06-01

71

On relationship between regression models and interpretation of multiple regression coefficients

In this paper, we consider the problem of treating linear regression equation coefficients in the case of correlated predictors. It is shown that in general there are no natural ways of interpreting these coefficients similar to the case of single predictor. Nevertheless we suggest linear transformations of predictors, reducing multiple regression to a simple one and retaining the coefficient at variable of interest. The new variable can be treated as the part of the old variable that has no linear statistical dependence on other presented variables.

Varaksin, A N

2012-01-01

72

Directory of Open Access Journals (Sweden)

Full Text Available This study explores the relationship between the student performance and instructional design. The research was conducted at the E-Learning School at a university in Turkey. A list of design factors that had potential influence on student success was created through a review of the literature and interviews with relevant experts. From this, the five most import design factors were chosen. The experts scored 25 university courses on the extent to which they demonstrated the chosen design factors. Multiple-regression and supervised artificial neural network (ANN) models were used to examine the relationship between student grade point averages and the scores on the five design factors. The results indicated that there is no statistical difference between the two models. Both models identified the use of examples and applications as the most influential factor. The ANN model provided more information and was used to predict the course-specific factor values required for a desired level of success.

Halil Ibrahim Cebeci; Harun Resit Yazgan; Abdulkadir Geyik

2009-01-01

73

UK PubMed Central (United Kingdom)

BACKGROUND: Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. NEW METHOD: In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the L-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, L-arginine, L-ornithine, L-citrulline, glutamate and ?-aminobutyric acid (GABA)). RESULTS: The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ? 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. COMPARISON WITH EXISTING METHODS: In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. CONCLUSION: In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases.

Smith PF; Ganesh S; Liu P

2013-09-01

74

Energy Technology Data Exchange (ETDEWEB)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.

Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.

1985-08-01

75

International Nuclear Information System (INIS)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy

1985-01-01

76

A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

UK PubMed Central (United Kingdom)

In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.

Shen J; Gao S

2008-10-01

77

Two SPSS programs for interpreting multiple regression results.

When multiple regression is used in explanation-oriented designs, it is very important to determine both the usefulness of the predictor variables and their relative importance. Standardized regression coefficients are routinely provided by commercial programs. However, they generally function rather poorly as indicators of relative importance, especially in the presence of substantially correlated predictors. We provide two user-friendly SPSS programs that implement currently recommended techniques and recent developments for assessing the relevance of the predictors. The programs also allow the user to take into account the effects of measurement error. The first program, MIMR-Corr.sps, uses a correlation matrix as input, whereas the second program, MIMR-Raw.sps, uses the raw data and computes bootstrap confidence intervals of different statistics. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from http://brm.psychonomic-journals.org/content/supplemental. PMID:20160283

Lorenzo-Seva, Urbano; Ferrando, Pere J; Chico, Eliseo

2010-02-01

78

Two SPSS programs for interpreting multiple regression results.

UK PubMed Central (United Kingdom)

When multiple regression is used in explanation-oriented designs, it is very important to determine both the usefulness of the predictor variables and their relative importance. Standardized regression coefficients are routinely provided by commercial programs. However, they generally function rather poorly as indicators of relative importance, especially in the presence of substantially correlated predictors. We provide two user-friendly SPSS programs that implement currently recommended techniques and recent developments for assessing the relevance of the predictors. The programs also allow the user to take into account the effects of measurement error. The first program, MIMR-Corr.sps, uses a correlation matrix as input, whereas the second program, MIMR-Raw.sps, uses the raw data and computes bootstrap confidence intervals of different statistics. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from http://brm.psychonomic-journals.org/content/supplemental.

Lorenzo-Seva U; Ferrando PJ; Chico E

2010-02-01

79

Digital Repository Infrastructure Vision for European Research (DRIVER)

Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only requir...

Guoli Wang; Jianhui Wu; Jianhua Wu; Xiaohong Wang

80

Integrated formation evaluation with regression analysis

Energy Technology Data Exchange (ETDEWEB)

A comprehensive formation evaluation method is presented which integrates log, core and well test data on a reservoir using regression analysis. The method employs a capillary pressure curve model correlating four reservoir variables: Porosity, water saturation, permeability and capillary pressure. Porosity and saturation are estimated by conventional log analysis, permeability is obtained from empirical correlations with logs (usually porosity) and capillary pressure is directly related to height above the water level. Depth profiles of the four variables are adjusted by regression analysis constrained by the capillary pressure curve model. Results of the regression include: (1) An estimate of the water level; (2) Improved profiles of the four variables which are consistent with both log analysis and capillary pressure theory; (3) Adjusted log analysis parameters; (4) Complete synthetic capillary pressure curves for each depth level; (5) Relative permeability curves (drainage) generated from the capillary pressure curves; and (6) Estimated effective permeability to hydrocarbons and to water opposite the wellbore. These results are then integrated with well test data by comparing effective permeabilities from 6 above with the test results. If there is a mismatch, it may be necessary to rerun the regression in order to honor all of the datasets. The method is illustrated with log, core and test data from an offshore Gulf of Mexico well.

Hawkins, J.M.

1994-12-31

81

Regression analysis using dependent Polya trees.

UK PubMed Central (United Kingdom)

Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

Schörgendorfer A; Branscum AJ

2013-11-01

82

Determining School Effectiveness Following a Regression Analysis.

Three methods that can be used subsequent to a regression analysis to determine the relative effectiveness of schools are Dyer's Performance Indices, Scheffe's hyperbolic confidence bands, and Gafarian's linear confidence bands. These methods were applied to data from 54 hypothetical schools randomly generated from a multivariate normal…

Convey, John J.

83

Multiple Linear Regression for Extracting Phrase Translation Pairs

Directory of Open Access Journals (Sweden)

Full Text Available Phrase translation pairs are very useful for bilingual lexicography, machine translation system, cross-lingual information retrieval and many applications in natural language processing. Phrase translation pairs are always extracted from bilingual sentence pairs. In this paper, we extract phrase translation pairs based on word alignment results of Chinese-English bilingual sentence pairs and parsing trees of Chinese sentences, in order to decrease the influence of the grammar disagreement between Chinese and English. Discriminative features for phrase translation pairs are proposed to evaluate extracted ones in this paper, including translation literality, phrase alignment probability and phrase length difference. Multiple linear regression model combined with N-best strategy will be employed to filter phrase translation pairs, in order to improve the evaluating and filtering performance. Experimental results indicate that the filtering performance of phrase alignment probability is best in three kinds of discriminative features for evaluating Chinese-English phrase translation pairs. After multiple linear regression model combined with N-best strategy is used, its F1 achieves 86.24%.

Chun-Xiang Zhang; Ming-Yuan Ren; Zhi-Mao Lu; Ying-Hong Liang; Da-Song Sun; Yong Liu

2011-01-01

84

Directory of Open Access Journals (Sweden)

Full Text Available In ordinary statistical methods, multiple outliers in multiple linear regression model are detected sequentially one after another, where smearing and masking effects give misleading results. If the potential multiple outliers can be detected simultaneously, smearing and masking effects can be avoided. Such multiple-case outlier detection is of combinatorial nature and 2^N-N-1 sets of possible outliers need to be tested, where N is the number of data points. This exhaustive search is practically impossible. In this paper, we have used quantum-inspired evolutionary algorithm (QEA) for multiple-case outlier detection in multiple linear regression model. A Bayesian information criterion based fitness function incorporating extra penalty for number of potential outliers has been used for identifying the most appropriate set of potential outliers. Experimental results with 10 widely referred datasets from statistical literature show that the QEA overcomes the effect of smearing and masking and effectively detects the most appropriate set of outliers.

Salena Akter; Mozammel H A Khan

2010-01-01

85

Numerical analysis of robust regression methods

International Nuclear Information System (INIS)

The robust estimates of linear regression parameters are considered. Numerical analysis of these methods and their modifications for the problems of particle track recognition is performed. For the cases when disbalance points can appear a special method of studentizing of residual vector is proposed. This method is investigated by theoretical and graphical means. The numerical characteristics of the methods obtained by the Monte-Carlo simulation in various models are summarized in tables. Some recommendations for using rubust regression methods in the mass data processing are given

1985-01-01

86

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR), Multiple Polynomial Regression (MPR) and soft computing techniques such as Artificial Neural Network (ANN) and Adaptive Neuro Fuzzy Inference System (ANFIS). The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

M. Srinivasan; A. Krishnan

2012-01-01

87

Stukel's Extended Logistic Regression Analysis with R

Directory of Open Access Journals (Sweden)

Full Text Available Objective: For a logistic regression model, the degree to which predicted probabilities agree with actual outcomes can be expressed as a classification table. Being crucial in model adequacy checking, such tables may be slightly different when the same data are modeled with different statistical packages. The underlying reason is that when classifying a set of binary data, if the observations used to fit the model are also used to estimate the classification error, the resulting error-count estimate is biased. In order to cope with this problem, SAS suggests an algorithm, whereas the software is not publicly available. R is a free downloadable programme which is particularly designed for statistical computation, including the logistic regression analysis. The purpose of this study is to present a new function in R which carries out an extended logistic regression analysis of a binary data from the construction of its reduced-biased classification table, to the inference of its model parameters by calling the lrm(.) function under the Design package where necessary. Material and Methods: The performance of ext.logreg(.) is evaluated in terms of the accuracy of estimates and computational cost. Results: From the results of two binary datasets, it is observed that ext.logreg(.) via R estimates the model parameters and constructs the unbiased classification table as accurate as SAS programme under PROC logistic function without losing the computational demand. Conclusion: The free downloadable ext.logreg(.) function can be seen as an alternative computational tool in the analysis of logistic regression when the validation of predicted probabilities is essential.

Vilda PURUTÇUO?LU; Sayg?n KARAGÜLLE

2011-01-01

88

Functional linear regression analysis for longitudinal data

We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allow...

Yao, F; Wang, J L; Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling

2005-01-01

89

Directory of Open Access Journals (Sweden)

Full Text Available ÖZArast?rman?n Amac?: Bu çal?sman?n amac? kredi kart? müsterilerinin kulland?klar? kredikartlar?na iliskin negatif ve pozitif tutumlar?n?n arast?r?lmas?d?r.Yöntem: Önce müsterilerin kredi kart?na olan tutumlar? Aç?klay?c? Faktör Analizi yard?m?ylaincelenmis, daha sonra belirlenen 7 faktörün kredi kart?na duyulan memnuniyet ve gelecekte kredikart? kullanmama tutumlar?na etkileri Çoklu Regresyon Analizi yard?m?yla arast?r?lm?st?r.Bulgular ve Sonuç: Çal?sma sonucunda kredi kart?n?n kisiye güven verdigi alg?s?n?nMemnuniyet degiskeni üzerinde en büyük artt?r?c? etkiye sahip faktör oldugu, bunun yan? s?ra kredikart? kullan?m?na kars? olumlu alg?n?n Ç?k?s degiskeni üzerinde en çok azalt?c? etkiye sahip faktöroldugu saptanm?st?r.Anahtar Kelimeler: Kredi Kart?, Müsteri Memnuniyeti, Ç?k?s Davran?s?, Aç?klay?c? FaktörAnalizi ve Çoklu Regresyon AnaliziABSTRACTResearch Aim: This study researched the effect of negative and positive perceptions of creditcard holders towards credit cards in their satisfaction and exit behaviors.Method: In this study, we first assessed the attitudes of customers towards the use of creditcards by means of Exploratory Factor Analysis, then we assessed the effects of the pre-determined 7factors on the credit card satisfaction and the use of credit cards in the future thanks to MultipleRegression Analysis.Findings and Result: At the end of the study, It was found the perception that credit cards givecustomer confidence has the most effect to increase the satisfaction. It was also found that positiveattitudes towards the use of credit cards have the most effect to decrease effect on exit behavior.Key Words: Credit Card, Consumer’s Satisfaction, Exit Behaviors, Exploratory FactorAnalysis, and Multiple Regression Analysis

Veysel YILMAZ; Cengiz AKTA?; M. S. Talha ARSLAN

2009-01-01

90

Regression analysis for central air conditioning data

Energy Technology Data Exchange (ETDEWEB)

A study was conducted to estimate the unit energy consumption (UEC) for central air-conditioning for Canadian houses. Residential energy use accounts for 19 per cent of secondary energy use in Canada. Air-conditioning contributes about 0.4 per cent of the total energy used in an average home. Factors such as house volume, design cooling dry and wet bulb temperature, design cooling degree, and number of occupants were taken into consideration when estimating the energy consumption of air-conditioners. In this study, data was obtained from the original HOT2000 output for SHUE-Archetype projects. The statistical software program SPSS was used for the regression analysis. The dependent variable FULCON is first regressed on the volume of the house, design cooling dry bulb temperature, design cooling wet bulb temperature, cooling degree days, number of adults, number of children, A/C capacity, A/C COP, and the annual electricity baseload. The average, maximum and minimum FULCON values for Atlantic, Central, Prairies and Western Canada are presented. tabs.

NONE

1998-03-01

91

Multiple Regression in a Two-Way Layout.

|This paper discusses Bayesian m-group regression where the groups are arranged in a two-way layout into m rows and n columns, there still being a regression of y on the x's within each group. The mathematical model is then provided as applied to the case where the rows correspond to high schools and the columns to colleges: the predictor…

Lindley, Dennis V.

92

There are many studies about cuffless and continuous blood pressure estimation using pulse transit time (PTT). In this study, we proposed the modeling method which could estimate systolic BP (SBP) conveniently and indirectly using PTT and some biometric parameters. 45 people participated in this study and we measured PTT using photoplethysmography (PPG) and electrocardiogram (ECG) signals and biometric parameters such as weight, height, body mass index (BMI), length of arm and circumference of arm. Before modeling, we selected variables as predictors using statistical analysis. With these parameters, we compared artificial neural network (ANN) with multiple regressions as an estimating method of BP. We evaluated the mean differences and standard deviations between estimated value and reference value, acquired from a KEDA-approved device. The results showed that the ANN had better accuracy than the multiple regression. ANN's estimation satisfied AAMI standard as a BP device. PMID:17281871

Yi Kim, Jung; Hwan Cho, Baek; Mi Im, Soo; Ju Jeon, Myoung; Young Kim, In; Kim, Sun

2005-01-01

93

UK PubMed Central (United Kingdom)

There are many studies about cuffless and continuous blood pressure estimation using pulse transit time (PTT). In this study, we proposed the modeling method which could estimate systolic BP (SBP) conveniently and indirectly using PTT and some biometric parameters. 45 people participated in this study and we measured PTT using photoplethysmography (PPG) and electrocardiogram (ECG) signals and biometric parameters such as weight, height, body mass index (BMI), length of arm and circumference of arm. Before modeling, we selected variables as predictors using statistical analysis. With these parameters, we compared artificial neural network (ANN) with multiple regressions as an estimating method of BP. We evaluated the mean differences and standard deviations between estimated value and reference value, acquired from a KEDA-approved device. The results showed that the ANN had better accuracy than the multiple regression. ANN's estimation satisfied AAMI standard as a BP device.

Yi Kim J; Hwan Cho B; Mi Im S; Ju Jeon M; Young Kim I; Kim S

2005-01-01

94

Forecasting Gold Prices Using Multiple Linear Regression Method

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: Forecasting is a function in management to assist decision making. It is also described as the process of estimation in unknown future situations. In a more general term it is commonly known as prediction which refers to estimation of time series or longitudinal type data. Gold is a precious yellow commodity once used as money. It was made illegal in USA 41 years ago, but is now once again accepted as a potential currency. The demand for this commodity is on the rise. Approach: Objective of this study was to develop a forecasting model for predicting gold prices based on economic factors such as inflation, currency price movements and others. Following the melt-down of US dollars, investors are putting their money into gold because gold plays an important role as a stabilizing influence for investment portfolios. Due to the increase in demand for gold in Malaysian and other parts of the world, it is necessary to develop a model that reflects the structure and pattern of gold market and forecast movement of gold price. The most appropriate approach to the understanding of gold prices is the Multiple Linear Regression (MLR) model. MLR is a study on the relationship between a single dependent variable and one or more independent variables, as this case with gold price as the single dependent variable. The fitted model of MLR will be used to predict the future gold prices. A naive model known as ?forecast-1? was considered to be a benchmark model in order to evaluate the performance of the model. Results: Many factors determine the price of gold and based on ?a hunch of experts?, several economic factors had been identified to have influence on the gold prices. Variables such as Commodity Research Bureau future index (CRB); USD/Euro Foreign Exchange Rate (EUROUSD); Inflation rate (INF); Money Supply (M1); New York Stock Exchange (NYSE); Standard and Poor 500 (SPX); Treasury Bill (T-BILL) and US Dollar index (USDX) were considered to have influence on the prices. Parameter estimations for the MLR were carried out using Statistical Packages for Social Science package (SPSS) with Mean Square Error (MSE) as the fitness function to determine the forecast accuracy. Conclusion: Two models were considered. The first model considered all possible independent variables. The model appeared to be useful for predicting the price of gold with 85.2% of sample variations in monthly gold prices explained by the model. The second model considered the following four independent variables the (CRB lagged one), (EUROUSD lagged one), (INF lagged two) and (M1 lagged two) to be significant. In terms of prediction, the second model achieved high level of predictive accuracy. The amount of variance explained was about 70% and the regression coefficients also provide a means of assessing the relative importance of individual variables in the overall prediction of gold price.

Z. Ismail; A. Yahya; A. Shabri

2009-01-01

95

Empirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping.

UK PubMed Central (United Kingdom)

BACKGROUND: Complex binary traits are influenced by many factors including the main effects of many quantitative trait loci (QTLs), the epistatic effects involving more than one QTLs, environmental effects and the effects of gene-environment interactions. Although a number of QTL mapping methods for binary traits have been developed, there still lacks an efficient and powerful method that can handle both main and epistatic effects of a relatively large number of possible QTLs. RESULTS: In this paper, we use a Bayesian logistic regression model as the QTL model for binary traits that includes both main and epistatic effects. Our logistic regression model employs hierarchical priors for regression coefficients similar to the ones used in the Bayesian LASSO linear model for multiple QTL mapping for continuous traits. We develop efficient empirical Bayesian algorithms to infer the logistic regression model. Our simulation study shows that our algorithms can easily handle a QTL model with a large number of main and epistatic effects on a personal computer, and outperform five other methods examined including the LASSO, HyperLasso, BhGLM, RVM and the single-QTL mapping method based on logistic regression in terms of power of detection and false positive rate. The utility of our algorithms is also demonstrated through analysis of a real data set. A software package implementing the empirical Bayesian algorithms in this paper is freely available upon request. CONCLUSIONS: The EBLASSO logistic regression method can handle a large number of effects possibly including the main and epistatic QTL effects, environmental effects and the effects of gene-environment interactions. It will be a very useful tool for multiple QTLs mapping for complex binary traits.

Huang A; Xu S; Cai X

2013-01-01

96

MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES

Directory of Open Access Journals (Sweden)

Full Text Available Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established by multiple logistic regression model using SPSS statistical software. Results: The factors like frequency of brushing, timings of cleaning teeth and type of toothpastes are significant persistent predictors of dental caries and periodontal disease. The log likelihood value of full model is –1013.1364 and Akaike’s Information Criterion (AIC) is 1.1752 as compared to reduced regression model are -1019.8106 and 1.1748 respectively for dental caries. But, the log likelihood value of full model is –1085.7876 and AIC is 1.2577 followed by reduced regression model are -1019.8106 and 1.1748 respectively for periodontal disease. The area under Receiver Operating Characteristic (ROC) curve for the dental caries is 0.7509 (full model) and 0.7447 (reduced model); the ROC for the periodontal disease is 0.6128 (full model) and 0.5821 (reduced model). Conclusions: The frequency of brushing, timings of cleaning teeth and type of toothpastes are main signifi cant risk factors of dental caries and periodontal disease. The fitting performance of reduced logistic regression model is slightly a better fit as compared to full logistic regression model in identifying the these risk factors for both dichotomous dental caries and periodontal disease.

Shivalingappa B. Javali; Parameshwar V. Pandit

2012-01-01

97

Multiple regression approach to optimize drilling operations in the Arabian Gulf area

Energy Technology Data Exchange (ETDEWEB)

This paper reports a successful application of multiple regression analysis, supported by a detailed statistical study to verify the Bourgoyne and Young model. The model estimates the optimum penetration rate (ROP), weight on bit (WOB), and rotary speed under the effect of controllable and uncontrollable factors. Field data from three wells in the Arabian Gulf were used and emphasized the validity of this model. The model coefficients are sensitive to the number of points included. The correlation coefficients and multicollinearity sensitivity of each drilling parameter on the ROP are studied.

Al-Betairi, E.A.; Moussa, M.M.; Al-Otaibi, S.

1988-03-01

98

We consider regression analysis when covariate variables are the underlying regression coefficients of another linear mixed model. A naive approach is to use each subject's repeated measurements, which are assumed to follow a linear mixed model, and obtain subject-specific estimated coefficients to replace the covariate variables. However, directly replacing the unobserved covariates in the primary regression by these estimated coefficients may result in a significantly biased estimator. The aforementioned problem can be evaluated as a generalization of the classical additive error model where repeated measures are considered as replicates. To correct for these biases, we investigate a pseudo-expected estimating equation (EEE) estimator, a regression calibration (RC) estimator, and a refined version of the RC estimator. For linear regression, the first two estimators are identical under certain conditions. However, when the primary regression model is a nonlinear model, the RC estimator is usually biased. We thus consider a refined regression calibration estimator whose performance is close to that of the pseudo-EEE estimator but does not require numerical integration. The RC estimator is also extended to the proportional hazards regression model. In addition to the distribution theory, we evaluate the methods through simulation studies. The methods are applied to analyze a real dataset from a child growth study. PMID:10877308

Wang, C Y; Wang, N; Wang, S

2000-06-01

99

UK PubMed Central (United Kingdom)

We consider regression analysis when covariate variables are the underlying regression coefficients of another linear mixed model. A naive approach is to use each subject's repeated measurements, which are assumed to follow a linear mixed model, and obtain subject-specific estimated coefficients to replace the covariate variables. However, directly replacing the unobserved covariates in the primary regression by these estimated coefficients may result in a significantly biased estimator. The aforementioned problem can be evaluated as a generalization of the classical additive error model where repeated measures are considered as replicates. To correct for these biases, we investigate a pseudo-expected estimating equation (EEE) estimator, a regression calibration (RC) estimator, and a refined version of the RC estimator. For linear regression, the first two estimators are identical under certain conditions. However, when the primary regression model is a nonlinear model, the RC estimator is usually biased. We thus consider a refined regression calibration estimator whose performance is close to that of the pseudo-EEE estimator but does not require numerical integration. The RC estimator is also extended to the proportional hazards regression model. In addition to the distribution theory, we evaluate the methods through simulation studies. The methods are applied to analyze a real dataset from a child growth study.

Wang CY; Wang N; Wang S

2000-06-01

100

Joint regression analysis for discrete longitudinal data.

UK PubMed Central (United Kingdom)

We introduce an approximation to the Gaussian copula likelihood of Song, Li, and Yuan (2009,?Biometrics?65, 60-68) used to estimate regression parameters from correlated discrete or mixed bivariate or trivariate outcomes. Our approximation allows estimation of parameters from response vectors of length much larger than three, and is asymptotically equivalent to the Gaussian copula likelihood. We estimate regression parameters from the toenail infection data of De Backer et al. (1996,?British Journal of Dermatology?134, 16-17), which consist of binary response vectors of length seven or less from 294 subjects. Although maximizing the Gaussian copula likelihood yields estimators that are asymptotically more efficient than generalized estimating equation (GEE) estimators, our simulation study illustrates that for finite samples, GEE estimators can actually be as much as 20% more efficient.

Madsen L; Fang Y

2011-09-01

101

UK PubMed Central (United Kingdom)

A first-hitting-time (FHT) survival model postulates a health status process for a patient that gradually declines until the patient dies when the level first reaches a critical threshold. Threshold regression (TR) is a new regression methodology that incorporates the effects of covariates on the threshold and process parameters of this FHT model. In this study, we use TR to analyze data from a randomized clinical trial of treatment for multiple myeloma. The trial compares VELCADE and high-dose dexamethasone, the former a new therapy and the latter an established therapy for this disease. Patients are switched between the two drugs based on patient response. The novel contribution of this work is the modeling of this clinical trial design using a mixture of TR models. Specifically, we propose a mixture FHT model to fit the survival distribution. The model includes a composite time scale that differentiates the rate of disease progression before and after switching. The analysis shows significant benefit from initial treatment by VELCADE. A comparison is made with a Cox proportional hazards regression analysis of the same data.

Lee ML; Chang M; Whitmore GA

2008-01-01

102

Application of Partial Least-Squares Regression Model on Temperature Analysis and Prediction of RCCD

Directory of Open Access Journals (Sweden)

Full Text Available This study, based on the temperature monitoring data of jiangya RCCD, uses principle and method of partial least-squares regression to analyze and predict temperature variation of RCCD. By founding partial least-squares regression model, multiple correlations of independent variables is overcome, organic combination on multiple linear regressions, multiple linear regression and canonical correlation analysis is achieved. Compared with general least-squares regression model result, it is more advanced and accurate, had more practical explanation. It is proved feasible and practical, so, it can be used to predict concrete temperature. By calculating, the result shows that rock temperature is the most important factor which affects RCCD temperature. RCCD temperature is decreasing with rock temperature. We suggest that rock temperature should be monitored as emphasis in the future; this can provide some scientific basis for temperature controlling and preventing RCCD crack.

Yuqing Zhao; Zhenxian Xing

2013-01-01

103

Semiparametric regression analysis of interval-censored data.

UK PubMed Central (United Kingdom)

We propose a semiparametric approach to the proportional hazards regression analysis of interval-censored data. An EM algorithm based on an approximate likelihood leads to an M-step that involves maximizing a standard Cox partial likelihood to estimate regression coefficients and then using the Breslow estimator for the unknown baseline hazards. The E-step takes a particularly simple form because all incomplete data appear as linear terms in the complete-data log likelihood. The algorithm of Turnbull (1976, Journal of the Royal Statistical Society, Series B 38, 290-295) is used to determine times at which the hazard can take positive mass. We found multiple imputation to yield an easily computed variance estimate that appears to be more reliable than asymptotic methods with small to moderately sized data sets. In the right-censored survival setting, the approach reduces to the standard Cox proportional hazards analysis, while the algorithm reduces to the one suggested by Clayton and Cuzick (1985, Applied Statistics 34, 148-156). The method is illustrated on data from the breast cancer cosmetics trial, previously analyzed by Finkelstein (1986, Biometrics 42, 845-854) and several subsequent authors.

Goetghebeur E; Ryan L

2000-12-01

104

Directory of Open Access Journals (Sweden)

Full Text Available Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fitting and forecasting of the two kinds of models, and consequently, comes to the conclusion that the linear neural network method has a stronger fitting ability and a more stable ability of prediction so that it can be further applied and promoted in the analyzing and forecasting of continuous data factors.

Guoli Wang; Jianhui Wu; Jianhua Wu; Xiaohong Wang

2011-01-01

105

UK PubMed Central (United Kingdom)

1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software.

Ludbrook J

2010-07-01

106

Geary on inference in multiple regression and on closeness and the Taxi problem

Digital Repository Infrastructure Vision for European Research (DRIVER)

for discussion — (a) his work with Leser on "paradoxical" situations in multiple regression and (b) his work on estimation of the unknown upper bound, N, of a uniform distribution, based on a sample of n values from...

Spencer, John E.; Largey, Ann; Geary, R. C.

107

Analysis of surfaces using constrained regression models.

We present a study of the relationship between the changes in the shape of the human ear due to jaw movement and acoustical feedback (AF) in hearing aids. In particular, we analyze the deformation field of the outer ear associated with the movement of the mandible (jaw bone) to understand its effect on AF and identify local regions that play a significant role. Our data contains ear impressions of 42 hearing aid users, in two different positions: open and closed mouth, and survey data including information about experienced discomfort due to AF. We use weighted support vector machines (WSVM) to investigate the separation between the presence and lack of AF and achieve classification accuracy of 80% based on the deformation field. To robustly localize the regions of the deformation field that significantly contribute to AF we employ logistic regression penalized with elastic net (EN). By visualizing the selected variables on the mean surface, we provide clinical interpretations of the results. PMID:18979824

Darkner, Sune; Sabuncu, Mert R; Golland, Polina; Paulsen, Rasmus R; Larsen, Rasmus

2008-01-01

108

Directory of Open Access Journals (Sweden)

Full Text Available Landslide is a natural hazard that causes many damages to the environment. Depending on the landform, several factors can cause the Landslide. This research addresses the methodology for landslide susceptibility mapping using multiple regression analysis and GIS tools. Based on the initial hypothesis, ten factors were recognized as effectual elements on landslide, which is geology, slope, aspect, distance from roads, faults and drainage network, soil capability, land use and rainfall. Crossing investigated parameters with the observed landslides indicated that three factor including distance from channel network, distance from fault and rainfall have no major effect on observed landslide in Tajan area. In order to quantifying the parameters in the form of weighting factors, the coverage of landslides in different observation was determined. Then Stepwise method was used for statistical analysis. It was found that slope, aspect, distance from the roads and soil capability are as most effective factors in landslide respectively.

Somayeh Mashari; Karim Solaimani; Ebrahim Omidvar

2012-01-01

109

Generalized regression neural networks with multiple-bandwidth sharing and hybrid optimization.

UK PubMed Central (United Kingdom)

This paper proposes a novel algorithm for function approximation that extends the standard generalized regression neural network. Instead of a single bandwidth for all the kernels, we employ a multiple-bandwidth configuration. However, unlike previous works that use clustering of the training data for the reduction of the number of bandwidths, we propose a distinct scheme that manages a dramatic bandwidth reduction while preserving the required model complexity. In this scheme, the algorithm partitions the training patterns to groups, where all patterns within each group share the same bandwidth. Grouping relies on the analysis of the local nearest neighbor distance information around the patterns and the principal component analysis with fuzzy clustering. Furthermore, we use a hybrid optimization procedure combining a very efficient variant of the particle swarm optimizer and a quasi-Newton method for global optimization and locally optimal fine-tuning of the network bandwidths. Training is based on the minimization of a flexible adaptation of the leave-one-out validation error that enhances the network generalization. We test the proposed algorithm with real and synthetic datasets, and results show that it exhibits competitive regression performance compared to other techniques.

Goulermas JY; Zeng XJ; Liatsis P; Ralph JF

2007-12-01

110

Estimating changes in river faecal coliform loading using nonparametric multiplicative regression.

UK PubMed Central (United Kingdom)

Faecal coliform (FC) concentration was monitored weekly in the Tangipahoa River over an eight year period. Available USGS discharge and precipitation data were used to construct a nonparametric multiplicative regression (NPMR) model for both forecasting and backcasting of FC density. NPMR backcasting and forecasting of FC allowed for estimation of concentration for any flow regime. During this study a remediation effort was undertaken to improve disinfection systems of contributing municipal waste water treatment plants in the watershed. Time-series analysis of FC concentrations demonstrated a drop in FC levels coinciding with remediation efforts. The NPMR model suggested the reduction in FC levels was not due to climate variance (i.e. discharge and precipitation changes) alone. Use of the NPMR method circumvented the need for construction of a more complex physical watershed model to estimate FC loading in the river. This method can be used to detect and estimate new discharge impacts, or forecast daily FC estimates.

Schulz CJ; Childers GW

2011-03-01

111

Latent class regression: inference and estimation with two-stage multiple imputation.

Latent class regression (LCR) is a popular method for analyzing multiple categorical outcomes. While nonresponse to the manifest items is a common complication, inferences of LCR can be evaluated using maximum likelihood, multiple imputation, and two-stage multiple imputation. Under similar missing data assumptions, the estimates and variances from all three procedures are quite close. However, multiple imputation and two-stage multiple imputation can provide additional information: estimates for the rates of missing information. The methodology is illustrated using an example from a study on racial and ethnic disparities in breast cancer severity. PMID:23712802

Harel, Ofer; Chung, Hwan; Miglioretti, Diana

2013-05-26

112

Latent class regression: inference and estimation with two-stage multiple imputation.

UK PubMed Central (United Kingdom)

Latent class regression (LCR) is a popular method for analyzing multiple categorical outcomes. While nonresponse to the manifest items is a common complication, inferences of LCR can be evaluated using maximum likelihood, multiple imputation, and two-stage multiple imputation. Under similar missing data assumptions, the estimates and variances from all three procedures are quite close. However, multiple imputation and two-stage multiple imputation can provide additional information: estimates for the rates of missing information. The methodology is illustrated using an example from a study on racial and ethnic disparities in breast cancer severity.

Harel O; Chung H; Miglioretti D

2013-07-01

113

UK PubMed Central (United Kingdom)

In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter ? ?at level 1?-?? to be identified as those ?* and ?** that satisfy CDF c (?*)?=?????2 and CDF c (?**)?=?1?-?????2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. Copyright © 2013 John Wiley & Sons, Ltd.

Heinze G; Ploner M; Beyea J

2013-07-01

114

A method for the analysis of capillary column Polychlorinated biphenyl (PCB) data using regression analysis with outlier checking and elimination, COMSTAR, is presented and evaluated. his algorithm determines the best combination of the commercial PCB mixtures which best fits the...

115

Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging

Directory of Open Access Journals (Sweden)

Full Text Available Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system) evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 and PP900 are 62.66%, 79.37% and 90.62%, respectively and results of fuzzy interference system and decision tree showed that this method have high potential in comparison with multiple regression and values of total classification accuracy for PP450, PP700 and PP900 are 98.6%, 97.3% and 98% respectively. Then final results of this research showed fuzzy interference systems by using decision tree have high potential to predict clogging in envelops.

Bakhtiar Karimi; Farhad Mirzaei; Mohammad Javad Nahvinia; Behnam Ababaei

2010-01-01

116

Directory of Open Access Journals (Sweden)

Full Text Available Polyethylene glycol (PEG) is the most common preservative in use for bulking and maintaining structural integrity in waterlogged wood. Conservators therefore have a need to be able to determine PEG concentrations in wood in a non-destructive manner. We present a study highlighting the application of infrared spectroscopy coupled with multivariate analysis techniques to predict the concentration of polyethylene glycol 400 (PEG-400) and water simultaneously. This technique uses attenuated total reflectance (ATR) spectroscopy andunconstrained stepwise multiple linear regression (SMLR) analysis for prediction of multiple components in archaeological wood. Using this model we have calculated the concentration of PEG-400 and water in treated archaeological waterlogged wood samples.

Rohan PATEL; Jessica BINGHAM; Shanna DANIEL; Sarah WATKINS-KENNEY; Anthony KENNEDY

2012-01-01

117

Egg hatchability prediction by multiple linear regression and artificial neural networks

Directory of Open Access Journals (Sweden)

Full Text Available An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

AC Bolzan; RAF Machado; JCZ Piaia

2008-01-01

118

Linear regression analysis of survival data with missing censoring indicators.

Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial. PMID:20559722

Wang, Qihua; Dinse, Gregg E

2010-06-18

119

Linear regression analysis of survival data with missing censoring indicators.

UK PubMed Central (United Kingdom)

Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.

Wang Q; Dinse GE

2011-04-01

120

Directory of Open Access Journals (Sweden)

Full Text Available This paper introduces a statistical model by using the statistical methods in 2G,GSM communication system.Multiple regression formula is to calculate path loss. It is assumed that hb,W and ? are three statistical variables. We use nakagami distribution to model hb,W and uniform distribution to model ?.

Meenal Sharma; Rakesh Mohan

2011-01-01

121

This paper represents an attempt to test and validate a previously developed model for predicting trihalomethane (THM) formation in chlorinated waters containing THM precursors. he original model, in the form of a nonlinear multiple regression equation, did not prove to be extrem...

122

Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil

Digital Repository Infrastructure Vision for European Research (DRIVER)

This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that repr...

José Paulo de Lucca Ramos; Newton Carneiro Affonso da Costa Jr.

123

Comparison of Artificial Neural Networks and Logistic Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Objectives: The factors that affect students’ alcohol use behaviors were examined by logistic regression analysis and artificial neural networks and the efficiency of these methods in identifying alcohol users and non-users was compared using the receiver operating characteristics (ROC) curve method. Study Design: Graduate students of 1-4 years in Trakya University Medical Faculty (2003-2004) were administered a questionnaire to predict their alcohol use behaviors and were assessed with the Frontal Lobe Personality Scale. Results: Logistic regression analysis showed that the following variables highly affected alcohol use behaviors of the students: visiting bars, discos or cafes in their spare time (OR=1.920; p<0.05), the importance of religion (OR=0.454; p<0.001), the number of alcohol-user friends (OR=2.441; p<0.001), insistence of friends on taking alcohol (OR=1.557; p<0.01), and impulsiveness (OR=1.826; p<0.001). Comparison between logistic regression analysis and artificial neural networks showed no differences in terms of the areas under the ROC curves of hyperbolic tangent-hyperbolic tangent function and hyperbolic tangent-logistic function artificial neural networks, but these models showed statistically larger areas than the other models. Conclusion: It may be necessary to take into account the advantages and disadvantages of artificial neural networks and logistic regression in classification and modelling, and to use artificial neural networks to eliminate insignificant variables of logistic regression analysis.

Imran KURT; Mevlut TURE

2005-01-01

124

Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil

Directory of Open Access Journals (Sweden)

Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.

José Paulo de Lucca Ramos; Newton Carneiro Affonso da Costa Jr.

2004-01-01

125

UK PubMed Central (United Kingdom)

A simultaneous confidence band provides useful information on the plausible range of the unknown regression model, and different confidence bands can often be constructed for the same regression model. For a simple regression line, it is proposed in Liu and Hayter (2007) to use the area of the confidence set that corresponds to a confidence band as an optimality criterion in comparison of confidence bands; the smaller is the area of the confidence set, the better is the corresponding confidence band. This minimum area confidence set (MACS) criterion can clearly be generalized to the minimum volume confidence set (MVCS) criterion in study of confidence bands for a multiple linear regression model. In this paper the hyperbolic and constant width confidence bands for a multiple linear regression model over a particular ellipsoidal region of the predictor variables are compared under the MVCS criterion. It is observed that whether one band is better than the other depends on the magnitude of one particular angle that determines the size of the predictor variable region. When the angle and so the size of the predictor variable region is small, the constant width band is better than the hyperbolic band but only marginally. When the angle and so the size of the predictor variable region is large the hyperbolic band can be substantially better than the constant width band.

Liu W; Hayter AJ; Piegorsch WW

2009-08-01

126

A simultaneous confidence band provides useful information on the plausible range of the unknown regression model, and different confidence bands can often be constructed for the same regression model. For a simple regression line, it is proposed in Liu and Hayter (2007) to use the area of the confidence set that corresponds to a confidence band as an optimality criterion in comparison of confidence bands; the smaller is the area of the confidence set, the better is the corresponding confidence band. This minimum area confidence set (MACS) criterion can clearly be generalized to the minimum volume confidence set (MVCS) criterion in study of confidence bands for a multiple linear regression model. In this paper the hyperbolic and constant width confidence bands for a multiple linear regression model over a particular ellipsoidal region of the predictor variables are compared under the MVCS criterion. It is observed that whether one band is better than the other depends on the magnitude of one particular angle that determines the size of the predictor variable region. When the angle and so the size of the predictor variable region is small, the constant width band is better than the hyperbolic band but only marginally. When the angle and so the size of the predictor variable region is large the hyperbolic band can be substantially better than the constant width band. PMID:20368761

Liu, W; Hayter, A J; Piegorsch, W W

2009-08-01

127

Energy Technology Data Exchange (ETDEWEB)

Increases were noted in the concentrations of total suspended particulate matter and several metals in New York City air in late 1968. This shift in the air quality persisted until late 1980. We have found that the change in air quality affects the multiple regression technique for apportioning sources of the suspended particulate matter. Separate regression models are required for coarse and fine particles. The shift appears related mainly to increased coarse particle emissions possibly related to changing sources of residual oil for use in boilers used for heating systems.

Kneip, T.J. (New York Univ. Medical Center, New York); Mallon, R.P.; Kleinman, M.T.

1983-01-01

128

MONEY DEMAND IN ROMANIAN ECONOMY, USING MULTIPLE REGRESSION METHOD AND UNRESTRICTED VAR MODEL

Directory of Open Access Journals (Sweden)

Full Text Available The paper describes the money demand in Romanian economy using two econometrics models. The first model consist in a multiple regression between demand money, monthly inflation rate, Industrial production Index and the foreign exchange rate RON/Euro. The second model (Unrestricted Vector AutoRegressive model) is applied for the same variables used in the first model. Identifying a statistically strong model, capable of stable estimations for the money demand function in Romania’s economy constitutes a prerequisite to the application of an efficient monetary policy.

Mariana KAZNOVSKY

2008-01-01

129

Directory of Open Access Journals (Sweden)

Full Text Available Generally, productivity is interpreted as relation between input and output, that is the comparison between input and the result or output. The measurement of productivity is one of the major indicator in assessing compete ability in a company. PT Taman Batu Alam is a natural stone company, that in its growth always cope to increases the productivity by doing repairmen in production.The measurement and performance analyze of transform process are done by using multiple regression analysis. This model selection is based on the form that simple and easy to comprehended. Directly it can depict the size measurement of performance that is the index of efficiency and production function in which can show elasticity of input usage that be used to produces the output.From the calculation result, its gotten that proportion input in which having effects to production process is efficiency index for the year of 2007 is 5.57 and for the year of 2008 is 1094,44. Result of return to scale in 2007 increasing and in 2008 decreasing. The usage of input elasticity: for the year of 2007 the usage of raw material is 0.39, the usage of labour is 0.22 and the expense of overhead is 0,42. While for the year of 2008 the usage of raw material is 0.39, the usage of labour is 0.165 and the expense of overhead is 0,237.

Yuliastuti Ramadhani

2011-01-01

130

Multiple regression as a preventive tool for determining the risk of Legionella spp.

Directory of Open Access Journals (Sweden)

Full Text Available To determine the interrelationship between health & hygiene conditions for prevention of legionellosis, the compositionof materials used in water distribution systems, the water origin and Legionella pneumophila risk. Material and methods. Include adescriptive study and multiple regression analysis on a sample of golf course sprinkler irrigation systems (n=31) pertaining to hotelslocated on the Costa del Sol (Malaga, Spain). The study was carried out in 2009. Results. Presented a significant lineal relation, withall the independent variables contributing significantly (p<0.05) to the model’s fit. The relationship between water type and the risk ofLegionella, as well as the material composition and the latter, is lineal and positive. In contrast, the relationship between health-hygieneconditions and Legionella risk is lineal and negative. Conclusion. The characterization of Legionella pneumophila concentration, asdefined by the risk in water and through use of the predictive method, can contribute to the consideration of new influence variables inthe development of the agent, resulting in improved control and prevention of the disease.

Enrique Gea-Izquierdo

2012-01-01

131

International Nuclear Information System (INIS)

Fundamental parameters calculations are used for the analysis of europium in the concentration range of 0.1 WT% to 30.0 WT% in the oxidic catalyst supports alumina, calcia, magnesia, lanthania, and thoria. The precision and accuracy of this method is dependent on how the sample matrix is defined in the fundamental parameters program and the number and concentration of the standards used. Results comparable to the multiple regression method are obtained when the matrix stoichiometry is defined as Eu2O3 and the catalyst oxide (i.e. A12O3 etc). It is also necessary to use standards which bracket the europium concentration in the samples. When these conditions are met, the results are comparable to those obtained from a ten point multiple regression calibration curve but with a considerable saving of standard preparation time. The precision is better than + or - 2% relative. The % relative difference between the fundamental parameters and multiple regression results is also 2%. Data is presented which illustrates the effect of defining the sample stoichiometry in the XRF11 computer program.

1984-08-03

132

The current study investigates the possibility of obtaining the anthropometric dimensions, critical to school furniture design, without measuring all of them. The study first selects some anthropometric dimensions that are easy to measure. Two methods are then used to check if these easy-to-measure dimensions can predict the dimensions critical to the furniture design. These methods are multiple linear regression and neural networks. Each dimension that is deemed necessary to ergonomically design school furniture is expressed as a function of some other measured anthropometric dimensions. Results show that out of the five dimensions needed for chair design, four can be related to other dimensions that can be measured while children are standing. Therefore, the method suggested here would definitely save time and effort and avoid the difficulty of dealing with students while measuring these dimensions. In general, it was found that neural networks perform better than multiple linear regression in the current study. PMID:22365329

Agha, Salah R; Alnahhal, Mohammed J

2012-02-25

133

UK PubMed Central (United Kingdom)

The current study investigates the possibility of obtaining the anthropometric dimensions, critical to school furniture design, without measuring all of them. The study first selects some anthropometric dimensions that are easy to measure. Two methods are then used to check if these easy-to-measure dimensions can predict the dimensions critical to the furniture design. These methods are multiple linear regression and neural networks. Each dimension that is deemed necessary to ergonomically design school furniture is expressed as a function of some other measured anthropometric dimensions. Results show that out of the five dimensions needed for chair design, four can be related to other dimensions that can be measured while children are standing. Therefore, the method suggested here would definitely save time and effort and avoid the difficulty of dealing with students while measuring these dimensions. In general, it was found that neural networks perform better than multiple linear regression in the current study.

Agha SR; Alnahhal MJ

2012-11-01

134

Affine Invariant Descriptors of 3D Object Using Multiple Regression Model

Directory of Open Access Journals (Sweden)

Full Text Available In this work, a new method invariant [1,2,3] for 3D object is proposed using multiple regression model.This method consists of extracting an invariant vector using the multiple linear parameters modelapplied to the 3D object, it’s invariant against affine transformation of this object.The concerned 3D objects are transformations of 3D objects by one element of the overalltransformation. The set of transformations considered in this work is the general affine group.

M. Elhachloufi; A. El Oirrak; D. Aboutajdine; M.N. Kaddioui

2011-01-01

135

Statistical studies of mortality and air pollution multiple regression analyses by cause of death

Energy Technology Data Exchange (ETDEWEB)

Multiple regression analyses relating community air quality, socioeconomic variables, and mortality rates for all cancers, respiratory system cancer, respiratory disease, and external causes, for U.S. cities during 1969-71 are presented. Socioeconomic variables included an index of cigarette smoking, which was highly significant. Most air pollution variables were not significant, with the exception of the trace metal manganese, which was associated with cancers and respiratory disease. (33 references, 8 tables)

Lipfert, F.W.

1980-10-01

136

An Algorithm to Estimate Continuous-time Traffic Speed Using Multiple Regression Model

Directory of Open Access Journals (Sweden)

Full Text Available In this study we present a novel algorithm to estimate continuous-time traffic speed data using multiple regression based on the correlated speed and then compare its results to other baseline missing speed prediction methods with real freeway traffic speed data. Since this approach has greater generalization ability for given real speed data, it is believed that this model will also perform well for all time-series missing data estimation fields.

Xin Jin; Suk-Kyo Hong; Qiang Ma

2006-01-01

137

Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

DEFF Research Database (Denmark)

This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.

Nielsen, Allan Aasbjerg

2007-01-01

138

Eigenspectra, a robust regression method for multiplexed Raman spectra analysis.

UK PubMed Central (United Kingdom)

With the latest development of Surface Enhanced Raman Scattering (SERS) nanoparticles, Raman spectroscopy now can be extended to bioimaging and biosensing. In this study, we demonstrate the ability of Raman spectroscopy to separate multiple spectral fingerprints using Raman nanotags. A machine learning method is proposed to estimate the mixing ratios of sources from mixture signals. It decomposes the mixture signals into components for both best representation and most relating to mixing ratios. Then regression coefficients are calculated for the prediction. The robustness of the method was compared with least squares and weighted least squares methods.

Li S; Nyagilo JO; Dave DP; Zhang B; Gao J

2013-01-01

139

Regression analysis of a chemical reaction fouling model

International Nuclear Information System (INIS)

A previously reported mathematical model for the initial chemical reaction fouling of a heated tube is critically examined in the light of the experimental data for which it was developed. A regression analysis of the model with respect to that data shows that the reference point upon which the two adjustable parameters of the model were originally based was well chosen, albeit fortuitously. (author). 3 refs., 2 tabs., 2 figs.

1996-01-01

140

Early cost estimating for road construction projects using multiple regression techniques

Directory of Open Access Journals (Sweden)

Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE) of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.

Ibrahim Mahamid

2011-01-01

141

Directory of Open Access Journals (Sweden)

Full Text Available Abstract A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG), feed conversion ratio (FCR), carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch) and random (additive genetic, pen and individual permanent environmental) effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.

Schnyder Urs; Hofer Andreas; Labroue Florence; Künzi Niklaus

2002-01-01

142

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects

Directory of Open Access Journals (Sweden)

Full Text Available Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic effect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

Jan Bocianowski

2012-01-01

143

UK PubMed Central (United Kingdom)

Data analysis is an essential tenet of analytical chemistry, extending the possible information obtained from the measurement of chemical phenomena. Chemometric methods have grown considerably in recent years, but their wide use is hindered because some still consider them too complicated. The purpose of this review is to describe a multivariate chemometric method, principal component regression, in a simple manner from the point of view of an analytical chemist, to demonstrate the need for proper quality-control (QC) measures in multivariate analysis and to advocate the use of residuals as a proper QC method.

Keithley RB; Heien ML; Wightman RM

2009-10-01

144

Multiple Regression Methods Show Great Potential for Rare Variant Association Tests

The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.

Xu, ChangJiang; Ladouceur, Martin; Dastani, Zari; Richards, J. Brent

2012-01-01

145

Regression analysis of radiological parameters in nuclear power plants

International Nuclear Information System (INIS)

Indian Pressurized Heavy Water Reactors (PHWRs) have now attained maturity in their operations. Indian PHWR operation started in the year 1972. At present there are 12 operating PHWRs collectively producing nearly 2400 MWe. Sufficient radiological data are available for analysis to draw inferences which may be utilised for better understanding of radiological parameters influencing the collective internal dose. Tritium is the main contributor to the occupational internal dose originating in PHWRs. An attempt has been made to establish the relationship between radiological parameters, which may be useful to draw inferences about the internal dose. Regression analysis have been done to find out the relationship, if it exist, among the following variables: A. Specific tritium activity of heavy water (Moderator and PHT) and tritium concentration in air at various work locations. B. Internal collective occupational dose and tritium release to environment through air route. C. Specific tritium activity of heavy water (Moderator and PHT) and collective internal occupational dose. For this purpose multivariate regression analysis has been carried out. D. Tritium concentration in air at various work location and tritium release to environment through air route. For this purpose multivariate regression analysis has been carried out. This analysis reveals that collective internal dose has got very good correlation with the tritium activity release to the environment through air route. Whereas no correlation has been found between specific tritium activity in the heavy water systems and collective internal occupational dose. The good correlation has been found in case D and F test reveals that it is not by chance. (author)

2003-01-01

146

UK PubMed Central (United Kingdom)

Relative survival is the standard measure of excess mortality due to cancer in population-based cancer survival studies. In relative survival analysis, the observed hazard for cancer patients is the sum of the expected hazard for the general cancer-free population and the excess hazard associated with a cancer diagnosis. Previous models for relative survival analysis have assumed that the excess hazard rate is related to covariates by additive or multiplicative regression models. In this paper, a transformation covariate regression model is developed for estimation of the excess hazard rate, which includes both the additive and the multiplicative regression models as special cases. The baseline excess hazard rate and time-dependent hazard ratios can be approximated by means of regression splines, and the parameter estimates can be obtained using a standard statistical package. As is demonstrated through simulation, the proposed transformation hazards model provides a reasonably good fit to typical relative survival data. For illustration purposes, the sex difference in relative survival for lung and bronchus cancer patients is examined using data from population-based cancer registries (1973-2003).

Yu B

2013-04-01

147

International Nuclear Information System (INIS)

Highlights: ? We obtained models for estimation of cetane number of biodiesel. ? Twenty-four neural networks using two topologies were evaluated. ? The best neural network for predict the cetane number was selected. ? The best accuracy was obtained for the selected neural network. - Abstract: Models for estimation of cetane number of biodiesel from their fatty acid methyl ester composition using multiple linear regression and artificial neural networks were obtained in this work. For the obtaining of models to predict the cetane number, an experimental data from literature reports that covers 48 and 15 biodiesels in the modeling-training step and validation step respectively were taken. Twenty-four neural networks using two topologies and different algorithms for the second training step were evaluated. The model obtained using multiple regression was compared with two other models from literature and it was able to predict cetane number with 89% of accuracy, observing one outlier. A model to predict cetane number using artificial neural network was obtained with better accuracy than 92% except one outlier. The best neural network to predict the cetane number was a backpropagation network (11:5:1) using the Levenberg–Marquardt algorithm for the second step of the networks training and showing R = 0.9544 for the validation data.

2013-01-01

148

Metabolite profile analysis: from raw data to regression and classification.

Successful metabolic profile analysis will aid in the fundamental understanding of physiology. Here, we present a possible analysis workflow. Initially, the procedure to transform raw data into a data matrix containing relative metabolite levels for each sample is described. Given that, because of experimental issues in the technical equipment, the levels of some metabolites cannot be universally determined or that different experiments need to be compared, missing value estimation and normalization are presented as helpful preprocessing steps. Regression methods are presented in this review as tools to relate metabolite levels with other physiological properties like biomass and gene expression. As the number of measured metabolites often exceeds the number of samples, dimensionality reduction methods are required. Two of these methods are discussed in detail in this review. Throughout this article, practical examples illustrating the application of the aforementioned methods are given. We focus on the uncovering the relationship between metabolism and growth-related properties. PMID:18251857

Steinfath, Matthias; Groth, Detlef; Lisec, Jan; Selbig, Joachim

2008-02-01

149

Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

International Nuclear Information System (INIS)

Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems

1981-01-01

150

Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

Energy Technology Data Exchange (ETDEWEB)

Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems.

Bahk, Yong Whee; Park, Seog Hee; Kim, Sun Moo [St. Mary' s Hospital, Catholic Medical College, Seoul (Korea, Republic of)

1981-09-15

151

Directory of Open Access Journals (Sweden)

Full Text Available Software Estimation Techniques present an inclusive set of directives for software project developers, project managers and the management in order to produce more accurate estimates or predictions for future developments. The estimates also facilitate allocation of resources’ for Software development. Estimations also smooth the process of re-planning, prioritizing, classification and reuse of the projects. Various estimation models are widely being used in the Industry as well for research purposes. Several comparative studies have been executed on them, but choosing the best technique is quite intricate. Estimation by Analogy(EbA) is the method of making estimations based on the outcome from k most analogous projects. The projects close in distance are potentially similar to the reference project from the repository of projects. This method has widely been accepted and is quite popular as it impersonates human beings inherent judgment skill by estimating with analogous projects. In this paper, Grey Relational Analysis(GRA) is used as the method for feature selection and also for locating the closest analogous projects to the reference project from the set of projects. The closest k projects are then used to build regression models. Regression techniques like Multiple Linear Regression, Stepwise Regression and Robust regression techniques are used to find the effort from the closest projects.

Geeta Nagpal; Moin Uddin; Arvinder Kaur

2012-01-01

152

Logistic Regression and Discriminant Analysis by Ordinary Least Squares.

If the observations for fitting a polytomous logistic regression model satisfy certain normality assumptions, the maximum likelihood estimates of the regression coefficients are the discriminant function estimates. This paper shows that these estimates, t...

G. W. Haggstrom

1982-01-01

153

UK PubMed Central (United Kingdom)

Relations among academic stress, depression, and suicidal ideation were examined in 1,108 Asian adolescents 12-18 years old from a secondary school in Singapore. Using Baron and Kenny's [J Pers Soc Psychol 51:1173-1192, 1986] framework, this study tested the prediction that adolescent depression mediated the relationship between academic stress and suicidal ideation in a four-step process. The previously significant relationship between academic stress and suicidal ideation was significantly reduced in magnitude when depression was included in the model providing evidence in this sample that adolescent depression was a partial mediator. The applied and practical implications for intervention and prevention work in schools are discussed. The present investigation also served as a demonstration to illustrate how multiple regression analyses can be used as one possible method for testing mediation effects within child psychology and psychiatry.

Ang RP; Huan VS

2006-01-01

154

Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

Directory of Open Access Journals (Sweden)

Full Text Available The saturation percentage (SP) of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs), multiple regression (MR), and adaptive neural-based fuzzy inference system (ANFIS) were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC) were used to develop the applied methods. In additions contributions of each input variable were assessed on estimation of SP index. Two performance functions, namely root mean square errors (RMSE) and determination coefficient (R2), were used to evaluate the adequacy of the models. ANFIS method was found to be superior over the other methods. It is, then, proposed that ANFIS model can be used for reasonable estimation of SP values of soils.

Khaled Ahmad Aali; Masoud Parsinejad; Bizhan Rahmani

2009-01-01

155

International Nuclear Information System (INIS)

We report two cases of spontaneous regression of multiple pulmonary metastases occurring after radiofrequency ablation (RFA) of a single lung metastasis. To the best of our knowledge, these are the first such cases reported. These two patients presented with lung metastases progressive despite treatment with interleukin-2, interferon, or sorafenib but were safely ablated with percutaneous RFA under computed tomography guidance. Percutaneous RFA allowed control of the targeted tumors for >1 year. Distant lung metastases presented an objective response despite the fact that they received no targeted local treatment. Local ablative techniques, such as RFA, induce the release of tumor-degradation product, which is probably responsible for an immunologic reaction that is able to produce a response in distant tumors.

2011-01-01

156

Evaluation of dominance-based ordinal multiple regression for variables with few categories.

UK PubMed Central (United Kingdom)

Dominance-based ordinal multiple regression (DOR) is designed to answer ordinal questions about relationships among ordinal variables. Only one parameter per predictor is estimated, and the number of parameters is constant for any number of outcome levels. The majority of existing simulation evaluations of DOR use predictors that are continuous or ordinal with many categories, so the performance of the method is not well understood for ordinal variables with few categories. This research evaluates DOR in simulations using three-category ordinal variables for the outcome and predictors, with a comparison to the cumulative logits proportional odds model (POC). Although ordinary least squares (OLS) regression is inapplicable for theoretical reasons, it was also included in the simulations because of its popularity in the social sciences. Most simulation outcomes indicated that DOR performs well for variables with few categories, and is preferable to the POC for smaller samples and when the proportional odds assumption is violated. Nevertheless, confidence interval coverage for DOR was not flawless and possibilities for improvement are suggested.

Woods CM

2013-02-01

157

UK PubMed Central (United Kingdom)

In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit.

Mathur P; Sharma S; Soni B

2010-01-01

158

UK PubMed Central (United Kingdom)

Nickel removal efficiency of powered activated carbons of coconut oilcake, neem oilcake and commercial carbon was investigated by using artificial neural network. The effective parameters for the removal of nickel (%R) by adsorption process, which included the pH, contact time (T), distinctiveness of activated carbon (Cn), amount of activated carbon (Cw) and initial concentration of nickel (Co) were investigated. Levenberg-Marquardt (LM) Back-propagation algorithm is used to train the network. The network topology was optimized by varying number of hidden layer and number of neurons in hidden layer. The model was developed in terms of training; validation and testing of experimental data, the test subsets that each of them contains 60%, 20% and 20% of total experimental data, respectively. Multiple regression equation was developed for nickel adsorption system and the output was compared with both simulated and experimental outputs. Standard deviation (SD) with respect to experimental output was quite higher in the case of regression model when compared with ANN model. The obtained experimental data best fitted with the artificial neural network.

Hema M; Srinivasan K

2011-07-01

159

UK PubMed Central (United Kingdom)

The aim of this work is to obtain an expression using multiple lineal regressions (MLR) to evaluate environmental soil quality. We used four forest soils from Alicante province (SE Spain), comprising three Mollisols and one Entisol, developed under natural vegetation with minimum human disturbance, considered as reference soils of high quality. We carried out MLR integrating different soil physical, chemical and biochemical properties, and we searched those regressions with Kjeldahl nitrogen (N(k)), soil organic carbon (SOC) or microbial biomass carbon (MBC) as predicted parameter. We observed that Mollisols and Entisols presented different relationships among their properties. Thus, we searched different equations for both groups of soils. The selected equation for Mollisols was N=0.448 (P) + 0.017 (water holding capacity) + 0.410(phosphatase) - 0.567 (urease) + 0.001 (MBC) + 0.410 (beta - glucosidase) - 0.980, and for the Entisol SOC = 4.247 (P) + 8.183 (beta-glucosidase) -7.949 (urease) + 17.333. Equations were applied to samples from two forest soils in advanced degree of degradation, one for Mollisols and the other one for the Entisol. We observed a clear deviation in the predicted parameters values related to the real properties. The obtained results show that MLR is a good tool for soil quality evaluation, because it seems to be capable of reflecting the balance among its properties, as well as deviations from it.

Zornoza R; Mataix-Solera J; Guerrero C; Arcenegui V; García-Orenes F; Mataix-Beneyto J; Morugán A

2007-05-01

160

Logistic regression analysis on the risk factors of radiation pneumonitis

International Nuclear Information System (INIS)

Objective: To identify the risk factors of radiation pneumonitis (RP). Methods: A retrospective study was conducted on 101 patients with radiation pneumonitis using SPSS 8.0 software. Factors evaluated included: gender, age, pathology, clinical stage, irradiation dose, irradiation field size, history of smoking, cardiovascular disease, bronchitis, surgery, chemotherapy, lung infection, atelectasis, obstructive infection and pleural effusion. Univariate analysis was performed using Chi-Square test and multivariate analysis was performed using Logistic regression model. Results: Univariate analysis revealed a significant relationship between 10 factors: pulmonary infection, atelectasis, obstructive infection, cardiovascular disease, bronchitis, chemotherapy, irradiation dose, number of days of radiation and irradiation field size were factors leading to radiation pneumonitis. Multivariate analysis showed that 9 factors: pulmonary infection, obs tractive infection, atelectasis, pleural effusion, bronchitis, cardiovascular disease, chemotherapy, irradiation dose, and irradiation field size were independent factors. Conclusion: Comprehensive consideration of the accompanying disease, chemotherapy, dose, field size, etc during the planning of radiotherapy is able to minimize the possibility of developing radiation pneumonitis

2003-01-01

161

Low-Cost Housing in Sabah, Malaysia: A Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Low-cost housing plays a vital role in the development process especially in providing accommodation to those who are less fortunate and the lower income group. This effort is also a step in overcoming the squatter problem which could cripple the competitive drive of the local community especially in the state of Sabah, Malaysia. This article attempts to look into the influencing factors to low-cost housing in Sabah namely the government’s budget (allocation) for low cost housing projects and Sabah’s total population. At the same time, this study will attempt to show the implication from the development and economic crises which occurred during period 1971 to 2000 towards the provision of low cost houses in Sabah. Empirical analyses were conducted using the multiple linear regression method, stepwise and also the dummy variable approach in demonstrating the link. The empirical result shows that the government’s budget for low-cost housing is the main contributor to the provision of low-cost housing in Sabah. The empirical decision also suggests that economic growth namely Gross Domestic Product (GDP) did not provide a significant effect to the low-cost housing in Sabah. However, almost all major crises that have beset upon Malaysia’s economy caused a significant and consistent effect to the low-cost housing in Sabah especially the financial crisis which occurred in mid 1997.

Dullah Mulok; Mori Kogid

2009-01-01

162

An improved regression algorithm for automated well-test analysis

Energy Technology Data Exchange (ETDEWEB)

Automated well-test analysis is a familiar technique, but the procedure still depends on the speed and robustness of the regression algorithms used. This paper examines a modification of the Cholesky factorization (CF) method for the solution of nonlinear-parameter-estimation problems and show that this modification is a very beneficial adaptation of the Gauss-Newton method. The new algorithm, the modified Gauss-Cholesky (MGC) method, is more robust under unfavorable conditions than two of the most reliable existing algorithms (the Gauss-Marquardt (GM) and the Newton-Barua (NB) methods). This robustness seems to depend on the algorithm's ability to control the rapid change of ill-defined parameters and not on computational singularity or higher-order representation.

Nanba, T.; Horne, R.N. (Stanford Univ. (US))

1992-03-01

163

Node-Mapping EIT Method Based on Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Medical Imaging shows people the morphology of the body's internal organs function intuitive ly. Electrical Impedance Tomography (EIT) is an emerging medical imaging technology. It has the advantages of simple structure, low cost, non-radiological hazards and non-invasive . EIT can not only take advantage of the impedance differences between the different organizations reconstruction of anatomical images, and cantissues and organs to achieve functional imaging impedance changes in different physiological and pathological state, and is suitable for long -term monitoring. The solution is approximate due to t he ill -posedness of inverse problem . Because the image is accuracy and computation of contradictions in not quick enough, EIT is still unable to meet the requirements of practical pplication. By using regression analysis algorithm , Node-Mapping Method only calculates the node potential . The speed of operation and the reconstructed image quality have been greatly improved.

Jianjun Zhang; Guizhi Xu; Weili Yan; Shuai Zhang

2012-01-01

164

A Quantile Regression Analysis of Micro-lending's Poverty Impact

Directory of Open Access Journals (Sweden)

Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF), a database using the Progress out of Poverty (PPI) Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade) but require patience as each additional year increases the client’s average change in poverty score.

Stephen W. Polk; Daniel K.N. Johnson

2012-01-01

165

A Logistic Regression Analysis of the Ischemic Heart Disease Risk

Directory of Open Access Journals (Sweden)

Full Text Available The main objective of the present study is to investigate factors that contribute significantly to enhancing the risk of ischemic heart disease. The dependent variable of the study is diagnosis - whether the patient has the disease or does not have the disease. Logistic regression analysis is applied for exploring the factors affecting the disease. The result of the study show the factors that contribute significantly to enhancing the risk of ischemic heart disease are the use of banaspati ghee, living in urban area, high cholesterol level, age group of 51 to 60 years. Other significant factors are Apo Protein A, Apo Protein B, cholesterol level, high density Lipo protein, low density Lipo protein, phospholipids, total lipid and uric acid.

Irfana P. Bhatti; Heman D. Lohano; Zafar A. Pirzado; Imran A. Jafri

2006-01-01

166

Energy Technology Data Exchange (ETDEWEB)

A method for the analysis of capillary column polychlorinated biphenyl (PCB) data using regression analysis with outlier checking and elimination, COMSTAR, is presented and evaluated. This algorithm determines the best combination of the commercial PCB mixtures which best fits the chromatographic fingerprint of the sample by excluding weathered and contaminated PCB components from its final determination. Subsequently, significance testing on the final determination is performed. The extra sum of squares test is used for outlier testing. The chief advantage of COMSTAR over other PCB analysis method is its ability to discern more accurately the amount of PCB present in a sample when weathered and contaminated or enriched PCB components exist in the chromatographic data.

Burkhard, L.P.; Weininger, D.

1987-04-15

167

Energy Technology Data Exchange (ETDEWEB)

A method for the analysis of capillary column polychlorinated biphenyl (PCB) data using regression analysis with outlier checking and elimination, COMSTAR, is presented and evaluated. The algorithm determines the best combination of the commercial PCB mixtures which best fits the chromatographic fingerprint of the sample by excluding weathered and contaminated PCB components from its final determination. Subsequently, significance testing on the final determination is performed. The extra sum of squares test is used for outlier testing. The chief advantage of COMSTAR over other PCB analysis methods is its ability to discern more accurately the amount of PCB present in a sample when weathered and contaminated or enriched PCB components exist in the chromatographic data.

Burkhard, L.P.; Wininger, D.

1987-01-01

168

Prognostic factors in childhood asthma: a logistic regression analysis.

Thirty-two factors related to childhood asthma were studied in 200 asthmatic children. Remission of asthma was defined as a period of at least 2 years free of asthma while receiving no treatment. Univariate analysis showed significant associations between persistence of asthma and perennial symptoms with an odds ratio (OR) of 2.5 (95% confidence intervals 1.2 to 5.5); sensitization to house dust mites OR 3.5 (1.2 to 9.6); sensitization to molds, OR 7.9 (2.9 to 21.6); sensitization to pollen, OR 4.8 (1.4 to 16.3); and sensitization to milk protein, OR 5.4 (1.8 to 15.9). There was a positive association of remission of asthma with good treatment compliance, OR 12.1 (1.6 to 91.6). A stepwise logistic regression analysis selected the variables: perennial symptoms (S: 0 = no, 1 = yes), length of follow-up time (T: number of months), treatment compliance (C: 0 = poor, 1 = good), sensitization to fungi (F: 0 to 4), and sensitization to milk protein (M: 0 to 4). The analysis yielded the following formula for calculation of probability of asthma remission P(R): [formula: see text] where S indicates perennial symptoms; T, follow-up time in months; C, compliance; F, allergy to fungi; and M, allergy to milk. PMID:8179234

Mazon, A; Nieto, A; Nieto, F J; Menendez, R; Boquete, M; Brines, J

1994-05-01

169

MULTINOMIAL LOGISTIC REGRESSION: USAGE AND APPLICATION IN RISK ANALYSIS

Directory of Open Access Journals (Sweden)

Full Text Available The objective of the article was to explore the usage of multinomial logistic regression (MLR) in risk analysis. In this regard, performing MLR on risk analysis data corrected for the non-linear nature of binary response and did address the violation of equal variance and normality assumptions. Additionally, use of maximum likelihood (-2log) estimation provided a means of working with binary response data. The relationship of independent and dependent variables was also addressed.The data used included a cohort of hundred risk analyst of a historically black South African University. In this analysis, the findings revealed that the probability of the model chi-square (17.142) was 0.005, less than the level of significance of 0.05 (i.e. p<0.05). Suggesting that there was a statistically significant relationship between the independent variable-risk planning (Rp) and the dependent variable-control mechanism (control mecs) (p<0.05). Also, there was a statistically significant relationship between key risks assigned (KSA) and time spent on risk mitigation. For each unit increase in confidence in control mecs, the odds of being in the group of survey respondents who thought institution spend too little time on Rp decreased by 74.7%. Moreover, the findings revealed that survey respondents who had less confidence in control mecs were less likely to be in the group of survey respondents who thought institution spent about the right amount of time on risk planning.

Anass BAYAGA

2010-01-01

170

Scientific Electronic Library Online (English)

Full Text Available Abstract in portuguese Este trabalho teve por objetivo estimar equações de regressão linear múltipla tendo, como variáveis explicativas, as demais características avaliadas em experimento de milho e, como variáveis principais, a diferença mínima significativa em percentagem da média (DMS%) e quadrado médio do erro (QMe), para peso de grãos. Com 610 experimentos conduzidos na Rede de Ensaios Nacionais de Competição de Cultivares de Milho, realizados entre 1986 e 1996 (522 experimen (more) tos) e em 1997 (88 experimentos), estimaram-se duas equações de regressão, com os 522 experimentos, validando estas pela análise de regressão simples entre os valores reais e os estimados pelas equações, com os 88 restantes, observando que, para a DMS% a equação não estimava o mesmo valor que a fórmula original e, para o QMe, a equação poderia ser utilizada na estimação. Com o teste de Lilliefors, verificou-se que os valores do QMe aderiam à distribuição normal padrão e foi construída uma tabela de classificação dos valores do QMe, baseada nos valores observados na análise da variância dos experimentos e nos estimados pela equação de regressão. Abstract in english The aims of this study were to estimate the multiple linear regression equation and to verify the possible relationship between dependent and independent variables. Dependent variables were the mean percentage of the least significant difference (LSD%) and the mean square of the error (MSe) for grain yield. Data from 522 experiments conducted from 1986 to 1996 and 88 experiments conducted in 1997 were used in a total of 610 experiments of the National Competition of Maize (more) Cultivars. In the 522 experiments, two regression equations validated by the analysis of simple regression between the real values and the foreseen for the equations were estimated, in the 88 experiments, it was observed that the regression equation was not a good estimation for the same original value for LSD%, but the equation can be used for the estimation of MSe. The application of Lilliefors test resulted in normal pattern distribution of MSe values. One classification table of MSe values was built based on observed values of variance analysis of the experiments and on the regression equation estimated value.

Lúcio, Alessandro Dal?Col; Banzatto, David Ariovaldo; Storck, Lindolfo; Martin, Thomas Newton; Lorentz, Leandro Homrich

2001-12-01

171

Mixed-effects Poisson regression analysis of adverse event reports

SUMMARY A new statistical methodology is developed for the analysis of spontaneous adverse event (AE) reports from post-marketing drug surveillance data. The method involves both empirical Bayes (EB) and fully Bayes estimation of rate multipliers for each drug within a class of drugs, for a particular AE, based on a mixed-effects Poisson regression model. Both parametric and semiparametric models for the random-effect distribution are examined. The method is applied to data from Food and Drug Administration (FDA)’s Adverse Event Reporting System (AERS) on the relationship between antidepressants and suicide. We obtain point estimates and 95 per cent confidence (posterior) intervals for the rate multiplier for each drug (e.g. antidepressants), which can be used to determine whether a particular drug has an increased risk of association with a particular AE (e.g. suicide). Confidence (posterior) intervals that do not include 1.0 provide evidence for either significant protective or harmful associations of the drug and the adverse effect. We also examine EB, parametric Bayes, and semiparametric Bayes estimators of the rate multipliers and associated confidence (posterior) intervals. Results of our analysis of the FDA AERS data revealed that newer antidepressants are associated with lower rates of suicide adverse event reports compared with older antidepressants. We recommend improvements to the existing AERS system, which are likely to improve its public health value as an early warning system.

Gibbons, Robert D.; Segawa, Eisuke; Karabatsos, George; Amatya, Anup K.; Bhaumik, Dulal K.; Brown, C. Hendricks; Kapur, Kush; Marcus, Sue M.; Hur, Kwan; Mann, J. John

2008-01-01

172

Energy Technology Data Exchange (ETDEWEB)

In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and {approximately}200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs.

Janssen, I.; Stebbings, J.H.

1990-01-01

173

International Nuclear Information System (INIS)

In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies importanractions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs.

1990-01-01

174

An Introduction to Logistic Regression Analysis and Reporting.

Provides guidelines for what to expect in an article using logistic regression techniques, discussing tables, figures, and charts to be included to comprehensively assess results and assumptions to be verified; demonstrating the preferred pattern for applying logistic methods, with an illustration of logistic regression applied to a data set; and…

Peng, Chao-Ying Joanne; Lee, Kuk Lida; Ingersoll, Gary M.

2002-01-01

175

Regression tree analysis for predicting slaughter weight in broilers

Directory of Open Access Journals (Sweden)

Full Text Available In this study, Regression Tree Analysis (RTA) was used to predict and to determine the most important variables in predicting the slaughter weight of Ross 308 broiler chickens. Data for this study came from 224 chickens raised during three different seasons, namely spring (n=66), summer (n=66), winter (n=92). Second week body weight, shank length, shank width, breast bone length, breast width, breast circumference and body length were used to predict the slaughter weight. Results of RTA showed that among the seven independent variables only four were selected, namely; body weight, breast bone length, shank width, and breast circumference. These selected independent variables were more efficient than the others in predicting the slaughter weight. RTA indicated that the birds which had values of second week body weight >295.95 g, breast bone length >55.82 mm and breast circumference >14.18 cm or that of body weight ?295.95 g, breast bone length >60.26 mm and shank width >8.32 mm could be expected to have higher slaughter weights.

Mehmet Mende?; Erkut Akkartal

2010-01-01

176

A simplified procedure of linear regression in a preliminary analysis

Directory of Open Access Journals (Sweden)

Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed – and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x) by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median) of X and Y. Polynomial models for y(x) of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.

Silvia Facchinetti; Umberto Magagnoli

2010-01-01

177

Background? Lymphatic drainage to multiple basins (MLBD) is frequently observed in patients with primary melanoma located in the trunk. Conflicting data regarding the prognostic impact of MLBD are reported. Objective and methods? We reviewed our case series of 352 patients with trunk melanoma to evaluate the pattern of basin drainage and to analyse whether different basin drainages may have different significance in negative sentinel lymph node (SLN) patients. The presence of single/multiple basin drainage, the status of SLN, the presence of melanoma regression, Breslow thickness, ulceration and type of melanoma were recorded for each patients and correlated to Disease Free Survival (DFS) and Overall Survival (OS). Results? MLBD occurred in 77 patients (21.9%) and single basin lymphatic drainage (SLBD) occurred in 275 patients (79.1%). The presence of metastases in SLN was not significantly different in patients with MLBD compared to those with SLBD (26% vs. 19.6%). No differences in OS and DFS were found in SLBD/MLBD independently from SLN status. However DFS was higher in patients with MLBD and negative SLN (P?=?0.0001), in addition, in patients with negative SLN and SLBD disease recurrence was 19% while was only 7% in patients with negative SLN obtained from MLBD (P?=?0.03). Multivariate analysis showed that Breslow thickness <2?mm, MLBD pattern and regression of melanoma were favourable variables for DFS of patients with negative SLN. Conclusions? An accurate study of the drainage basin and of all the SLNs obtained from MLBD is recommended because of the impact in prognosis of melanoma of the trunk. PMID:22998598

Ribero, S; Quaglino, P; Osella-Abate, S; Sanlorenzo, M; Senetta, R; Macrì, L; Savoia, P; Macripò, G; Sapino, A; Bernengo, M G

2012-09-23

178

UK PubMed Central (United Kingdom)

Background? Lymphatic drainage to multiple basins (MLBD) is frequently observed in patients with primary melanoma located in the trunk. Conflicting data regarding the prognostic impact of MLBD are reported. Objective and methods? We reviewed our case series of 352 patients with trunk melanoma to evaluate the pattern of basin drainage and to analyse whether different basin drainages may have different significance in negative sentinel lymph node (SLN) patients. The presence of single/multiple basin drainage, the status of SLN, the presence of melanoma regression, Breslow thickness, ulceration and type of melanoma were recorded for each patients and correlated to Disease Free Survival (DFS) and Overall Survival (OS). Results? MLBD occurred in 77 patients (21.9%) and single basin lymphatic drainage (SLBD) occurred in 275 patients (79.1%). The presence of metastases in SLN was not significantly different in patients with MLBD compared to those with SLBD (26% vs. 19.6%). No differences in OS and DFS were found in SLBD/MLBD independently from SLN status. However DFS was higher in patients with MLBD and negative SLN (P?=?0.0001), in addition, in patients with negative SLN and SLBD disease recurrence was 19% while was only 7% in patients with negative SLN obtained from MLBD (P?=?0.03). Multivariate analysis showed that Breslow thickness <2?mm, MLBD pattern and regression of melanoma were favourable variables for DFS of patients with negative SLN. Conclusions? An accurate study of the drainage basin and of all the SLNs obtained from MLBD is recommended because of the impact in prognosis of melanoma of the trunk.

Ribero S; Quaglino P; Osella-Abate S; Sanlorenzo M; Senetta R; Macrì L; Savoia P; Macripò G; Sapino A; Bernengo MG

2013-09-01

179

Penalized-regression-based multimarker genotype analysis of Genetic Analysis Workshop 17 data

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Testing for association between multiple markers and a phenotype can not only capture untyped causal variants in weak linkage disequilibrium with nearby typed markers but also identify the effect of a combination of markers. We propose a sliding window approach that uses multimarker genotypes as variables in a penalized regression. We investigate a penalty with three separate components: (1) a group least absolute shrinkage and selection operator (LASSO) that selects multimarker genotypes in a gene to be included in or excluded from the model, (2) an allele-sharing penalty that encourages multimarker genotypes with similar alleles to have similar coefficients, and (3) a penalty that shrinks the size of coefficients while performing model selection. The penalized likelihood is minimized with a cyclic coordinate descent algorithm, allowing quick coefficient estimation for a large number of markers. We compare our method to single-marker analysis and a gene-based sparse group LASSO on the Genetic Analysis Workshop 17 data for quantitative trait Q2. We found that all of the methods were underpowered to detect the simulated rare causal variants at the low false-positive rates desired in association studies. However, the sparse group LASSO on multi-marker genotypes seems to provide some advantage over the sparse group LASSO applied to single SNPs within genes, giving further evidence that there may be an advantage to modeling combinations of rare variant alleles over modeling them individually.

Ayers Kristin L; Mamasoula Chrysovalanto; Cordell Heather J

2011-01-01

180

Introduction to Fuzzy Regression Analysis with a Ballistic Application.

Regression models are frequently used for predicting, but quite often the value assumed by the independent variable is somewhat vague. That problem is overcome in this paper by utilizing a combination of fuzzy set methodology and conventional statistical ...

W. E. Baker

1984-01-01

181

M-quantile regression analysis of temporal gene expression data.

UK PubMed Central (United Kingdom)

In this paper, we explore the use of M-quantile regression and M-quantile coefficients to detect statistical differences between temporal curves that belong to different experimental conditions. In particular, we consider the application of temporal gene expression data. Here, the aim is to detect genes whose temporal expression is significantly different across a number of biological conditions. We present a new method to approach this problem. Firstly, the temporal profiles of the genes are modelled by a parametric M-quantile regression model. This model is particularly appealing to small-sample gene expression data, as it is very robust against outliers and it does not make any assumption on the error distribution. Secondly, we further increase the robustness of the method by summarising the M-quantile regression models for a large range of quantile values into an M-quantile coefficient. Finally, we fit a polynomial M-quantile regression model to the M-quantile coefficients over time and employ a Hotelling T(2)-test to detect significant differences of the temporal M-quantile coefficients profiles across conditions. Extensive simulations show the increased power and robustness of M-quantile regression methods over standard regression methods and over some of the previously published methods. We conclude by applying the method to detect differentially expressed genes from time-course microarray data on muscular dystrophy.

Vinciotti V; Yu K

2009-01-01

182

Regression analysis of technical parameters affecting nuclear power plant performances

Energy Technology Data Exchange (ETDEWEB)

Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)

Ghazy, R.; Ricotti, M. E.; Trueco, P. [Politecnico di Milano, Via La Masa, 34, 20156 Milano (Italy)

2012-07-01

183

Regression analysis of technical parameters affecting nuclear power plant performances

International Nuclear Information System (INIS)

[en] Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)

2012-01-01

184

Use of generalized regression models for the analysis of stress-rupture data

International Nuclear Information System (INIS)

The design of components for operation in an elevated-temperature environment often requires a detailed consideration of the creep and creep-rupture properties of the construction materials involved. Techniques for the analysis and extrapolation of creep data have been widely discussed. The paper presents a generalized regression approach to the analysis of such data. This approach has been applied to multiple heat data sets for types 304 and 316 austenitic stainless steel, ferritic 21/4 Cr-1 Mo steel, and the high-nickel austenitic alloy 800H. Analyses of data for single heats of several materials are also presented. All results appear good. The techniques presented represent a simple yet flexible and powerful means for the analysis and extrapolation of creep and creep-rupture data

1978-06-29

185

Buffalos milk yield analysis using random regression models

Directory of Open Access Journals (Sweden)

Full Text Available Data comprising 1,719 milk yield records from 357 females (predominantly Murrah breed), daughters of 110 sires, with births from 1974 to 2004, obtained from the Programa de Melhoramento Genético de Bubalinos (PROMEBUL) and from records of EMBRAPA Amazônia Oriental - EAO herd, located in Belém, Pará, Brazil, were used to compare random regression models for estimating variance components and predicting breeding values of the sires. The data were analyzed by different models using the Legendre’s polynomial functions from second to fourth orders. The random regression models included the effects of herd-year, month of parity date of the control; regression coefficients for age of females (in order to describe the fixed part of the lactation curve) and random regression coefficients related to the direct genetic and permanent environment effects. The comparisons among the models were based on the Akaike Infromation Criterion. The random effects regression model using third order Legendre’s polynomials with four classes of the environmental effect were the one that best described the additive genetic variation in milk yield. The heritability estimates varied from 0.08 to 0.40. The genetic correlation between milk yields in younger ages was close to the unit, but in older ages it was low.

C.V. de Araújo; A. Amorim Ramos; S. Inoe Araújo; L. Celi Chaves; A.S. Schierholt

2010-01-01

186

Digital Repository Infrastructure Vision for European Research (DRIVER)

The multiple linear regression formula of the probability of the averaged daily solar energy reaching a specific location on the earth's surface in a calendar month was obtained with the assumption that the arrival process of clouds and solar energy during the day follows the exponential distributio...

Mohammed Mohammed El Genidy

187

Scientific Electronic Library Online (English)

Full Text Available Abstract in spanish RESUMEN OBJETIVO: Realizar una revisión sistemática de ensayos aleatorizados y controlados en los que se compara el efecto de la administración de múltiples micronutrientes con el de la administración de hierro y ácido fólico sobre los resultados de los embarazos en los países en vías de desarrollo. MÉTODOS: Se realizaron búsquedas en MEDLINE y EMBASE. Los resultados de interés fueron: peso del neonato, bajo peso neonatal, neonatos con una talla baja para la e (more) dad gestacional, mortalidad perinatal y mortalidad neonatal. Se calcularon los riesgos relativos (RR) agrupados, empleando modelos de efectos aleatorios. Se investigaron las fuentes de heterogeneidad del metanálisis y la metarregresión de los subgrupos. RESULTADOS: La administración de múltiples micronutrientes fue más eficaz que la administración de hierro y ácido fólico a la hora de reducir el riesgo del peso bajo neonatal (RR=0,86, IC del 95%=0,79-0,93) y la talla baja para la edad gestacional (RR=0,85; IC del 95%=0,78-0,93). La administración de micronutrientes no tuvo un efecto global en la mortalidad perinatal (RR=1,05; IC del 95%=0,90-1,22), si bien la heterogeneidad fue importante y evidente (I²=58%; p de heterogeneidad=0,008). Los análisis de los subgrupos y de la metarregresión sugirieron que la administración de micronutrientes estaba asociada a un menor riesgo de mortalidad perinatal en aquellos estudios en los que más del 50% de las madres tenía formación universitaria (RR=0,93; IC del 95%=0,82-1,06) o en los que la administración se inició después de una media de 20 semanas de gestación (RR=0,88; IC del 95%=0,80-0,97). CONCLUSIÓN: La educación de la madre o la edad gestacional en la que se inició la administración pueden haber contribuido a los efectos heterogéneos observados en la mortalidad perinatal. Se debe seguir investigando la seguridad, la eficacia y la efectividad de la administración de micronutrientes a mujeres embarazadas. Abstract in english OBJECTIVE: To systematically review randomized controlled trials comparing the effect of supplementation with multiple micronutrients versus iron and folic acid on pregnancy outcomes in developing countries. METHODS: MEDLINE and EMBASE were searched. Outcomes of interest were birth weight, low birth weight, small size for gestational age, perinatal mortality and neonatal mortality. Pooled relative risks (RRs) were estimated by random effects models. Sources of heterogenei (more) ty were explored through subgroup meta-analyses and meta-regression. FINDINGS: Multiple micronutrient supplementation was more effective than iron and folic acid supplementation at reducing the risk of low birth weight (RR:0.86, 95% confidence interval, CI:0.79-0.93) and of small size for gestational age (RR:0.85; 95% CI: 0.78-0.93). Micronutrient supplementation had no overall effect on perinatal mortality (RR:1.05; 95% CI:0.90-1.22), although substantial heterogeneity was evident (I²=58%; P for heterogeneity=0.008). Subgroup and meta-regression analyses suggested that micronutrient supplementation was associated with a lower risk of perinatal mortality in trials in which >50% of mothers had formal education (RR:0.93; 95% CI:0.82-1.06) or in which supplementation was initiated after a mean of 20 weeks of gestation (RR:0.88; 95% CI:0.80-0.97). CONCLUSION: Maternal education or gestational age at initiation of supplementation may have contributed to the observed heterogeneous effects on perinatal mortality. The safety, efficacy and effective delivery of maternal micronutrient supplementation require further research.

Kawai, Kosuke; Spiegelman, Donna; Shankar, Anuraj H; Fawzi, Wafaie W

2011-06-01

188

Regression Analysis with Block Missing Values and Variables Selection

Directory of Open Access Journals (Sweden)

Full Text Available We consider a regression model when a block of observations is missing, i.e. there are a group of observations with all the explanatory variables or covariates observed and another set of observations with only a block of the variables observed. We propose an estimator of the regression coefficients that is a combination of two estimators, one based on the observations with no missing variables, and the other the set all observations after deleting of the block of variables with missing values. The proposed combined estimator will be compared with the uncombined estimators. If the experimenter suspects that the variables with missing values may be deleted, a preliminary test will be performed to resolve the uncertainty. If the preliminary test of the null hypothesis that regression coefficients of the variables with missing value equal to zero is accepted, then only the data with no missing values are used for estimating the regression coefficients. Otherwise the combined estimator is used. This gives a preliminary test estimator. The properties of the preliminary test estimator and comparisons of the estimators are studied by a Monte Carlo study

Chien-Pai Han; Yan Li

2011-01-01

189

Bayesian analysis of logistic regression with an unknown change point

Digital Repository Infrastructure Vision for European Research (DRIVER)

We discuss Bayesian estimation of a logistic regression model with an unknown threshold limiting value (TLV). In these models it is assumed that there is no effect of a covariate on the response under a certain unknown TLV. The estimation of these models with a focus on the TLV in a Bayesian contex...

Gössl, Christoff; Küchenhoff, Helmut

190

A high-resolution analysis of process improvement: use of quantile regression for wait time.

UK PubMed Central (United Kingdom)

OBJECTIVE: Apply quantile regression for a high-resolution analysis of changes in wait time to treatment and assess its applicability to quality improvement data compared with least-squares regression. DATA SOURCE: Addiction treatment programs participating in the Network for the Improvement of Addiction Treatment. METHODS: We used quantile regression to estimate wait time changes at 5, 50, and 95 percent and compared the results with mean trends by least-squares regression. PRINCIPAL FINDINGS: Quantile regression analysis found statistically significant changes in the 5 and 95 percent quantiles of wait time that were not identified using least-squares regression. CONCLUSIONS: Quantile regression enabled estimating changes specific to different percentiles of the wait time distribution. It provided a high-resolution analysis that was more sensitive to changes in quantiles of the wait time distributions.

Choi D; Hoffman KA; Kim MO; McCarty D

2013-02-01

191

Correlation and regression are statistical methods that help us determine interactions of variables. Both are being used in statistical analysis of basic and clinical research. Correlation (r) is a measure of linear relationship between two numerical measurements made on the same set of subjects and it is represented by correlation coefficient. Values of correlation coefficient range between -1 and 1. Pearson's and Spearman's coefficients of correlation are the most often used correlation coefficients. Correlation can be linear and non-linear. We calculate the significance of correlation (P) in an effort to determine significance of correlation coefficient. Regression is a statistical method that allows us to predict values of one variable from another. The simplest regression is linear regression. The success of regression equation is valued by analysis of residuals. Multiple regression is used to predict one variable from several known variables. PMID:16526309

Azman, Josip; Frkovi?, Vedran; Bili?-Zulle, Lidija; Petrovecki, Mladen

2006-01-01

192

UK PubMed Central (United Kingdom)

Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies.

Liu X; Saat MR; Qin X; Barkan CP

2013-10-01

193

Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies. PMID:23770389

Liu, Xiang; Saat, M Rapik; Qin, Xiao; Barkan, Christopher P L

2013-05-22

194

REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL

Directory of Open Access Journals (Sweden)

Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.

Siana Halim; Indriati N Bisono

2007-01-01

195

HIGH RESOLUTION FOURIER ANALYSIS WITH AUTO-REGRESSIVE LINEAR PREDICTION

Energy Technology Data Exchange (ETDEWEB)

Auto-regressive linear prediction is adapted to double the resolution of Angle-Resolved Photoemission Extended Fine Structure (ARPEFS) Fourier transforms. Even with the optimal taper (weighting function), the commonly used taper-and-transform Fourier method has limited resolution: it assumes the signal is zero beyond the limits of the measurement. By seeking the Fourier spectrum of an infinite extent oscillation consistent with the measurements but otherwise having maximum entropy, the errors caused by finite data range can be reduced. Our procedure developed to implement this concept adapts auto-regressive linear prediction to extrapolate the signal in an effective and controllable manner. Difficulties encountered when processing actual ARPEFS data are discussed. A key feature of this approach is the ability to convert improved measurements (signal-to-noise or point density) into improved Fourier resolution.

Barton, J.; Shirley, D.A.

1984-04-01

196

A quantile regression analysis of ambulance response time.

UK PubMed Central (United Kingdom)

BACKGROUND: Shorter ambulance response time (ART) contributes to improved clinical outcomes. Various methods have been used to analyze ART. OBJECTIVES: We aimed to compare the use of quantile regression with the standard ordinary least squares (OLS) model for identifying factors associated with ART in Singapore. A secondary aim was to determine the relative importance of patient-level (e.g., gender and ethnicity) versus system-level (e.g., call volumes within the last one hour) factors contributing to longer ART. METHODS: We conducted a retrospective review of data electronically captured from ambulance dispatch records and patient case notes of emergency calls to the national ambulance service from January to May 2006 (n = 30,687). The primary outcome was ART, defined as the time taken for an ambulance to arrive at the scene upon receiving an emergency call, and modeled as a function of patient- and system-level factors. We used a quantile regression model to account for potential heterogeneous effects of explanatory variables on ART across different quantiles of the ART distribution, and compared estimates derived with the corresponding OLS estimates. RESULTS: Quantile regression estimates suggested that the call volume in the previous one hour predicted increased ART, with the effect being more pronounced in higher ART quantiles. At the 90th and 50th percentiles of ART, each additional call in the last one hour was predicted to increase ART to the next call from the same area by 93 and 57 seconds, respectively. The corresponding OLS estimate was 58 seconds. Patient factors had little effect on ART. CONCLUSION: The quantile regression model is more useful than the OLS model for estimating ART, revealing that in Singapore, ART is influenced heterogeneously by the volume of emergency calls in the past one hour.

Do YK; Foo K; Ng YY; Ong ME

2013-04-01

197

A Nonmonotone Line Search Method for Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available In this paper, we propose a nonmonotone line search combining with the search direction (G. L. Yuan and Z. X.Wei, New Line Search Methods for Unconstrained Optimization, Journal of the Korean Statistical Society, 38(2009), pp. 29-39.) for regression problems. The global convergence of the given method will be established under suitable conditions. Numerical results show that the presented algorithm is more competitive than the normal methods.

Gonglin Yuan; Zengxin Wei

2009-01-01

198

Analysis of some methods for reduced rank Gaussian process regression

DEFF Research Database (Denmark)

While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.

Quinonero-Candela, J.; Rasmussen, Carl Edward

2005-01-01

199

Seasonal Regression Models for Electricity Consumption Characteristics Analysis

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents seasonal regression models of demand to investigate electricity consumption characteristics. Electricity consumption in commercial areas in Japan is analyzed by using meteorological variables, namely temperature and relative humidity. A dummy variable for holidays is also considered. We have developed models for two levels of period to analyze demand characteristics, that is, half year models and seasonal models. Some options for each model are calculated and validated by statistical tests to obtain better models. As results, half year and seasonal models present explicit information about how the variables affect the demand differently for each period. These specific information help in analyzing characteristics of studied commercial demand.

Yusri Syam Akil; Hajime Miyauchi

2013-01-01

200

Energy Technology Data Exchange (ETDEWEB)

The adsorption of 55 organic compounds is carried out onto a recently discovered adsorbent, activated carbon cloth. Isotherms are modeled using the Freundlich classical model, and the large database generated allows qualitative assumptions about the adsorption mechanism. However, to confirm these assumptions, a quantitative structure-property relationship methodology is used to assess the correlations between an adsorbability parameter (expressed using the Freundlich parameter K) and topological indices related to the compounds molecular structure (molecular connectivity indices, MCI). This correlation is set up by mean of two different statistical tools, multiple linear regression (MLR) and neural network (NN). A principal component analysis is carried out to generate new and uncorrelated variables. It enables the relations between the MCI to be analyzed, but the multiple linear regression assessed using the principal components (PCs) has a poor statistical quality and introduces high order PCs, too inaccurate for an explanation of the adsorption mechanism. The correlations are thus set up using the original variables (MCI), and both statistical tools, multiple linear regression and neutral network, are compared from a descriptive and predictive point of view. To compare the predictive ability of both methods, a test database of 10 organic compounds is used.

Brasquet, C.; Bourges, B.; Le Cloirec, P.

1999-12-01

201

UK PubMed Central (United Kingdom)

BACKGROUND: There have been few published studies on spirometric reference values for healthy children in China. We hypothesize that there would have been changes in lung function that would not have been precisely predicted by the existing spirometric reference equations. The objective of the study was to develop more accurate predictive equations for spirometric reference values for children aged 9 to 15 years in Northeast China. METHODOLOGY/PRINCIPAL FINDINGS: Spirometric measurements were obtained from 3,922 children, including 1,974 boys and 1,948 girls, who were randomly selected from five cities of Liaoning province, Northeast China, using the ATS (American Thoracic Society) and ERS (European Respiratory Society) standards. The data was then randomly split into a training subset containing 2078 cases and a validation subset containing 1844 cases. Predictive equations used multiple linear regression techniques with three predictor variables: height, age and weight. Model goodness of fit was examined using the coefficient of determination or the R(2) and adjusted R(2). The predicted values were compared with those obtained from the existing spirometric reference equations. The results showed the prediction equations using linear regression analysis performed well for most spirometric parameters. Paired t-tests were used to compare the predicted values obtained from the developed and existing spirometric reference equations based on the validation subset. The t-test for males was not statistically significant (p>0.01). The predictive accuracy of the developed equations was higher than the existing equations and the predictive ability of the model was also validated. CONCLUSION/SIGNIFICANCE: We developed prediction equations using linear regression analysis of spirometric parameters for children aged 9-15 years in Northeast China. These equations represent the first attempt at predicting lung function for Chinese children following the ATS/ERS Task Force 2005 guidelines on spirometry standardization.

Ma YN; Wang J; Dong GH; Liu MM; Wang D; Liu YQ; Zhao Y; Ren WH; Lee YL; Zhao YD; He QC

2013-01-01

202

UK PubMed Central (United Kingdom)

Two main issues regarding stormwater quality models have been investigated: i) the effect of calibration dataset size and characteristics on calibration and validation results; ii) the optimal split of available data into calibration and validation subsets. Data from 13 catchments have been used for three pollutants: BOD, COD and SS. Three multiple regression models were calibrated and validated. The use of different data sets and different models allows viewing general trends. It was found mainly that multiple regression models are case sensitive to calibration data. Few data used for calibration infers bad predictions despite good calibration results. It was also found that the random split of available data into halves for calibration and validation is not optimal. More data should be allocated to calibration. The proportion of data to be used for validation increases with the number of available data (N) and reaches about 35% for N around 55 measured events.

Mourad M; Bertrand-Krajewski JL; Chebbo G

2005-01-01

203

Regression analysis study on the carbon dioxide capture process

Energy Technology Data Exchange (ETDEWEB)

Research on amine-based carbon dioxide (CO{sub 2}) capture has mainly focused on improving the effectiveness and efficiency of the CO{sub 2} capture process. The objective of our work is to explore relationships among key parameters that affect the CO{sub 2} production rate. From a survey of relevant literature, we observed that the significant parameters influencing the CO{sub 2} production rate include the reboiler heat duty, solvent concentration, solvent circulation rate, and CO{sub 2} lean loading. While it is widely recognized that these parameters are related, the exact nature of the relationships are unknown. This paper presents a regression study conducted with data collected at the International Test Center for CO{sub 2} capture (ITC) located at University of Regina, Saskatchewan, Canada. The regression technique was applied to a data set consisting of data on 113 days of operation of the CO{sub 2} capture plant, and four mathematical models of the key parameters have been developed. The models can be used for predicting the performance of the plant when changes occur in the process. By manipulation of the parameter values, the efficiency of the CO{sub 2} capture process can be improved.

Zhou, Q.; Chan, C.W.; Tontiwachiwuthikul, P. [University of Regina, Regina, SK (Canada). Faculty of Engineering

2008-07-15

204

This article proposes an efficient two-channel time delay estimation method for tracking a moving speaker in noisy and re-verberant environment. Unlike conventional linear regression model-based methods, the proposed multiple linear regression model designed in the expanded phase domain shows high estimation accuracy in adverse condition because its the Gaussian assumption on phase distribution is valid. Therefore, the least-square-based time delay estimator using the proposed multiple linear regression model becomes an ideal estimator that does not require a complicated phase unwrapping process. In addition, the proposed method is extended to the two-stage recursive estimation approach, which can be used for a moving source tracking scenario. The performance of the proposed method is compared with that of conventional cross-correlation and linear regression-based methods in noisy and reverberant environment. Experimental results verify that the proposed algorithm significantly decreases estimation anomalies and improves the accuracy of time delay estimation. Finally, the tracking performance of the proposed method to both slow and fast moving speakers is confirmed in adverse environment.

Yang, Jae-Mo; Kang, Hong-Goo

2012-12-01

205

Logistic regression analysis of cadmium-induced renal abnormalities

Energy Technology Data Exchange (ETDEWEB)

Cases of renal dysfunction associated with cadium exposure have been reported in Belgium, Great Britian, Japan, United States, and Sweden. Indirect estimates of body burden were often based on the measurement of environmental exposure conditions or on tissue concentrations in urine, blood, saliva, or hair clippings. More recently, however, the direct in vivo assessment of liver and kidney cadmium burden in humans has provided additional data. Sufficient data on humans does exist, however, to make reasonable estimates of the increased risk for cadmium-induced renal dysfunction. In the present paper, a linear logistic regression model has been developed on the basis of liver and kidney cadmium burden. These relationships are discussed with respect to the concept of a critical concentration for the renal cortex. 14 refs., 3 figs., 2 tabs.

Ellis, K.J.; Yuen, K.; Cohn, S.H.

1986-02-01

206

UK PubMed Central (United Kingdom)

Many machine learning and pattern classification methods have been applied to the diagnosis of Alzheimer's disease (AD) and its prodromal stage, i.e., mild cognitive impairment (MCI). Recently, rather than predicting categorical variables as in classification, several pattern regression methods have also been used to estimate continuous clinical variables from brain images. However, most existing regression methods focus on estimating multiple clinical variables separately and thus cannot utilize the intrinsic useful correlation information among different clinical variables. On the other hand, in those regression methods, only a single modality of data (usually only the structural MRI) is often used, without considering the complementary information that can be provided by different modalities. In this paper, we propose a general methodology, namely multi-modal multi-task (M3T) learning, to jointly predict multiple variables from multi-modal data. Here, the variables include not only the clinical variables used for regression but also the categorical variable used for classification, with different tasks corresponding to prediction of different variables. Specifically, our method contains two key components, i.e., (1) a multi-task feature selection which selects the common subset of relevant features for multiple variables from each modality, and (2) a multi-modal support vector machine which fuses the above-selected features from all modalities to predict multiple (regression and classification) variables. To validate our method, we perform two sets of experiments on ADNI baseline MRI, FDG-PET, and cerebrospinal fluid (CSF) data from 45 AD patients, 91 MCI patients, and 50 healthy controls (HC). In the first set of experiments, we estimate two clinical variables such as Mini Mental State Examination (MMSE) and Alzheimer's Disease Assessment Scale-Cognitive Subscale (ADAS-Cog), as well as one categorical variable (with value of 'AD', 'MCI' or 'HC'), from the baseline MRI, FDG-PET, and CSF data. In the second set of experiments, we predict the 2-year changes of MMSE and ADAS-Cog scores and also the conversion of MCI to AD from the baseline MRI, FDG-PET, and CSF data. The results on both sets of experiments demonstrate that our proposed M3T learning scheme can achieve better performance on both regression and classification tasks than the conventional learning methods.

Zhang D; Shen D

2012-01-01

207

A Bayesian Quantile Regression Analysis of Potential Risk Factors for Violent Crimes in USA

Directory of Open Access Journals (Sweden)

Full Text Available Bayesian quantile regression has drawn more attention in widespread applications recently. Yu and Moyeed (2001) proposed an asymmetric Laplace distribution to provide likelihood based mechanism for Bayesian inference of quantile regression models. In this work, the primary objective is to evaluate the performance of Bayesian quantile regression compared with simple regression and quantile regression through simulation and with application to a crime dataset from 50 USA states for assessing the effect of potential risk factors on the violent crime rate. This paper also explores improper priors, and conducts sensitivity analysis on the parameter estimates. The data analysis reveals that the percent of population that are single parents always has a significant positive influence on violent crimes occurrence, and Bayesian quantile regression provides more comprehensive statistical description of this association.

Ming Wang; Lijun Zhang

2012-01-01

208

Additive Intensity Regression Models in Corporate Default Analysis

DEFF Research Database (Denmark)

We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of model checking techniques to identify misspecifications. In our final model, we find evidence of time-variation in the effects of distance-to-default and short-to-long term debt. Also we identify interactions between distance-to-default and other covariates, and the quick ratio covariate is significant. None of our macroeconomic covariates are significant.

Lando, David; Medhat, Mamdouh

2013-01-01

209

Quantile regression analysis of body mass and wages.

UK PubMed Central (United Kingdom)

Using the National Longitudinal Survey of Youth 1979, we explore the relationship between body mass and wages. We use quantile regression to provide a broad description of the relationship across the wage distribution. We also allow the relationship to vary by the degree of social skills involved in different jobs. Our results find that for female workers body mass and wages are negatively correlated at all points in their wage distribution. The strength of the relationship is larger at higher-wage levels. For male workers, the relationship is relatively constant across wage distribution but heterogeneous across ethnic groups. When controlling for the endogeneity of body mass, we find that additional body mass has a negative causal impact on the wages of white females earning more than the median wages and of white males around the median wages. Among these workers, the wage penalties are larger for those employed in jobs that require extensive social skills. These findings may suggest that labor markets reward white workers for good physical shape differently, depending on the level of wages and the type of job a worker has.

Johar M; Katayama H

2012-05-01

210

Combinatorial Analysis of Multiple Networks

The study of complex networks has been historically based on simple graph data models representing relationships between individuals. However, often reality cannot be accurately captured by a flat graph model. This has led to the development of multi-layer networks. These models have the potential of becoming the reference tools in network data analysis, but require the parallel development of specific analysis methods explicitly exploiting the information hidden in-between the layers and the availability of a critical mass of reference data to experiment with the tools and investigate the real-world organization of these complex systems. In this work we introduce a real-world layered network combining different kinds of online and offline relationships, and present an innovative methodology and related analysis tools suggesting the existence of hidden motifs traversing and correlating different representation layers. We also introduce a notion of betweenness centrality for multiple networks. While some preli...

Magnani, Matteo; Rossi, Luca

2013-01-01

211

Nonlinear Robust Regression Using Kernel Principal Component Analysis and R-Estimators

Digital Repository Infrastructure Vision for European Research (DRIVER)

In recent years, many algorithms based on kernel principal component analysis (KPCA) have been proposed including kernel principal component regression (KPCR). KPCR can be viewed as a non-linearization of principal component regression (PCR) which uses the ordinary least squares (OLS) for estimatin...

Antoni Wibowo; Mohammad Ishak Desa

212

UK PubMed Central (United Kingdom)

BACKGROUND: Polytomous logistic regression models are commonly used in case-control studies of cancer to directly compare the risks associated with an exposure variable across multiple cancer subtypes. However, the validity, accuracy, and efficiency of this approach for prospective cohort studies have not been formally evaluated. METHODS: We investigated the performance of the polytomous logistic regression model and compared it with an alternative approach based on a joint Cox proportional hazards model using simulation studies. We then applied both methods to a prospective cohort study to assess whether the association of breast cancer with body size differs according to estrogen and progesterone receptor-defined subtypes. RESULTS: Our simulations showed that the polytomous logistic regression model but not the joint Cox regression model yielded biased results in comparing exposure and disease subtype associations when the baseline hazards for different disease subtypes are nonproportional. For this reason, an analysis of a real data set was based on the joint Cox proportional hazards model and showed that body size has a significantly greater association with estrogen- and progesterone-positive breast cancer than with other subtypes. CONCLUSIONS: Because of the limitations of the polytomous logistic regression model for the comparison of exposure-disease associations across disease subtypes, the joint Cox proportional hazards model is recommended over the polytomous logistic regression model in prospective cohort studies. Impact: The article will promote the use of the joint Cox model in a prospective cohort study. Examples of SAS and S-plus programming codes are provided to facilitate use by nonstatisticians.

Xue X; Kim MY; Gaudet MM; Park Y; Heo M; Hollenbeck AR; Strickler HD; Gunter MJ

2013-02-01

213

Multivariate linear regression of high-dimensional fMRI data with multiple target variables.

UK PubMed Central (United Kingdom)

Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets. Hum Brain Mapp, 2013. © 2013 Wiley Periodicals, Inc.

Valente G; Castellanos AL; Vanacore G; Formisano E

2013-07-01

214

Advanced GIS Exercise: Predicting Rainfall Erosivity Index Using Regression Analysis

|Graduate students from a variety of agricultural and natural resource fields are incorporating geographic information systems (GIS) analysis into their graduate research, creating a need for teaching methodologies that help students understand advanced GIS topics for use in their own research. Graduate-level GIS exercises help students understand…

Post, Christopher J.; Goddard, Megan A.; Mikhailova, Elena A.; Hall, Steven T.

2006-01-01

215

Directory of Open Access Journals (Sweden)

Full Text Available Objective To explore the risk factors of complication of acute renal failure(ARF) in war injuries of limbs.Methods The clinical data of 352 patients with limb injuries admitted to 303 Hospital of PLA from 1968 to 2002 were retrospectively analyzed.The patients were divided into ARF group(n=9) and non-ARF group(n=343) according to the occurrence of ARF,and the case-control study was carried out.Ten factors which might lead to death were analyzed by logistic regression to screen the risk factors for ARF,including causes of trauma,shock after injury,time of admission to hospital after injury,injured sites,combined trauma,number of surgical procedures,presence of foreign matters,features of fractures,amputation,and tourniquet time.Results Fifteen of the 352 patients died(4.3%),among them 7 patients(46.7%) died of ARF,3(20.0%) of pulmonary embolism,3(20.0%) of gas gangrene,and 2(13.3%) of multiple organ failure.Univariate analysis revealed that the shock,time before admitted to hospital,amputation and tourniquet time were the risk factors for ARF in the wounded with limb injuries,while the logistic regression analysis showed only amputation was the risk factor for ARF(P < 0.05).Conclusion ARF is the primary cause-of-death in the wounded with limb injury.Prompt and accurate treatment and optimal time for amputation may be beneficial to decreasing the incidence and mortality of ARF in the wounded with severe limb injury and ischemic necrosis.

Chang-zhi CHENG; Dong-hai ZHAO; Quan-yue LI; Hai-yan QU; Bo-cheng CHEN; Zhou-dan LIN

2011-01-01

216

Directory of Open Access Journals (Sweden)

Full Text Available The multiple linear regression formula of the probability of the averaged daily solar energy reaching a specific location on the earth's surface in a calendar month was obtained with the assumption that the arrival process of clouds and solar energy during the day follows the exponential distribution. This formula enables any user to find out some of the required information such as knowing the maximum probability for the averaged daily solar energy and the amount of the corresponding clouds. In addition, the cumulative distribution functions of this probability was obtained.

Mohammed Mohammed El Genidy

2012-01-01

217

UK PubMed Central (United Kingdom)

Cells use common signaling molecules for the selective control of downstream gene expression and cell-fate decisions. The relationship between signaling molecules and downstream gene expression and cellular phenotypes is a multiple-input and multiple-output (MIMO) system and is difficult to understand due to its complexity. For example, it has been reported that, in PC12 cells, different types of growth factors activate MAP kinases (MAPKs) including ERK, JNK, and p38, and CREB, for selective protein expression of immediate early genes (IEGs) such as c-FOS, c-JUN, EGR1, JUNB, and FOSB, leading to cell differentiation, proliferation and cell death; however, how multiple-inputs such as MAPKs and CREB regulate multiple-outputs such as expression of the IEGs and cellular phenotypes remains unclear. To address this issue, we employed a statistical method called partial least squares (PLS) regression, which involves a reduction of the dimensionality of the inputs and outputs into latent variables and a linear regression between these latent variables. We measured 1,200 data points for MAPKs and CREB as the inputs and 1,900 data points for IEGs and cellular phenotypes as the outputs, and we constructed the PLS model from these data. The PLS model highlighted the complexity of the MIMO system and growth factor-specific input-output relationships of cell-fate decisions in PC12 cells. Furthermore, to reduce the complexity, we applied a backward elimination method to the PLS regression, in which 60 input variables were reduced to 5 variables, including the phosphorylation of ERK at 10 min, CREB at 5 min and 60 min, AKT at 5 min and JNK at 30 min. The simple PLS model with only 5 input variables demonstrated a predictive ability comparable to that of the full PLS model. The 5 input variables effectively extracted the growth factor-specific simple relationships within the MIMO system in cell-fate decisions in PC12 cells.

Akimoto Y; Yugi K; Uda S; Kudo T; Komori Y; Kubota H; Kuroda S

2013-01-01

218

Regression analysis of ESR/TL dose-response data

International Nuclear Information System (INIS)

Methods are described for the analysis of ESR (electron spin resonance) or TL (thermoluminescence) dose-response data. When fitting data to a straight line, an expression is derived which allows the error in the accumulated dose, AD, to be estimated. For fitting data to a saturating exponential, the simplex algorithm with quadratic convergence is proposed. This allows the errors in the parameters, including the AD, to be estimated. An alternative method for estimating the parameter errors, using analytical expressions for the required partial derivatives, is also described. These techniques are more satisfactory than jackknifing for estimating uncertainties in ADs. (author).

1992-01-01

219

Random Decrement and Regression Analysis of Traffic Responses of Bridges

DEFF Research Database (Denmark)

The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data fro the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e. g. wind, traffic and small ground motion. The random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time domain method. The possible influence of the traffic mass load on the bridge is investigated by assuming that the response level of the bridge is dependent on the mass of the vehicle load. The eigenfrequencies of the bridge is estimated as a function of the response level. This indicates the degree of influence of the mass load on the estimated eigenfrequencies. The results of the analysis using the Random decrement technique are compared with results from an analysis based on fast Fourier transformations.

Asmussen, J. C.; Ibrahim, S. R.

1996-01-01

220

PREDICTION OF GROUND VIBRATIONS IN OPENCAST MINE USING NONLINEAR REGRESSION ANALYSIS

Directory of Open Access Journals (Sweden)

Full Text Available The present work deals with the prediction of ground vibrations in Opencast mine by using Nonlinear regression analysis. It is very important to control the influence of various blast design parameters in the prediction of ground vibrations. Predictions from Non linear regression analysis have been compared with actual values observed from the field and are very close with the field values. Three cases have been considered and the ground vibrations are predicted. In the second case, the obtained results matched very closely with themeasured values from the field data. Thus the Nonlinear regression model can be applied for analyzing the prediction of ground vibrations in Opencast mine.

Dr.Y.SEETHARAMA RAO

2012-01-01

221

UK PubMed Central (United Kingdom)

OBJECTIVES: To develop a novel statistical method for analysis of longitudinal DTI data in individual subjects. MATERIALS AND METHODS: The proposed SPatial REgression Analysis of Diffusion tensor imaging (SPREAD) method incorporates a spatial regression fitting of DTI data among neighboring voxels and a resampling method among data at different times. Both numerical simulations and real DTI data from healthy volunteers and multiple sclerosis (MS) patients were used in the study to evaluate this method. RESULTS: Statistical inference based on SPREAD was shown to perform well through both group comparisons among simulated DTI data of individuals (especially when the group size is smaller than 5) and longitudinal comparisons of human DTI data within the same individual. CONCLUSIONS: When pathological changes of neurodegenerative diseases are heterogeneous in a population, SPREAD provides a unique way to assess abnormality during disease progression at the individual level. Consequently, it has the potential to shed light on how the brain has changed as a result of disease or injury.

Zhu T; Hu R; Tian W; Ekholm S; Schifitto G; Qiu X; Zhong J

2013-10-01

222

A Prediction on the Pollution Level of Outdoor Insulator with Regression Analysis

Energy Technology Data Exchange (ETDEWEB)

The degree of contamination on outdoor insulator is one of the most importance factor to determine the pollution level of outdoor insulation, and the sea salt is known as the most dangerous pollutant. As shown through the preceding study, the generation of salt pollutant and the pollution degree of outdoor insulator have a close relation with meteorological conditions, such as wind velocity, wind direction, precipitation and so fourth. So, in this paper, we made an investigation on the prediction method, a statistical estimation technique for equivalent salt deposit density of outdoor insulator with multiple linear regression analysis. From the results of the analysis, we proved the superiority of the prediction method in which the variables had a very close(about 0.9) correlation coefficient. And the results could be applied to establish the Pollution Prediction System for power utilities, and the system could provide an invaluable information for the design and maintenance of outdoor insulation system. (author). 14 refs., 11 figs., 2 tabs.

Choi, N.H.; Han, S.O. [Chungnam National University, Daejeon (Korea); Koo, K.W. [Youngdong University, Youngdong (Korea)

2003-03-01

223

This study encompasses air surface temperature (AST) modeling in the lower atmosphere. Data of four atmosphere pollutant gases (CO, O3, CH4, and H2O) dataset, retrieved from the National Aeronautics and Space Administration Atmospheric Infrared Sounder (AIRS), from 2003 to 2008 was employed to develop a model to predict AST value in the Malaysian peninsula using the multiple regression method. For the entire period, the pollutants were highly correlated (R=0.821) with predicted AST. Comparisons among five stations in 2009 showed close agreement between the predicted AST and the observed AST from AIRS, especially in the southwest monsoon (SWM) season, within 1.3 K, and for in situ data, within 1 to 2 K. The validation results of AST with AST from AIRS showed high correlation coefficient (R=0.845 to 0.918), indicating the model's efficiency and accuracy. Statistical analysis in terms of ? showed that H2O (0.565 to 1.746) tended to contribute significantly to high AST values during the northeast monsoon season. Generally, these results clearly indicate the advantage of using the satellite AIRS data and a correlation analysis study to investigate the impact of atmospheric greenhouse gases on AST over the Malaysian peninsula. A model was developed that is capable of retrieving the Malaysian peninsulan AST in all weather conditions, with total uncertainties ranging between 1 and 2 K.

Rajab, Jasim Mohammed; Jafri, Mohd. Zubir Mat; Lim, Hwee San; Abdullah, Khiruddin

2012-10-01

224

DEFF Research Database (Denmark)

This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.

Fauser, Patrik; Thomsen, Marianne

2010-01-01

225

Directory of Open Access Journals (Sweden)

Full Text Available This study was down in Forest Park of Noor. In order to determination of tree ring response to climatic variations, 35 cores were taken from dominant natural stand of common ash (Fraxinus excelsior L.). The guide of this study was finding which climatic variables are effective in the ring width growth of ash in current growing year and previous years (one, two and three years before current growing year) by multiple regression models at the North of IR-Iran. Totally, 85 annually, monthly seasons and seasonal growth climatic variations of precipitation, temperature, heat index, evapotranspiration and water balance were analyzed. The best multiple regression models were explained 83 percent of total variance of the growth of common ash. The results show that the growth of common ash was related to the previous year's climatic variations than that of the current year. The most effective role of climatic variations was due to the first and second preceding years (55%). Evapotranspiration of July and September, and precipitation of May in the second and precipitation of March in the third previous years, all were positively affected the growth of this species. This study revealed that ash is interested in warmer condition on early and middle of seasonal growth in present of available humid, and precipitation in the months of early growing season (Ordibehesht-Khordad of two previous years).

H. Jalilvand

2008-01-01

226

Directory of Open Access Journals (Sweden)

Full Text Available Many research groups have being studying the contribution of tropical forests to the global carbon cycle, and theclimatic consequences of substituting the forests for pastures. Considering that soil CO2 efflux is the greater component of the carboncycle of the biosphere, this work found an equation for estimating the soil CO2 efflux of an area of the Transition Forest, using a modelof multiple regression for time series data of temperature and soil moisture. The study was carried out in the northwest of MatoGrosso, Brazil (11°24.75’S; 55°19.50’W), in a transition forest between cerrado and AmazonForest, 50 km far from Sinop county.Each month, throughout one year, it was measured soil CO2 efflux, temperature and soil moisture. The annual average of soil CO2 efflux was 7.5 ± 0.6 (mean ± SE) ì mol m-2 s-1, the annual mean soil temperature was 25,06 ± 0.12 (mean ± SE) ºC. The study indicatedthat the humidity had high influence on soil CO2 efflux; however the results were more significant using a multiple regression modelthat estimated the logarithm of soil CO2 efflux, considering time, soil moisture and the interaction between time duration and theinverse of soil temperature. .

Carla Maria Abido Valentini; Mariano Martínez Espinosa; Sérgio Roberto de Paulo

2008-01-01

227

The aim of the study reported here was to develop a regression equation for predicting oral clearance of various kinds of drugs in humans using experimental data from rats and dogs and molecular structural parameters. The data concerning the oral clearance of 87 drugs from rats, dogs, and humans were obtained from literature. The compounds have various structures, pharmacological activities, and pharmacokinetic characteristics. In addition, the molecular weight, calculated partition coefficient (c log P), and the number of hydrogen bond acceptors were used as possible descriptors related to oral clearance in human. Multivariate regression analyses, multiple linear regression analysis, and the partial least squares (PLS) method were used to predict oral clearance in human, and the predictive performances of these techniques were compared by allometric approaches, which have been used in interspecies scaling. Interaction terms were also introduced into the regression analysis to evaluate the nonlinear relationship. For the data set used in this study, the PLS model with the tertiary term descriptors gave the best predictive performance, and the value of the squared cross-validated correlation coefficient (q(2)) was 0.694. This PLS model, using animal oral clearance data for only two species and easily calculated molecular structural parameters, can generally predict oral clearance in human better than the allometric approaches. In addition, the molecular structural parameters and the interaction term descriptors were useful for predicting oral clearance in human by PLS. Another advantage of this PLS model is that it can be applied to drugs with various characteristics. PMID:14603488

Wajima, Toshihiro; Fukumura, Kazuya; Yano, Yoshitaka; Oguma, Takayoshi

2003-12-01

228

Energy Technology Data Exchange (ETDEWEB)

In this paper, the most relevant multiple regression models for sales forecasting of gas stations, developed over the past ten years, are reviewed. The most significant variables related to gas station sales, the types of the multiple regression models (linear or non-linear), the most common uses in supporting decision making and its limits are presented. The predictive power of each model and its impact on decision-making, such as sensitivity analysis and confidence intervals for independent variables, are also commented. Four models are presented, based on studies conducted in South Africa, Portugal and Brazil. In conclusion, suggestions for future developments are presented based on past developments. (author)

Wanke, Peter [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil). Instituto de Pesquisa e Pos-Graduacao em Administracao de Empresas (COPPEAD). Centro de Estudos em Logistica

2004-07-01

229

Regression Models for Demand Reduction based on Cluster Analysis of Load Profiles

Energy Technology Data Exchange (ETDEWEB)

This paper provides new regression models for demand reduction of Demand Response programs for the purpose of ex ante evaluation of the programs and screening for recruiting customer enrollment into the programs. The proposed regression models employ load sensitivity to outside air temperature and representative load pattern derived from cluster analysis of customer baseline load as explanatory variables. The proposed models examined their performances from the viewpoint of validity of explanatory variables and fitness of regressions, using actual load profile data of Pacific Gas and Electric Company's commercial and industrial customers who participated in the 2008 Critical Peak Pricing program including Manual and Automated Demand Response.

Yamaguchi, Nobuyuki; Han, Junqiao; Ghatikar, Girish; Piette, Mary Ann; Asano, Hiroshi; Kiliccote, Sila

2009-06-28

230

Covariate imbalance and adjustment for logistic regression analysis of clinical trial data.

UK PubMed Central (United Kingdom)

In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This article uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be prespecified. Unplanned adjusted analyses should be considered secondary. Results suggest that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored.

Ciolino JD; Martin RH; Zhao W; Jauch EC; Hill MD; Palesch YY

2013-01-01

231

Energy Technology Data Exchange (ETDEWEB)

Relationships of ultimate and proximate analysis of 4540 US coal samples from 25 states with gross calorific value (GCV) have been investigated by regression and artificial neural networks (ANNs) methods. Three set of inputs: (a) volatile matter, ash and moisture (b) C, H, N, O, S and ash (c) C, H{sub exclusive} {sub of} {sub moisture}, N, O{sub exclusive} {sub of} {sub moisture}, S, moisture and ash were used for the prediction of GCV by regression and ANNs. The multivariable regression studies have shown that the model (c) is the most suitable estimator of GCV. Running of the best arranged ANNs structures for the models (a) to (c) and assessment of errors have shown that the ANNs are not better or much different from regression, as a common and understood technique, in the prediction of uncomplicated relationships between proximate and ultimate analysis and coal GCV. (author)

Mesroghli, Sh.; Jorjani, E.; Chehreh Chelgani, S. [Department of Mining Engineering, Science and Research Branch, Islamic Azad University, Poonak, Hesarak, Tehran (Iran)

2009-07-01

232

Directory of Open Access Journals (Sweden)

Full Text Available Logistic Regression (LR) is a well known classification method in the field of statistical learning. Itallows probabilistic classification and shows promising results on several benchmark problems.Logistic regression enables us to investigate the relationship between a categorical outcome anda set of explanatory variables. Artificial Neural Networks (ANNs) are popularly used as universalnon-linear inference models and have gained extensive popularity in recent years. Researchactivities are considerable and literature is growing. The goal of this research work is to comparethe performance of logistic regression and neural network models on publicly available medicaldatasets. The evaluation process of the model is as follows. The logistic regression and neuralnetwork methods with sensitivity analysis have been evaluated for the effectiveness of theclassification. The classification accuracy is used to measure the performance of both themodels. From the experimental results it is confirmed that the neural network model withsensitivity analysis model gives more efficient result.

Raghavendra B.K. & S.K. Srivatsa

2011-01-01

233

THE THEORY AND APPLICATION OF REGRESSION ANALYSIS AND THE LEAST-SQAURES PRINCIPLE

Directory of Open Access Journals (Sweden)

Full Text Available The theory and practice of regression analysis, and the principle of least-squares on which it is based, is frequently encountered in Mathematics and particularly Statistical Mathematics, but less well known are some very useful applications in a military environment. It is therefore the aim of this article to firstly give a general description of the theory of regression analyses, and secondly to highlight some military applications of the theory.

P. De Viliers

2012-01-01

234

UK PubMed Central (United Kingdom)

OBJECTIVE: The objective of the study is to determine whether Alcoholics Anonymous (AA) participation leads to reduced drinking and problems related to drinking within Project MATCH (Matching Alcoholism Treatments to Client Heterogeneity), an existing national alcoholism treatment data set. METHOD: The method used is structural equation modeling of panel data with cross-lagged partial regression coefficients. The main advantage of this technique for the analysis of AA outcomes is that potential reciprocal causation between AA participation and drinking behavior can be explicitly modeled through the specification of finite causal lags. RESULTS: For the outpatient subsample (n = 952), the results strongly support the hypothesis that AA attendance leads to increases in alcohol abstinence and reduces drinking/ problems, whereas a causal effect in the reverse direction is unsupported. For the aftercare subsample (n = 774), the results are not as clear but also suggest that AA attendance leads to better outcomes. CONCLUSIONS: Although randomized controlled trials are the surest means of establishing causal relations between interventions and outcomes, such trials are rare in AA research for practical reasons. The current study successfully exploited the multiple data waves in Project MATCH to examine evidence of causality between AA participation and drinking outcomes. The study obtained unique statistical results supporting the effectiveness of AA primarily in the context of primary outpatient treatment for alcoholism.

Magura S; Cleland CM; Tonigan JS

2013-05-01

235

UK PubMed Central (United Kingdom)

Recently, some individual differences in rats have been shown to be related to stress-induced physiological responses. As yet there has been no attempt to incorporate measurement techniques from the psychometric field to this line of research. The present study was conducted to examine the utility of applying such methods to animal research. Physiological responses to cold-restraint stress in 26 male rats were investigated using a factor analytic-multiple regression procedure. Nineteen behavioral and physiological measures obtained during open-field testing, motor activity monitoring, and passive avoidance learning were subjected to a principle components factor analysis. Five factors were extracted which reflected exploratory behavior, general activity level, metabolic rate, behavioral reactivity, and autonomic reactivity. The obtained factor scores were used to predict physiological responses to four hours of supine cold-restraint using a step-wise multiple regression procedure. General activity level was the best predictor of adrenal weight and temperature loss. Autonomic reactivity was the best predictor of ulcer incidence and severity. Applications of these statistical procedures are discussed.

Ossenkopp KP; Mazmanian DS

1985-06-01

236

Energy Technology Data Exchange (ETDEWEB)

The purpose of this study was to examine the relationship between turnover and productivity. Turnover was classified into two distinct categories of functional and dysfunctional. Functional turnover was defined as the loss of a non-valued employee and dysfunctional as the loss of a valued employee. It was hypothesized that functional turnover would have postive. Also, it was hypothesized that dysfunctional turnover would produce negative results, that is, productivity would decrease. The hypotheses were tested using hierarchical multiple regression at the 0.05 level of significance. Out of 476 regression equations associated with functional turnover, less than 3% resulted in rejection of the null hypothesis. In addition, out of 329 regression equations associated with dysfunctional turnover, less than 1% resulted in rejection of the null hypothesis. Implications for organizations are discussed with emphasis on the value of incorporating employee evaluations in their turnover reporting procedures.

Lawrence, J.E.

1985-01-01

237

We proposed a passive method to distinguish and to estimate density of the Cyanobacteria (Cyanophyceae) in a mixed population by measuring the spectral absorption of sample waters, based on two step linear regression analysis. Natural freshwater usually contains a few species of algae and dissolved organic carbon (DOC). In the experiment, we picked out four typical algal groups characterized by their own colors, Cyanophycaeae or blue-green alga, Chlorophyceae or green alga, and Bacillariophyceae and Dinophyceae or brown algae. In the first step, for each of the pure sample waters which contained only one of these elemental substances, dependence of spectral characteristic on its density was determined using simple linear regression analysis. Resultant spectral characteristics which we call gradient vectors were used to estimate spectral absorption of mixed sample waters containing the four elementary algae and DOC by multiple linear regression analysis. This method offers new perspectives for identification and estimation of density of blue-green algae and other unialgal species in a mixed population.

Lokuhewage, Asha Udayamali M.; Naiki, Yasuhiro; Toyooka, Satoru

238

Nonlinear Robust Regression Using Kernel Principal Component Analysis and R-Estimators

Directory of Open Access Journals (Sweden)

Full Text Available In recent years, many algorithms based on kernel principal component analysis (KPCA) have been proposed including kernel principal component regression (KPCR). KPCR can be viewed as a non-linearization of principal component regression (PCR) which uses the ordinary least squares (OLS) for estimating its regression coefficients. We use PCR to dispose the negative effects of multicollinearity in regression models. However, it is well known that the main disadvantage of OLS is its sensitiveness to the presence of outliers. Therefore, KPCR can be inappropriate to be used for data set containing outliers. In this paper, we propose a novel nonlinear robust technique using hybridization of KPCA and R-estimators. The proposed technique is compared to KPCR and gives better results than KPCR.

Antoni Wibowo; Mohammad Ishak Desa

2011-01-01

239

Categorical Variables in Regression Analysis: A Comparison of Dummy and Effect Coding

Directory of Open Access Journals (Sweden)

Full Text Available The use of categorical variables in regression involves the application of coding methods. The purpose of this paper is to describe how categorical independent variables can be incorporated into regression by virtue of two coding methods: dummy and effect coding. The paper discusses the uses, interpretations, and underlying assumptions of each method. In general, overall results of the regression are unaffected by the methods used for coding the categorical independent variables. In any of the methods, the analysis tests whether group membership is related to the dependent variables. Both methods yield identical R2 and F. However, the interpretations of the intercept and regression coefficients depend on what coding method has been applied and whether the groups have equal sample sizes.

Hussain Alkharusi

2012-01-01

240

UK PubMed Central (United Kingdom)

The work that is reported here concerns a method that allows the simultaneous determination of cadmium (II) and zinc (II) in aqueous solution by molecular fluorescence spectroscopy using 9-(1',4',7',10',13'-pentaazacyclopentadecyl)-methylanthracene. For this chemosensor, the fluorophore pi-system is insulated from an azacrown donor by one methylene group. A self-quenching mechanism, resulting from an electron transfer from the nitrogens of the azacrown to the excited aromatic system, essentially precludes fluorescence emission. Fluorescence is restored when cadmium (II) or zinc (II) are chelated by the macrocycle. The difference between the emission spectra profiles of the free chemosensor, the cadmium and the zinc chelates is such that the concentration determination of the two metals and the remaining free chemosensor is possible at the nanomolar scale in only one experiment using a multiple linear regression algorithm. Usefulness and convenience of this simple method is proven by steady state and kinetic quantitative determination experiments.

Yunus S; Charles S; Dubois F; Vander Donckt E

2008-03-01

241

Directory of Open Access Journals (Sweden)

Full Text Available Habitat degradation and loss has been widely recognized as the main cause for the decline of wildlife population. Evaluating the quality of wildlife habitat can provide essential information for wildlife refuge design and management. The purpose of this study was to produce georeferenced ecological information about suitable habitats available for muntjac, Muntiacus muntjak in Chandoli tiger reserve, India (17° 04' 00" N to 17° 19' 54" N and 73° 40' 43" E to 73° 53' 09" E). Habitats were evaluated using multiple logistic regression integrated with remote sensing and geographic information system. Satellite imageries of LISS-III of IRS-P6 of study area were digitally processed. To generate collateral data topographic maps were analysed in a GIS framework. Layers of different variables such as Landuse land cover, forest density, proximity to disturbances and water resources and a digital terrain model were created from satellite and topographic sheets. These layers along with GPS location of muntjac presence/absence and ?multiple logistic regression (MLR) techniques were integrated in a GIS environment to model habitat suitability index of muntjac. The results indicate that approximately 222.39 km2 (75.4%) of the forest of tiger reserve was least suitable for muntjac, whereas, 29.53 km2 (10.02%) was moderately suitable, 22.12 km2 (7.5%) suitable and 20.70 km2 (7.0%) was highly suitable. The accuracy level of this model was 97.6%. The model can be considered as potent enough to advocate that forests of this area are most appropriate for declaring it as a reserve for muntjac conservation, ultimately to provide prey base for tiger.

Imam EKWAL; Hussain TAHIR; Mary TAHIR

2012-01-01

242

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background During community epidemics, infections may be imported within hospital and transmitted to hospitalized patients. Hospital outbreaks of communicable diseases have been increasingly reported during the last decades and have had significant consequences in terms of patient morbidity, mortality, and associated costs. Quantitative studies are thus needed to estimate the risks of communicable diseases among hospital patients, taking into account the epidemiological process outside, hospital and host-related risk factors of infection and the role of other patients and healthcare workers as sources of infection. Methods We propose a multiplicative hazard regression model to analyze the risk of acquiring a communicable disease by patients at hospital. This model derives from epidemiological data on communicable disease epidemics in the community, hospital ward, patient susceptibility to infection, and exposure of patients to infection at hospital. The model estimates the relative effect of each of these factors on a patient's risk of communicable disease. Results Using individual data on patients and health care workers in a teaching hospital during the 2004-2005 influenza season in Lyon (France), we show the ability of the model to assess the risk of influenza-like illness among hospitalized patients. The significant effects on the risk of influenza-like illness were those of old age, exposure to infectious patients or health care workers, and a stay in a medical care unit. Conclusions The proposed multiplicative hazard regression model could be an interesting epidemiological tool to quantify the risk of communicable disease at hospital during community epidemics and the uncertainty inherent in such quantification. Furthermore, key epidemiological, environmental, host, or exposure factors that influence this risk can be identified.

Voirin Nicolas; Roche Sylvain; Vanhems Philippe; Giard Marine; David-Tchouda Sandra; Barret Béatrice; Ecochard René

2011-01-01

243

Seasonal forecasting of Bangladesh summer monsoon rainfall using simple multiple regression model

In this paper, the development of a statistical forecasting method for summer monsoon rainfall over Bangladesh is described. Predictors for Bangladesh summer monsoon (June-September) rainfall were identified from the large scale ocean-atmospheric circulation variables (i.e., sea-surface temperature, surface air temperature and sea level pressure). The predictors exhibited a significant relationship with Bangladesh summer monsoon rainfall during the period 1961-2007. After carrying out a detailed analysis of various global climate datasets; three predictors were selected. The model performance was evaluated during the period 1977-2007. The model showed better performance in their hindcast seasonal monsoon rainfall over Bangladesh. The RMSE and Heidke skill score for 31 years was 8.13 and 0.37, respectively, and the correlation between the predicted and observed rainfall was 0.74. The BIAS of the forecasts (% of long period average, LPA) was -0.85 and Hit score was 58%. The experimental forecasts for the year 2008 summer monsoon rainfall based on the model were also found to be in good agreement with the observation.

Rahman, Md Mizanur; Rafiuddin, M.; Alam, Md Mahbub

2013-04-01

244

Application of appropriate models to approximate the performance function warrants more precise prediction and helps to make the best decisions in the poultry industry. This study reevaluated the factors affecting hatchability in laying hens from 29 to 56 wk of age. Twenty-eight data lines representing 4 inputs consisting of egg weight, eggshell thickness, egg sphericity, and yolk/albumin ratio and 1 output, hatchability, were obtained from the literature and used to train an artificial neural network (ANN). The prediction ability of ANN was compared with that of fuzzy logic to evaluate the fitness of these 2 methods. The models were compared using R(2), mean absolute deviation (MAD), mean squared error (MSE), mean absolute percentage error (MAPE), and bias. The developed model was used to assess the relative importance of each variable on the hatchability by calculating the variable sensitivity ratio. The statistical evaluations showed that the ANN-based model predicted hatchability more accurately than fuzzy logic. The ANN-based model had a higher determination of coefficient (R(2) = 0.99) and lower residual distribution (MAD = 0.005; MSE = 0.00004; MAPE = 0.732; bias = 0.0012) than fuzzy logic (R(2) = 0.87; MAD = 0.014; MSE = 0.0004; MAPE = 2.095; bias = 0.0046). The sensitivity analysis revealed that the most important variable in the ANN-based model of hatchability was egg weight (variable sensitivity ratio, VSR = 283.11), followed by yolk/albumin ratio (VSR = 113.16), eggshell thickness (VSR = 16.23), and egg sphericity (VSR = 3.63). The results of this research showed that the universal approximation capability of ANN made it a powerful tool to approximate complex functions such as hatchability in the incubation process. PMID:23472039

Mehri, M

2013-04-01

245

UK PubMed Central (United Kingdom)

Application of appropriate models to approximate the performance function warrants more precise prediction and helps to make the best decisions in the poultry industry. This study reevaluated the factors affecting hatchability in laying hens from 29 to 56 wk of age. Twenty-eight data lines representing 4 inputs consisting of egg weight, eggshell thickness, egg sphericity, and yolk/albumin ratio and 1 output, hatchability, were obtained from the literature and used to train an artificial neural network (ANN). The prediction ability of ANN was compared with that of fuzzy logic to evaluate the fitness of these 2 methods. The models were compared using R(2), mean absolute deviation (MAD), mean squared error (MSE), mean absolute percentage error (MAPE), and bias. The developed model was used to assess the relative importance of each variable on the hatchability by calculating the variable sensitivity ratio. The statistical evaluations showed that the ANN-based model predicted hatchability more accurately than fuzzy logic. The ANN-based model had a higher determination of coefficient (R(2) = 0.99) and lower residual distribution (MAD = 0.005; MSE = 0.00004; MAPE = 0.732; bias = 0.0012) than fuzzy logic (R(2) = 0.87; MAD = 0.014; MSE = 0.0004; MAPE = 2.095; bias = 0.0046). The sensitivity analysis revealed that the most important variable in the ANN-based model of hatchability was egg weight (variable sensitivity ratio, VSR = 283.11), followed by yolk/albumin ratio (VSR = 113.16), eggshell thickness (VSR = 16.23), and egg sphericity (VSR = 3.63). The results of this research showed that the universal approximation capability of ANN made it a powerful tool to approximate complex functions such as hatchability in the incubation process.

Mehri M

2013-04-01

246

DEFF Research Database (Denmark)

This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities.

Barndorff-Nielsen, Ole Eiler; Shephard, N.

2004-01-01

247

UK PubMed Central (United Kingdom)

Self-help treatments have the potential to increase the availability and affordability of evidence-based treatments for anxiety disorders. Although promising, previous research results are heterogeneous, indicating a need to identify factors that moderate treatment outcome. The present article reviews the literature on self-help treatment for anxiety disorders among adults, with a total sample of 56 articles with 82 comparisons. When self-help treatment was compared to wait-list or placebo, a meta-analysis indicated a moderate to large effect size (g=0.78). When self-help treatment was compared to face-to-face treatment, results indicated a small effect that favored the latter (g=-0.20). When self-help was compared to wait-list or placebo, subgroup analyses indicated that self-help treatment format, primary anxiety diagnosis and procedures for recruitment of subjects were related to treatment outcome in bivariate analyses, but only recruitment procedures remained significant in a multiple meta-regression analysis. When self-help was compared to face-to-face treatment, a multiple meta-regression indicated that the type of comparison group, treatment format and gender were significantly related to outcome. We conclude that self-help is effective in the treatment of anxiety disorders, and should be offered as part of stepped care treatment models in community services. Implications of the results and future directions are discussed.

Haug T; Nordgreen T; Öst LG; Havik OE

2012-07-01

248

UK PubMed Central (United Kingdom)

Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.

Uchino M; Hirano T; Satoh H; Arimura K; Nakagawa M; Wakamiya J

2005-01-01

249

Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders. PMID:15635274

Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji

2005-01-01

250

The validation of an analytical procedure means the evaluation of some performance criteria such as accuracy, sensitivity, linear range, capability of detection, selectivity, calibration curve, etc. This implies the use of different statistical methodologies, some of them related with statistical regression techniques, which may be robust or not. The presence of outlier data has a significant effect on the determination of sensitivity, linear range or capability of detection amongst others, when these figures of merit are evaluated with non-robust methodologies. In this paper some of the robust methods used for calibration in analytical chemistry are reviewed: the Huber M-estimator; the Andrews, Tukey and Welsh GM-estimators; the fuzzy estimators; the constrained M-estimators, CM; the least trimmed squares, LTS. The paper also shows that the mathematical properties of the least median squares (LMS) regression can be of great interest in the detection of outlier data in chemical analysis. A comparative analysis is made of the results obtained by applying these regression methods to synthetic and real data. There is also a review of some applications where this robust regression works in a suitable and simple way that proves very useful to secure an objective detection of outliers. The use of a robust regression is recommended in ISO 5725-5. PMID:18970799

Ortiz, M Cruz; Sarabia, Luis A; Herrero, Ana

2006-02-10

251

Quantile regression for the statistical analysis of immunological data with many non-detects

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Conclusion Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

Eilers Paul HC; Röder Esther; Savelkoul Huub FJ; van Wijk Roy

2012-01-01

252

Regression and residual analysis in linear models with interval censored data

Digital Repository Infrastructure Vision for European Research (DRIVER)

This work consists of two parts, both related with regression analysis for interval censored data. Interval censored data x have the property that their value cannot be observed exactly but only the respective interval [xL,xR] which contains the true value x with probability one.In the first part of...

Topp, Rebekka

253

What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

Thomas, Emily H.; Galambos, Nora

2004-01-01

254

Application of logistic regression in an analysis of Polish households’ financial problems

Directory of Open Access Journals (Sweden)

Full Text Available This article attempted to identify the socio-economic and demographic factors influencing the problems with arrears in Polish households. The micro data from Social Diagnosis were used. In order to achieve the main goal the logistic regression analysis was used.

Zbigniew Go?a?,; Paulina Anio?a

2012-01-01

255

UK PubMed Central (United Kingdom)

Net ecosystem exchange of CO? (NEE) over a temperate peatland in northwestern Turkey was directly measured using the eddy covariance (EC) method for 590 days. Both the response variables of diurnal and nocturnal NEE (NEEd and R?c???) and the explanatory variables of latent heat (LE), relative humidity (RH), and atmospheric CO? and H?O concentrations (AtmCO? and AtmH?O) were denoised with discrete wavelet transform (DWT) using coiflet (coif10-6). Denoised NEE fluxes and their temporal components were modeled using multiple linear regression (MLR), polynomial regression (PR) and artificial neural network (ANN) models as a function of LE, RH, AtmCO?, AtmH?O, air temperature (T???), day of year (DOY), and local time. Peak NEEd flux, and peak R?c??? efflux were ?0.37mgCO?m?²s?¹ in late July and 0.27mgCO?m?²s?¹ in mid-August. Mean annual NEE was estimated at ?1157gCO?m?² which is in agreement with previous results of peatland studies. The use of DWT-augmented ANN, MLR and PR models significantly increased predictive power and reduced uncertainties in predicting the temporal dynamics of the biosphere–atmosphere CO? exchange, relative to the models without DWT denoising. Out of 28 DWT-augmented ANNs, multilayer perceptron (MLP) and recurrent network (RN) models were the best diurnal and nocturnal ones, respectively, based on accuracy metrics derived from training, cross-validation and independent validation. Among the DWT-based ANN, MLR and PR models, diurnal MLP and nocturnal MLR outperformed the others. Wavelet-augmented ANN and MLR models appear to be a promising tool to quantify diurnal and nocturnal NEE dynamics, respectively.

Evrendilek F

2013-04-01

256

[Regression analysis in the assessment of the state of the immune system

UK PubMed Central (United Kingdom)

Optimal assessment of the immune status is one of the key problems in modern clinical immunology. Mathematical methods of diagnosis help select the minimal number of the most informative tests. Regression analysis of linear relationship permitted determining the regression coefficients and correlation pairs of equations which may be used for predicting, with few reference monoclonal sera (CD3 and CD5), the lacking cellular phenotype in immunogram. The resultant correlation pairs of equations were tried in a large number of examinees at laboratory of ecological immunology and are recommended for clinical diagnostical laboratories.

Shchegoleva LS; Dobrodeeva LK; Poskotinova LV; Diuzhikova EM

1999-03-01

257

Automatic regression analysis for use in a complex system of evaluation of plant genetic resources

Directory of Open Access Journals (Sweden)

Full Text Available In accordance with the general requirements regarding computerization in gene banks and germplasm research a computer program has been compiled for the analysis of univariate response in crop germplasm evaluation. The program is compiled in COBOL and run on a FELIX C-256 computer. The different modules of the program allows for: (1.) data control and error listing; (2) computation of the regression function; (3) listing of the difference between the values measured and computed; (4) sorting of the individuals samples; (5) construction of scattergrams in two dimensions for measured values with the simultaneous representation of the regression line; (6) listing of examined samples in a sequence required in evaluation.

Cs. ARKOSSY; Attila T. SZABO

1984-01-01

258

UK PubMed Central (United Kingdom)

Reliable detection of circadian phase in humans using noninvasive ambulatory measurements in real-life conditions is challenging and still an unsolved problem. The masking effects of everyday behavior and environmental input such as physical activity and light on the measured variables need to be considered critically. Here, we aimed at developing techniques for estimating circadian phase with the lowest subject burden possible, that is, without the need of constant routine (CR) laboratory conditions or without measuring the standard circadian markers, (rectal) core body temperature (CBT), and melatonin levels. In this validation study, subjects (N = 16) wore multi-channel ambulatory monitoring devices and went about their daily routine for 1 week. The devices measured a large number of physiological, behavioral, and environmental variables, including CBT, skin temperatures, cardiovascular and respiratory function, movement/posture, ambient temperature, and the spectral composition and intensity of light received at eye level. Sleep diaries were logged electronically. After the ambulatory phase, subjects underwent a 32-h CR procedure in the laboratory for measuring unmasked circadian phase based on the "midpoint" of the salivary melatonin profile. To overcome the complex masking effects of confounding variables during ambulatory measurements, multiple regression techniques were applied in combination with the cross-validation approach to subject-independent prediction of circadian phase. The most accurate estimate of circadian phase was achieved using skin temperatures, irradiance for ambient light in the blue spectral band, and motion acceleration as predictors with lags of up to 24 h. Multiple regression showed statistically significant improvement of variance of prediction error over the traditional approaches to determining circadian phase based on single predictors (motion acceleration or sleep log), although CBT was intentionally not included as the predictor. Compared to CBT alone, our method resulted in a 40% smaller range of prediction errors and a nonsignificant reduction of error variance. The proposed noninvasive measurement method could find applications in sleep medicine or in other domains where knowing the exact endogenous circadian phase is important (e.g., for the timing of light therapy).

Kolodyazhniy V; Späti J; Frey S; Götz T; Wirz-Justice A; Kräuchi K; Cajochen C; Wilhelm FH

2011-02-01

259

UK PubMed Central (United Kingdom)

After the impoundment of the Three Gorges Reservoir (TGR) since 2003, eutrophication has occurred and has become severe in Daning River. To predict chlorophyll-a (Chl-a) levels, the relationships between Chl-a and 11/13 routine monitoring data on water quality and hydrodynamics in Daning River were studied by principal component scores in the multiple linear regression model (principal component regression (PCR) model). In order to determine the hydrodynamic effect on simulated accuracy, two 0-day ahead prediction models were established: model A without hydrodynamic factors as variables, and model B with hydrodynamic factors (surface water velocity and water residence time) as variables. Based on the results of correlation analysis, score 1 and 2 with significant loads of phosphorus and nitrogen nutrients were omitted in developing model A (R(2) = 0.355); while score 2 with significant loads of nitrogen was omitted in developing model B (R(2) = 0.777). The results of validation using a new dataset showed that model B achieved a better fitted relationship between the predicted and observed values of Chl-a. It indicated hydrodynamics play an important role in limiting algal growth. The results suggested that a PCR model incorporating hydrodynamics processes has been suitable for the Chl-a concentration simulation and algal blooming prediction in Daning River of TGR.

Liping W; Binghui Z

2013-01-01

260

Linear regression analysis of the gamma dose in fast neutron beams

International Nuclear Information System (INIS)

The dual dosimeter technique for determining both the absorbed dose of neutrons and photons in a mixed field has been applied to multiple dosimeter use. The data were analyzed by a linear regression method which yields the neutron dose from the slope and the photon dose from the intercept and an estimation of the uncertainty of the photon dose can also be obtained. Measurements were made on a high energy neutron beam and the photon dose obtained both as a function of field size and depth in a tissue equivalent phantom

1980-01-01

261

Linear regression analysis of the gamma dose in fast neutron beams

Energy Technology Data Exchange (ETDEWEB)

The dual dosimeter technique for determining both the absorbed dose of neutrons and photons in a mixed field has been applied to multiple dosimeter use. The data were analyzed by a linear regression method which yields the neutron dose from the slope and the photon dose from the intercept and an estimation of the uncertainty of the photon dose can also be obtained. Measurements were made on a high energy neutron beam and the photon dose obtained both as a function of field size and depth in a tissue equivalent phantom.

Almond, P.R.; Rosanky, S.L.

1980-07-01

262

International Nuclear Information System (INIS)

[en] Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R2 were used to evaluate performance of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R2 confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in future.

2009-01-01

263

UK PubMed Central (United Kingdom)

A set of 38 mineral base oils was characterized by a number of chemical (i.e., overall chemical composition) and physical parameters used routinely in industry. Their primary biodegradability was evaluated using the CEC L-33-A-93 test. Multiple (stepwise) linear regression (MLR) analyses were performed to describe the relationships between the biodegradability values and the chemical or physical properties of oils. Chemical, physical, and both types of parameters were successively used as independent variables. Using chemical descriptors as variables, a four-variable model equation was obtained that explained only 68.2% (adjusted R-squared statistic=68.2%) of the variability in biodegradability. The fitting was improved by using either the physical or the whole parameters as variables. MLR analyses led to three-descriptor model equations involving kinematic viscosity (as log), Noack volatility (as log) and either the viscosity index (pure physical model) or the paraffinic carbon percentage (mixed chemical-physical model). These two models displayed very similar adjusted R-squared statistics, of approximately 91%. Their predicting ability was verified using 25 additional base oils or oil blends. For 80% of oils on a total of 63, the absolute percentage error on biodegradability predicted by either model was lower than 20%. Kinematic viscosity was by far the most influential parameter in the two models.

Haus F; Boissel O; Junter GA

2003-02-01

264

Directory of Open Access Journals (Sweden)

Full Text Available In the present work, support vector machines (SVMs) and multiple linear regression (MLR) techniques were used for quantitative structure–property relationship (QSPR) studies of retention time (tR) in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins) based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD). The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

Roya Khosrokhavar; Jahan Bakhsh Ghasemi; Fereshteh Shiri

2010-01-01

265

UK PubMed Central (United Kingdom)

Ozone is a highly unpredictable pollutant which severely affects living conditions in urban and surrounding areas in the Mediterranean basin. This secondary pollutant periodically reaches extremely high concentrations, damaging human health. Multiple linear regression has been widely used in previous works due to the fact that it is a simple and versatile method for forecasting ozone concentrations. However, these models usually prove their validity using fulfillment of statistical constraints, ignoring other intrinsic characteristics existing in the time series, such as the temporal scaling behavior and the data distribution over different time scales. In previous works, it has been demonstrated that observed ozone time series are of a multifractal nature, meaning that the data distribution can be described by using the multifractal spectrum. This work focuses on the capacity of a forecasting model to reproduce the scaling features existing in an observed time series when several chemical and meteorological explanatory variables are introduced following the stepwise procedure. A comparison between the observed spectrum and the simulated ones for each step is used to check which explanatory variables better reproduce the multifractal nature in real ozone time series. It has been confirmed that a model with few explanatory variables allows reproducing the multifractal nature in the simulated time series with an acceptable accuracy without compromising the values of the coefficient of determination and root-mean-squared error, which were used as performance indicators.

Pavón-Domínguez P; Jiménez-Hornero FJ; de Ravé EG

2013-05-01

266

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Results Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. Conclusion The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Bohlin Jon; Skjerve Eystein; Ussery David W

2009-01-01

267

Application of Binary Regression Analysis in the Prescription Pattern of Antidepressants

Directory of Open Access Journals (Sweden)

Full Text Available Background:In Nepal several research studies are reported using percentages or cross tabulation method, but the relevance of logistic regression methodology in research is lag behind among the researchers. Objectives: The main objective of this study was to find the role of logistic regression analysis in the pattern of antidepressants in a tertiary care center in hospitalized patients of Western Nepal.Methods: A hospital based study was done between 1st October 2009 and 31st March 2010 at Psychiatry Ward of Manipal Teaching Hospital, Nepal. Z test, Chi square test and Binary logistic regression were used for the analysis. We calculated odds ratios (OR) and their 95% confidence intervals (95% CI) P-value 10000, 2.63 times more in Hindus and 1.197 times more in Brahmins than any other ethnic groups. 9.179 times more tendency of prescribing antidepressants by trade names in case of unemployed patients as compared to employed patients in Nepal.Conclusion: Binary Logistic regression plays an important role to understand the drug utilization pattern of mood elevators in Western Nepal.

Dr.Indrajit Banerjee, MBBS, MD; Dr.Indraneel Banerjee, MBBS, MS, MRCS; Bedanta Roy; Dr.Brijesh Sathian MD(AM), PhD.

2013-01-01

268

DEFF Research Database (Denmark)

Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Ussery, David; Bohlin, Jon

2009-01-01

269

UK PubMed Central (United Kingdom)

Searching for background factors associated with falls in people with dementia is difficult because the population is heterogeneous. The aim of this study was to compare the efficacies of three statistical methods for analysis of fall predictors in people with dementia. NBR, RT and PLSR analyses were compared. Data used for the comparison were from a prospective cohort study of 192 patients at a psychogeriatric ward, specializing in patients with cognitive impairment and related behavioral and psychological symptoms. Seventy-eight of these patients fell a total of 238 times. PLSR and RT analyses are directed at finding patterns among predictor variables related to outcome, whereas an NBR model is directed at finding predictor variables that, independent of other variables, are related to the outcome. The NBR analysis explained an additional 10-15% variation compared with the PLSR and RT analyses. The results of PLSR and RT show a similar plausible pattern of predictor variables. However, none of these techniques appears to be sufficient in itself. In order to gain patterns of explanatory variables, RT would be a good complement to NBR for analysis of fall predictors.

Eriksson S; Lundquist A; Gustafson Y; Lundin-Olsson L

2009-11-01

270

The information that is gained through various analyses of the residual scores yielded by the least squares regression model is explored. In fact, the most widely used methods for detecting data that do not fit this model are based on an analysis of residual scores. First, graphical methods of residual analysis are discussed, followed by a review…

Serdahl, Eric

271

Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

Directory of Open Access Journals (Sweden)

Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

Catalin Angelo Ioan; Gina Ioan

2011-01-01

272

UK PubMed Central (United Kingdom)

Surface-enhanced Raman spectroscopy (SERS) has been a routine method used as an analytical tool to do the quantitative analysis of materials. The difficulties mainly come from the inherent instable backgrounds of Raman signals, which unexpectedly increase the intensities of Raman spectra and from the high dimension small sample number problem of Raman data sets, which demands the ability of feature extraction from the regression methods. Targeting at removing the instable background meanwhile extracting the Raman peaks and taking full use of the information of Raman peaks to extract features, we design a new framework that combines new continuum regression (NCR) with continuous wavelet transform (CWT) to do the quantitative analysis of Raman spectra. The experiment results show its performance beats the state of the art methods.

Li S; Kang M; Nyagilo JO; Zhang B; Wu X; Dave DP; Gao J

2013-07-01

273

Directory of Open Access Journals (Sweden)

Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.

Hüseyin BUDAK; Semra ERPOLAT

2012-01-01

274

A logistic normal multinomial regression model for microbiome compositional data analysis.

UK PubMed Central (United Kingdom)

Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ?1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ?1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype.

Xia F; Chen J; Fung WK; Li H

2013-10-01

275

Robust estimation for homoscedastic regression in the secondary analysis of case-control data.

UK PubMed Central (United Kingdom)

Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach.

Wei J; Carroll RJ; Müller UU; Van Keilegom I; Chatterjee N

2013-01-01

276

Directory of Open Access Journals (Sweden)

Full Text Available OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a high risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients) were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for high-risk group and 15.2% for low-risk group. Accuracy of the tree model was maintained in the test sample and different Child-Pugh classes. CONCLUSIONS: A decision tree model that consists of spleen width, portal vein diameter and prothrombin time may be useful for prediction of large esophageal varices in cirrhotic patients

Wan-dong Hong; Le-mei Dong; Zen-cai Jiang; Qi-huai Zhu; Shu-Qing Jin

2011-01-01

277

DEFF Research Database (Denmark)

Background: The next fifty years will see a drastic increase in the older population. Among other effects, ageing causes a decrease in strength. It is necessary to provide safe and comfortable environments for the elderly. To achieve this, digital human modelling has proved to be a useful and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent parameters based on statistical redundancies, and then validate these equations. Methods: 283 subjects (141 males, 142 females) aged 50-59 years (54.9 +/- 2.9) , 60-69 years (65.4 +/- 2.9) and 70-79 years (73.7 +/- 2.7) were tested for maximal voluntary isometric torque of right knee extensors and elbow flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (P<0.05) across all age groups. This held true for males and females. Gender, thigh mass and age best predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass and age.

D'Souza, Sonia; Rasmussen, John

2012-01-01

278

Directory of Open Access Journals (Sweden)

Full Text Available Organophosphorus compounds are a well known class of toxic chemicals which find their way into ecosystem due to their wide spread use. Their detection, identification and quantification are cause of concern world over. In environmental samples these compounds are detected and estimated through the gas chromatographic response factor. This prompted us to study the quantitative structure-response relationships (QSRR) of gas chromatographic response factor of organophosphonate esters. In this study attempts have been made to rationalize the gas chromatographic response factor of twenty-eight organophosphonates in terms of their physicochemical and electronic descriptors. Combinatorial Protocol in Multiple Linear Regression (CP-MLR), a 'filter' based variable selection procedure for model development in structure-activity or property relationship studies, has been used for the variable selection and identification of diverse QSRR models of the GC response factor of organophosphonates. The study has resulted in the identification of ten models (equations), having two or three descriptor each, to account for the response factor of organophosphonates (cross-validated R2 or Q2 is 0.88 to 0.95). The response factor of the compounds is strongly correlated with the total refractivity (TREF), molecular weight (MW) and thermodynamic properties, e.g., enthalpy of vaporization (ENTH). In the study, alkyl groups of these compounds have shown two-fold influence (namely, steric and branching effect) on the response factor. Also, the study suggests that the polarization of (d-p)? bond of P=Oa in these compounds plays a critical role in the formation of the responding species. The steric and electronic properties of organophosphonates play a determining role in the predictive aspect of their gas chromatographic response factor. Also the study suggested a mechanism for the formation of the responding species.

Yenamandra S. Prabhakar

2004-01-01

279

UK PubMed Central (United Kingdom)

1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both individuals maximize their reproductive success when the larger individual adopts the female tactic. 8. This study provides the most complete picture to date of the factors that predict the reproductive success of dominant pairs of clown anemonefish and illustrates the utility of BRTs for analysis of complex behavioural and evolutionary ecology data.

Buston PM; Elith J

2011-05-01

280

Scientific Electronic Library Online (English)

Full Text Available Abstract in spanish Es necesario contar con registros largos de información hidrológica anual para obtener una imagen más apegada a la realidad de su variabilidad, así como estimaciones confiables de sus propiedades estadísticas. Para obtener tales registros es común buscar fuentes adicionales de datos y técnicas de transferencia. Una técnica es la regresión lineal múltiple, cuya aplicación numérica lleva implícita la selección óptima de los registros largos cercanos (regresor (more) es) para buscar que la ampliación del registro corto sea una estimación confiable. Este proceso de selección implica tres análisis: 1) cómo definir las mejores estimaciones, 2) cuáles ecuaciones de regresión investigar, y 3) cuál modelo tiene mejor capacidad predictiva. Para el primer análisis se presentan cuatro criterios basados en las sumas de los cuadrados de los residuos; para el segundo se investigan todas las regresiones posibles porque en los problemas de transferencia de información hidrológica se dispondrá máximo de cinco regresores; para el tercero, seleccionar el mejor modelo predictivo se utiliza el análisis de residuales y la validación cruzada. La aplicación numérica descrita es una ampliación del registro de volúmenes escurridos anuales en la estación hidrométrica Platón Sánchez del sistema del río Tempoal, en la Región Hidrológica No. 26 (Pánuco, México). En este caso se utilizan cuatro regresores que son los registros del resto de las estaciones de aforos de tal sistema. Se concluye que incluso en problemas con multicolinealidad, los criterios de selección y los análisis expuestos conducen a resultados consistentes y permiten obtener las mejores ecuaciones de regresión. La similitud de los resultados alcanzados con los modelos de regresión seleccionados genera confianza en las estimaciones adoptadas. Abstract in english It is necessary to have long records of annual hydrological data to get a truer picture of their variability, as well as reliable estimates of their statistical properties. To obtain these records it is common to use additional sources of data and transfer techniques. One technique is the multiple linear regression whose numerical application implies the optimum selection of close lengthy records (regressors) to have the extension of short registration be a reliable estim (more) ate. This selection process involves three analyses: 1) how to define the best estimates, 2) what regression equations should be investigated, and 3) which model has better predictive ability. For the first analysis four criteria based on the sums of the squares of the residuals are presented; for the second all possible regressions are investigated since in the problems of hydrological information transfer, we will have five regressors at the most; for the third, about selecting the best predictive model, we used the residual analysis and cross-validation. The numerical application described is an extension of the annual runoff volume record in the Platón Sánchez hydrometric station of the Tempoal river system in the 26 Hydrological Region (Pánuco, México). Here we used four regressors that are the records of other gauging stations in such system. We came to the conclusion that even in problems with multicollinearity, the selection criteria and analysis led to consistent results and allowed for the best regression equations. The similarity of the results obtained with the selected regression models generated confidence in the estimates adopted.

Campos-Aranda, Daniel F.

2011-12-01

281

Scientific Electronic Library Online (English)

Full Text Available Abstract in portuguese Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dado (more) s do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM). As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente. Abstract in english This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calcium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region wh (more) ere this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM). The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas) or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation), piezometric well or superficial water, respectivelly.

Maia, Celsemy E.; Morais, Elís R.C. de; Oliveira, Maurício de

2001-04-01

282

A regression analysis of the effect of energy use in agriculture

Energy Technology Data Exchange (ETDEWEB)

This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey. (Author)

Karkacier, Osman [Gaziosmanpasa Univ., Dept. of Business Administration, Tokat (Turkey); Goktolga, Z. Gokalp; Cicek, Adnan [Gaziosmanpasa Univ., Dept. of Agricultural Economics, Tokat (Turkey)

2006-12-15

283

Energy Technology Data Exchange (ETDEWEB)

This paper describes the results obtained with the multiple regression analysis for the coke quality prediction at COSIPA Steelworks. The analysed data covers the period from July, 1993, when the use of soft coal in the blend was increased, to December, 1994. Also some neural networks were designed and build to compare the results of this technique with the statistical model predictions. (author) 5 refs., 9 figs., 2 tabs.

Lia, Luiz R.B.; Maranha, Silvio P.D. [Companhia Siderurgica Paulista (COSIPA), Cubatao, SP (Brazil)

1996-12-31

284

Correlation Study and Regression Analysis of Drinking Water Quality in Kashan City, Iran

Directory of Open Access Journals (Sweden)

Full Text Available Chemical and statistical regression analysis on drinking water samples at five fields (21 sampling wells) with hot and dry climate in Kashan city, central Iran was carried out. Samples were collected during October 2006 to May 2007 (25 - 30 °C). Comparing the results with drinking water quality standards issued by World Health Organization (WHO), it is found that some of the water samples are not potable. Hydrochemical facies using a Piper diagram indicate that in most parts of the city, the chemical character of water is dominated by NaCl. All samples showed sulfate and sodium ion higher and K+ and F- content lower than the permissible limit. A strongly positive correlation is observed between TDS and EC (R = 0.995) and Ca2+ and TH (R = 0.948). The results showed that regression relations have the same correlation coefficients: (I) pH -TH, EC -TH (R = 0.520), (II) NO3- -pH, TH-pH (R = 0.520), (III) Ca2+-SO42-, TH-SO42-, Cl- -SO42- (R = 0.630). The results revealed that systematic calculations of correlation coefficients between water parameters and regression analysis provide a useful means for rapid monitoring of water quality.

Mohammad Mehdi HEYDARI; Ali ABBASI; Seyed Mohammad ROHANI; Seyed Mohammad Ali HOSSEINI

2013-01-01

285

Directory of Open Access Journals (Sweden)

Full Text Available In order to evaluate the skin surface temperature (SSST) estimation accuracy with MODIS data, 84 of MODIS scenes together with the match-up data of NCEP/GDAS are used. Through regressive analysis, it is found that 0.305 to 0.417 K of RMSE can be achieved. Furthermore, it also is found that band 29 is effective for atmospheric correction (30.6 to 38.8% of estimation accuracy improvement). If single coefficient set for the regressive equation is used for all the cases, SSST estimation accuracy is around 1.969 K so that the specific coefficient set for the five different cases have to be set.

Kohei Arai

2012-01-01

286

Dynamic Seemingly Unrelated Cointegrating Regression

UK PubMed Central (United Kingdom)

Multiple cointegrating regressions are frequently encountered in empirical work as, forexample, in the analysis of panel data. When the equilibrium errors are correlated acrossequations, the seemingly unrelated regression estimation strategy can be applied tocointegrating regressions to obtain asymptotically e#cient estimators. Whilenon-parametric methods for seemingly unrelated cointegrating regressions have beenproposed in the literature, they are not computationally straightforward. We proposeDynamic Seemingly Unrelated Regression (SUR) Estimators which can be made fullyparametric and are computationally straightforward to use. We study the asymptotic andsmall sample properties of the dynamic SUR estimators for both heterogeneous andhomogenous cointegrating vectors. The estimation techniques are then applied to analyzetwo long-standing problems. The first revisits whether the forward exchange rate is anunbiased predictor of the future spot rate. The second prob...

Nelson C. Mark; Masao Ogaki; Donggyu Sul

287

DEFF Research Database (Denmark)

The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating eigenvalues and eigenvectors. We give a number of different applications to regression and time series analysis, and show how the reduced rank regression estimator can be derived as a Gaussian maximum likelihood estimator. We briefly mention asymptotic results

Johansen, SØren

2008-01-01

288

Directory of Open Access Journals (Sweden)

Full Text Available Polycyclic aromatic hydrocarbons (PAHs) are ubiquitous contaminants found in the environment. Immunoassays represent useful analytical methods to complement traditional analytical procedures for PAHs. Cross-reactivity (CR) is a very useful character to evaluate the extent of cross-reaction of a cross-reactant in immunoreactions and immunoassays. The quantitative relationships between the molecular properties and the CR of PAHs were established by stepwise multiple linear regression, principal component regression and partial least square regression, using the data of two commercial enzyme-linked immunosorbent assay (ELISA) kits. The objective is to find the most important molecular properties that affect the CR, and predict the CR by multiple regression methods. The results show that the physicochemical, electronic and topological properties of the PAH molecules have an integrated effect on the CR properties for the two ELISAs, among which molar solubility (Sm) and valence molecular connectivity index (3?v) are the most important factors. The obtained regression equations for RisC kit are all statistically significant (p p > 0.05) and not suitable for predicting. It is probably because that the RisC immunoassay employs a monoclonal antibody, while the RaPID kit is based on polyclonal antibody. Considering the important effect of solubility on the CR values, cross-reaction potential (CRP) is calculated and used as a complement of CR for evaluation of cross-reactions in immunoassays. Only the compounds with both high CR and high CRP can cause intense cross-reactions in immunoassays.

Yan-Feng Zhang; Li Zhang; Zhi-Xian Gao; Shu-Gui Dai

2012-01-01

289

A sensitivity analysis of a distributed hydrologic model with a large number of parameters is essential for understanding the model structure and simplifying model calibration efforts. It is also useful for guiding future field data collection and sampling efforts. Global sensitivity analysis methods are widely recognized today as superior to local or one-at-a-time methods because they are not limited by model linearity requirements and have a more extensive coverage of the parameter space. In this study, two global sensitivity analysis methods, the variance-based Sobol method and a Latin Hypercube Sampling based Multiple Linear Regression (LHS-MLR) approach, are employed to evaluate the effect of model parameter variability on simulated stages in the Everglades National Park (ENP) in Florida, USA. Both methods provide robust estimates of model parameter sensitivity. However, due to the distinctive characteristics of the two methods, they provide unique insights regarding model parameter sensitivities. These observations are compared in detail in this study. The simulated stage results from the distributed-parameter Regional Simulation Model (RSM), developed by the South Florida Water Management District, are used for this comparison. The parameters considered for sensitivity analysis consist of several model parameters that influence overland and groundwater flows as well as evapotranspiration within the ENP. Their relative sensitivities are assessed under dry, wet and average hydrologic conditions existing in the ENP watershed. The use of a variety of hydrologic conditions allows the robust assessment of parameter sensitivities obtained using the two global sensitivity analysis methods.

Dessalegne, T.; Senarath, S. U.; Novoa, R. J.

2010-12-01

290

Directory of Open Access Journals (Sweden)

Full Text Available Pesquisadores da área da saúde lidam frequentemente com o problema das bases de dados incompletas. A Análise de Casos Completos (ACC), que restringe as análises aos indivíduos com dados completos, reduz o tamanho da amostra e pode produzir estimativas viciadas. Baseado em fundamentos estatísticos, o método de Imputação Múltipla (IM) utiliza todos os dados coletados e é recomendado como alternativa à ACC. Dados do estudo Saúde em Beagá, inquérito domiciliar em que participaram 4.048 adultos de dois dos nove distritos sanitários da Cidade de Belo Horizonte no biênio 2008-2009, foram utilizados para avaliar a ACC e diferentes abordagens de IM no contexto de modelos logísticos com covariáveis incompletas. Peculiaridades de algumas variáveis desse estudo permitiram aproximar uma situação em que os dados ausentes de uma covariável são recuperados, e assim os resultados anteriores e posteriores à recuperação são comparados. Verificou-se que mesmo a abordagem mais simplista de IM obteve melhor desempenho que a ACC, já que se aproximou mais dos resultados pós-recuperação.Researchers in the health field often deal with the problem of incomplete databases. Complete Case Analysis (CCA), which restricts the analysis to subjects with complete data, reduces the sample size and may result in biased estimates. Based on statistical grounds, Multiple Imputation (MI) uses all collected data and is recommended as an alternative to CCA. Data from the study Saúde em Beagá, attended by 4,048 adults from two of nine health districts in the city of Belo Horizonte, Minas Gerais State, Brazil, in 2008-2009, were used to evaluate CCA and different MI approaches in the context of logistic models with incomplete covariate data. Peculiarities in some variables in this study allowed analyzing a situation in which the missing covariate data are recovered and thus the results before and after recovery are compared. Based on the analysis, even the more simplistic MI approach performed better than CCA, since it was closer to the post-recovery results.

Vitor Passos Camargos; Cibele Comini César; Waleska Teixeira Caiaffa; Cesar Coelho Xavier; Fernando Augusto Proietti

2011-01-01

291

A refined method for multivariate meta-analysis and meta-regression.

UK PubMed Central (United Kingdom)

Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.

Jackson D; Riley RD

2013-08-01

292

Regression analysis of current-status data: an application to breast-feeding.

UK PubMed Central (United Kingdom)

"Although techniques for calculating mean survival time from current-status data are well known, their use in multiple regression models is somewhat troublesome. Using data on current breast-feeding behavior, this article considers a number of techniques that have been suggested in the literature, including parametric, nonparametric, and semiparametric models as well as the application of standard schedules. Models are tested in both proportional-odds and proportional-hazards frameworks....I fit [the] models to current status data on breast-feeding from the Demographic and Health Survey (DHS) in six countries: two African (Mali and Ondo State, Nigeria), two Asian (Indonesia and Sri Lanka), and two Latin American (Colombia and Peru)."

Grummer-strawn LM

1993-09-01

293

Scientific Electronic Library Online (English)

Full Text Available Abstract in english In QSRR discipline an easy novel to used parameter was designed (Vc) for evaluated classical topological index (W, ¹chi, Z, MTI) and two new generation ones (Xu, ¹chih). Regression between Vc and ¹chih presented a correlation index (r) of 0,9992, a surprising high value in comparison with that founds commonly in QSPR/QSAR discipline. Through Vc parameter, an idea to treatise multiple three independent variable regression is present. Model of 35 saturated hydrocarbons were used

CORNWELL, E

2006-03-01

294

Directory of Open Access Journals (Sweden)

Full Text Available Aim: The study aimed to determine the factors associated with periodontal disease (different levels of severity) by using different regression models for ordinal data. Design: A cross-sectional design was employed using clinical examination and ?questionnaire with interview? method. Materials and Methods: The study was conducted during June 2008 to October 2008 in Dharwad, Karnataka, India. It involved a systematic random sample of 1760 individuals aged 18-40 years. The periodontal disease examination was conducted by using Community Periodontal Index for Treatment Needs (CPITN). Statistical Analysis Used: Regression models for ordinal data with different built-in link functions were used in determination of factors associated with periodontal disease. Results: The study findings indicated that, the ordinal regression models with four built-in link functions (logit, probit, Clog-log and nlog-log) displayed similar results with negligible differences in significant factors associated with periodontal disease. The factors such as religion, caste, sources of drinking water, Timings for sweet consumption, Timings for cleaning or brushing the teeth and materials used for brushing teeth were significantly associated with periodontal disease in all ordinal models. Conclusions: The ordinal regression model with Clog-log is a better fit in determination of significant factors associated with periodontal disease as compared to models with logit, probit and nlog-log built-in link functions. The factors such as caste and time for sweet consumption are negatively associated with periodontal disease. But religion, sources of drinking water, Timings for cleaning or brushing the teeth and materials used for brushing teeth are significantly and positively associated with periodontal disease.

Javali Shivalingappa; Pandit Parameshwar

2010-01-01

295

Analyzing temporal trends in health outcomes can provide a more comprehensive picture of the burden of a disease like cancer and generate new insights about the impact of various interventions. In the United States such an analysis is increasingly conducted using joinpoint regression outside a spatial framework, which overlooks the existence of significant variation among U.S. counties and states with regard to the incidence of cancer. This paper presents several innovative ways to account for space in joinpoint regression: (1) prior filtering of noise in the data by binomial kriging and use of the kriging variance as measure of reliability in weighted least-square regression, (2) detection of significant boundaries between adjacent counties based on tests of parallelism of time trends and confidence intervals of annual percent change of rates, and (3) creation of spatially compact groups of counties with similar temporal trends through the application of hierarchical cluster analysis to the results of boundary analysis. The approach is illustrated using time series of proportions of prostate cancer late-stage cases diagnosed yearly in every county of Florida since 1980s. The annual percent change (APC) in late-stage diagnosis and the onset years for significant declines vary greatly across Florida. Most counties with non-significant average APC are located in the north-western part of Florida, known as the Panhandle, which is more rural than other parts of Florida. The number of significant boundaries peaked in the early 1990s when prostate-specific antigen (PSA) test became widely available, a temporal trend that suggests the existence of geographical disparities in the implementation and/or impact of the new screening procedure, in particular as it began available.

Goovaerts, Pierre

2013-06-01

296

KINETIC ANALYSIS OF HIGH-NITROGEN ENERGETIC MATERIALS USING MULTIVARIATE NONLINEAR REGRESSION

Energy Technology Data Exchange (ETDEWEB)

New high-nitrogen energetic materials were synthesized by Hiskey and Naud. J. Opfermann reported a new tool for finding the probable model of the complex reactions using multivariate non-linear regression analysis of DSC and TGA data from several measurements run at different heating rates. This study is to take the kinetic parameters from the different steps and discover which reaction step is responsible for the runaway reaction by comparing predicted results from the Frank-Kamenetsckii equation with the critical temperature found experimentally using the modified Henkin test.

Campbell, M. S. (Mary Stinecipher); Rabie, R. L. (Ronald L.); Diaz-Acosta, I. (Irina); Pulay, P. (Peter)

2001-01-01

297

Analysis of reactor noise by multi-variate auto-regressive model

International Nuclear Information System (INIS)

The multi-variate auto-regressive model has recently been applied to the noise analysis of nuclear reactor systems. From such a standpoint a system identification study was performed at the Japan Power Demonstrain Reactor-2 (JPDR-2), 45 Mwt, using pseuds-random signals. The aim of this paper is further to extend and refine this identification problem based on the measured data. Emphasis is on the fact that the results obtained by the non-parametric method can by justified by the parametric one. Elucidation of feedback map is also made by estimating the noise contribution rate. Results of computation show the effectiveness of the procedure. (author)

1981-01-01

298

Analysis of reactor noise by multi-variate auto-regressive model

Energy Technology Data Exchange (ETDEWEB)

The multi-variate auto-regressive model has recently been applied to the noise analysis of nuclear reactor systems. From such a standpoint a system identification study was performed at the Japan Power Demonstration Reactor-2 (JPDR-2), 45 Mwt, using pseuds-random signals. The aim of this paper is further to extend and refine this identification problem based on the measured data. Emphasis is on the fact that the results obtained by the non-parametric method can by justified by the parametric one. Elucidation of feedback map is also made by estimating the noise contribution rate. Results of computation show the effectiveness of the procedure.

Kuroda, Y.; Yokota, K. (Tokai Univ., Tokyo (Japan). Faculty of Engineering)

1981-03-01

299

LINEAR REGRESSION MODEL IN THE ANALYSIS OF THE GROSS DOMESTIC PRODUCT

Directory of Open Access Journals (Sweden)

Full Text Available As we ascertain the evolutionary trend of the global economy, it becomes evident that strict analyses on the evolution of a certain micro or macro-economical indicator is no longer enough to describe the corresponding phenomenon, as the emphasis shifts towards the analysis of the correlations existing between two or more indicators, able to offer a much stronger insight on the economical phenomenon. We propose to use the simple linear regression model, a relatively easy and very effective modality to establish the correlation between two economical indicators. The measurement of the factor’s influence on the indicator will most surely offer additional information on the phenomen they describe.

Constantin ANGHELACHE; Mario PAGLIACCI

2011-01-01

300

Energy Technology Data Exchange (ETDEWEB)

The experimental data of ammonium exchange by natural Bigadic clinoptilolite was evaluated using nonlinear regression analysis. Three two-parameters isotherm models (Langmuir, Freundlich and Temkin) and three three-parameters isotherm models (Redlich-Peterson, Sips and Khan) were used to analyse the equilibrium data. Fitting of isotherm models was determined using values of standard normalization error procedure (SNE) and coefficient of determination (R{sup 2}). HYBRID error function provided lowest sum of normalized error and Khan model had better performance for modeling the equilibrium data. Thermodynamic investigation indicated that ammonium removal by clinoptilolite was favorable at lower temperatures and exothermic in nature.

Gunay, Ahmet [Deparment of Environmental Engineering, Faculty of Engineering and Architecture, Balikesir University (Turkey)], E-mail: ahmetgunay2@gmail.com

2007-09-30

301

Estimating the causes of traffic accidents using logistic regression and discriminant analysis.

UK PubMed Central (United Kingdom)

Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.

Karacasu M; Ergül B; Altin Yavuz A

2013-07-01

302

Soil colour and spectral analysis employing linear regression models I. Effect of organic matter

Directory of Open Access Journals (Sweden)

Full Text Available This work comprises an investigation into whether soil reflectance spectral analysis which is employed to calculate the colour characteristics (hue, value, chroma) of soil can be carried out using linear regression models, so that comparison of colour characteristics subsequently becomes possible, and also statistically documented. To this end the colour of soil samples was calculated through spectrum reflectance in the visible region of dry smooth-rubbed soil samples smaller than 250 mm. The colour parameters of the CIE system assessed by analysis of the spectrum reflectance were converted into Munsell colour system characte- ristics. Regression in accordance with the piecewise linear model was then applied to the spectrum data. The processing indicated that this model is capable of making satisfactory predictions - above all of the value and secondarily of the chroma of the soil samples. Detection of statistically significant differences in the colour characteristics of horizons of the same profile was effected through the application of the nested model. These differences cannot be detected using the tables of the Munsell colour system. Finally, in each region of the spectrum, qualitative analysis of the effect of the organic matter on the soil colour characteristics was performed, demonstrating its active role in determining the readings for value and chroma.

Barouchas P.E.; Moustakas N.K.

2004-01-01

303

International Nuclear Information System (INIS)

Purpose: The goal of this study was to maximize the discrimination between benign and malignant masses in patients with sonographically indeterminate ovarian lesions by means of unenhanced and contrast-enhanced MR imaging, and to develop a computer-assisted diagnosis system. Material and Methods: Findings in precontrast and Gd-DTPA contrast-enhanced MR images of 104 patients with 115 sonographically indeterminate ovarian masses were analyzed, and the results were correlated with histopathological findings. Of 115 lesions, 65 were benign (23 cystadenomas, 13 complex cysts, 11 teratomas, 6 fibrothecomas, 12 others) and 50 were malignant (32 ovarian carcinomas, 7 metastatic tumors of the ovary, 4 carcinomas of the fallopian tubes, 7 others). A logistic regression analysis was performed to discriminate between benign and malignant lesions, and a model of a computer-assisted diagnosis was developed. This model was prospectively tested in 75 cases of ovarian tumors found at other institutions. Results: From the univariate analysis, the following parameters were selected as significant for predicting malignancy (p?0.05): A solid or cystic mass with a large solid component or wall thickness greater than 3 mm; complex internal architecture; ascites; and bilaterality. Based on these parameters, a model of a computer-assisted diagnosis system was developed with the logistic regression analysis. To distinguish benign from malignant lesions, the maximum cut-off point was obtained between 0.47 and 0.51. In a prospective application of this model, 87% of the lesions were accurately identified as benign or malignant. (orig.)

1997-01-01

304

Energy Technology Data Exchange (ETDEWEB)

Purpose: The goal of this study was to maximize the discrimination between benign and malignant masses in patients with sonographically indeterminate ovarian lesions by means of unenhanced and contrast-enhanced MR imaging, and to develop a computer-assisted diagnosis system. Material and Methods: Findings in precontrast and Gd-DTPA contrast-enhanced MR images of 104 patients with 115 sonographically indeterminate ovarian masses were analyzed, and the results were correlated with histopathological findings. Of 115 lesions, 65 were benign (23 cystadenomas, 13 complex cysts, 11 teratomas, 6 fibrothecomas, 12 others) and 50 were malignant (32 ovarian carcinomas, 7 metastatic tumors of the ovary, 4 carcinomas of the fallopian tubes, 7 others). A logistic regression analysis was performed to discriminate between benign and malignant lesions, and a model of a computer-assisted diagnosis was developed. This model was prospectively tested in 75 cases of ovarian tumors found at other institutions. Results: From the univariate analysis, the following parameters were selected as significant for predicting malignancy (p{<=}0.05): A solid or cystic mass with a large solid component or wall thickness greater than 3 mm; complex internal architecture; ascites; and bilaterality. Based on these parameters, a model of a computer-assisted diagnosis system was developed with the logistic regression analysis. To distinguish benign from malignant lesions, the maximum cut-off point was obtained between 0.47 and 0.51. In a prospective application of this model, 87% of the lesions were accurately identified as benign or malignant. (orig.).

Yamashita, Y. [Dept. of Radiology, Kumamoto Univ. School of Medicine (Japan); Hatanaka, Y. [Dept. of Radiology, Kumamoto Univ. School of Medicine (Japan); Torashima, M. [Dept. of Radiology, Kumamoto Univ. School of Medicine (Japan); Takahashi, M. [Dept. of Radiology, Kumamoto Univ. School of Medicine (Japan); Miyazaki, K. [Dept. of Obstetrics and Gynecology, Kumamoto Univ. School of Medicine (Japan); Okamura, H. [Dept. of Obstetrics and Gynecology, Kumamoto Univ. School of Medicine (Japan)

1997-07-01

305

Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

UK PubMed Central (United Kingdom)

Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated.

Hinton DE; Chhean D; Pich V; Um K; Fama JM; Pollack MH

2006-01-01

306

Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated. PMID:16464700

Hinton, Devon E; Chhean, Dara; Pich, Vuth; Um, Khin; Fama, Jeanne M; Pollack, Mark H

2006-01-01

307

International Nuclear Information System (INIS)

[en] The records of three earthquakes which had induced significant earthquake response to the piping system were obtained with the earthquake observation system. In the present paper, first, the eigenvalue analysis results for the natural piping system based on the piping support (boundary) conditions are described and second, the frequency and the damping factor evaluation results for each vibrational mode are described. In the present study, the Auto Regressive (AR) analysis method is used in the evaluation of natural frequencies and damping factors. The AR analysis applied here has a capability of direct evaluation of natural frequencies and damping factors from earthquake records observed on a piping system without any information on the input motions to the system. (orig./HP)

1985-01-01

308

International Nuclear Information System (INIS)

A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (log Po/w). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 110 molecules as training set, and predictive ability tested using 40 compounds. Modeling of log Po/w of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR). Four descriptors for these compounds molecular volume (MV) (geometrical), hydrophilic-lipophilic balance (HLB) (constitutional), hydrogen bond forming ability (HB) (electronic) and polar surface area (PSA) (electrostatic) are taken as inputs for the model. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log Po/w for molecules not yet synthesized. Application of the developed model to a testing set of 40 drug organic compounds demonstrates that the model is reliable with good predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient (R2) for MLR model were 0.22 and 0.99 for the prediction set log Po/w.

2007-12-05

309

Regression analysis of growth responses to water depth in three wetland plant species.

UK PubMed Central (United Kingdom)

BACKGROUND AND AIMS: Plant species composition in wetlands and on lakeshores often shows dramatic zonation, which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta and Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. METHODOLOGY: Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 to 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated-measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. PRINCIPAL RESULTS: Growth responses to depth differed between the three species, and were non-linear. Phormium tenax growth decreased rapidly in standing water >0.25 m depth, C. secta growth increased initially with depth but then decreased at depths >0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0- to 0.50-m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. CONCLUSIONS: The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences.

Sorrell BK; Tanner CC; Brix H

2012-01-01

310

Regression analysis of growth responses to water depth in three wetland plant species

DEFF Research Database (Denmark)

Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta, Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. P. tenax growth rapidly decreased in standing water > 0.25 m depth, C. secta growth increased initially with depth but then decreased at depths > 0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0 – 0.50 m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences.

Sorrell, Brian K; Tanner, Chris C

2012-01-01

311

International Nuclear Information System (INIS)

The observation of the equipment and piping system installed in an operating nuclear power plant in earthquakes is very umportant for evaluating and confirming the adequacy and the safety margin expected in the design stage. By analyzing observed earthquake records, it can be expected to get the valuable data concerning the behavior of those in earthquakes, and extract the information about the aseismatic design parameters for those systems. From these viewpoints, an earthquake observation system was installed in a reactor building in an operating plant. Up to now, the records of three earthquakes were obtained with this system. In this paper, an example of the analysis of earthquake records is shown, and the main purpose of the analysis was the evaluation of the vibration mode, natural frequency and damping factor of this piping system. Prior to the earthquake record analysis, the eigenvalue analysis for this piping system was performed. Auto-regressive analysis was applied to the observed acceleration time history which was obtained with a piping system installed in an operating BWR. The results of earthquake record analysis agreed well with the results of eigenvalue analysis. (Kako, I.)

1986-01-01

312

UK PubMed Central (United Kingdom)

Multivariate distance matrix regression (MDMR) analysis is a statistical technique that allows researchers to relate P variables to an additional M factors collected on N individuals, where P???N. The technique can be applied to a number of research settings involving high-dimensional data types such as DNA sequence data, gene expression microarray data, and imaging data. MDMR analysis involves computing the distance between all pairs of individuals with respect to P variables of interest and constructing an N?×?N matrix whose elements reflect these distances. Permutation tests can be used to test linear hypotheses that consider whether or not the M additional factors collected on the individuals can explain variation in the observed distances between and among the N individuals as reflected in the matrix. Despite its appeal and utility, properties of the statistics used in MDMR analysis have not been explored in detail. In this paper we consider the level accuracy and power of MDMR analysis assuming different distance measures and analysis settings. We also describe the utility of MDMR analysis in assessing hypotheses about the appropriate number of clusters arising from a cluster analysis.

Zapala MA; Schork NJ

2012-01-01

313

Multivariate distance matrix regression (MDMR) analysis is a statistical technique that allows researchers to relate P variables to an additional M factors collected on N individuals, where P???N. The technique can be applied to a number of research settings involving high-dimensional data types such as DNA sequence data, gene expression microarray data, and imaging data. MDMR analysis involves computing the distance between all pairs of individuals with respect to P variables of interest and constructing an N?×?N matrix whose elements reflect these distances. Permutation tests can be used to test linear hypotheses that consider whether or not the M additional factors collected on the individuals can explain variation in the observed distances between and among the N individuals as reflected in the matrix. Despite its appeal and utility, properties of the statistics used in MDMR analysis have not been explored in detail. In this paper we consider the level accuracy and power of MDMR analysis assuming different distance measures and analysis settings. We also describe the utility of MDMR analysis in assessing hypotheses about the appropriate number of clusters arising from a cluster analysis. PMID:23060897

Zapala, Matthew A; Schork, Nicholas J

2012-09-27

314

A least trimmed square regression method for second level FMRI effective connectivity analysis.

UK PubMed Central (United Kingdom)

We present a least trimmed square (LTS) robust regression method to combine different runs/subjects for second/high level effective connectivity analysis. The basic idea of this method is to treat the extreme nonlinear model variability as outliers if they exceed a certain threshold. A bootstrap method for the LTS estimation is employed to detect model outliers. We compared the LTS robust method with a non-robust method using simulated and real datasets. The difference between LTS and the non-robust method for second level effective connectivity analysis is significant, suggesting the conventional non-robust method is easily affected by the model variability from the first level analysis. In addition, after these outliers are detected and excluded for the high level analysis, the model coefficients of the second level are combined within the framework of a mixed model. The variance of the mixed model is estimated using the Newton-Raphson (NR) type Levenberg-Marquardt algorithm. Three sets of real data are adopted to compare conventional methods which do not include random effects in the analysis with a mixed model for second level effective connectivity analysis. The results show that the conventional method is significantly different from the mixed model when greater model variability exists, suggesting there is a strong random effect, and the mixed model should be employed for the second level effective connectivity analysis.

Li X; Coyle D; Maguire L; McGinnity TM

2013-01-01

315

Poisson regression analysis of mortality among male workers at a thorium-processing plant

Energy Technology Data Exchange (ETDEWEB)

Analyses of mortality among a cohort of 3119 male workers employed between 1915 and 1973 at a thorium-processing plant were updated to the end of 1982. Of the whole group, 761 men were deceased and 2161 men were still alive, while 197 men were lost to follow-up. A total of 250 deaths was added to the 511 deaths observed in the previous study. The standardized mortality ratio (SMR) for all causes of death was 1.12 with 95% confidence interval (CI) of 1.05-1.21. The SMRs were also significantly increased for all malignant neoplasms (SMR = 1.23, 95% CI = 1.04-1.43) and lung cancer (SMR = 1.36, 95% CI = 1.02-1.78). Poisson regression analysis was employed to evaluate the joint effects of job classification, duration of employment, time since first employment, age and year at first employment on mortality of all malignant neoplasms and lung cancer. A comparison of internal and external analyses with the Poisson regression model was also conducted and showed no obvious difference in fitting the data on lung cancer mortality of the thorium workers. The results of the multivariate analysis showed that there was no significant effect of all the study factors on mortality due to all malignant neoplasms and lung cancer. Therefore, further study is needed for the former thorium workers.

Liu, Zhiyuan; Lee, Tze-San; Kotek, T.J.

1991-12-31

316

COPD mortality rates in Andalusia, Spain, 1975-2010: a joinpoint regression analysis.

UK PubMed Central (United Kingdom)

OBJECTIVES: To describe chronic obstructive pulmonary disease (COPD) mortality rates in Andalusia, Spain, between 1975 and 2010 using a joinpoint regression analysis. DESIGN: Mortality figures for Andalusian residents aged >40 years from 1975 to 2010 were obtained from the National Institute of Statistics. Causes of death were classified based on the 8th, 9th and 10th revisions of the International Classification of Diseases. Crude, standardised (SMR) and 40- to 70-year truncated mortality rates were calculated. Trends were analysed using joinpoint regression analysis to identify significant trend changes, and an annual percentage of change (APC) was computed from each trend. RESULTS: Mortality rates showed a downward trend for both sexes. The SMR ranged from 109.9 to 98.0 deaths/100?000 males, and between 35.8 and 12.0 deaths/100?000 females. An increase in the average age at death for men and women with COPD was also observed. Both sexes experienced an increase in SMR in the early 1980s, although female mortality rates began to decline in 1985 (APC -5.8% thereafter), whereas those for males remained high until 1998 (APC -4% thereafter). CONCLUSIONS: COPD mortality remains higher in male than female inhabitants of Andalusia. These rates have decreased following different sex- and age-dependent patterns.

López-Campos JL; Ruiz-Ramos M; Soriano JB

2013-01-01

317

Statistical learning method in regression analysis of simulated positron spectral data

International Nuclear Information System (INIS)

Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)

2005-01-01

318

Multivariate Regression Analysis of Panel Data with Binary Outcomes applied to Unemployment Data

UK PubMed Central (United Kingdom)

In panel studies binary outcome measures together with time stationary and timevarying explanatory variables are collected over time on the same individual. Therefore, aregression analysis for this type of data must allow for the correlation among the outcomesof an individual. The multivariate probit model of Ashford and Sowdon (1970) was thefirst regression model for multivariate binary responses. However, a likelihood analysis ofthe multivariate probit model for higher dimensions is intractable due to the maximizationover high dimensional integrals thus severely restricting ist applicability so far. Czado(1996) developed a Markov Chain Monte Carlo (MCMC) algorithm to overcome thisdifficulty. In this paper we present an application of this algorithm to unemploymentdata from the Panel Study of Income Dynamics involving 11 waves of the panel study. Inaddition we adapt Bayesian model checking techniques based on the posterior predictivedistribution (see for example Gelman et a...

Claudia Czado

319

Death rates from lung cancer in men are higher in Andalusia than in other Spanish regions. This study describes lung cancer mortality rates and their trends in Andalusia from 1975 through 2008. Data on lung cancer mortality were obtained from the Death Registry of Andalusia. For each gender, age group-specific and standardized (overall and truncated) rates were calculated by the direct method using the world standard population. Joinpoint regression analysis was used to identify points where a significant change in trends occurred. In men, short-term trends for age-standardized mortality rates (ASMRs) declined significantly from 2004 through 2008 for each age group < 80 years old. In women, the segmented joinpoint analysis showed a decrease from 1975 through 1998 in ASMRs (overall) (-0.6%, P < 0.05), followed by a marked increase (4.6%, P < 0.05). A decrease in male versus female mortality due to lung cancer is evident in Andalusia (Spain). PMID:21678025

Cayuela, Aurelio; Rodríguez-Domínguez, Susana; Jara-Palomares, Luis; Otero-Candelera, Remedios; López-Campos, Jose Luis; Vigil, Eduardo

2011-06-16

320

UK PubMed Central (United Kingdom)

Death rates from lung cancer in men are higher in Andalusia than in other Spanish regions. This study describes lung cancer mortality rates and their trends in Andalusia from 1975 through 2008. Data on lung cancer mortality were obtained from the Death Registry of Andalusia. For each gender, age group-specific and standardized (overall and truncated) rates were calculated by the direct method using the world standard population. Joinpoint regression analysis was used to identify points where a significant change in trends occurred. In men, short-term trends for age-standardized mortality rates (ASMRs) declined significantly from 2004 through 2008 for each age group < 80 years old. In women, the segmented joinpoint analysis showed a decrease from 1975 through 1998 in ASMRs (overall) (-0.6%, P < 0.05), followed by a marked increase (4.6%, P < 0.05). A decrease in male versus female mortality due to lung cancer is evident in Andalusia (Spain).

Cayuela A; Rodríguez-Domínguez S; Jara-Palomares L; Otero-Candelera R; López-Campos JL; Vigil E

2012-09-01

321

Directory of Open Access Journals (Sweden)

Full Text Available In this study, effective economic factors on the import of forest industry products were investigated. Data used in the time series analysis covered a period of 18 years from 1985 to 2002. Double-log linear function was used to analyze the import model. The imported forest industry products in Turkey were considered to be a function of domestic production value, domestic prices, national income per capita, lagged import value (t-1), exchange-rate (TL/$) and export values. The parameters were evaluated using a regression analysis. The results indicated that imported forest industry products in Turkey have largely been effected by national income per capita, domestic prices, export values and exchange-rate variables.

Metin Akay; Orhan Gunduz; Kemal Esengun

2006-01-01

322

An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.

Junek, W. N.; Jones, W. L.; Woods, M. T.

2011-12-01

323

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The genetic association analysis using haplotypes as basic genetic units is anticipated to be a powerful strategy towards the discovery of genes predisposing human complex diseases. In particular, the increasing availability of high-resolution genetic markers such as the single-nucleotide polymorphisms (SNPs) has made haplotype-based association analysis an attractive alternative to single marker analysis. Results We consider haplotype association analysis under the population-based case-control study design. A multinomial logistic model is proposed for haplotype analysis with unphased genotype data, which can be decomposed into a prospective logistic model for disease risk as well as a model for the haplotype-pair distribution in the control population. Environmental factors can be readily incorporated and hence the haplotype-environment interaction can be assessed in the proposed model. The maximum likelihood estimation with unphased genotype data can be conveniently implemented in the proposed model by applying the EM algorithm to a prospective multinomial logistic regression model and ignoring the case-control design. We apply the proposed method to the hypertriglyceridemia study and identifies 3 haplotypes in the apolipoprotein A5 gene that are associated with increased risk for hypertriglyceridemia. A haplotype-age interaction effect is also identified. Simulation studies show that the proposed estimator has satisfactory finite-sample performances. Conclusion Our results suggest that the proposed method can serve as a useful alternative to existing methods and a reliable tool for the case-control haplotype-based association analysis.

Chen Yi-Hau; Kao Jau-Tsuen

2006-01-01

324

The Impact of Outliers on Net-Benefit Regression Model in Cost-Effectiveness Analysis.

UK PubMed Central (United Kingdom)

Ordinary least square (OLS) in regression has been widely used to analyze patient-level data in cost-effectiveness analysis (CEA). However, the estimates, inference and decision making in the economic evaluation based on OLS estimation may be biased by the presence of outliers. Instead, robust estimation can remain unaffected and provide result which is resistant to outliers. The objective of this study is to explore the impact of outliers on net-benefit regression (NBR) in CEA using OLS and to propose a potential solution by using robust estimations, i.e. Huber M-estimation, Hampel M-estimation, Tukey's bisquare M-estimation, MM-estimation and least trimming square estimation. Simulations under different outlier-generating scenarios and an empirical example were used to obtain the regression estimates of NBR by OLS and five robust estimations. Empirical size and empirical power of both OLS and robust estimations were then compared in the context of hypothesis testing. Simulations showed that the five robust approaches compared with OLS estimation led to lower empirical sizes and achieved higher empirical powers in testing cost-effectiveness. Using real example of antiplatelet therapy, the estimated incremental net-benefit by OLS estimation was lower than those by robust approaches because of outliers in cost data. Robust estimations demonstrated higher probability of cost-effectiveness compared to OLS estimation. The presence of outliers can bias the results of NBR and its interpretations. It is recommended that the use of robust estimation in NBR can be an appropriate method to avoid such biased decision making.

Wen YW; Tsai YW; Wu DB; Chen PF

2013-01-01

325

UK PubMed Central (United Kingdom)

The major limitation of using existing vegetation indices for crop biomass estimation is that it approaches a saturation level asymptotically for a certain range of biomass. In order to resolve this problem, band depth analysis and partial least square regression (PLSR) were combined to establish winter wheat biomass estimation model in the present study. The models based on the combination of band depth analysis and PLSR were compared with the models based on common vegetation indexes from the point of view of estimation accuracy, subsequently. Band depth analysis was conducted in the visible spectral domain (550-750 nm). Band depth, band depth ratio (BDR), normalized band depth index, and band depth normalized to area were utilized to represent band depth information. Among the calibrated estimation models, the models based on the combination of band depth analysis and PLSR reached higher accuracy than those based on the vegetation indices. Among them, the combination of BDR and PLSR got the highest accuracy (R2 = 0.792, RMSE = 0.164 kg x m(-2)). The results indicated that the combination of band depth analysis and PLSR could well overcome the saturation problem and improve the biomass estimation accuracy when winter wheat biomass is large.

Fu YY; Wang JH; Yang GJ; Song XY; Xu XG; Feng HK

2013-05-01

326

Functional MRI studies have revealed changes in default-mode and salience networks in neurodegenerative dementias, especially in Alzheimer's disease (AD). The purpose of this study was to analyze the whole brain cortex resting state networks (RSNs) in patients with behavioral variant frontotemporal dementia (bvFTD) by using resting state functional MRI (rfMRI). The group specific RSNs were identified by high model order independent component analysis (ICA) and a dual regression technique was used to detect between-group differences in the RSNs with p < 0.05 threshold corrected for multiple comparisons. A y-concatenation method was used to correct for multiple comparisons for multiple independent components, gray matter differences as well as the voxel level. We found increased connectivity in several networks within patients with bvFTD compared to the control group. The most prominent enhancement was seen in the right frontotemporal area and insula. A significant increase in functional connectivity was also detected in the left dorsal attention network (DAN), in anterior paracingulate-a default mode sub-network as well as in the anterior parts of the frontal pole. Notably the increased patterns of connectivity were seen in areas around atrophic regions. The present results demonstrate abnormal increased connectivity in several important brain networks including the DAN and default-mode network (DMN) in patients with bvFTD. These changes may be associated with decline in executive functions and attention as well as apathy, which are the major cognitive and neuropsychiatric defects in patients with frontotemporal dementia. PMID:23986673

Rytty, Riikka; Nikkinen, Juha; Paavola, Liisa; Abou Elseoud, Ahmed; Moilanen, Virpi; Visuri, Annina; Tervonen, Osmo; Renton, Alan E; Traynor, Bryan J; Kiviniemi, Vesa; Remes, Anne M

2013-08-26

327

UK PubMed Central (United Kingdom)

Random regression models allow for analysis of longitudinal data, which together with the use of genomic information are expected to increase accuracy of selection, when compared with analyzing average or total production with pedigree information. The objective of this study was to estimate variance components for egg production over time in a commercial brown egg layer population using genomic relationship information. A random regression reduced animal model with a marker-based relationship matrix was used to estimate genomic breeding values of 3,908 genotyped animals from 6 generations. The first 5 generations were used for training, and predictions were validated in generation 6. Daily egg production up to 46 wk in lay was accumulated into 85,462 biweekly (every 2 wk) records for training, of which 17,570 were recorded on genotyped hens and the remaining on their nongenotyped progeny. The effect of adding additional egg production data of 2,167 nongenotyped sibs of selection candidates [16,037 biweekly (every 2 wk) records] to the training data was also investigated. The model included a 5th order Legendre polynomial nested within hatch-week as fixed effects and random terms for coefficients of quadratic polynomials for genetic and permanent environmental components. Residual variance was assumed heterogeneous among 2-wk periods. Models using pedigree and genomic relationships were compared. Estimates of residual variance were very similar under both models, but the model with genomic relationships resulted in a larger estimate of genetic variance. Heritability estimates increased with age up to mid production and decreased afterward, resulting in an average heritability of 0.20 and 0.33 for pedigree and genomic models. Prediction of total egg number was more accurate with the genomic than with the pedigree-based random regression model (correlation in validation 0.26 vs. 0.16). The genomic model outperformed the pedigree model in most of the 2-wk periods. Thus, results of this study show that random regression reduced animal models can be used in breeding programs using genomic information and can result in substantial improvements in the accuracy of selection for trajectory traits.

Wolc A; Arango J; Settar P; Fulton JE; O'Sullivan NP; Preisinger R; Fernando R; Garrick DJ; Dekkers JC

2013-06-01

328

Directory of Open Access Journals (Sweden)

Full Text Available ??????????????????????????????????????????????????logratio??????????(PLS??)????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ? ???? ? ????? Prediction of water consumption structure on the basis of the relationship between water consumption structure and industrial structure is essential to the exploitation and utilization of water resources. Based on the symmetrical logratio transformation and partial least-squares regression, linear regression model for water consumption structure and industrial structure in FujianProvinceis developed in this study. Analysis on the model showed that the compositional data of water consumption structure and industrial structure inFujianProvincehad obvious linear relationship. This model fit the data very well with high accuracy and can be used to predict water consumption structure. Agricultural water was highly correlated with primary industry, and so was the industrial water with secondary industry. Agricultural water showed significantly negative correlation with secondary industry and tertiary industry. The variation of domestic water had an insignificant correlation with industrial structure. The capacity to explain water consumption structure of the industrial structure factors was in the order of primary industry > secondary industry > tertiary industry.

???; ???; ???; ??

2012-01-01

329

An innovative land use regression model incorporating meteorology for exposure analysis.

UK PubMed Central (United Kingdom)

The advent of spatial analysis and geographic information systems (GIS) has led to studies of chronic exposure and health effects based on the rationale that intra-urban variations in ambient air pollution concentrations are as great as inter-urban differences. Such studies typically rely on local spatial covariates (e.g., traffic, land use type) derived from circular areas (buffers) to predict concentrations/exposures at receptor sites, as a means of averaging the annual net effect of meteorological influences (i.e., wind speed, wind direction and insolation). This is the approach taken in the now popular land use regression (LUR) method. However spatial studies of chronic exposures and temporal studies of acute exposures have not been adequately integrated. This paper presents an innovative LUR method implemented in a GIS environment that reflects both temporal and spatial variability and considers the role of meteorology. The new source area LUR integrates wind speed, wind direction and cloud cover/insolation to estimate hourly nitric oxide (NO) and nitrogen dioxide (NO(2)) concentrations from land use types (i.e., road network, commercial land use) and these concentrations are then used as covariates to regress against NO and NO(2) measurements at various receptor sites across the Vancouver region and compared directly with estimates from a regular LUR. The results show that, when variability in seasonal concentration measurements is present, the source area LUR or SA-LUR model is a better option for concentration estimation.

Su JG; Brauer M; Ainslie B; Steyn D; Larson T; Buzzelli M

2008-02-01

330

Directory of Open Access Journals (Sweden)

Full Text Available Complexities of submerged arc welding variables on the one hand and its widespread use in producing the sensitive and expensive parts on the other hand have doubled the importance of precise control of its adjusting parameters. In general, in order to create high-quality joints in welding processes it is necessary to control three parameters of welding current, voltage and speed precisely from various variables. On this basis, the mentioned variables have been considered as the criteria for quality of the weld joints in this study as the adjusting parameters and weld bead geometry, which include the bead height, width and penetration. Thus, the accurate equations have been proposed for estimating the weld bead height, width and penetration based on the input parameters by the regression analysis and neural network. Based on the results, the designed neural network is markedly more accurate than the regression equations, but both models have high capabilities for optimizing the parameters of submerged arc welding and also predicting the weld bead geometry for a set of input values.

Hossein Towsyfyan; Gholamreza Davoudi; Bahram Heidarian Dehkordy; Ahmad Kariminasab

2013-01-01

331

The Use of Logistic Regression in the Analysis of Data Concerning Good Medical Practice

Directory of Open Access Journals (Sweden)

Full Text Available Logistic regression is one of the commonly used models of explicative multivariate analysis utilized in epidemiology. Its use, which has become easier with modern statistical software, allows researchers to control confusion bias. It measures the odds-ratio , a quantification of the association probability between a given occurrence, represented by a dichotomic variable, and factors susceptible to influence it, represented by explicative variables. The choice of explicative variables integrated into the model is based on previous information on the study subject and is aimed at avoiding the confusion factors which have already been identified. The authors explain the fundamental principles of logistic regression and the steps involved in its application. By using two examples (the quality of the follow up care given to diabetics and in-hospital mortality after acute myocardial infarction), they demonstrate the value this statistical tool can have in studies performed by the medical service of the national health care fund, particularly in studies designed to evaluate professional practice.

Aminot I; Damon MN

2002-01-01

332

UK PubMed Central (United Kingdom)

The risk of antisocial outcomes in individuals with personality disorder (PD) remains uncertain. The authors synthesize the current evidence on the risks of antisocial behavior, violence, and repeat offending in PD, and they explore sources of heterogeneity in risk estimates through a systematic review and meta-regression analysis of observational studies comparing antisocial outcomes in personality disordered individuals with controls groups. Fourteen studies examined risk of antisocial and violent behavior in 10,007 individuals with PD, compared with over 12 million general population controls. There was a substantially increased risk of violent outcomes in studies with all PDs (random-effects pooled odds ratio [OR] = 3.0, 95% CI = 2.6 to 3.5). Meta-regression revealed that antisocial PD and gender were associated with higher risks (p = .01 and .07, respectively). The odds of all antisocial outcomes were also elevated. Twenty-five studies reported the risk of repeat offending in PD compared with other offenders. The risk of a repeat offense was also increased (fixed-effects pooled OR = 2.4, 95% CI = 2.2 to 2.7) in offenders with PD. The authors conclude that although PD is associated with antisocial outcomes and repeat offending, the risk appears to differ by PD category, gender, and whether individuals are offenders or not.

Yu R; Geddes JR; Fazel S

2012-10-01

333

International Nuclear Information System (INIS)

Fast neutron induced gamma spectrometry is based on inelastic scattering and capture of fast neutrons in the nucleus of various elements and consequent detection of emitted characteristic gamma. It is a useful technique for online, nondestructive elemental analysis of composition of various compounds. In this technique fast neutron, typically 14 MeV are made to incident on the sample and inelastic, capture gamma is collected. The elements present in the sample can be identified through the peaks at their characteristic energies in the collected spectrum and the peak heights contain the information about the abundance of the elements in the sample. Analyzing this gamma spectrum gives the quantitative composition of the sample. A two step method consisting of spectrum evaluation and calibration is used currently for quantitative abundance analysis. In field applications such as explosive detection, cancer diagnostics where real-time composition analysis is required, this method is inconvenient and not practical. In this work a new single step method based on Partial least square regression (PLS) has been proposed. The gamma energy spectrums of various compounds are collected and used to calibrate the correlation between peak height and elements quantity. Based on this analysis the unknown composition of any compound having similar elements can be predicted with comparatively higher accuracy. Monte-Carlo simulations has been carried out to verify the proposed method and used to predict the quantity of various elements present in some unknown compounds. (author)

2011-01-01

334

Characterization of breast masses by dynamic enhanced MR imaging. A logistic regression analysis

International Nuclear Information System (INIS)

To identify features useful for differentiation between malignant and benign breast neoplasms using multivariate analysis of findings by MR imaging. In a retrospective analysis, 61 patients with 64 breast masses underwent MR imaging and the time-signal intensity curves for precontrast dynamic postcontrast images were quantitatively analyzed. Statistical analysis was performed using a logistic regression model, which was prospectively tested in another 34 patients with suspected breast masses. Univariate analysis revealed that the reliable indicators for malignancy were first the appearance of the tumor border, followed by the washout ratio, internal architecture after contrast enhancement, and peak time. The factors significantly associated with malignancy were irregular tumor border, followed by washout ratio, internal architecture, and peak time. For differentiation between benignity and malignancy, the maximum cut-off point was to be found between 0.47 and 0.51. In a prospective application of this model, 91% of the lesions were accurately discriminated as benign or malignant lesions. Combination of contrast-enhanced dynamic and postcontrast-enhanced MR imaging provided accurate data for the diagnosis of malignant neoplasms of the breast. The model had an accuracy of 91% (sensitivity 90%, specificity 93%)

1999-01-01

335

Characterization of breast masses by dynamic enhanced MR imaging. A logistic regression analysis

Energy Technology Data Exchange (ETDEWEB)

Purpose: To identify features useful for differentiation between malignant and benign breast neoplasms using multivariate analysis of findings by MR imaging. Material and Methods: In a retrospective analysis, 61 patients with 64 breast masses underwent MR imaging and the time-signal intensity curves for precontrast dynamic postcontrast images were quantitatively analyzed. Statistical analysis was performed using a logistic regression model, which was prospectively tested in another 34 patients with suspected breast masses. Results: Univariate analysis revealed that the reliable indicators for malignancy were first the appearance of the tumor border, followed by the washout ratio, internal architecture after contrast enhancement, and peak time. The factors significantly associated with malignancy were irregular tumor border, followed by washout ratio, internal architecture, and peak time. For differentiation between benignity and malignancy, the maximum cut-off point was to be found between 0.47 and 0.51. In a prospective application of this model, 91% of the lesions were accurately discriminated as benign or malignant lesions. Conclusion: Combination of contrast-enhanced dynamic and postcontrast-enhanced MR imaging provided accurate data for the diagnosis of malignant neoplasms of the breast. The model had an accuracy of 91% (sensitivity 90%, specificity 93%). (orig.)

Ikeda, O.; Morishita, S.; Kido, T.; Kitajima, M. [Dept. of Radiology, Kumamoto Rousai Hospital, Yatsushiro City (Japan); Yamashita, Y.; Takahashi, M. [Dept. of Radiology, Kumamoto Univ. School of Medicine, Kumamoto (Japan); Okamura, K. [Dept. of Surgery, Kumamoto Rousai Hospital, Yatsushiro City (Japan); Fukuda, S. [Dept. of Pathology, Kumamoto Rousai Hospital, Yatsushiro City (Japan)

1999-11-01

336

UK PubMed Central (United Kingdom)

We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model.

De la Cruz R; Branco MD

2009-08-01

337

We have considered a Bayesian approach for the nonlinear regression model by replacing the normal distribution on the error term by some skewed distributions, which account for both skewness and heavy tails or skewness alone. The type of data considered in this paper concerns repeated measurements taken in time on a set of individuals. Such multiple observations on the same individual generally produce serially correlated outcomes. Thus, additionally, our model does allow for a correlation between observations made from the same individual. We have illustrated the procedure using a data set to study the growth curves of a clinic measurement of a group of pregnant women from an obstetrics clinic in Santiago, Chile. Parameter estimation and prediction were carried out using appropriate posterior simulation schemes based in Markov Chain Monte Carlo methods. Besides the deviance information criterion (DIC) and the conditional predictive ordinate (CPO), we suggest the use of proper scoring rules based on the posterior predictive distribution for comparing models. For our data set, all these criteria chose the skew-t model as the best model for the errors. These DIC and CPO criteria are also validated, for the model proposed here, through a simulation study. As a conclusion of this study, the DIC criterion is not trustful for this kind of complex model. PMID:19629998

De la Cruz, Rolando; Branco, Márcia D

2009-08-01

338

Two quantitative structure-activity relationships (QSAR) models for predicting 95 compounds inhibiting Acyl-coenzyme A: cholesterol acyltransferase2 (ACAT2) were developed. The whole data set was randomly split into a training set including 72 compounds and a test set including 23 compounds. The molecules were represented by 11 descriptors calculated by software ADRIANA.Code. Then the inhibitory activity of ACAT2 inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. The correlation coefficients of the models for the test sets were 0.90 for MLR model, and 0.91 for SVM model. Y-randomization was employed to ensure the robustness of the SVM model. The atom charge and electronegativity related descriptors were important for the interaction between the inhibitors and ACAT2. PMID:23711921

Zhong, Min; Xuan, Shouyi; Wang, Ling; Hou, Xiaoli; Wang, Maolin; Yan, Aixia; Dai, Bin

2013-05-09

339

UK PubMed Central (United Kingdom)

Two quantitative structure-activity relationships (QSAR) models for predicting 95 compounds inhibiting Acyl-coenzyme A: cholesterol acyltransferase2 (ACAT2) were developed. The whole data set was randomly split into a training set including 72 compounds and a test set including 23 compounds. The molecules were represented by 11 descriptors calculated by software ADRIANA.Code. Then the inhibitory activity of ACAT2 inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. The correlation coefficients of the models for the test sets were 0.90 for MLR model, and 0.91 for SVM model. Y-randomization was employed to ensure the robustness of the SVM model. The atom charge and electronegativity related descriptors were important for the interaction between the inhibitors and ACAT2.

Zhong M; Xuan S; Wang L; Hou X; Wang M; Yan A; Dai B

2013-07-01

340

As changes in the medical environment and policies on national health insurance coverage have triggered tremendous impacts on the business performance and financial management of medical institutions, effective management becomes increasingly crucial for hospitals to enhance competitiveness and to strive for sustainable development. The study accordingly aims at evaluating hospital operational efficiency for better resource allocation and cost effectiveness. Several data envelopment analysis (DEA)-based models were first compared, and the DEA-artificial neural network (ANN) model was identified as more capable than the DEA and DEA-assurance region (AR) models of measuring operational efficiency and recognizing the best-performing hospital. The classification and regression tree (CART) efficiency model was then utilized to extract rules for improving resource allocation of medical institutions. PMID:20878210

Chuang, Chun-Ling; Chang, Peng-Chan; Lin, Rong-Ho

2010-09-28

341

UK PubMed Central (United Kingdom)

As changes in the medical environment and policies on national health insurance coverage have triggered tremendous impacts on the business performance and financial management of medical institutions, effective management becomes increasingly crucial for hospitals to enhance competitiveness and to strive for sustainable development. The study accordingly aims at evaluating hospital operational efficiency for better resource allocation and cost effectiveness. Several data envelopment analysis (DEA)-based models were first compared, and the DEA-artificial neural network (ANN) model was identified as more capable than the DEA and DEA-assurance region (AR) models of measuring operational efficiency and recognizing the best-performing hospital. The classification and regression tree (CART) efficiency model was then utilized to extract rules for improving resource allocation of medical institutions.

Chuang CL; Chang PC; Lin RH

2011-10-01

342

Focusing on the socio-geographical factors that influence local vulnerability to dengue at the village level, spatial regression methods were applied to analyse, over a 5-year period, the village-specific, cumulative incidence of all reported dengue cases among 437 villages in Prachuap Khiri Khan, a semi-urban province of Thailand. The K-order nearest neighbour method was used to define the range of neighbourhoods. Analysis showed a significant neighbourhood effect (? = 0.405, P geographical proximity shared a similar level of vulnerability to dengue. The two independent social factors, associated with a higher incidence of dengue, were a shorter distance to the nearest urban area (? = -0.133, P <0.05) and a smaller average family size (? = -0.102, P <0.05). These results indicate that the trend of increasing dengue occurrence in rural Thailand arose in areas under stronger urban influence rather than in remote rural areas. PMID:21590669

Tipayamongkholgul, Mathuros; Lisakulruk, Sunisa

2011-05-01

343

International Nuclear Information System (INIS)

[en] Measurements of excited and backscattered fluorescence radiation intensity were applied for ash content determination in coal samples. An Si(Li) detector and low energy X- and gamma ray sources 55Fe, 109Cd, 238Pu, 241Am were used. The measurement facility, consisting of an argon filled proportional counter and a 238Pu radiation source, was tested and compared with other radioanalytical methods for ash content determination. The evaluation of results was based on the Snedecor F test and the analysis of the rootmean square of estimate. The best results were obtained when 55Fe source was used. In the multivariate linear regression independent variables SiK?, CaK? and backscattered radiation intensities have been selected as variables that are best related with content in coal. (author)

1988-01-01

344

Unraveling plant-animal diversity relationships: a meta-regression analysis.

In the face of unprecedented loss of biodiversity, cross-taxon correlates have been proposed as a means of obtaining quantitative estimates of biodiversity for identifying habitats of important conservation value. Habitat type, animal trophic level, and the spatial extent of studies would be expected to influence the strength of such correlations. We investigated these effects by carrying out a meta-analysis of 320 case studies of correlations between plant and animal species richnesses. The diversity of arthropods, herps, birds, and mammals significantly increased with plant diversity regardless of species habitat. However, correlations were stronger when plant and animal species richnesses were compared between habitats (gamma diversity) than within single habitats (alpha diversity). For arthropods, both the coefficient of correlation and the slope of the regression line were also greater for primary than for secondary consumers. These findings substantiate the use of plant species richness as an indicator of the diversity of animal taxa over space. PMID:23094383

Castagneyrol, Bastien; Jactel, Hervé

2012-09-01

345

Unraveling plant-animal diversity relationships: a meta-regression analysis.

UK PubMed Central (United Kingdom)

In the face of unprecedented loss of biodiversity, cross-taxon correlates have been proposed as a means of obtaining quantitative estimates of biodiversity for identifying habitats of important conservation value. Habitat type, animal trophic level, and the spatial extent of studies would be expected to influence the strength of such correlations. We investigated these effects by carrying out a meta-analysis of 320 case studies of correlations between plant and animal species richnesses. The diversity of arthropods, herps, birds, and mammals significantly increased with plant diversity regardless of species habitat. However, correlations were stronger when plant and animal species richnesses were compared between habitats (gamma diversity) than within single habitats (alpha diversity). For arthropods, both the coefficient of correlation and the slope of the regression line were also greater for primary than for secondary consumers. These findings substantiate the use of plant species richness as an indicator of the diversity of animal taxa over space.

Castagneyrol B; Jactel H

2012-09-01

346

Energy Technology Data Exchange (ETDEWEB)

The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, F{sub q} ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor.

Bae, In Ho; Naa, Man Gyun [Chosun Univ., Gwangju (Korea, Republic of); Lee, Yoon Joon [Cheju National Univ., Jeju-do (Korea, Republic of); Park, Goon Cherl [Seoul National Univ., Seoul (Korea, Republic of)

2009-05-15

347

International Nuclear Information System (INIS)

[en] The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, Fq ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor

2009-01-01

348

Research of NiMH Battery Modeling and Simulation Based on Linear Regression Analysis Method

Directory of Open Access Journals (Sweden)

Full Text Available The battery State-Of-Charge estimation was one of core issues in the development of electric vehicles battery management system, and higher accurate model was needed in State-Of-Charge estimation correctly. Therefore, accurate battery modeling and simulation was researched here. The thevenin equivalent circuit model of NiMH battery was established for the poor accuracy of traditional model. Based on the data which were brought from the 6V 6Ah NiMH battery hybrid pulse cycling test experiments, thevenin model parameters were identified by means of the linear regression analysis method. Then, the battery equivalent circuit simulating model was built in the MATLAB/Simulink environment. The simulation and experimental results showed that the model has better accuracy and can be used to guide the battery State-Of-Charge estimation.

Chang-hao Piao; Qing-yong Qin; Yong-sheng Zhang; Qian Zhang

2013-01-01

349

Directory of Open Access Journals (Sweden)

Full Text Available This paper uses data envelopment analysis to investigate the extent to which universities in the United States have undergone productivity and efficiency changes, partly due to managerial performance, during the 2005-09 academic years. Using panel data for 133 research and doctoral universities, the focus is on the primary drivers of U.S. publicly controlled higher education. DEA efficiency and returns to scale estimates are provided. In addition, university total factor productivity changes via the Malmquist index are decomposed into component parts. Results suggest that U.S. universities experienced average productivity regress. On an annual basis such was present prior to the global financial crisis. However, productivity gains appeared in concert with the crisis. Managerial efficiency tended to hamper productivity gains but, on the positive side, showed slight improvements over time. Decreasing returns to scale prevailed but from a policy perspective a return to economy wide growth may automatically correct some over production.

G. Thomas Sav

2012-01-01

350

Ordinal Logistic Regression for the Estimate of the Response Functions in the Conjoint Analysis

Directory of Open Access Journals (Sweden)

Full Text Available In the Conjoint Analysis (COA) model proposed here – a new approach to estimate more than one response function–an extension of the traditional COA, the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles) is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt – at the aggregate level – an ordinal logistic regression, based on a main effects experimental design.The model provides several overall desirability functions (aggregated part-worths sets), as many as the overall ordered categories are, unlike the traditional metric and non metric COA, which gives only one response function. We provide an application of the model and an interpretation of the main effects.

Amedeo De Luca

2011-01-01

351

Directory of Open Access Journals (Sweden)

Full Text Available In this paper we give a comparative analysis of performance of feed forward neural network and generalized regression neural network based face recognition. We use different inner epoch for different input pattern according to their difficulty of recognition. We run our system for different number of training patterns and test the system’s performance in terms of recognition rate and training time. We run our algorithm for face recognition application using Principal Component Analysis and both neural network. PCA is used for feature extraction and the neural network is used as a classifier to identify the faces. We use the ORL database for all the experiments.

Amit Kumar; Mr. Mahesh Singh

2012-01-01

352

Hybrid fuzzy regression with trapezoidal fuzzy data

In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.

Razzaghnia, T.; Danesh, S.; Maleki, A.

2011-12-01

353

Analysis of Dynamic Multiplicity Fluctuations at PHOBOS

This paper presents the analysis of the dynamic fluctuations in the inclusive charged particle multiplicity measured by PHOBOS for Au+Au collisions at sqrt(s_NN)=200$GeV within the pseudo-rapidity range of -3

Chai, Z; Baker, M D; Ballintijn, M; Barton, D S; Betts, R R; Bickley, A A; Bindel, R; Budzanowski, A; Busza, W; Carroll, A; Chai, Z; Decowski, M P; García, E; George, N; Gulbrandsen, K H; Gushue, S; Halliwell, C; Hamblen, J; Heintzelman, G A; Henderson, C; Hofman, D J; Hollis, R S; Holynski, R; Holzman, B; Iordanova, A; Johnson, E; Kane, J L; Katzy, J; Khan, N; Kucewicz, W; Kulinich, P; Kuo, C M; Lin, W T; Manly, S; McLeod, D; Mignerey, A C; Nouicer, R; Olszewski, A; Pak, R; Park, I C; Pernegger, H; Reed, C; Remsberg, L P; Reuter, M; Rolan, C; Roland, G; Rosenberg, L J; Sagerer, J; Sarin, P; Sawicki, P; Skulski, W; Steinberg, P; Stephans, G S F; Sukhanov, A; Tang, J L; Trzupek, A; Vale, C; van Nieuwenhuizen, G J; Verdier, R; Wolfs, F L H; Wosiek, B; Wozniak, K; Wuosmaa, A H; Wyslouch, B; Chai, Zhengwei

2005-01-01

354

Poisson regression analysis of the mortality among a cohort of World War II nuclear industry workers

International Nuclear Information System (INIS)

A historical cohort mortality study was conducted among 28,008 white male employees who had worked for at least 1 month in Oak Ridge, Tennessee, during World War II. The workers were employed at two plants that were producing enriched uranium and a research and development laboratory. Vital status was ascertained through 1980 for 98.1% of the cohort members and death certificates were obtained for 96.8% of the 11,671 decedents. A modified version of the traditional standardized mortality ratio (SMR) analysis was used to compare the cause-specific mortality experience of the World War II workers with the U.S. white male population. An SMR and a trend statistic were computed for each cause-of-death category for the 30-year interval from 1950 to 1980. The SMR for all causes was 1.11, and there was a significant upward trend of 0.74% per year. The excess mortality was primarily due to lung cancer and diseases of the respiratory system. Poisson regression methods were used to evaluate the influence of duration of employment, facility of employment, socioeconomic status, birth year, period of follow-up, and radiation exposure on cause-specific mortality. Maximum likelihood estimates of the parameters in a main-effects model were obtained to describe the joint effects of these six factors on cause-specific mortality of the World War II workers. We show that these multivariate regression techniques provide a useful extension of conventional SMR analysis and illustrate their effective use in a large occupational cohort study

1990-01-01

355

Poisson regression analysis of the mortality among a cohort of World War II nuclear industry workers

Energy Technology Data Exchange (ETDEWEB)

A historical cohort mortality study was conducted among 28,008 white male employees who had worked for at least 1 month in Oak Ridge, Tennessee, during World War II. The workers were employed at two plants that were producing enriched uranium and a research and development laboratory. Vital status was ascertained through 1980 for 98.1% of the cohort members and death certificates were obtained for 96.8% of the 11,671 decedents. A modified version of the traditional standardized mortality ratio (SMR) analysis was used to compare the cause-specific mortality experience of the World War II workers with the U.S. white male population. An SMR and a trend statistic were computed for each cause-of-death category for the 30-year interval from 1950 to 1980. The SMR for all causes was 1.11, and there was a significant upward trend of 0.74% per year. The excess mortality was primarily due to lung cancer and diseases of the respiratory system. Poisson regression methods were used to evaluate the influence of duration of employment, facility of employment, socioeconomic status, birth year, period of follow-up, and radiation exposure on cause-specific mortality. Maximum likelihood estimates of the parameters in a main-effects model were obtained to describe the joint effects of these six factors on cause-specific mortality of the World War II workers. We show that these multivariate regression techniques provide a useful extension of conventional SMR analysis and illustrate their effective use in a large occupational cohort study.

Frome, E.L.; Cragle, D.L.; McLain, R.W. (Oak Ridge National Laboratory, TN (USA))

1990-08-01

356

A multivariate regression analysis of panniculectomy outcomes: does plastic surgery training matter?

UK PubMed Central (United Kingdom)

BACKGROUND: Panniculectomy can improve quality of life in morbidly obese patients, but its functional benefits are counterbalanced by relatively high complication rates. The authors endeavored to determine the impact of plastic surgery training on panniculectomy outcomes. METHODS: A retrospective review was performed of the prospectively maintained American College of Surgeons National Surgical Quality Improvement Program database for all patients undergoing panniculectomy from 2006 to 2010. Patient demographic details, surgeon specialty training, and 30-day outcomes were assessed. RESULTS: A total of 954 panniculectomies meeting inclusion criteria were identified. Plastic surgeons performed 694 (72.7 percent) of the procedures, and 260 (27.3 percent) were performed by nonplastic surgeons. Nonplastic surgeons had significantly higher rates of overall complications (23.08 percent versus 8.65 percent; p < 0.001) and wound infections (12.69 percent versus 5.33 percent; p < 0.001) than plastic surgeons. Average operative time for plastic surgeons was significantly longer than that for nonplastic surgeons (3.00 ± 1.48 hours versus 1.88 ± 0.93 hours; p < 0.001). Risk-adjusted multivariate regression showed that undergoing a panniculectomy by a nonplastic surgeon was a significant predictor of overall postoperative complications (odds ratio, 2.09; 95 percent CI, 1.35 to 3.23) and wound infection (odds ratio, 1.73; 95 percent CI, 1.004 to 2.98). Subgroup analysis of propensity-matched samples supported this finding. CONCLUSION: Multivariate regression analysis of National Surgical Quality Improvement Program data showed that panniculectomy performed by plastic surgeons results in lower rates of overall postoperative complications compared with that performed by nonplastic surgeons.

Mioton LM; Buck DW 2nd; Gart MS; Hanwright PJ; Wang E; Kim JY

2013-04-01

357

The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines

Directory of Open Access Journals (Sweden)

Full Text Available "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT) findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS), which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data). MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS."nResults: MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet- use, grade of students and occupations of mothers had a significant effect (P< 0.05). In this comparative study, MARS obtained different findings from C&RT in dependency level prediction."nConclusion: The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.

M Kayri

2010-01-01

358

UK PubMed Central (United Kingdom)

This paper gives optimal algorithms for determiningreal-valued univariate unimodal regressions, that is, fordetermining the optimal regression which is increasingand then decreasing. Such regressions arise in a widevariety of applications. They are shape-constrained nonparametricregressions, closely related to isotonic regression.For unimodal regression on n weighted points ouralgorithm for the L 2 metric requires only (n) time,while for the L 1 metric it requires (n log n) time. Forunweighted points our algorithm for the L1 metric alsorequires only (n) time. Previous algorithms were forthe L 2 metric andrequired) time. All previous algorithmsused multiple calls to isotonic regression, andour major contribution is to organize these into a pre-x isotonic regression, determining the regression on allinitial segments. The prex approach reduces the totaltime required by utilizing the solution for one initialsegment to solve the next.

Quentin F. Stout

359

Using multiple tracers for 13C metabolic flux analysis.

UK PubMed Central (United Kingdom)

(13)C-Metabolic flux analysis ((13)C-MFA) is a powerful technique for quantifying intracellular metabolic fluxes in living cells. These in vivo fluxes provide important information on the physiology of cells in culture, which can be used for metabolic engineering purposes and serve as inputs for systems biology modeling. The (13)C-MFA technique consists of several steps: (1) selecting appropriate tracers for a given system of interest, (2) performing isotopic labeling experiments, (3) measuring isotopic labeling distributions in metabolic products, (4) estimating metabolic fluxes using least-squares regression, and (5) evaluating the goodness of fit and computing confidence intervals for estimated fluxes. In this chapter, we provide guidelines for performing (13)C-MFA studies using multiple isotopic tracers, a technique that is especially useful for elucidating fluxes in complex biological systems where multiple carbon sources are present. Here, as an example, we describe key steps and decision points for designing (13)C-MFA studies for microbes grown on mixtures of glucose and xylose. The general concepts described in this chapter are applicable to many other biological systems. For example, the same procedures can be applied to design (13)C-MFA studies in mammalian cells, which are generally grown in complex media containing multiple substrates such as glucose and amino acids.

Antoniewicz MR

2013-01-01

360

UK PubMed Central (United Kingdom)

BACKGROUND: This study aimed to develop the artificial neural network (ANN) and multivariable logistic regression (LR) analyses for prediction modeling of cardiovascular autonomic (CA) dysfunction in the general population, and compare the prediction models using the two approaches. METHODS AND MATERIALS: We analyzed a previous dataset based on a Chinese population sample consisting of 2,092 individuals aged 30-80 years. The prediction models were derived from an exploratory set using ANN and LR analysis, and were tested in the validation set. Performances of these prediction models were then compared. RESULTS: Univariate analysis indicated that 14 risk factors showed statistically significant association with the prevalence of CA dysfunction (P<0.05). The mean area under the receiver-operating curve was 0.758 (95% CI 0.724-0.793) for LR and 0.762 (95% CI 0.732-0.793) for ANN analysis, but noninferiority result was found (P<0.001). The similar results were found in comparisons of sensitivity, specificity, and predictive values in the prediction models between the LR and ANN analyses. CONCLUSION: The prediction models for CA dysfunction were developed using ANN and LR. ANN and LR are two effective tools for developing prediction models based on our dataset.

Tang ZH; Liu J; Zeng F; Li Z; Yu X; Zhou L

2013-01-01

361

Data Minning Application into Potential Voters Trends in Usa Elections with Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Background: Data Minning technique is very useful in bringing out the hidden information which is very useful to provide solution to a particular problem. Objective: The essence of this paper is to provide a basic model which relates potential voters in USA elections with periods of registration. Method: SPSS (Statistical Package for Social Sciences) is the choosen software and it was used to perform the analysis with Data Mining techniques, the raw data between 1932 to 2010 was refined and the data chosen which was twenty years were used for the analysis. With Data Mining Techniques through the linear regression analysis, the mathematical model which relate the voter’s registration in every two years. Result: Based on this model, it was discovered that there is relationship with potential voters or participant and years of registrations. Conclusion: Base on the findings,it was discurvered that the voting trend in USA election is baed on the population of the voters and the year or period also play significant role because as year incrases the population also increases.

Olagunju; Mukaila; Tomori; Adekola Rasheed

2012-01-01

362

In this study, genetic parameters for test-day milk, fat, and protein yield were estimated for the first lactation. The data analyzed consisted of 1,433 first lactations of Murrah buffaloes, daughters of 113 sires from 12 herds in the state of São Paulo, Brazil, with calvings from 1985 to 2007. Ten-month classes of lactation days were considered for the test-day yields. The (co)variance components for the 3 traits were estimated using the regression analyses by Bayesian inference applying an animal model by Gibbs sampling. The contemporary groups were defined as herd-year-month of the test day. In the model, the random effects were additive genetic, permanent environment, and residual. The fixed effects were contemporary group and number of milkings (1 or 2), the linear and quadratic effects of the covariable age of the buffalo at calving, as well as the mean lactation curve of the population, which was modeled by orthogonal Legendre polynomials of fourth order. The random effects for the traits studied were modeled by Legendre polynomials of third and fourth order for additive genetic and permanent environment, respectively, the residual variances were modeled considering 4 residual classes. The heritability estimates for the traits were moderate (from 0.21-0.38), with higher estimates in the intermediate lactation phase. The genetic correlation estimates within and among the traits varied from 0.05 to 0.99. The results indicate that the selection for any trait test day will result in an indirect genetic gain for milk, fat, and protein yield in all periods of the lactation curve. The accuracy associated with estimated breeding values obtained using multi-trait random regression was slightly higher (around 8%) compared with single-trait random regression. This difference may be because to the greater amount of information available per animal. PMID:23831097

Borquis, Rusbel Raul Aspilcueta; Neto, Francisco Ribeiro de Araujo; Baldi, Fernando; Hurtado-Lugo, Naudin; de Camargo, Gregório M F; Muñoz-Berrocal, Milthon; Tonhati, Humberto

2013-07-05

363

UK PubMed Central (United Kingdom)

STUDY DESIGN: Retrospective analysis of prospective registry data. OBJECTIVE: To determine the patient characteristics, risk factors, and fracture patterns associated with vertebral artery injury (VAI) in patients with blunt cervical spine injury. SUMMARY OF BACKGROUND DATA: VAI associated with cervical spine trauma has the potential for catastrophical clinical sequelae. The patterns of cervical spine injury and patient characteristics associated with VAI remain to be determined. METHODS: A retrospective review of prospectively collected data from the American College of Surgeons trauma registries at 3 level-1 trauma centers identified all patients with a cervical spine injury on multidetector computed tomographic scan during a 3-year period (January 1, 2007, to January 1, 2010). Fracture pattern and patient characteristics were recorded. Logistic multivariate regression analysis of independent predictors for VAI and subgroup analysis of neurological events related to VAI was performed. RESULTS: Twenty-one percent of 1204 patients with cervical injuries (n = 253) underwent screening for VAI by multidetector computed tomography angiogram. VAI was diagnosed in 17% (42 of 253), unilateral in 15% (38 of 253), and bilateral in 1.6% (4 of 253) and was associated with a lower Glasgow coma scale (P < 0.001), a higher injury severity score (P < 0.01), and a higher mortality (P < 0.001). VAI was associated with ankylosing spondylitis/diffuse idiopathic skeletal hyperosteosis (crude odds ratio [OR] = 8.04; 95% confidence interval [CI], 1.30-49.68; P = 0.034), and occipitocervical dissociation (P < 0.001) by univariate analysis and fracture displacement into the transverse foramen 1 mm or more (adjusted OR = 3.29; 95% CI, 1.15-9.41; P = 0.026), and basilar skull fracture (adjusted OR = 4.25; 95% CI, 1.25-14.47; P= 0.021), by multivariate regression model. Subgroup analyses of neurological events secondary to VAI occurred in 14% (6 of 42) and the stroke-related mortality rate was 4.8% (2 of 42). Neurological events were associated with male sex (P = 0.024), facet subluxation/dislocation (crude OR = 9.00; 95% CI, 1.51-53.74; P = 0.004) and the diagnosis of ankylosing spondylitis/diffuse idiopathic skeletal hyperosteosis (OR = 40.67; 95% CI, 5.27-313.96; P < 0.001). CONCLUSION: VAI associated with blunt cervical spine injury is a marker for more severely injured patients. High-risk patients with basilar skull fractures, occipitocervical dissociation, fracture displacement into the transverse foramen more than 1 mm, ankylosing spondylitis/diffuse idiopathic skeletal hyperosteosis, and facet subluxation/dislocation deserve focused consideration for VAI screening.

Lebl DR; Bono CM; Velmahos G; Metkar U; Nguyen J; Harris MB

2013-07-01

364

The influence of the network structure on the emergence of collective dynamical behavior is an important topic of research that has not been fully understood yet. In the current work, it is shown how statistical regression analysis can be considered to address this issue. The regression model proposed suggests that the average shortest path length is the network property most influencing the degree of synchronization of Kuramoto oscillators. Moreover, this model revealed to be very accurate, being the predicted and measured values of synchronization highly correlated. Therefore, the regression modeling allows predicting the values of the dynamic variable in terms of network structure.

de Arruda, Guilherme F.; Dal'Maso Peron, Thomas Kauê; de Andrade, Marinho Gomes; Achcar, Jorge Alberto; Rodrigues, Francisco Aparecido

2013-08-01

365

Energy Technology Data Exchange (ETDEWEB)

Mixed dissociation constants of four drug acids, i.e. silychristin, silybinin, silydianin and mycophenolate at various ionic strengths I of range 0.01 and 0.30 and at temperatures of 25 and 37 deg. C were determined using the SQUAD(84) regression analysis program applied to pH-spectrophotometric titration data. The proposed strategy of an efficient experimentation in a protonation constants determination, followed by a computational strategy for the chemical model with a protonation constants determination, is presented on the protonation equilibria of silychristin. The thermodynamic dissociation constant pK{sub a}{sup T} was estimated by non-linear regression of {l_brace}pK{sub a}, I data at 25 and 37 deg. C: for silychristin pK{sub a,1}{sup T}=6.52(16) and 6.62(1), pK{sub a,2}{sup T}=7.22(13) and 7.41(5), pK{sub a,3}{sup T}=8.96(9) and 8.94(9), pK{sub a,4}{sup T}=10.17(7) and 10.03(8), pK{sub a,5}{sup T}=11.89(4) and 11.63(7); for silybin pK{sub a,1}{sup T}=7.00(4) and 6.86(5), pK{sub a,2}{sup T}=8.77(11) and 8.77(3), pK{sub a,3}{sup T}=9.57(8) and 9.62(1), pK{sub a,4}{sup T}=11.66(3) and 11.38(1); for silydianin pK{sub a,1}{sup T}=6.64(7) and 7.10(6), pK{sub a,2}{sup T}=7.78(5) and 8.93(1), pK{sub a,3}{sup T}=9.66(9) and 10.06(11), pK{sub a,4}{sup T}=10.71(7) and 10.77(7), pK{sub a,5}{sup T}=12.26(5) and 12.14(5); for mycophenolate pK{sub a}{sup T}=8.32(1) and 8.14(1). Goodness-of-fit tests for various regression diagnostics enabled the reliability of parameter estimates to be found.

Meloun, Milan; Burkonova, Dominika; Syrovy, Tomas; Vrana, Ales

2003-06-11

366

International Nuclear Information System (INIS)

Mixed dissociation constants of four drug acids, i.e. silychristin, silybinin, silydianin and mycophenolate at various ionic strengths I of range 0.01 and 0.30 and at temperatures of 25 and 37 deg. C were determined using the SQUAD(84) regression analysis program applied to pH-spectrophotometric titration data. The proposed strategy of an efficient experimentation in a protonation constants determination, followed by a computational strategy for the chemical model with a protonation constants determination, is presented on the protonation equilibria of silychristin. The thermodynamic dissociation constant pKaT was estimated by non-linear regression of {pKa, I data at 25 and 37 deg. C: for silychristin pKa,1T=6.52(16) and 6.62(1), pKa,2T=7.22(13) and 7.41(5), pKa,3T=8.96(9) and 8.94(9), pKa,4T=10.17(7) and 10.03(8), pKa,5T=11.89(4) and 11.63(7); for silybin pKa,1T=7.00(4) and 6.86(5), pKa,2T=8.77(11) and 8.77(3), pKa,3T=9.57(8) and 9.62(1), pKa,4T=11.66(3) and 11.38(1); for silydianin pKa,1T=6.64(7) and 7.10(6), pKa,2T=7.78(5) and 8.93(1), pKa,3T=9.66(9) and 10.06(11), pKa,4T=10.71(7) and 10.77(7), pKa,5T=12.26(5) and 12.14(5); for mycophenolate pKaT=8.32(1) and 8.14(1). Goodness-of-fit tests for various regression diagnostics enabled the reliability of parameter estimates to be found

2003-06-11

367

Directory of Open Access Journals (Sweden)

Full Text Available Aim: A study was undertaken to develop a forecasting model for predicting bluetongue outbreaks in North-west agroclimatic zone of Tamil Nadu, India. Materials and Methods: Eleven bluetongue outbreaks were characterised by active and passive surveillances for a period of twelve years and used in this study. Meteorological data comprising of maximum and minimum temperatures, relative humidity, rainfall and wind speed were collected and used as the multiple predictor variables in the multiple liner regression model. Results: A multiple liner regression model was developed for the North-west zone of Tamil Nadu. Values of the dependant variables were less than or greater than one, and indicated remote or greater chances of bluetongue outbreaks respectively. The monthly mean maximum and minimum temperatures, relative humidity at 8.30 h and at 17.00 h IST, wind speed, and monthly total rainfall of 29.1 - 31.0°C, 20.1 - 22.0°C, 80.1 ? 85.0%, 65.1 ? 70.0%, 3.1 ? 5.0 km/h and < 200 mm respectively, were identified as the ideal climatic conditions for increased numbers of bluetongue outbreaks in this zone. Conclusion: Based on the values obtained from the prediction model, stake holders can be warned timely through the media to institute suitable prophylactic measures against bluetongue, to avoid economic losses due to disease. [Vet World 2013; 6(6.000): 321-324

G. Selvaraju; A. Balasubramaniam; D. Rajendran; D. Kannan; M. Geetha

2013-01-01

368

Directory of Open Access Journals (Sweden)

Full Text Available Objective: To explore the relationships between traditional Chinese medicine (TCM) constitutional types and overweight or obesity so as to provide evidence for adjusting constitutional bias and preventing and treating obesity.Methods: The data comes from a cross-sectional survey on TCM constitution of 18 805 samples aged above 18 in Beijing and 8 provinces (Jiangsu, Anhui, Gansu, Qinghai, Fujian, Jilin, Jiangxi and Henan) in China. The survey of TCM constitution was performed by standardized constitution in Chinese medicine questionnaire (CCMQ). Discriminatory analysis method was used to judge the individual’s constitutional type (gentleness type, qi-deficiency type, yang-deficiency type, yin-deficiency type, phlegm-dampness type, dampness-heat type, blood-stasis type, qi-depression type and special diathesis type). The relationships between TCM constitution types and overweight or obesity was investigated by logistic regression analysis. Results: Compared with gentleness type, the risk of overweight (OR, 2.05; 95% CI, 1.79-2.35) and obesity (OR, 4.34; 95% CI, 3.52-5.36) in phlegm-dampness type is significantly increased; the risk of obesity (OR, 1.60; 95% CI, 1.30-1.98) in qi-deficiency type is significantly higher; the risk of overweight and obesity in yang-deficiency type, blood-stasis type, and qi-depression type is significantly lower. Conclusion: Phlegm-dampness type and qi-deficiency type are the main constitutional risk factors of overweight or obesity.

Yan-bo Zhu; Qi Wang

2010-01-01

369

Inhibition deficits in individuals with intellectual disability: a meta-regression analysis.

UK PubMed Central (United Kingdom)

BACKGROUND: Individuals with intellectual disabilities (ID) are characterised by inhibition deficits; however, the magnitude of these deficits is still subject to debate. This meta-analytic study therefore has two aims: first to assess the magnitude of inhibition deficits in ID, and second to investigate inhibition type, age, IQ and the presence/absence of comorbid problems as potential moderators of effect sizes. METHOD: Twenty-eight effect sizes comparing ID and age matched normal controls on inhibition tasks were included in a random effects meta-regression. Moderators were age, IQ, inhibition type and presence/absence of comorbid disorder. RESULTS: The analysis showed a medium to large inhibition deficit in ID. Inhibition type significantly moderated effect size, whereas age and comorbid disorder did not. IQ significantly moderated effect size indicating increasing effect size with decreasing IQ, but only in studies that included a sample of ID participants with mean IQ?>?70. The analysis indicated comparable deficits in behavioural inhibition and interference control, but no significant deficits in cognitive inhibition and motivational inhibition. CONCLUSIONS: These results indicate that ID is characterised by a medium to large inhibition deficit in individuals with ID. ID seems not to be characterised by deficits in cognitive and motivational inhibition, which might indicate that distinct processes underlie distinct inhibition capacities.

Bexkens A; Ruzzano L; Collot D' Escury-Koenigs AM; Van der Molen MW; Huizenga HM

2013-07-01

370

Cigarette Smoking Habits among Men and Women in Turkey: A Meta Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Smoking has become more prevalent in Turkey than it has in those of western countries during the past decade. This study was conducted to make parameter estimations on gender related smoking habits with the minimum of variance. Of the ninety-two researches related to smoking habits conducted from 1981 to 2003 in Turkey, 60 were deemed appropriate for the application of Meta analysis and Meta regression analysis. The proportions of men and women smoking cigarettes were 0.51 and 0.35, respectively. The proportion of men smoking cigarette in 1996 and the years before it was 0.52, and for women as 0.35. However, the figures for the years following 1996 were 0.41 for men, and 0.32 for women. In the results of the Dersimonian and Laird random effect model, the Odds Ratio, which shows the tendency of men to smoke compared to women, was found 1.894 for the period of 1981-2003. A heterogeneous distribution between the researches was apparent (Q=1560.91, P<0.001) as well as for Tau-square test (x2=0.55, z=6.29, P<0.001). We propose that effective precautions should be considered, especially with regard to the introduction of laws to minimize the smoking habit for both sexes, with particular attention to women.

F Sahin Mutlu; U Ayranci; K Ozdamar

2006-01-01

371

UK PubMed Central (United Kingdom)

PROBLEM: Successful pregnancy is the result of multiple genetic and non-genetic factors. Associations of various SNPs described in this study have not revealed any conclusive results. We have analyzed 47 SNPs using statistical tools like multidimensional regression, classification regression tree, and logistic regression. METHOD OF STUDY: Two hundred women with at least three consecutive unexplained spontaneous abortions before 20th week of gestation and 300 control women without any history of recurrent miscarriages (RM) were genotyped using PCR, RFLP and sequencing. RESULTS: Our results revealed that Leptin 2549 C/A (rs7799039) and TNF-? 238 (rs361525) may play an important role in the maintenance of pregnancy. TNF-? 238 may act as a protective SNP and Leptin 2549 C/A as a susceptible marker among women with RM cases. CONCLUSIONS: Present study demonstrated an association with Leptin 2549C/A (rs7799039) and TNF-? (rs361525) gene polymorphism among RM cases.

Parveen F; Agrawal S

2013-07-01

372

DEFF Research Database (Denmark)

The estimation of the technical efficiency comprises a vast literature in the field of applied production economics. There are two predominant approaches: the non-parametric and non-stochastic Data Envelopment Analysis (DEA) and the parametric Stochastic Frontier Analysis (SFA). The DEA is criticised, because it cannot account for statistical noise such as random production shocks and measurement errors, which are inherent in more or less all production data sets. In contrast, the SFA is criticised, because it requires the specification of a functional form, which involves the risk of specifying an unsuitable functional form and thus, model misspecification and biased parameter estimates. Given these problems of the DEA and the SFA, Fan, Li and Weersink (1996) proposed a semi-parametric stochastic frontier model that estimates the production function (frontier) by non-parametric regression based on kernel estimators. This approach combines the virtues of the DEA and the SFA, while avoiding their drawbacks: itavoids the specification of a functional form and at the same time accounts for statistical noise. More recently, this approach was used by Henderson and Simar (2005), Kumbhakar et al. (2007), and Henningsen and Kumbhakar (2009). The aim of this paper and its main contribution to the existing literature is the estimation semi-parametric stochastic frontier models using a different non-parametric estimation technique: spline regression (Ma et al. 2011). We apply this approach to the Polish dairy sector and use a panel data set of Polish dairy farms from the years 2004-2010. The Polish dairy sector has changed considerably since the integration of Poland in the European Union: the number of dairy producers decreased by one third and the average herd size increased from 3.8 to 5.7 cows per farm within the period 2004-2010. It is expected that farms with small herds (less than 30 dairy cows) will quit and that the number of large farms (with more than 100 dairy cows) will increase. Therefore, a thorough empirical study of the technical efficiency and scale efficiency of Polish dairy farms contributes to the insight into this dynamic process. Furthermore, we compare and evaluate the results of this spline-based semi-parametric stochastic frontier model with results of other semi-parametric stochastic frontier models and of traditional parametric stochastic frontier models. References: Fan, Y.; Li, Q. , Weersink, A. (1996), Semiparametric Estimation of Stochastic Production Frontier Models, Journal of Business and Economic Statistics. Henderson, D. J., Simar, L. (2005), A Fully Nonparametric Stochastic Frontier Model for Panel Data, University of New York Henningsen, A. , Kumbhakar, S. C. (2009), Semiparametric Stochastic Frontier Analysis: An Application to Polish Farms During Transition, Paper presented at the (EWEPA) in Pisa, Italy. Kumbhakar S. C., Park, B. U., Simar, L. Tsionas E. G. (2007), Nonparametric Stochastic Frontiers: A Local Maximum Likelihood Approach, Journal of Econometrics. Ma,S., Racine, J. S. & Yang, L. (2011), Spline regression in the presence of categorical predictors, Working Paper

Czekaj, Tomasz Gerard; Henningsen, Arne

373

Energy Technology Data Exchange (ETDEWEB)

The ChemCam instrument on the Mars Science Laboratory (MSL) will include a laser-induced breakdown spectrometer (LIBS) to quantify major and minor elemental compositions. The traditional analytical chemistry approach to calibration curves for these data regresses a single diagnostic peak area against concentration for each element. This approach contrasts with a new multivariate method in which elemental concentrations are predicted by step-wise multiple regression analysis based on areas of a specific set of diagnostic peaks for each element. The method is tested on LIBS data from igneous and metamorphosed rocks. Between 4 and 13 partial regression coefficients are needed to describe each elemental abundance accurately (i.e., with a regression line of R{sup 2} > 0.9995 for the relationship between predicted and measured elemental concentration) for all major and minor elements studied. Validation plots suggest that the method is limited at present by the small data set, and will work best for prediction of concentration when a wide variety of compositions and rock types has been analyzed.

Clegg, Samuel M [Los Alamos National Laboratory; Barefield, James E [Los Alamos National Laboratory; Wiens, Roger C [Los Alamos National Laboratory; Dyar, Melinda D [MT HOLYOKE COLLEGE; Schafer, Martha W [LSU; Tucker, Jonathan M [MT HOLYOKE COLLEGE

2008-01-01

374

Partial least squares (PLS) regression and its application to coal analysis

Directory of Open Access Journals (Sweden)

Full Text Available Los métodos instrumentales de análisis químico hacen uso de las relaciones entre la señal obtenida y una propiedad del sistema estudiado (generalmente, una concentración). Los avances en electrónica y computación han hecho posible un rápido progreso en la adquisición de datos y en su transmisión y procesamiento. La aplicación de diversos métodos matemáticos al cálculo de concentraciones y otras propiedades a partir de datos instrumentales se conoce como quimiometría y es un área de intensa actividad, por sus amplias aplicaciones en la industria química, de procesos y en estudios ambientales. Uno de los métodos más usados en quimiometría es el método de mínimos cuadrados parciales, conocido por sus iniciales en inglés, PLS ("partial least squares"). Este método, relacionado con la regresión de componentes principales, PCR ("principal components regression") posee ventajas teóricas y computacionales que han llevado a innumerables aplicaciones. Se encuentran en Internet decenas de miles de referencias solamente para el PLS lineal. En este artículo, se explica los fundamentos del método y se muestra una aplicación a la predicción de propiedades de carbones minerales a partir de datos del infrarrojo medio, con el objetivo de desarrollar métodos de análisis rápidos y no destructivos para estos materiales.Instrumental chemical analysis methods use the relationships between a signal obtained and a property (generally a concentration) of the system under study. The study and applications of these relations is known as chemometrics, a discipline of intense development, with ample applications in chemical and process industry and in environmental studies. The method of partial least squares (PLS) is one of the most used in chemometrics. This method is closely related to principal components regression (PCR). PLS have theoretical and computational advantages that have led to a great number of applications. The numbers of Internet sites referring to PLS are hundreds of thousands. Here, we give the fundamentals and show an application to prediction of coal properties from mid-infrared data, with the purpose of developing fast, non-destructive methods of analysis for these materials.

Carlos E Alciaturi; Marcos E Escobar; Carlos De La Cruz; Carlos Rincón

2003-01-01

375

Digital Repository Infrastructure Vision for European Research (DRIVER)

Background and Aim: The purpose of this study was to assess the accuracy of the bootstrap method in logistic regression and to explore the methods use in logistic regression models in cases where the sample size is insufficient. Materials and Methods: We use data from 150 patients who had undergone ...

M Baniasadi; GH.R BaBaie; H Zeraati; F Memari

376

Detecting outliers in fuzzy regression analysis with asymmetric trapezoidal fuzzy data

The existence of outliers in a set of experimental data can cause incorrect interpretation of the fuzzy linear regression results. This paper is to introduce some limitation on constraints of fuzzy linear regression models for determining fuzzy parameters with outliers by value trapezoidal fuzzy data.

Maleki, A.; Pasha, E.; Yari, Gh.; Razzaghnia, T.

2011-12-01

377

|This article considers the problem of estimating dynamic linear regression models when the data are generated from finite mixture probability density function where the mixture components are characterized by different dynamic regression model parameters. Specifically, conventional linear models assume that the data are generated by a single…

Kaplan, David

2005-01-01

378

Joint analysis of multiple metagenomic samples.

UK PubMed Central (United Kingdom)

The availability of metagenomic sequencing data, generated by sequencing DNA pooled from multiple microbes living jointly, has increased sharply in the last few years with developments in sequencing technology. Characterizing the contents of metagenomic samples is a challenging task, which has been extensively attempted by both supervised and unsupervised techniques, each with its own limitations. Common to practically all the methods is the processing of single samples only; when multiple samples are sequenced, each is analyzed separately and the results are combined. In this paper we propose to perform a combined analysis of a set of samples in order to obtain a better characterization of each of the samples, and provide two applications of this principle. First, we use an unsupervised probabilistic mixture model to infer hidden components shared across metagenomic samples. We incorporate the model in a novel framework for studying association of microbial sequence elements with phenotypes, analogous to the genome-wide association studies performed on human genomes: We demonstrate that stratification may result in false discoveries of such associations, and that the components inferred by the model can be used to correct for this stratification. Second, we propose a novel read clustering (also termed "binning") algorithm which operates on multiple samples simultaneously, leveraging on the assumption that the different samples contain the same microbial species, possibly in different proportions. We show that integrating information across multiple samples yields more precise binning on each of the samples. Moreover, for both applications we demonstrate that given a fixed depth of coverage, the average per-sample performance generally increases with the number of sequenced samples as long as the per-sample coverage is high enough.

Baran Y; Halperin E

2012-01-01

379

UK PubMed Central (United Kingdom)

OBJECTIVE: The objectives of the study are to evaluate which clinicopathologic factors influenced overall survival (OS) in endometrial carcinoma and to determine if the surgical effort to assess para-aortic (PA) lymph nodes (LNs) at initial staging surgery impacts OS. METHODS: All patients diagnosed with endometrial cancer from 1/1993-12/2011 who had LNs excised were included. PALN assessment was defined by the identification of one or more PALNs on final pathology. A multivariate analysis was performed to assess the effect of PALNs on OS. A form of recursive partitioning called classification and regression tree (CART) analysis was implemented. Variables included: age, stage, tumor subtype, grade, myometrial invasion, total LNs removed, evaluation of PALNs, and adjuvant chemotherapy. RESULTS: The cohort included 1920 patients, with a median age of 62years. The median number of LNs removed was 16 (range, 1-99). The removal of PALNs was not associated with OS (P=0.450). Using the CART hierarchically, stage I vs. stages II-IV and grades 1-2 vs. grade 3 emerged as predictors of OS. If the tree was allowed to grow, further branching was based on age and myometrial invasion. Total number of LNs removed and assessment of PALNs as defined in this study were not predictive of OS. CONCLUSION: This innovative CART analysis emphasized the importance of proper stage assignment and a binary grading system in impacting OS. Notably, the total number of LNs removed and specific evaluation of PALNs as defined in this study were not important predictors of OS.

Barlin JN; Zhou Q; St Clair CM; Iasonos A; Soslow RA; Alektiar KM; Hensley ML; Leitao MM Jr; Barakat RR; Abu-Rustum NR

2013-09-01

380

International Nuclear Information System (INIS)

The report presents an analysis of possible calculation procedure for automated data fitting. The problem is defined in the first part, and it is shown that solution demands optimisation under invariance conditions (stochastic errors) which is part of theory of planning optimal experiments. A rough review of knowledge in this field is given. Second part of the report some statistical and optimisation methods are analysed in more detail in order to be used for automated fitting. Evaluation of possible relevant calculation procedure is presented

1988-01-01

381

UK PubMed Central (United Kingdom)

Breast density (the percentage of fibroglandular tissue in the breast) has been suggested to be a useful surrogate marker for breast cancer risk. It is conventionally measured using screen-film mammographic images by a labor-intensive histogram segmentation method (HSM). We have adapted and modified the HSM for measuring breast density from raw digital mammograms acquired by full-field digital mammography. Multiple regression model analyses showed that many of the instrument parameters for acquiring the screening mammograms (e.g. breast compression thickness, radiological thickness, radiation dose, compression force, etc) and image pixel intensity statistics of the imaged breasts were strong predictors of the observed threshold values (model R(2) = 0.93) and %-density (R(2) = 0.84). The intra-class correlation coefficient of the %-density for duplicate images was estimated to be 0.80, using the regression model-derived threshold values, and 0.94 if estimated directly from the parameter estimates of the %-density prediction regression model. Therefore, with additional research, these mathematical models could be used to compute breast density objectively, automatically bypassing the HSM step, and could greatly facilitate breast cancer research studies.

Lu LJ; Nishino TK; Khamapirad T; Grady JJ; Leonard MH Jr; Brunder DG

2007-08-01

382