WorldWideScience
 
 
1

Multiple linear regression analysis  

Science.gov (United States)

Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

Edwards, T. R.

1980-01-01

2

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Ba...

Kulcsa?r, Erika

2009-01-01

3

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM  

Directory of Open Access Journals (Sweden)

Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

Erika KULCSÁR

2009-12-01

4

Basic Multiple Regression  

Science.gov (United States)

This page will perform basic multiple regression analysis for the case where there are several independent predictor variables, X1, X2, etc., and one dependent or criterion variable, Y. Requires import of data from a spreadsheet.

Lowry, Richard, 1940-

2008-06-25

5

Multiple Regression Analysis Using ANCOVA in University Model  

Directory of Open Access Journals (Sweden)

Full Text Available The government of UAE is promoting Dubai as an academic hub. Dubai International Academic City (DIAC is a free zone area with many national and international universities promoting higher education in almost all disciplines. The aspiration of every graduating student from the university is to get a good placement. In Dubai diverse job opportunities in national and multinational organizations are available. The objective of the paper is to review the placement opportunities in Dubai for the universities offering programs in Engineering. This paper attempts to study the effect of three independent variables namely Cumulative grade point average (CGPA, Engineering disciplines and types of jobs that graduating students are offered on the dependent variable salary. Engineering discipline understudy are Mechanical, Electronics and Communication, Computer Science and Electrical and Electronics Engineering. The type of jobs taken into consideration are marketing, technical marketing, design and logistics. The concepts of Analysis of covariance (ANCOVA and multiple regression are used for review of placement opportunities vis a vis the salary structure.

Maneesha

2013-09-01

6

Pulsars and interstellar medium - multiple regression analysis of related parameters  

Energy Technology Data Exchange (ETDEWEB)

The relationship between pulsars and the interstellar-medium electron density (IED) is investigated by performing multiple stepwise regression analysis on the parameters for which linear correlations with the galactic continuum background temperature at 408 MHz (T408) have been established in 325 pulsars by Fracassini et al. (1983) (dispersion measure, radio luminosity, heliocentric distance, and galactocentric radius). An empirical relation is derived the results are presented in a table and graph and the pulsars are classified as peculiar, normal, or standard on the basis of their O-C values. Standard pulsars are shown to have actual T4O8 values equal to those calculated statistically and to confirm theoretically based T408-IED relationships, whereas normal pulsars are indicators of regions in which the observed and calculated T4O8 are in agreement only on average, and peculiar pulsars are associated with regions in which T4O8 is significantly above or below the average and IED is depleted (by accretion phenomena) or increased (by H II regions or star formation in front of the pulsar). 30 references.

Antonello, E.; Fracassini, M.

1985-01-01

7

Pulsars and interstellar medium - Multiple regression analysis of related parameters  

Science.gov (United States)

The relationship between pulsars and the interstellar-medium electron density (IED) is investigated by performing multiple stepwise regression analysis on the parameters for which linear correlations with the galactic continuum background temperature at 408 MHz (T408) have been established in 325 pulsars by Fracassini et al. (1983) (dispersion measure, radio luminosity, heliocentric distance, and galactocentric radius). An empirical relation is derived; the results are presented in a table and graph; and the pulsars are classified as peculiar, normal, or standard on the basis of their O-C values. Standard pulsars are shown to have actual T4O8 values equal to those calculated statistically and to confirm theoretically based T408-IED relationships, whereas normal pulsars are indicators of regions in which the observed and calculated T4O8 are in agreement only on average, and peculiar pulsars are associated with regions in which T4O8 is significantly above or below the average and IED is depleted (by accretion phenomena) or increased (by H II regions or star formation in front of the pulsar).

Antonello, E.; Fracassini, M.

1985-01-01

8

Robust Multiple Linear Regression.  

Science.gov (United States)

An extensive Monte Carlo analysis is conducted to determine the performance of robust linear regression techniques with and without outliers. Thirteen methods of regression are compared including least squares and minimum absolute deviation. The classical...

A. M. M. Sultan

1982-01-01

9

On Stepwise Multiple Linear Regression.  

Science.gov (United States)

Stepwise multiple linear regression has proved to be an extremely useful computational technique in data analysis problems. This procedure has been implemented in numerous computer programs and overcomes the acute problem that often exists with the classi...

H. J. Breaux

1967-01-01

10

An improved multiple linear regression and data analysis computer program package  

Science.gov (United States)

NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

Sidik, S. M.

1972-01-01

11

Multiple Linear Regression  

Science.gov (United States)

This site, created by Michelle Lacey of Yale University, gives an explanation, a definition and an example of multiple linear regression. Topics include: confidence intervals, tests of significance, and squared multiple correlation. While brief, this is still a valuable site for anyone interested in statistics.

Lacey, Michelle

2009-11-30

12

Analysis of ? spectra in airborne radioactivity measurements using multiple linear regressions  

International Nuclear Information System (INIS)

This paper describes the net peak counts calculating of nuclide 137Cs at 662 keV of ? spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)

2004-11-06

13

Quantitative electron microscope autoradiography: application of multiple linear regression analysis  

International Nuclear Information System (INIS)

A new method for the analysis of high resolution EM autoradiographs is described. It identifies labelled cell organelle profiles in sections on a strictly statistical basis and provides accurate estimates for their radioactivity without the need to make any assumptions about their size, shape and spatial arrangement. (author)

1986-01-01

14

A multiple regression analysis for accurate background subtraction in 99Tcm-DTPA renography  

International Nuclear Information System (INIS)

A technique for accurate background subtraction in 99Tcm-DTPA renography is described. The technique is based on a multiple regression analysis of the renal curves and separate heart and soft tissue curves which together represent background activity. It is compared, in over 100 renograms, with a previously described linear regression technique. Results show that the method provides accurate background subtraction, even in very poorly functioning kidneys, thus enabling relative renal filtration and excretion to be accurately estimated. (author)

1989-01-01

15

Multiple Linear Regression Analysis of Scintillation Gamma-Ray Spectra: Theoretical and Practical Considerations.  

Science.gov (United States)

Application of the method of multiple linear regression as a data-analysis technique for gamma-ray scintillation spectrometer data requires knowledge of (1) the response matrix of the spectrometer and (2) the covariance matrix of the unknown spectrum (or ...

D. F. Covell M. Brown S. Yamamoto

1969-01-01

16

Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis  

Science.gov (United States)

Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

2006-01-01

17

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three uniqu...

2011-01-01

18

Fungible Weights in Multiple Regression  

Science.gov (United States)

Every set of alternate weights (i.e., nonleast squares weights) in a multiple regression analysis with three or more predictors is associated with an infinite class of weights. All members of a given class can be deemed "fungible" because they yield identical "SSE" (sum of squared errors) and R[superscript 2] values. Equations for generating…

Waller, Niels G.

2008-01-01

19

Multiple regression analysis of Jominy hardenability data for boron treated steels  

International Nuclear Information System (INIS)

The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

1997-03-01

20

High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.  

Science.gov (United States)

One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary. PMID:24229385

Laurens, L M L; Wolfrum, E J

2013-12-18

 
 
 
 
21

Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results  

Science.gov (United States)

Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…

Warne, Russell T.

2011-01-01

22

QSPR study of molar diamagnetic susceptibility of diverse organic compounds using multiple linear regression analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The multiple linear regression (MLR) was used to build the linear quantitative structure-property relationship (QSPR) model for the prediction of the molar diamagnetic susceptibility (?m) for 140 diverse organic compounds using the three significant descriptors calculated from the molecular structures alone and selected by stepwise regression method. Stepwise regression was employed to develop a regression equation based on 100 training compounds, and predictive ability was tested on 40 co...

2012-01-01

23

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE, R-square, R-Square adjusted, condition number, root mean square error(RMSE, number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

R.K.Chauhan

2011-04-01

24

Estimate of Compressive Strength for Concrete using Ultrasonics by Multiple Regression Analysis Method  

International Nuclear Information System (INIS)

Various types of ultrasonic techniques have been used for the estimation of compressive strength of concrete structures. However, conventional ultrasonic velocity method using only longitudial wave cannot be determined the compressive strength of concrete structures with accuracy. In this paper, by using the introduction of multiple parameter, e. g. velocity of shear wave, velocity of longitudinal wave, attenuation coefficient of shear wave, attenuation coefficient of longitudinal wave, combination condition, age and preservation method, multiple regression analysis method was applied to the determination of compressive strength of concrete structures. The experimental results show that velocity of shear wave can be estimated compressive strength of concrete with more accuracy compared with the velocity of longitudinal wave, accuracy of estimated error range of compressive strength of concrete structures can be enhanced within the range of ± 10% approximately

1991-12-01

25

ANALYSIS OF THE FINANCIAL PERFORMANCES OF THE FIRM, BY USING THE MULTIPLE REGRESSION MODEL  

Directory of Open Access Journals (Sweden)

Full Text Available The information achieved through the use of simple linear regression are not always enough to characterize the evolution of an economic phenomenon and, furthermore, to identify its possible future evolution. To remedy these drawbacks, the special literature includes multiple regression models, in which the evolution of the dependant variable is defined depending on two or more factorial variables.

Constantin Anghelache

2011-11-01

26

ANALYSIS OF THE FINANCIAL PERFORMANCES OF THE FIRM, BY USING THE MULTIPLE REGRESSION MODEL  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The information achieved through the use of simple linear regression are not always enough to characterize the evolution of an economic phenomenon and, furthermore, to identify its possible future evolution. To remedy these drawbacks, the special literature includes multiple regression models, in which the evolution of the dependant variable is defined depending on two or more factorial variables.

2011-01-01

27

Prediction of Persian Gulf Sea Surface Temperature Using Multiple Regressions and Principal Components Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available Since the fluctuations of the Persian Gulf Sea Surface Temperature (PGSST have a significant effect on the winter precipitation and water resources and agricultural productions of the south western parts of Iran, the possibility of the Winter SST prediction was evaluated by multiple regression model. The time series of PGSSTs for all seasons, during 1947-1992, were considered as predictors, and the time series of MSSTs during 1948-1993, as the prrdictand. For the purpose of data reduction and principal components extraction, the principal components analysis was applied. Just the scores of the first four PCs (PC1 to PC4 that accounted for the total variance in predictor field were considered as the input file for the regression analysis. For finding the dependency of each principal component to the first time series of the PGSST, the Varimax rotation analysis was applied. The results have indicated that PC1 to PC4 respectively are the indicator of temperature changes during winter, autumn, Spring and Summer. According to the regression model, the components of PC1, PC2 and PC4 were significant at 5% level. But the components of PC3 was insignificant. The results indicated that the significant variables are held accountable for the 33.5% of the total variance in the winter PGSSTs. It became obvious that for the prediction of the winter PGSST, the PGSST during the winter of the last year has a particular importance. At the next stage, autumn and summer temperature have also a role in prediction of winter PGSST.

A. Shirvani

2005-10-01

28

Variables Associated with Communicative Participation in People with Multiple Sclerosis: A Regression Analysis  

Science.gov (United States)

Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…

Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar

2010-01-01

29

Multiple Regression and Its Discontents  

Science.gov (United States)

Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

Snell, Joel C.; Marsh, Mitchell

2012-01-01

30

A COMPARISON OF STEPWISE AND FUZZY MULTIPLE REGRESSION ANALYSIS TECHNIQUES FOR MANAGING SOFTWARE PROJECT RISKS: ANALYSIS PHASE  

Directory of Open Access Journals (Sweden)

Full Text Available Risk is not always avoidable, but it is controllable. The aim of this study is to identify whether those techniques are effective in reducing software failure. This motivates the authors to continue the effort to enrich the managing software project risks with consider mining and quantitative approach with large data set. In this study, two new techniques are introduced namely stepwise multiple regression analysis and fuzzy multiple regression to manage the software risks. Two evaluation procedures such as MMRE and Pred (25 is used to compare the accuracy of techniques. The modelâ??s accuracy slightly improves in stepwise multiple regression rather than fuzzy multiple regression. This study will guide software managers to apply software risk management practices with real world software development organizations and verify the effectiveness of the new techniques and approaches on a software project. The study has been conducted on a group of software project using survey questionnaire. It is hope that this will enable software managers improve their decision to increase the probability of software project success.

Abdelrafe Elzamly

2014-01-01

31

Empirical equations for the content rate of scattered radiation by the way of multiple regression analysis  

International Nuclear Information System (INIS)

The content rate of scattered radiation is affected by tube voltage, object thickness, size of radiation field and with or without grid. We tried to formalize the relationship between the content rate and them. As changed the tube voltage, object thickness, radiation field and grid ratio, radiography varies in its film density, then we calculated the content rate of scattered radiation, and led two approximate equations by the method of multiple regression analysis. One of the equations was computed by using real value X of explaining variables, another by using root value ?X except the explaining variable of tube voltage. As a result, the latter had better accuracy. Applying this approximate equation, when each explaining variable is within the boundary area, error is not over 10 percent, almost errors within 5 percent. (author)

1981-01-01

32

Analysis of longitudinal clinical trials with missing data using multiple imputation in conjunction with robust regression.  

Science.gov (United States)

In a typical randomized clinical trial, a continuous variable of interest (e.g., bone density) is measured at baseline and fixed postbaseline time points. The resulting longitudinal data, often incomplete due to dropouts and other reasons, are commonly analyzed using parametric likelihood-based methods that assume multivariate normality of the response vector. If the normality assumption is deemed untenable, then semiparametric methods such as (weighted) generalized estimating equations are considered. We propose an alternate approach in which the missing data problem is tackled using multiple imputation, and each imputed dataset is analyzed using robust regression (M-estimation; Huber, 1973, Annals of Statistics 1, 799-821.) to protect against potential non-normality/outliers in the original or imputed dataset. The robust analysis results from each imputed dataset are combined for overall estimation and inference using either the simple Rubin (1987, Multiple Imputation for Nonresponse in Surveys, New York: Wiley) method, or the more complex but potentially more accurate Robins and Wang (2000, Biometrika 87, 113-124.) method. We use simulations to show that our proposed approach performs at least as well as the standard methods under normality, but is notably better under both elliptically symmetric and asymmetric non-normal distributions. A clinical trial example is used for illustration. PMID:22994905

Mehrotra, Devan V; Li, Xiaoming; Liu, Jiajun; Lu, Kaifeng

2012-12-01

33

Determination of useful ranges of mixing conditions for glycerin Fatty Acid ester by multiple regression analysis.  

Science.gov (United States)

The interaction of the effects of the triglycerin full behenate (TR-FB) concentration and the mixing time on lubrication and tablet properties were analyzed under a two-factor central composite design, and compared with those of magnesium stearate (Mg-St). Various amounts of lubricant (0.07-3.0%) were added to granules and mixed for 1-30 min. A multiple linear regression analysis was performed to identify the effect of the mixing conditions on each physicochemical property. The mixing conditions did not significantly affect the lubrication properties of TR-FB. For tablet properties, tensile strength decreased and disintegration time increased when the lubricant concentration and the mixing time were increased for Mg-St. The direct interaction of the Mg-St concentration and the mixing time had a significant negative effect on the disintegration time. In contrast, any mixing conditions of TR-FB did not affect the tablet properties. In addition, the range of mixing conditions which satisfied the lubrication and tablet property criteria was broader for TR-FB than that for Mg-St, suggesting that TR-FB allows tablets with high quality attributes to be produced consistently. Therefore, TR-FB is a potential lubricant alternative to Mg-St. PMID:24189302

Uchimoto, Takeaki; Iwao, Yasunori; Hattori, Hiroaki; Noguchi, Shuji; Itai, Shigeru

2013-01-01

34

Multiple regression analysis in modeling of columnar ozone in Peninsular Malaysia.  

Science.gov (United States)

This study aimed to predict monthly columnar ozone (O3) in Peninsular Malaysia by using data on the concentration of environmental pollutants. Data (2003-2008) on five atmospheric pollutant gases (CO2, O3, CH4, NO2, and H2O vapor) retrieved from the satellite Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY) were employed to develop a model that predicts columnar ozone through multiple linear regression. In the entire period, the pollutants were highly correlated (R?=?0.811 for the southwest monsoon, R?=?0.803 for the northeast monsoon) with predicted columnar ozone. The results of the validation of columnar ozone with column ozone from SCIAMACHY showed a high correlation coefficient (R?=?0.752-0.802), indicating the model's accuracy and efficiency. Statistical analysis was utilized to determine the effects of each atmospheric pollutant on columnar ozone. A model that can retrieve columnar ozone in Peninsular Malaysia was developed to provide air quality information. These results are encouraging and accurate and can be used in early warning of the population to comply with air quality standards. PMID:24599658

Tan, K C; Lim, H S; Mat Jafri, M Z

2014-06-01

35

Influence of plant root morphology and tissue composition on phenanthrene uptake: Stepwise multiple linear regression analysis  

International Nuclear Information System (INIS)

Polycyclic aromatic hydrocarbons (PAHs) are contaminants that reside mainly in surface soils. Dietary intake of plant-based foods can make a major contribution to total PAH exposure. Little information is available on the relationship between root morphology and plant uptake of PAHs. An understanding of plant root morphologic and compositional factors that affect root uptake of contaminants is important and can inform both agricultural (chemical contamination of crops) and engineering (phytoremediation) applications. Five crop plant species are grown hydroponically in solutions containing the PAH phenanthrene. Measurements are taken for 1) phenanthrene uptake, 2) root morphology – specific surface area, volume, surface area, tip number and total root length and 3) root tissue composition – water, lipid, protein and carbohydrate content. These factors are compared through Pearson's correlation and multiple linear regression analysis. The major factors which promote phenanthrene uptake are specific surface area and lipid content. -- Highlights: •There is no correlation between phenanthrene uptake and total root length, and water. •Specific surface area and lipid are the most crucial factors for phenanthrene uptake. •The contribution of specific surface area is greater than that of lipid. -- The contribution of specific surface area is greater than that of lipid in the two most important root morphological and compositional factors affecting phenanthrene uptake

2013-08-01

36

Investigations upon the indefinite rolls quality assurance in multiple regression analysis  

Energy Technology Data Exchange (ETDEWEB)

The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks) and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates. (Author) 16 refs.

Kiss, I.

2012-11-01

37

Simulation of maritime transport and distribution by sea-going barges: an application of multiple regression analysis and factor screening  

Energy Technology Data Exchange (ETDEWEB)

This paper presents an application of multiple regression analysis and factor screening to the study of transport by pusher barges compared to sea-going ships. A simulation model is described and mathematical techniques are used to simplify the parameters involved to such a level that the model can be investigated with a minimum of simulation runs. (9 refs.)

Rooda, J.E.; van der Schilden, N.

1982-12-01

38

Simulation of maritime transport and distribution by sea-going barges: An application of multiple regression analysis and factor screening  

Energy Technology Data Exchange (ETDEWEB)

The authors present an application of multiple regression analysis and factor screening to the study of transport by pusher barges compared to sea-going ships. A simulation model is described and mathematical techniques are used to simplify the parameters involved to such a level that the model can be investigated with a minimum of simulation runs.

Rooda, J.E.; van der Schilden, N.

1982-12-01

39

Oral health-related risk behaviours and attitudes among Croatian adolescents--multiple logistic regression analysis.  

Science.gov (United States)

The aim of this study was to explore the patterns of oral health-related risk behaviours in relation to dental status, attitudes, motivation and knowledge among Croatian adolescents. The assessment was conducted in the sample of 750 male subjects - military recruits aged 18-28 in Croatia using the questionnaire and clinical examination. Mean number of decayed, missing and filled teeth (DMFT) and Significant Caries Index (SIC) were calculated. Multiple logistic regression models were crated for analysis. Although models of risk behaviours were statistically significant their explanatory values were quite low. Five of them--rarely toothbrushing, not using hygiene auxiliaries, rarely visiting dentist, toothache as a primary reason to visit dentist, and demand for tooth extraction due to toothache--had the highest explanatory values ranging from 21-29% and correctly classified 73-89% of subjects. Toothache as a primary reason to visit dentist, extraction as preferable therapy when toothache occurs, not having brushing education in school and frequent gingival bleeding were significantly related to population with high caries experience (DMFT > or = 14 according to SiC) producing Odds ratios of 1.6 (95% CI 1.07-2.46), 2.1 (95% CI 1.29-3.25), 1.8 (95% CI 1.21-2.74) and 2.4 (95% CI 1.21-2.74) respectively. DMFT> or = 14 model had low explanatory value of 6.5% and correctly classified 83% of subjects. It can be concluded that oral health-related risk behaviours are interrelated. Poor association was seen between attitudes concerning oral health and oral health-related risk behaviours, indicating insufficient motivation to change lifestyle and habits. Self-reported oral hygiene habits were not strongly related to dental status. PMID:24851627

Spalj, Stjepan; Spalj, Vedrana Tudor; Ivankovi?, Luida; Plancak, Darije

2014-03-01

40

Error analysis of dimensionless scaling experiments with multiple points using linear regression  

International Nuclear Information System (INIS)

A general method of error estimation in the case of multiple point dimensionless scaling experiments, using linear regression and standard error propagation, is proposed. The method reduces to the previous result of Cordey (2009 Nucl. Fusion 49 052001) in the case of a two-point scan. On the other hand, if the points follow a linear trend, it explains how the estimated error decreases as more points are added to the scan. Based on the analytical expression that is derived, it is argued that for a low number of points, adding points to the ends of the scanned range, rather than the middle, results in a smaller error estimate. (letter)

2010-02-01

 
 
 
 
41

Analysis of aromatic constituents in multicomponent hydrocarbon mixtures by infrared spectroscopy using multiple linear regression  

Science.gov (United States)

Absorption spectra of multicomponent hydrocarbon mixtures based on n-heptane and isooctane with addition of benzene (up to 1%) and toluene and o-xylene (up to 20%) were investigated experimentally in the region of the first overtones of the hydrocarbon groups (? = 1620-1780 nm). It was shown that their concentrations could be determined separately by using a multiple linear regression method. The optimum result was obtained by including four wavelengths at 1671, 1680, 1685, and 1695 nm, which took into account absorption of CH groups in benzene, toluene, and o-xylene and CH3 groups, respectively.

Vesnin, V. L.; Muradov, V. G.

2012-09-01

42

Regression analysis by example  

CERN Document Server

Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

Chatterjee, Samprit

2012-01-01

43

In silico methods in stability testing of hydrocortisone, powder for injections: Multiple regression analysis versus dynamic neural network  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This article presents the possibility of using of multiple regression analysis (MRA) and dynamic neural network (DNN) for prediction of stability of Hydrocortisone 100 mg (in a form of hydrocortisone sodium succinate) freeze-dried powder for injection packed into a dual chamber container. Degradation products of hydrocortisone sodium succinate: free hydrocortisone and related substances (impurities A, B, C, D and E; unspecified impurities and total impurities) were followed during stres...

2012-01-01

44

In silico methods in stability testing of hydrocortisone, powder for injections: Multiple regression analysis versus dynamic neural network  

Directory of Open Access Journals (Sweden)

Full Text Available This article presents the possibility of using of multiple regression analysis (MRA and dynamic neural network (DNN for prediction of stability of Hydrocortisone 100 mg (in a form of hydrocortisone sodium succinate freeze-dried powder for injection packed into a dual chamber container. Degradation products of hydrocortisone sodium succinate: free hydrocortisone and related substances (impurities A, B, C, D and E; unspecified impurities and total impurities were followed during stress and formal stability studies. All data obtained during stability studies were used for in silico modeling; multiple regression models and dynamic neural networks as well, in order to compare predicted and observed results. High values of coefficient of determination (0.950.99 were gained using MRA and DNN, so both methods are powerful tools for in silico stability studies, but superiority of DNN over mathematical modeling of degradation was also confirmed.

Vuji? Zorica B.

2012-01-01

45

Detecting Nitrogen Content in Lettuce Leaves Based on Hyperspectral Imaging and Multiple Regression Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available This study was carried out to detect nitrogen content in lettuce leaves rapidly and non-destructively using visible and near infrared (VIS-NIR hyperspectral imaging technology. Principal Component Analysis (PCA was performed on the average spectra to reduce the spectral dimensionality and the principal components (PCs were extracted as the input vectors of prediction models. Partial Least Square Regression (PLSR, Back Propagation Artificial Neural Network (BP-ANN, Extreme Learning Machine (ELM, Support Vector Machine Regression (SVR were, respectively applied to relate the nitrogen content to the corresponding PCs to build the prediction models of nitrogen content. R2p of the PLSR model for nitrogen content was 0.91 and RMSEP was 0.32. BP model of structure 5-2-1 with R2p of 0.92 and RMSEP of 0.21, ELM model of structure 5-10-1 with R2p of 0.95 and RMSEP of 0.19 and SVR model for nitrogen with R2p of 0.96 and RMSEP of 0.18, all got good prediction performance. Compared with the other three models, SVR model has the better performance for predicting nitrogen content in lettuce leaves. This work demonstrated that the hyperspectral imaging technique coupled with PCA-SVR exhibits a considerable promise for nondestructive detection of nitrogen content in lettuce leaves.

Sun Jun

2013-01-01

46

Multiple Regressions in Analysing House Price Variations  

Directory of Open Access Journals (Sweden)

Full Text Available An application of rigorous statistical analysis in aiding investment decision making gains momentum in the United States of America as well as the United Kingdom. Nonetheless in Malaysia the responses from the local academician are rather slow and the rate is even slower as far as the practitioners are concern. This paper illustrates how Multiple Regression Analysis (MRA and its extension, Hedonic Regression Analysis been used in explaining price variation for selected houses in Malaysia. Each attribute that theoretically identified as price determinant is priced and the perceived contribution of each is explicitly shown. The paper demonstrates how the statistical analysis is capable of analyzing property investment by considering multiple determinants. The consideration of various characteristics which is more rigorous enables better investment decision making.

Aminah Md Yusof

2012-03-01

47

Multiple regression analysis of factors that may influence middle school science scores  

Science.gov (United States)

The purpose of this quantitative multiple regression study was to determine whether a relationship existed between Maryland State Assessment (MSA) reading scores, MSA math scores, gender, ethnicity, age, and MSA science scores. Also examined was if MSA reading scores, MSA math scores, gender, ethnicity, and age can be used in combination or alone to predict a passing score on the MSA science test and which variable, if any, had the most influence on science MSA scores. Both math and reading MSA scores were positively correlated with science MSA scores. Ethnicity was correlated with science MSA scores, but may have been confounded by socio-economic status. Age and gender were not correlated with science MSA scores. When the variables were combined, results showed that math MSA scores followed by reading MSA scores had the most predictive influence upon science MSA scores. Ethnicity, gender, and age had the least predictive influence. The findings of this study may serve as a catalyst for improving student achievement in science through changes in instructional methodology and curriculum design thereby increasing the number of students pursuing science careers.

Glover, Judith

48

Understanding logistic regression analysis.  

Science.gov (United States)

Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed. PMID:24627710

Sperandei, Sandro

2014-01-01

49

Multiple Regression Analysis for Grading and Prognosis of Cubital Tunnel Syndrome:Assessment of Akahori’s Classification  

Directory of Open Access Journals (Sweden)

Full Text Available The purpose of this study was to quantitatively evaluate Akahori's preoperative classification of cubital tunnel syndrome. We analyzed the results for 57 elbows that were treated by a simple decompression procedure from 1997 to 2004. The relationship between each item of Akahori's preoperative classification and clinical stage was investigated based on the parameter distribution. We evaluated Akahori's classification system using multiple regression analysis, and investigated the association between the stage and treatment results. The usefulness of the regression equation was evaluated by analysis of variance of the expected and observed scores. In the parameter distribution, each item of Akahori's classification was mostly associated with the stage, but it was difficult to judge the severity of palsy. In the mathematical evaluation, the most effective item in determining the stage was sensory conduction velocity. It was demonstrated that the established regression equation was highly reliable (R?0.922. Akahori's preoperative classification can also be used in postoperative classification, and this classification was correlated with postoperative prognosis. Our results indicate that Akahori's preoperative classification is a suitable system. It is reliable, reproducible and well-correlated with the postoperative prognosis. In addition, the established prediction formula is useful to reduce the diagnostic complexity of Akahori's classification.

Nishida,Keiichiro

2013-02-01

50

Multiple Instance Regression with Structured Data  

Science.gov (United States)

This slide presentation reviews the use of multiple instance regression with structured data from multiple and related data sets. It applies the concept to a practical problem, that of estimating crop yield using remote sensed country wide weekly observations.

Wagstaff, Kiri L.; Lane, Terran; Roper, Alex

2008-01-01

51

Introduction to regression analysis  

CERN Document Server

This book is an introduction to regression analysis for upper division and graduate students in science, engineering, social science and medicine. The emphasis is on the classical linear model using least squares estimation and inference. In addition, topics of current interest, such as regression diagnostics, ridge and logistic regression are treated as well. In contrast to other books at this level, the theoretical foundation of the subject is presented in some detail based on extensive use of matrix algebra. Throughout the text model building and evaluation are emphasised and illustrated wi

GOLBERG, M

2003-01-01

52

Bayesian logistic regression analysis  

Science.gov (United States)

In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an essential added ingredient. The application of the product rule gives the posterior of the unknown logistic regression coefficients. The Jacobian transformation then maps the posterior of these regression coefficients to the posterior of the corresponding probability of some event and some nuisance parameters. Finally, by way of the sumrule the nuissance parameters are integrated out.

van Erp, N.; van Gelder, P.

2013-08-01

53

Reliability and Regression Analysis  

Science.gov (United States)

This applet, by David M. Lane of Rice University, demonstrates how the reliability of X and Y affect various aspects of the regression of Y on X. Java 1.1 is required and a full set of instructions is given in order to get the full value from the applet. Exercises and definitions to key terms are also given to help students understand reliability and regression analysis.

Lane, David M.

2009-02-17

54

Bayesian logistic regression analysis :  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an essential added ingredient. The application of the product rule gives the posterior of the unknown logistic regression coefficients. The Jacobian transformation then maps the posterior of these re...

Erp, H. R. N.; Gelder, P. H. A. J. M.

2012-01-01

55

Diagnostics for multiple regression problems  

Energy Technology Data Exchange (ETDEWEB)

In the last 10 to 15 years there has been much work done in trying to improve linear regression results. Individuals have analyzed the susceptibility of least-squares results to values far removed from the center of the independent variable observations. They have studied the problem of heavy-tailed residuals, and they have studied the problem of collinearity. From these studies have come ridge regression techniques, robust regression techniques, regression on principal components, etc. However, many practitioners view these methods with suspicion (and ignorance), and prefer to continue using the usual least-squares procedures to fit their models, even though their results might not be answering the question they think. In reaction to this, statisticians are spending more time analyzing how the individual observations affect the least squares results. In the last few years approximately 10 papers and one text have appeared that address the problem of how to study the influence of the individual observations. This report is a study of the recent work done in linear regression diagnostics. It is concerned with analyzing the effect of one case at a time, since the methods to analyze this situation are relatively straight-forward and are not prohibitive computationally.

Daly, J.C.

1982-03-01

56

A multiple linear regression analysis of hot corrosion attack on a series of nickel base turbine alloys  

Science.gov (United States)

Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

Barrett, C. A.

1985-01-01

57

Comparison of a neural network with multiple linear regression for quantitative analysis in ICP-atomic emission spectroscopy  

International Nuclear Information System (INIS)

A two layer perceptron with backpropagation of error is used for quantitative analysis in ICP-AES. The network was trained by emission spectra of two interfering lines of Cd and As and the concentrations of both elements were subsequently estimated from mixture spectra. The spectra of the Cd and As lines were also used to perform multiple linear regression (MLR) via the calculation of the pseudoinverse S+ of the sensitivity matrix S. In the present paper it is shown that there exist close relations between the operation of the perceptron and the MLR procedure. These are most clearly apparent in the correlation between the weights of the backpropagation network and the elements of the pseudoinverse. Using MLR, the confidence intervals over the predictions are exploited to correct for the optical device of the wavelength shift. (orig.)

1992-10-01

58

Commonality Analysis: A Method for Decomposing Explained Variance in Multiple Regression Analyses.  

Science.gov (United States)

Offers a brief explication of commonality analysis; a step-by-step discussion of how communication researchers may perform commonality analyses using output from a computer-assisted statistical analysis program; and provides an extended example illustrating a commonality analysis. (JMF)

McPhee, Robert D.; Seibold, David R.

1979-01-01

59

Statistical analysis using multiple regression of stereological parameters for skeleton castings microstructure  

Directory of Open Access Journals (Sweden)

Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L and (P/A factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals. Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.

M. Cholewa

2011-07-01

60

Linear Regression Analysis  

CERN Document Server

Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

Seber, George A F

2012-01-01

 
 
 
 
61

MLREG, Stepwise Multiple Linear Regression Program.  

Science.gov (United States)

This program is written in FORTRAN for an IBM computer and performs multiple linear regressions according to a stepwise procedure. The program transforms and combines old variables into new variables, prints input and transformed data, sums, raw sums or s...

J. H. Carder

1981-01-01

62

MLRP, Multiple Linear Regression Program (for Microcomputers).  

Science.gov (United States)

The Multiple Linear Regression computer program follows the procedures of Statistical Methods in Hydrology (Beard, 1962). Major features of the program are automatic deletion of independent variables (according to importance), combination of variables to ...

H. Kubik

1986-01-01

63

On Investment Efficiency of China's Tourism Listed Companies Based on Multiple Regression Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available The paper is to investigate the conditions of efficient investment for China?s tourism listed companies and to examine how other factors affect the level of investment for the companies, in order to establish a basis for further studying the effect of executive compensation incentives on the investment efficiency of the tourism listed companies. Fifteen tourism listed companies from 2002 to 2010 are selected as study samples. On the basis of analysis of literature, the paper builds tourism listed companies' capital investment model by using the Richardson expected investment model for reference and then use it to deal with and analyze the data by the tools of SPSS 17.0 and EXCEL 2010. It is found that the mean residual of fifteen tourism listed companies' capital investment model is -0.000 000 744 with the mean residuals of seven companies less than zero and the ones of eight companies greater than zero. The minimum and maximum of the mean residuals respectively are -0.040 181 25 (Beijing Capital Tourism Co., Ltd and 0.036 942 5(Shenzhen Overseas Chinese Town Co., Ltd. ROAi,t-1(return of assets, p<0.10andINVi,t-1(scale of investment, p<0.01 respectively have significant positive correlations with INVi,t. And Agei,t-1(p<0.05has the significant negative correlation with INVi,t. It suggests that fifteen tourism listed companies from 2003 to 2010 have under-investment on the whole, in which seven ones and eight ones respectively have under-investment and over-investment. In addition, the total return on assets and the level of investment in tourism listed companies significantly advance the level of investment of the company of the following year. And the listing age significantly inhibits the level of investment of the company of the following year.

WEI Wei

2013-09-01

64

Multiple regression analysis of reading performance data from twin pairs with reading difficulties and nontwin siblings: the augmented model.  

Science.gov (United States)

The augmented multiple regression model for the analysis of data from selected twin pairs was extended to facilitate analyses of data from twin pairs and nontwin siblings. Fitting this extended model to data from both selected twin pairs and siblings yields direct estimates of heritability (h2) and the difference between environmental influences shared by members of twin pairs and those of sib or twin-sib pairs (i.e., c2(t) - c2 (s)). When this model was fitted to reading performance data from 293 monozygotic and 436 dizygotic pairs selected for reading difficulties, and 291 of their nontwin siblings, h2 = .48 ± .22, p = .03, and c2 (t) - c2 (s) = .22 ± .12, p = .06. Although the test for differential shared environmental influences is only marginally significant, the results of this analysis suggest that environmental influences on reading performance that are shared by members of twin pairs (.36) may be substantially greater than those for less contemporaneous twin-sibling pairs (.14). PMID:22784461

Wadsworth, Sally J; Olson, Richard K; Willcutt, Erik G; DeFries, John C

2012-02-01

65

Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.  

Science.gov (United States)

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

66

Multiple Outliers Detection Procedures in Linear Regression  

Directory of Open Access Journals (Sweden)

Full Text Available This paper describes a procedure for identifying multiple outliers in linear regression. This procedure uses a robust fit which is the least of trimmed of squares (LTS and the single linkage clustering method to obtain the potential outliers. Then multiple-case diagnostics are used to obtain the outliers from these potential outliers. The performance of this procedure is also compared to Serbert’s method. Monte Carlo simulations are used in determining which procedure performed best in all of the linear regression scenarios.

Robiah Adnan

2003-06-01

67

Multiple Outliers Detection Procedures in Linear Regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper describes a procedure for identifying multiple outliers in linear regression. This procedure uses a robust fit which is the least of trimmed of squares (LTS) and the single linkage clustering method to obtain the potential outliers. Then multiple-case diagnostics are used to obtain the outliers from these potential outliers. The performance of this procedure is also compared to Serbert’s method. Monte Carlo simulations are used in determining which procedure performed best in all...

2003-01-01

68

Regression Analysis A Constructive Critique  

CERN Document Server

Regression Analysis: A Constructive Critique identifies a wide variety of problems with regression analysis as it is commonly used and then provides a number of ways in which practice could be improved. Regression is most useful for data reduction, leading to relatively simple but rich and precise descriptions of patterns in a data set. The emphasis on description provides readers with an insightful rethinking from the ground up of what regression analysis can do, so that readers can better match regression analysis with useful empirical questions and improved policy-related research. "An

Berk, Richard A

2003-01-01

69

Salience Assignment for Multiple-Instance Regression  

Science.gov (United States)

We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag's real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstrate that it provides more extensive, intuitive, and stable salience models than Primary-Instance Regression, which selects a single relevant item from each bag.

Wagstaff, Kiri L.; Lane, Terran

2007-01-01

70

The use of artificial neural network analysis and multiple regression for trap quality evaluation: a case study of the Northern Kuqa Depression of Tarim Basin in western China  

Energy Technology Data Exchange (ETDEWEB)

Artificial neural network analysis is found to be far superior to multiple regression when applied to the evaluation of trap quality in the Northern Kuqa Depression, a gas-rich depression of Tarim Basin in western China. This is because this technique can correlate the complex and non-linear relationship between trap quality and related geological factors, whereas multiple regression can only describe a linear relationship. However, multiple regression can work as an auxiliary tool, as it is suited to high-speed calculations and can indicate the degree of dependence between the trap quality and its related geological factors which artificial neural network analysis cannot. For illustration, we have investigated 30 traps in the Northern Kuqa Depression. For each of the traps, the values of 14 selected geological factors were all known. While geologists were also able to assign individual trap quality values to 27 traps, they were less certain about the values for the other three traps. Multiple regression and artificial neural network analysis were, therefore, respectively used to ascertain these values. Data for the 27 traps were used as known sample data, while the three traps were used as prediction candidates. Predictions from artificial neural network analysis are found to agree with exploration results: where simulation predicted high trap quality, commercial quality flows were afterwards found, and where low trap quality is indicated, no such discoveries have yet been made. On the other hand, multiple regression results indicate the order of dependence of the trap quality on geological factors, which reconciles with what geologists have commonly recognized. We can conclude, therefore, that the application of artificial neural network analysis with the aid of multiple regression to trap evaluation in the Northern Kuqa Depression has been quite successful. To ensure the precision of the above mentioned geological factors and their related parameters for each trap, a study of the petroleum system in Kuqa Depression was conducted, which included the partitioning and mechanisms of the Kuqa petroleum system. Three migration models are presented. (author)

Guangren Shi; Xingxi Zhou; Guangya Zhang; Xiaofeng Shi; Honghui Li [Research Institute of Petroleum Exploration and Development, Beijing (China)

2004-03-01

71

Multiple-Regression Hidden Markov Model  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper proposes a new class of hidden Markov model (HMM) called multiple-regression HMM (MRHMM) that utilizes auxiliary features such as fundamental frequency (F0) and speaking styles that affect spectral parameters to better model the acoustic features of phonemes. Though such auxiliary features are considered to be the factors that degrade the performance of speech recognizers, the proposed MR-HMM adapts its model parameters, i.e. mean vectors of output probabili...

Fujinaga, Katsuhisa; Nakai, Mitsuru; Shimodaira, Hiroshi; Sagayama, Shigeki

2001-01-01

72

Flexible Model Selection Criterion for Multiple Regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Predictors of a multiple linear regression equation selected by GCV (Generalized Cross Validation) may contain undesirable predictors with no linear functional relationship with the target variable, but are chosen only by accident. This is because GCV estimates prediction error, but does not control the probability of selecting irrelevant predictors of the target variable. To take this possibility into account, a new statistics “GCVf

2012-01-01

73

Application of multiple linear regression analysis to predict antifungal activity of some benzimidazole derivatives using ADME parameters  

Directory of Open Access Journals (Sweden)

Full Text Available In this study we were investigated the relationship between the antifungal activity of some benzimidazole derivatives and some absorption, distribution, metabolism and excretion (ADME parameters. The antifungal activity of studied compounds against Saccharomyces cerevisiae was expressed as the minimal inhibitory concentration (MIC. A statistically significant quantitative structure-activity relationship (QSAR model for predicting antifungal activity of the investigated benzimidazole derivatives against Saccharomyces cerevisiae was obtained by multiple linear regression (MLR using ADME parameters. The quality of the MLR model was validated by the leave-one-out (LOO technique, as well as by the calculation of the statistical parameters for the developed model, and the results are discussed based on the statistical data. [Projekat Ministarstva nauke Republike Srbije, br. 172012 i br. 172014

Kalajdžija Nataša D.

2013-01-01

74

A Dirty Model for Multiple Sparse Regression  

CERN Document Server

Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...

Jalali, Ali; Sanghavi, Sujay

2011-01-01

75

Prediction of coal grindability based on petrography, proximate and ultimate analysis using multiple regression and artificial neural network models  

Energy Technology Data Exchange (ETDEWEB)

The effects of proximate and ultimate analysis, maceral content, and coal rank (R{sub max}) for a wide range of Kentucky coal samples from calorific value of 4320 to 14960 (BTU/lb) (10.05 to 34.80 MJ/kg) on Hardgrove Grindability Index (HGI) have been investigated by multivariable regression and artificial neural network methods (ANN). The stepwise least square mathematical method shows that the relationship between (a) Moisture, ash, volatile matter, and total sulfur; (b) ln (total sulfur), hydrogen, ash, ln ((oxygen + nitrogen)/carbon) and moisture; (c) ln (exinite), semifusinite, micrinite, macrinite, resinite, and R{sub max} input sets with HGI in linear condition can achieve the correlation coefficients (R{sup 2}) of 0.77, 0.75, and 0.81, respectively. The ANN, which adequately recognized the characteristics of the coal samples, can predict HGI with correlation coefficients of 0.89, 0.89 and 0.95 respectively in testing process. It was determined that ln (exinite), semifusinite, micrinite, macrinite, resinite, and R{sub max} can be used as the best predictor for the estimation of HGI on multivariable regression (R{sup 2} = 0.81) and also artificial neural network methods (R{sup 2} = 0.95). The ANN based prediction method, as used in this paper, can be further employed as a reliable and accurate method, in the hardgrove grindability index prediction. (author)

Chelgani, S. Chehreh; Jorjani, E.; Mesroghli, Sh.; Bagherieh, A.H. [Department of Mining Engineering, Research and Science Campus, Islamic Azad University, Poonak, Hesarak Tehran (Iran); Hower, James C. [Center for Applied Energy Research, University of Kentucky, 2540 Research Park Drive, Lexington, KY 40511 (United States)

2008-01-15

76

Flexible Model Selection Criterion for Multiple Regression  

Directory of Open Access Journals (Sweden)

Full Text Available Predictors of a multiple linear regression equation selected by GCV (Generalized Cross Validation may contain undesirable predictors with no linear functional relationship with the target variable, but are chosen only by accident. This is because GCV estimates prediction error, but does not control the probability of selecting irrelevant predictors of the target variable. To take this possibility into account, a new statistics “GCVf” (“f”stands for “flexible” is suggested. The rigidness in accepting predictors by GCVf is adjustable; GCVf is a natural generalization of GCV. For example, GCVf is designed so that the possibility of erroneous identification of linear relationships is 5 percent when all predictors have no linear relationships with the target variable. Predictors of the multiple linear regression equation by this method are highly likely to have linear relationships with the target variable.

Kunio Takezawa

2012-10-01

77

A comparison on parameter-estimation methods in multiple regression analysis with existence of multicollinearity among independent variables  

Directory of Open Access Journals (Sweden)

Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than RRR and RL methods when the level of correlations is low or median and sample sizes are large.The AMSE varies with, most to least, respectively, error of restrictions, level of correlations, standard deviation and number of independent variables but inversely with to sample sizes, except that error of restrictions does not affect AMSE of OLS method.

Hukharnsusatrue, A.

2005-11-01

78

Estimating the input function non-invasively for FDG-PET quantification with multiple linear regression analysis: simulation and verification with in vivo data  

International Nuclear Information System (INIS)

A novel statistical method, namely Regression-Estimated Input Function (REIF), is proposed in this study for the purpose of non-invasive estimation of the input function for fluorine-18 2-fluoro-2-deoxy-d-glucose positron emission tomography (FDG-PET) quantitative analysis. We collected 44 patients who had undergone a blood sampling procedure during their FDG-PET scans. First, we generated tissue time-activity curves of the grey matter and the whole brain with a segmentation technique for every subject. Summations of different intervals of these two curves were used as a feature vector, which also included the net injection dose. Multiple linear regression analysis was then applied to find the correlation between the input function and the feature vector. After a simulation study with in vivo data, the data of 29 patients were applied to calculate the regression coefficients, which were then used to estimate the input functions of the other 15 subjects. Comparing the estimated input functions with the corresponding real input functions, the averaged error percentages of the area under the curve and the cerebral metabolic rate of glucose (CMRGlc) were 12.13±8.85 and 16.60±9.61, respectively. Regression analysis of the CMRGlc values derived from the real and estimated input functions revealed a high correlation (r=0.91). No significant difference was found between the real CMRGlc and that derived from our regression-estimated input function (Student's t test, P>0.05). The proposed REIF method demonstrated good abilities for input function and CMRGlc estimation, and represents a reliable replacement for the blood sampling procedures in FDG-PET quantification. (orig.)

2004-05-01

79

A comparison between Joint Regression Analysis and the Additive Main and Multiplicative Interaction model: the robustness with increasing amounts of missing data  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robust [...] ness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Paulo Canas, Rodrigues; Dulce Gamito Santinhos, Pereira; João Tiago, Mexia.

80

Multiple Regression Analysis of Reading Performance Data from Twin Pairs with Reading Difficulties and Non-twin Siblings: The Augmented Model  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The augmented multiple regression model for the analysis of data from selected twin pairs was extended to facilitate analyses of data from twin pairs and non-twin siblings. Fitting this extended model to data from both selected twin pairs and siblings yields direct estimates of heritability (h2) and the difference between environmental influences shared by members of twin pairs and those of sib or twin/sib pairs [i.e., c2(t) ? c2(s)]. When this model was fitted to reading performance data f...

Wadsworth, S. J.; Olson, R. K.; Willcutt, E. G.; Defries, J. C.

2012-01-01

 
 
 
 
81

Comparison of two-concentration with multi-concentration linear regressions: Retrospective data analysis of multiple regulated LC-MS bioanalytical projects.  

Science.gov (United States)

Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. PMID:23917407

Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi

2013-09-01

82

Commonality Analysis for the Regression Case.  

Science.gov (United States)

Commonality analysis is a procedure for decomposing the coefficient of determination (R superscript 2) in multiple regression analyses into the percent of variance in the dependent variable associated with each independent variable uniquely, and the proportion of explained variance associated with the common effects of predictors in various…

Murthy, Kavita

83

Suppression Situations in Multiple Linear Regression  

Science.gov (United States)

This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

Shieh, Gwowen

2006-01-01

84

Exploring the equity of GP practice prescribing rates for selected coronary heart disease drugs: a multiple regression analysis with proxies of healthcare need  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background There is a small, but growing body of literature highlighting inequities in GP practice prescribing rates for many drug therapies. The aim of this paper is to further explore the equity of prescribing for five major CHD drug groups and to explain the amount of variation in GP practice prescribing rates that can be explained by a range of healthcare needs indicators (HCNIs. Methods The study involved a cross-sectional secondary analysis in four primary care trusts (PCTs 1–4 in the North West of England, including 132 GP practices. Prescribing rates (average daily quantities per registered patient aged over 35 years and HCNIs were developed for all GP practices. Analysis was undertaken using multiple linear regression. Results Between 22–25% of the variation in prescribing rates for statins, beta-blockers and bendrofluazide was explained in the multiple regression models. Slightly more variation was explained for ACE inhibitors (31.6% and considerably more for aspirin (51.2%. Prescribing rates were positively associated with CHD hospital diagnoses and procedures for all drug groups other than ACE inhibitors. The proportion of patients aged 55–74 years was positively related to all prescribing rates other than aspirin, where they were positively related to the proportion of patients aged >75 years. However, prescribing rates for statins and ACE inhibitors were negatively associated with the proportion of patients aged >75 years in addition to the proportion of patients from minority ethnic groups. Prescribing rates for aspirin, bendrofluazide and all CHD drugs combined were negatively associated with deprivation. Conclusion Although around 25–50% of the variation in prescribing rates was explained by HCNIs, this varied markedly between PCTs and drug groups. Prescribing rates were generally characterised by both positive and negative associations with HCNIs, suggesting possible inequities in prescribing rates on the basis of ethnicity, deprivation and the proportion of patients aged over 75 years (for statins and ACE inhibitors, but not for aspirin.

St Leger Antony S

2005-02-01

85

Analysis and estimative of schistosomiasis prevalence for the state of Minas Gerais, Brazil, using multiple regression with social and environmental spatial data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The aim of this work is to establish a relationship between schistosomiasis prevalence and social-environmental variables, in the state of Minas Gerais, Brazil, through multiple linear regression. The final regression model was established, after a variables selection phase, with a set of spatial variables which contains the summer minimum temperature, human development index, and vegetation type variables. Based on this model, a schistosomiasis risk map was built for Minas Gerais.

2006-01-01

86

Fuzzy multiple linear regression: A computational approach  

Science.gov (United States)

This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.

Juang, C. H.; Huang, X. H.; Fleming, J. W.

1992-01-01

87

Retail sales forecasting with application the multiple regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with mult...

2012-01-01

88

Retail sales forecasting with application the multiple regression  

Directory of Open Access Journals (Sweden)

Full Text Available The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with multiple model estimation. One of the most important decisions a retailer can make with information obtained by the multiple regression. Recently, a changing retail environment is causing by an expected consumer’s income and advertising costs. Checking model on the goodness of fit and statistical significance are explored in the article. Finally, the quantitative value of retail sales forecast based on multiple regression model is calculated.

Kuzhda, Tetyana

2012-05-01

89

Synthesis analysis of regression models with a continuous outcome  

Digital Repository Infrastructure Vision for European Research (DRIVER)

To estimate the multivariate regression model from multiple individual studies, it would be challenging to obtain results if the input from individual studies only provide univariate or incomplete multivariate regression information. Samsa et al. (J. Biomed. Biotechnol. 2005; 2:113–123) proposed a simple method to combine coefficients from univariate linear regression models into a multivariate linear regression model, a method known as synthesis analysis. However, the validity of this meth...

2009-01-01

90

Non-destructive evaluation of chlorophyll content in quinoa and amaranth leaves by simple and multiple regression analysis of RGB image components.  

Science.gov (United States)

Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R (2)) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth. PMID:24442792

Riccardi, M; Mele, G; Pulvento, C; Lavini, A; d'Andria, R; Jacobsen, S-E

2014-06-01

91

Multiple Regression Analysis of Reading Performance Data from Twin Pairs with Reading Difficulties and Non-twin Siblings: The Augmented Model  

Science.gov (United States)

The augmented multiple regression model for the analysis of data from selected twin pairs was extended to facilitate analyses of data from twin pairs and non-twin siblings. Fitting this extended model to data from both selected twin pairs and siblings yields direct estimates of heritability (h2) and the difference between environmental influences shared by members of twin pairs and those of sib or twin/sib pairs [i.e., c2(t) ? c2(s)]. When this model was fitted to reading performance data from 293 MZ and 436 DZ pairs selected for reading difficulties, and 291 of their non-twin siblings, h2 = .48 ± .22, p = .03, and c2(t) ? c2(s) = .22 ± .12, p = .06. Although the test for differential shared environmental influences is only marginally significant, the results of this analysis suggest that environmental influences on reading performance that are shared by members of twin pairs (.36) may be substantially greater than those for less contemporaneous twin/sibling pairs (.14).

Wadsworth, S.J.; Olson, R. K.; Willcutt, E.G.; DeFries, J.C.

2011-01-01

92

The role of chemometrics in single and sequential extraction assays: a review. Part II. Cluster analysis, multiple linear regression, mixture resolution, experimental design and other techniques.  

Science.gov (United States)

Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. PMID:21334477

Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo

2011-03-01

93

A Software Tool for Regression Analysis and its Assumptions  

Directory of Open Access Journals (Sweden)

Full Text Available Nowadays, among the forecasting methods, the most important one is the regression analysis. In this method, the aim is to estimate the population regression model as much as accurate by taking as basis the sample regression function. Its results are valid under certain assumptions and the violations of these assumptions cause the invalidity of some properties of the estimators. In this study, a new object-oriented program concentrated only on the regression analysis and its assumptions has been developed using Java, to carry out this analysis more easily and in a shorter time. In this program, regression model selection, regression and correlation analysis with Least Square method, one test for every assumption and solution methods has been presented. All the results of the analysis are illustrated by using a multiple regression example.

Sona Mardikyan

2006-01-01

94

A multiple covariance approach to PLS regression with several predictor groups: Structural Equation Exploratory Regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A variable group Y is assumed to depend upon R thematic variable groups X 1, ..., X R . We assume that components in Y depend linearly upon components in the Xr's. In this work, we propose a multiple covariance criterion which extends that of PLS regression to this multiple predictor groups situation. On this criterion, we build a PLS-type exploratory method - Structural Equation Exploratory Regression (SEER) - that allows to simultaneously perform dimension reduction in groups and investigat...

2008-01-01

95

Regression analysis with categorized regression calibrated exposure: some interesting findings  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a percentile scale. Relating back to the original scale of the exposure solves the problem. The conclusion regards all regression models.

Hjartåker Anette

2006-07-01

96

Clearness index in cloudy days estimated with meteorological information by multiple regression analysis; Kisho joho wo riyoshita kaiki bunseki ni yoru dontenbi no seiten shisu no suitei  

Energy Technology Data Exchange (ETDEWEB)

Study is under way for a more accurate solar radiation quantity prediction for the enhancement of solar energy utilization efficiency. Utilizing the technique of roughly estimating the day`s clearness index from forecast weather, the forecast weather (constituted of weather conditions such as `clear,` `cloudy,` etc., and adverbs or adjectives such as `afterward,` `temporary,` and `intermittent`) has been quantified relative to the clearness index. This index is named the `weather index` for the purpose of this article. The error high in rate in the weather index relates to cloudy days, which means a weather index falling in 0.2-0.5. It has also been found that there is a high correlation between the clearness index and the north-south wind direction component. A multiple regression analysis has been carried out, under the circumstances, for the estimation of clearness index from the maximum temperature and the north-south wind direction component. As compared with estimation of the clearness index on the basis only of the weather index, estimation using the weather index and maximum temperature achieves a 3% improvement throughout the year. It has also been learned that estimation by use of the weather index and north-south wind direction component enables a 2% improvement for summer and a 5% or higher improvement for winter. 2 refs., 6 figs., 4 tabs.

Nakagawa, S. [Maizuru National College of Technology, Kyoto (Japan); Kenmoku, Y.; Sakakibara, T. [Toyohashi University of Technology, Aichi (Japan); Kawamoto, T. [Shizuoka University, Shizuoka (Japan). Faculty of Engineering

1996-10-27

97

A multiple stepwise logistic regression analysis of trauma history and 16 other history and dental cofactors in females with temporomandibular disorders.  

Science.gov (United States)

The simultaneous contribution of 11 occlusal factors, dental attrition severity, orthodontic history, trauma (motor vehicle accident [MVA] and non-MVA), and age in defining two independent large populations of females diagnosed with five mutually exclusive temporomandibular disorders was tested through multiple stepwise logistic regression analysis. Non-MVA trauma was significant in both groups in defining disc displacement (DD) with and without reduction, and osteoarthrosis (OA) (both primary and following DD). Anterior open bite was also a significant factor in defining OA in both groups. Much smaller contributions were also made by missing teeth in one of the populations with OA following DD, and by retruded contact position-intercuspal position slide lengths and overjet in one of the primary OA populations. Motor vehicle accident trauma was significant in defining myofascial pain (MP) in both populations, and laterotrusive attrition mildly defined MP in one population. Only a minority of total variance was explained: 6% to 8% of DD with reduction; 10% to 14% of DD without reduction; 11% to 20% of OA following DD; 17% to 38% of primary OA; and 4% to 10% of MP. Non-MVA trauma was the major defining feature of the temporomandibular joint intracapsular disorders, and MVA trauma explained a very small percentage of the MP patients. Implications are discussed and recommendations are made for future research. PMID:9161240

Seligman, D A; Pullinger, A G

1996-01-01

98

Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression  

Science.gov (United States)

The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

Beckstead, Jason W.

2012-01-01

99

Regression Commonality Analysis: A Technique for Quantitative Theory Building  

Science.gov (United States)

When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

Nimon, Kim; Reio, Thomas G., Jr.

2011-01-01

100

Sample Sizes when Using Multiple Linear Regression for Prediction  

Science.gov (United States)

When using multiple regression for prediction purposes, the issue of minimum required sample size often needs to be addressed. Using a Monte Carlo simulation, models with varying numbers of independent variables were examined and minimum sample sizes were determined for multiple scenarios at each number of independent variables. The scenarios…

Knofczynski, Gregory T.; Mundfrom, Daniel

2008-01-01

 
 
 
 
101

Using Quantile Regression for Duration Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Quantile regression methods are emerging as a popular technique in econometrics and biometrics for exploring the distribution of duration data. This paper discusses quantile regression for duration analysis allowing for a flexible specification of the functional relationship and of the error distribution. Censored quantile regression address the issue of right censoring of the response variable which is common in duration analysis. We compare quantile regression to standard duration models. Q...

Fitzenberger, Bernd; Wilke, Ralf A.

2005-01-01

102

Regression and regression analysis time series prediction modeling on climate data of quetta, pakistan  

International Nuclear Information System (INIS)

Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

2007-01-01

103

Significant Tests of Coefficient Multiple Regressions by using Permutation Methods  

Directory of Open Access Journals (Sweden)

Full Text Available Tests of significance of a single partial regression coefficient in a multiple regression model are often made in situations where the standard assumptions underlying the probability calculation (for example assumption of normally of random error term do not hold. When the random error term fails to fulfill some of these assumptions, one need resort to some other nonparametric methods to carry out statistical inferences. Permutation methods are a branch of nonparametric methods. This study compared empirical type one error of different permutation strategies that proposed for testing nullity of a partial regression coefficient in a multiple regression model, using simulation and show that the type one error of Freedman and Lanes strategy is lower to than the other methods.

Ali Shadrokh

2011-01-01

104

Local Constant and Local Bilinear Multiple-Output Quantile Regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A new quantile regression concept, based on a directional version of Koenker and Bassett’s traditional single-output one, has been introduced in [Hallin, Paindaveine and ¡Siman, Annals of Statistics 2010, 635-703] for multiple-output regression problems. The polyhedral contours provided by the empirical counterpart of that concept, however, cannot adapt to nonlinear and/or heteroskedastic dependencies. This paper therefore introduces local constant and local linear versions of those contou...

Hallin, Marc; Lu, Zudi; Paindaveine, Davy; Siman, Miroslav

2012-01-01

105

Significant Tests of Coefficient Multiple Regressions by using Permutation Methods  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Tests of significance of a single partial regression coefficient in a multiple regression model are often made in situations where the standard assumptions underlying the probability calculation (for example assumption of normally of random error term) do not hold. When the random error term fails to fulfill some of these assumptions, one need resort to some other nonparametric methods to carry out statistical inferences. Permutation methods are a branch of nonparametric methods. This study c...

Ali Shadrokh

2011-01-01

106

Relative risk regression analysis of epidemiologic data.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiolog...

Prentice, R. L.

1985-01-01

107

Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hierarchical model that explicitly deals with both of the above sources of variation. This model leads to a test statistic analogous to a weighted two-sample t-test. When the number of groups involved is more than two, however, a more general approach is needed. Results We describe how logistic regression with overdispersion supplies this generalization, carrying with it the framework for incorporating other covariates into the model as a byproduct. This approach has the advantage that logistic regression routines are available in several common statistical packages. Conclusions The described method provides an easily implemented tool for analyzing SAGE data that correctly handles multiple types of variation and allows for more flexible modelling.

Morris Jeffrey S

2004-10-01

108

Application of Partial Least-Squares Regression Model on Temperature Analysis and Prediction of RCCD  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This study, based on the temperature monitoring data of jiangya RCCD, uses principle and method of partial least-squares regression to analyze and predict temperature variation of RCCD. By founding partial least-squares regression model, multiple correlations of independent variables is overcome, organic combination on multiple linear regressions, multiple linear regression and canonical correlation analysis is achieved. Compared with general least-squares regression model result, it is more ...

2013-01-01

109

Functional linear regression via canonical analysis  

CERN Document Server

We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection between functional regression and functional canonical analysis and suggests alternative approaches for the implementation of functional linear regression analysis. A specific procedure for the estimation of the regression parameter function using canonical expansions is proposed and compared with an established functional principal component regression approach. As an example of an application, we present an analysis of mortality data for cohorts of medflies, obtained in experimental studies of aging and longevity.

He, Guozhong; Wang, Jane-Ling; Yang, Wenjing; 10.3150/09-BEJ228

2011-01-01

110

Forecasting Financial Time Series Using Multiple Regression, Multi Layer Perception, Radial Basis Function and Adaptive Neuro Fuzzy Inference System Models: A Comparative Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.

Arindam Chaudhuri

2012-09-01

111

AN EFFECTIVE TECHNIQUE OF MULTIPLE IMPUTATION IN NONPARAMETRIC QUANTILE REGRESSION  

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we consider the nonparametric quantile regression model with the covariates Missing at Random (MAR. Multiple imputation is becoming an increasingly popular approach for analyzing missing data, which combined with quantile regression is not well-developed. We propose an effective and accurate two-stage multiple imputation method for the model based on the quantile regression, which consists of initial imputation in the first stage and multiple imputation in the second stage. The estimation procedure makes full use of the entire dataset to achieve increased efficiency and we show the proposed two-stage multiple imputation estimator to be asymptotically normal. In simulation study, we compare the performance of the proposed imputation estimator with Complete Case (CC estimator and other imputation estimators, e.g., the regression imputation estimator and k-Nearest-Neighbor imputation estimator. We conclude that the proposed estimator is robust to the initial imputation and illustrates more desirable performance than other comparative methods. We also apply the proposed multiple imputation method to an AIDS clinical trial data set to show its practical application.

Yanan Hu

2014-01-01

112

Modeling Oil Palm Yield Using Multiple Linear Regression and Robust M-regression  

Directory of Open Access Journals (Sweden)

Full Text Available This study shows how a multiple linear regression model can be used to model palm oil yield. The methods are illustrated by examining the time series data of foliar nutrient compositions as one of the independent variable and fresh fruit bunch as dependent variable. Other independent variables include the nutrient balance ratio and major nutrient composition. This modeling approach is capable of identifying the significant contribution of each independent variable in the improving the modeling performance. We find that the quantile-quantile plot demonstrates the existing of outlier and this directs us to use robust M-regression for removing the negative impact of outliers. Results show that robust regression in this case gives a better results than conventional regression in modeling oil palm yield.

Azme Khamis

2006-01-01

113

Interpreting Multiple Linear Regression: A Guidebook of Variable Importance  

Science.gov (United States)

Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

2012-01-01

114

Moderated Multiple Regression, Spurious Interaction Effects, and IRT  

Science.gov (United States)

Two Monte Carlo studies were conducted to explore the Type I error rates in moderated multiple regression (MMR) of observed scores and estimated latent trait scores from a two-parameter logistic item response theory (IRT) model. The results of both studies showed that MMR Type I error rates were substantially higher than the nominal alpha levels…

Kang, Sun-Mee; Waller, Niels G.

2005-01-01

115

Regression Analysis and the Sociological Imagination  

Science.gov (United States)

Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

De Maio, Fernando

2014-01-01

116

On relationship between regression models and interpretation of multiple regression coefficients  

CERN Document Server

In this paper, we consider the problem of treating linear regression equation coefficients in the case of correlated predictors. It is shown that in general there are no natural ways of interpreting these coefficients similar to the case of single predictor. Nevertheless we suggest linear transformations of predictors, reducing multiple regression to a simple one and retaining the coefficient at variable of interest. The new variable can be treated as the part of the old variable that has no linear statistical dependence on other presented variables.

Varaksin, A N

2012-01-01

117

A comparative analysis of the effects of instructional design factors on student success in e-learning: multiple-regression versus neural networks  

Directory of Open Access Journals (Sweden)

Full Text Available This study explores the relationship between the student performance and instructional design. The research was conducted at the E-Learning School at a university in Turkey. A list of design factors that had potential influence on student success was created through a review of the literature and interviews with relevant experts. From this, the five most import design factors were chosen. The experts scored 25 university courses on the extent to which they demonstrated the chosen design factors. Multiple-regression and supervised artificial neural network (ANN models were used to examine the relationship between student grade point averages and the scores on the five design factors. The results indicated that there is no statistical difference between the two models. Both models identified the use of examples and applications as the most influential factor. The ANN model provided more information and was used to predict the course-specific factor values required for a desired level of success.

Halil Ibrahim Cebeci

2009-12-01

118

Fundamental Analysis of the Linear Multiple Regression Technique for Quantification of Water Quality Parameters from Remote Sensing Data. Ph.D. Thesis - Old Dominion Univ.  

Science.gov (United States)

Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.

Whitlock, C. H., III

1977-01-01

119

Switching regressions and activity analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We study the use of switching regression models to characterize the coefficients in linear production technologies with a finite number of activities. Maximum likelihood-based methods are proposed and different switching specifications are discussed. The viability of these newly proposed technniques is established. The methods developed combine the advantages of the two major approaches to frontier estimation: the functional flexibility of the linear programing-nonparametric and nonstatistica...

Ley, Eduardo

1992-01-01

120

A Comparison between the Linear Neural Network Method and the Multiple Linear Regression Method in the Modeling of Continuous Data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fi...

2011-01-01

 
 
 
 
121

Regression analysis of cytopathological data  

Energy Technology Data Exchange (ETDEWEB)

Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.

Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.

1982-12-01

122

Modeling Oil Palm Yield Using Multiple Linear Regression and Robust M-regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This study shows how a multiple linear regression model can be used to model palm oil yield. The methods are illustrated by examining the time series data of foliar nutrient compositions as one of the independent variable and fresh fruit bunch as dependent variable. Other independent variables include the nutrient balance ratio and major nutrient composition. This modeling approach is capable of identifying the significant contribution of each independent variable in the improving the modelin...

2006-01-01

123

Linear regression analysis theory and computing  

CERN Document Server

This volume presents in detail the fundamental theories of linear regression analysis and diagnosis, as well as the relevant statistical computing techniques so that readers are able to actually model the data using the methods and techniques described in the book. It covers the fundamental theories in linear regression analysis and is extremely useful for future research in this area. The examples of regression analysis using the Statistical Application System (SAS) are also included. This book is suitable for graduate students who are either majoring in statistics/biostatistics or using line

Yan, Xin

2009-01-01

124

Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men  

International Nuclear Information System (INIS)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy

1985-01-01

125

Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men  

Energy Technology Data Exchange (ETDEWEB)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.

Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.

1985-08-01

126

Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis  

Science.gov (United States)

When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

Kim, Rae Seon

2011-01-01

127

Using multiple regression analysis to estimate the contributions of engine radiated noise components; Jukaiki bunseki wo mochiita engine hoshaon no kiyo suitei  

Energy Technology Data Exchange (ETDEWEB)

In reducing noise from direct fuel injection diesel engines, it is important to place ranking on noise reducing measures upon identifying contributions of combustion noise and mechanical noise. Conventionally, upon noticing on the combustion noise, its reduction has often been attempted by reducing combustion vibration force and improving transmission systems. However, during the development phase of the direct fuel injection diesel engines, cases were noted that the correlation between the engine noise and the combustion vibration force is not high according to driving conditions. This paper reveals that, by a withdrawal experiment, contribution to the noise is high not only in the combustion noise in the engine noise but also noise especially from the injection pump as a vibration source among mechanical noises. It was made clear that the noise power is nearly proportional to engine load. The paper then describes that, as a method to simply identify contributions of these noises, a regression analysis was performed by using the engine noise as a criterion variable, and cylinder pressure and engine load as explanation variables. It also reports the results of attempting estimation on contributions of the combustion noise, the mechanical noise and the load dependent noise, and the accuracy verification thereon. 3 refs., 11 figs., 2 tabs.

Hirano, I.; Kondo, M.; Uraki, Y.; Asahara, Y. [Nissan Motor Co. Ltd. Tokyo (Japan)

1998-05-01

128

Forecasting relativistic electron flux using dynamic multiple regression models  

Directory of Open Access Journals (Sweden)

Full Text Available The forecast of high energy electron fluxes in the radiation belts is important because the exposure of modern spacecraft to high energy particles can result in significant damage to onboard systems. A comprehensive physical model of processes related to electron energisation that can be used for such a forecast has not yet been developed. In the present paper a systems identification approach is exploited to deduce a dynamic multiple regression model that can be used to predict the daily maximum of high energy electron fluxes at geosynchronous orbit from data. It is shown that the model developed provides reliable predictions.

H.-L. Wei

2011-02-01

129

Multiple-Case Outlier Detection in Multiple Linear Regression Model Using Quantum-Inspired Evolutionary Algorithm  

Directory of Open Access Journals (Sweden)

Full Text Available In ordinary statistical methods, multiple outliers in multiple linear regression model are detected sequentially one after another, where smearing and masking effects give misleading results. If the potential multiple outliers can be detected simultaneously, smearing and masking effects can be avoided. Such multiple-case outlier detection is of combinatorial nature and 2^N-N-1 sets of possible outliers need to be tested, where N is the number of data points. This exhaustive search is practically impossible. In this paper, we have used quantum-inspired evolutionary algorithm (QEA for multiple-case outlier detection in multiple linear regression model. A Bayesian information criterion based fitness function incorporating extra penalty for number of potential outliers has been used for identifying the most appropriate set of potential outliers. Experimental results with 10 widely referred datasets from statistical literature show that the QEA overcomes the effect of smearing and masking and effectively detects the most appropriate set of outliers.

Salena Akter

2010-12-01

130

A multiple regression model for the Ft. Calhoun reactor coolant pump system  

Energy Technology Data Exchange (ETDEWEB)

Multiple regression analysis is one of the most widely used of all statistical tools. In this research paper, we introduce an application of fitting a multiple regression model on reactor coolant pump (RCP) data. The primary purpose of this research is to correlate the results obtained by Design of Experiments (DOE) and regression model fitting. Also, the idea behind using regression model is to gain more detailed information in the RCP data than provided by DOE. In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety and reliability. The result obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The four parameters that are out of control adversely affect the regression model fitting and the final prediction equation, thereby, does not predict accurate response for the future. The analysis concludes that in order to fit a best regression model, one has to remove all out of control points from the data set, including dropping a variable from the model to have better prediction of the response variable. (author)

Patel, B.; Heising, C.D. [Iowa State Univ. of Science and Technology, Ames, IA (United States)

1996-10-01

131

Regression analysis using dependent Polya trees.  

Science.gov (United States)

Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. PMID:23839794

Schörgendorfer, Angela; Branscum, Adam J

2013-11-30

132

Regression  

Science.gov (United States)

... Stages > Toddler > Toilet Training > Regression Ages & Stages Listen Regression Article Body “Our daughter made great progress with ... a few steps back in her learning process. Regression during toilet training—a child’s sudden neglect of ...

133

Hot Resistance Estimation for Dry Type Transformer Using Multiple Variable Regression, Multiple Polynomial Regression and Soft Computing Techniques  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

M. Srinivasan

2012-01-01

134

Outlier Detection for Multivariate Multiple Regression in Y-direction  

Directory of Open Access Journals (Sweden)

Full Text Available This study focuses on the outlier detection for Multivariate Multiple Regression in Y-direction however, we propose an alternative method based on the squared distances of the residuals. The proposed method refers to the robust estimates of location and covariance matrices derived from the squared distances of the residuals. The proposed method is compared to Mahalanobis Distance method, Minimum Covariance Determinant method and Minimum Volume Ellipsoid method which are used to detect multivariate outliers. An advantage of the proposed method is that it is an alternative method to solve the complicated problem of resampling algorithm in detecting multivariate outliers in Y-direction in the case of having a large sample size and correlation between the dependent variables.

Paweena Tangjuang

2014-01-01

135

Precipitation interpolation in mountainous regions using multiple linear regression  

Science.gov (United States)

Multiple linear regression (MLR) was used to spatially interpolate precipitation for simulating runoff in the Animas River basin of southwestern Colorado. MLR equations were defined for each time step using measured precipitation as dependent variables. Explanatory variables used in each MLR were derived for the dependent variable locations from a digital elevation model (DEM) using a geographic information system. The same explanatory variables were defined for a 5 ?? 5 km grid of the DEM. For each time step, the best MLR equation was chosen and used to interpolate precipitation onto the 5 ?? 5 km grid. The gridded values of precipitation provide a physically-based estimate of the spatial distribution of precipitation and result in reliable simulations of daily runoff in the Animas River basin.

Hay, L.; Viger, R.; McCabe, G.

1998-01-01

136

Mü?terilerin Kredi Kart?na Olan Tutumlar?n?n Çoklu Regresyon ve Faktör Analizi ?le ?ncelenmesi - Consumer Attitude towards the Credit Card Assessed By Means Of Multiple Regression and Factor Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available ÖZArast?rman?n Amac?: Bu çal?sman?n amac? kredi kart? müsterilerinin kulland?klar? kredikartlar?na iliskin negatif ve pozitif tutumlar?n?n arast?r?lmas?d?r.Yöntem: Önce müsterilerin kredi kart?na olan tutumlar? Aç?klay?c? Faktör Analizi yard?m?ylaincelenmis, daha sonra belirlenen 7 faktörün kredi kart?na duyulan memnuniyet ve gelecekte kredikart? kullanmama tutumlar?na etkileri Çoklu Regresyon Analizi yard?m?yla arast?r?lm?st?r.Bulgular ve Sonuç: Çal?sma sonucunda kredi kart?n?n kisiye güven verdigi alg?s?n?nMemnuniyet degiskeni üzerinde en büyük artt?r?c? etkiye sahip faktör oldugu, bunun yan? s?ra kredikart? kullan?m?na kars? olumlu alg?n?n Ç?k?s degiskeni üzerinde en çok azalt?c? etkiye sahip faktöroldugu saptanm?st?r.Anahtar Kelimeler: Kredi Kart?, Müsteri Memnuniyeti, Ç?k?s Davran?s?, Aç?klay?c? FaktörAnalizi ve Çoklu Regresyon AnaliziABSTRACTResearch Aim: This study researched the effect of negative and positive perceptions of creditcard holders towards credit cards in their satisfaction and exit behaviors.Method: In this study, we first assessed the attitudes of customers towards the use of creditcards by means of Exploratory Factor Analysis, then we assessed the effects of the pre-determined 7factors on the credit card satisfaction and the use of credit cards in the future thanks to MultipleRegression Analysis.Findings and Result: At the end of the study, It was found the perception that credit cards givecustomer confidence has the most effect to increase the satisfaction. It was also found that positiveattitudes towards the use of credit cards have the most effect to decrease effect on exit behavior.Key Words: Credit Card, Consumer’s Satisfaction, Exit Behaviors, Exploratory FactorAnalysis, and Multiple Regression Analysis

M. S. Talha ARSLAN

2009-12-01

137

Multiple predictor smoothing methods for sensitivity analysis.  

Energy Technology Data Exchange (ETDEWEB)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present.

Helton, Jon Craig; Storlie, Curtis B.

2006-08-01

138

Multiple predictor smoothing methods for sensitivity analysis  

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

2006-01-01

139

Logistic regression analysis with standardized markers  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Two different approaches to analysis of data from diagnostic biomarker studies are commonly employed. Logistic regression is used to fit models for probability of disease given marker values, while ROC curves and risk distributions are used to evaluate classification performance. In this paper we present a method that simultaneously accomplishes both tasks. The key step is to standardize markers relative to the nondiseased population before including them in the logistic reg...

Huang, Ying; Pepe, Margaret S.; Feng, Ziding

2013-01-01

140

Regression based process energy analysis system  

Energy Technology Data Exchange (ETDEWEB)

The results of an investigation are presented to determine which weather, production, and time-related parameters exert significant influence on installation energy consumption for the U.S. Army Armament, Munitions and Chemical Command (AMCCOM) using regression analysis methods. Based on data gathered at AMCCOM HQ, potentially significant weather and production/mission parameters are identified, and Process Energy Analysis Systems are developed for each installation using regression analysis methods on a monthly data base for the period FY75 through FY82 (October 1974 through September 1982). The regression model for AMCCOM shows that aggregate energy consumption in general depends on heating degree-days, production level, and labor force strength. At individual installations, additional important parameters include cooling degree-days and facility changes over time. The model was applied to actual FY84 data and predicted total energy consumption to within 3% to 6% of actual consumption. Results of this effort will be used to forecast energy consumption and establish energy guidelines throughout AMCCOM.

Leslie, N.P.; Aveta, G.A.; Sliwinski, B.J.

1986-01-01

 
 
 
 
141

Stukel's Extended Logistic Regression Analysis with R  

Directory of Open Access Journals (Sweden)

Full Text Available Objective: For a logistic regression model, the degree to which predicted probabilities agree with actual outcomes can be expressed as a classification table. Being crucial in model adequacy checking, such tables may be slightly different when the same data are modeled with different statistical packages. The underlying reason is that when classifying a set of binary data, if the observations used to fit the model are also used to estimate the classification error, the resulting error-count estimate is biased. In order to cope with this problem, SAS suggests an algorithm, whereas the software is not publicly available. R is a free downloadable programme which is particularly designed for statistical computation, including the logistic regression analysis. The purpose of this study is to present a new function in R which carries out an extended logistic regression analysis of a binary data from the construction of its reduced-biased classification table, to the inference of its model parameters by calling the lrm(. function under the Design package where necessary. Material and Methods: The performance of ext.logreg(. is evaluated in terms of the accuracy of estimates and computational cost. Results: From the results of two binary datasets, it is observed that ext.logreg(. via R estimates the model parameters and constructs the unbiased classification table as accurate as SAS programme under PROC logistic function without losing the computational demand. Conclusion: The free downloadable ext.logreg(. function can be seen as an alternative computational tool in the analysis of logistic regression when the validation of predicted probabilities is essential.

Vilda PURUTÇUO?LU

2011-01-01

142

Assessing the binding affinity of a selected class of DPP4 inhibitors using chemical descriptor-based multiple linear regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The activity of a selected class of DPP4 inhibitors was preliminarily assessed using chemical descriptors derived AM1 optimized geometries. Using multiple linear regression model, it was found that ?E0, LUMO energy, area, molecular weight and ?H0 are the significant descriptors that can adequately assess the binding affinity of the compounds. The derived multiple linear regression (MLR) model was validated using rigorous statistical analysis. The preliminary model suggests that bul...

2011-01-01

143

Assessing the binding affinity of a selected class of DPP4 inhibitors using chemical descriptor-based multiple linear regression  

Directory of Open Access Journals (Sweden)

Full Text Available The activity of a selected class of DPP4 inhibitors was preliminarily assessed using chemical descriptors derived AM1 optimized geometries. Using multiple linear regression model, it was found that ?E0, LUMO energy, area, molecular weight and ?H0 are the significant descriptors that can adequately assess the binding affinity of the compounds. The derived multiple linear regression (MLR model was validated using rigorous statistical analysis. The preliminary model suggests that bulky and electrophilic inhibitors are desired.

Jose Isagani Janairo

2011-08-01

144

A multiple regression model for urban traffic noise in Hong Kong  

Science.gov (United States)

This article describes the roadside traffic noise surveys conducted in heavily built-up urban areas in Hong Kong. Noise measurements were carried out along 18 major roads in 1999. The measurement data included L10, L50, L90, Leq, Lmax, the number of light vehicles, the number of heavy vehicles, the total traffic flow, and the average speed of vehicles. Statistical analysis using the analysis of variance (ANOVA) and Tukey test (pheavy vehicles are the most significant factors of urban traffic noise. Multiple regression was used to derive a set of empirical formulas for predicting L10 noise level due to road traffic. The accuracy of these empirical formulas is quantified and compared to that of another widely used prediction model in Hong Kong--the Calculation of Road Traffic Noise. The applicability of the selected multiple regression model is validated by the noise measurements performed in the winter of 2000. copyright 2002 Acoustical Society of America.

To, W. M.; Ip, Rodney C. W.; Lam, Gabriel C. K.; Yau, Chris T. H.

2002-08-01

145

Multiple regression models for energy use in air-conditioned office buildings in different climates  

International Nuclear Information System (INIS)

An attempt was made to develop multiple regression models for office buildings in the five major climates in China - severe cold, cold, hot summer and cold winter, mild, and hot summer and warm winter. A total of 12 key building design variables were identified through parametric and sensitivity analysis, and considered as inputs in the regression models. The coefficient of determination R2 varies from 0.89 in Harbin to 0.97 in Kunming, indicating that 89-97% of the variations in annual building energy use can be explained by the changes in the 12 parameters. A pseudo-random number generator based on three simple multiplicative congruential generators was employed to generate random designs for evaluation of the regression models. The difference between regression-predicted and DOE-simulated annual building energy use are largely within 10%. It is envisaged that the regression models developed can be used to estimate the likely energy savings/penalty during the initial design stage when different building schemes and design concepts are being considered.

2010-12-01

146

Landslide Susceptibility Mapping Using Multiple Regression and GIS Tools in Tajan Basin, North of Iran  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Landslide is a natural hazard that causes many damages to the environment. Depending on the landform, several factors can cause the Landslide. This research addresses the methodology for landslide susceptibility mapping using multiple regression analysis and GIS tools. Based on the initial hypothesis, ten factors were recognized as effectual elements on landslide, which is geology, slope, aspect, distance from roads, faults and drainage network, soil capability, land use and rainfall. Crossin...

somayeh mashari; Karim Solaimani; Ebrahim Omidvar

2012-01-01

147

Functional linear regression analysis for longitudinal data  

CERN Multimedia

We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allow...

Yao, F; Wang, J L; Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling

2005-01-01

148

Determination of the Regression Coefficients and Their Associated Standard Errors in Hierarchical Regression Analysis.  

Science.gov (United States)

The regression coefficients and the associated standard errors in hierarchical regression, when a theoretical basis for the analysis exists, are determined for four regression models. Each reflects different controlling or partialling of the variates. An illustration is presented using data from the Berkeley Growth Study. (SLD)

Tisak, John

1994-01-01

149

Dynamic Population Structure based PSO with Granular Computing for Unified Multiple Linear Regression  

Directory of Open Access Journals (Sweden)

Full Text Available Unified Multiple Linear Regression (UMLR is a nonlinear programming model that unifies all kind of multiple linear regression models, such as Principal Components Regression, Ridge Regression, Robust Regression and constrained regression. Although, UMLR has exhibited excellent performances in some real applications, the optimization procedure is not satisfying yet. This study proposes a novel Granular Computing-Particle Swarm Optimization (Grc-PSO algorithm by introducing granular computing into standard PSO which is used for the optimization of the UMLR model. The experimental results show that the solution got by Grc-PSO algorithm is much better to the real situation than other state-of-art algorithms.

Chen Su-Fen

2013-01-01

150

Regression Analysis for the Social Sciences  

CERN Multimedia

The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include:interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis.teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming

Gordon, Rachel A A

2012-01-01

151

Exploring the equity of GP practice prescribing rates for selected coronary heart disease drugs: a multiple regression analysis with proxies of healthcare need  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background There is a small, but growing body of literature highlighting inequities in GP practice prescribing rates for many drug therapies. The aim of this paper is to further explore the equity of prescribing for five major CHD drug groups and to explain the amount of variation in GP practice prescribing rates that can be explained by a range of healthcare needs indicators (HCNIs). Methods The study involved a cross-sectional secondary analysis in fo...

Ward Paul R; Noyce Peter R; St Leger Antony S

2005-01-01

152

An Effect Size for Regression Predictors in Meta-Analysis  

Science.gov (United States)

A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

Aloe, Ariel M.; Becker, Betsy Jane

2012-01-01

153

Assessing the multisite binding properties of multiple sources of dissolved organic matter at nanomolar copper concentrations using piecewise linear regression and parallel factor analysis of fluorescence quenching.  

Science.gov (United States)

This study reports on the development and application of a piecewise linear model for the determination of copper-binding parameters at concentrations in the nanomolar range using fluorescence quenching. L-Tyrosine, Suwannee River natural organic matter, and two leaf leachates with similar fluorescence signatures were used as test compounds, and results were compared with those of the standard Ryan-Weber model. The piecewise model was also applied to and compared with data from an earlier study. Parallel factor analysis (PARAFAC) was used to identify three to five independent fluorophores in each test compound, and copper-binding parameters were estimated for one to three binding sites for each fluorophore. The binding properties of similar and different fluorophores were also compared. The conditional binding strengths (log K') estimated using the piecewise approach were similar to those obtained using the Ryan-Weber approach (p?>?0.05); however, the piecewise linear model provided superior results compared to models based on the Ryan-Weber equation in several ways, including (1) capable of distinguishing more binding sites for a single fluorophore, (2) capable of extracting binding parameters at environmentally relevant, nanomolar concentrations of copper, where fluorescence changes are often observed as enhancement, (3) greater precision over repeated titrations, and (4) no severe underestimation of complexing capacities. Finally, the copper-binding properties of PARAFAC components with similar optical signatures were found to be similar, both in sources with dramatically different and similar total fluorescence signatures. PMID:24327077

Cuss, C W; Guéguen, C

2014-01-01

154

Forecasting Gold Prices Using Multiple Linear Regression Method  

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: Forecasting is a function in management to assist decision making. It is also described as the process of estimation in unknown future situations. In a more general term it is commonly known as prediction which refers to estimation of time series or longitudinal type data. Gold is a precious yellow commodity once used as money. It was made illegal in USA 41 years ago, but is now once again accepted as a potential currency. The demand for this commodity is on the rise. Approach: Objective of this study was to develop a forecasting model for predicting gold prices based on economic factors such as inflation, currency price movements and others. Following the melt-down of US dollars, investors are putting their money into gold because gold plays an important role as a stabilizing influence for investment portfolios. Due to the increase in demand for gold in Malaysian and other parts of the world, it is necessary to develop a model that reflects the structure and pattern of gold market and forecast movement of gold price. The most appropriate approach to the understanding of gold prices is the Multiple Linear Regression (MLR model. MLR is a study on the relationship between a single dependent variable and one or more independent variables, as this case with gold price as the single dependent variable. The fitted model of MLR will be used to predict the future gold prices. A naive model known as ?forecast-1? was considered to be a benchmark model in order to evaluate the performance of the model. Results: Many factors determine the price of gold and based on ?a hunch of experts?, several economic factors had been identified to have influence on the gold prices. Variables such as Commodity Research Bureau future index (CRB; USD/Euro Foreign Exchange Rate (EUROUSD; Inflation rate (INF; Money Supply (M1; New York Stock Exchange (NYSE; Standard and Poor 500 (SPX; Treasury Bill (T-BILL and US Dollar index (USDX were considered to have influence on the prices. Parameter estimations for the MLR were carried out using Statistical Packages for Social Science package (SPSS with Mean Square Error (MSE as the fitness function to determine the forecast accuracy. Conclusion: Two models were considered. The first model considered all possible independent variables. The model appeared to be useful for predicting the price of gold with 85.2% of sample variations in monthly gold prices explained by the model. The second model considered the following four independent variables the (CRB lagged one, (EUROUSD lagged one, (INF lagged two and (M1 lagged two to be significant. In terms of prediction, the second model achieved high level of predictive accuracy. The amount of variance explained was about 70% and the regression coefficients also provide a means of assessing the relative importance of individual variables in the overall prediction of gold price.

Z. Ismail

2009-01-01

155

Testing mediation using multiple regression and structural equation modeling analyses in secondary data.  

Science.gov (United States)

Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the respective circumstances in which the two methods are most useful. One study examines the mediated effect of parents' social capital on parent involvement in Head Start programs through parent-child bond. The other study assesses the mediating effects of structured routine activities, delinquent association, and prosocial belief on the relationship between religiosity and juvenile delinquency. PMID:21917711

Li, Spencer D

2011-06-01

156

On connectivity of fibers with positive marginals in multiple logistic regression  

CERN Document Server

In this paper we consider exact tests of a multiple logistic regression, where the levels of covariates are equally spaced, via Markov beses. In usual application of multiple logistic regression, the sample size is positive for each combination of levels of the covariates. In this case we do not need a whole Markov basis, which guarantees connectivity of all fibers. We first give an explicit Markov basis for multiple Poisson regression. By the Lawrence lifting of this basis, in the case of bivariate logistic regression, we show a simple subset of the Markov basis which connects all fibers with a positive sample size for each combination of levels of covariates.

Hara, Hisayuki; Yoshida, Ruriko

2008-01-01

157

Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity  

Directory of Open Access Journals (Sweden)

Full Text Available While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret multiple regression effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

AmandaKraha

2012-03-01

158

Throughput Prediction of Fishing Goods Based on the Grey Multiple Linear Regression Method  

Directory of Open Access Journals (Sweden)

Full Text Available Based on the grey prediction method and multiple linear regression method, the grey multiple linear regression method was presented. This method was applied to the throughput prediction of fishing goods according to five fishing ports’ actual throughput data. The result of comparing the calculating conclusion to the time series one-dimensional linear regression method and grey prediction method proved that the method of calculation and analyzing was more effective and the forecasting precision was higher.

Changping Chen

2014-06-01

159

Using Dominance Analysis to Determine Predictor Importance in Logistic Regression  

Science.gov (United States)

This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

Azen, Razia; Traxel, Nicole

2009-01-01

160

Forecasting Electrical Load using ANN Combined with Multiple Regression Method  

Directory of Open Access Journals (Sweden)

Full Text Available This paper combined artificial neural network and regression modeling methods to predict electrical load. We propose an approach for specific day, week and/or month load forecasting for electrical companies taking into account the historical load. Therefore, a modified technique, based on artificial neural network (ANN combined with linear regression, is applied on the KSA electrical network dependent on its historical data to predict the electrical load demand forecasting up to year 2020. This technique was compared with extrapolation of trend curves as a traditional method (Linear regression models. Application results show that the proposed method is feasible and effective. The application of neural networks prediction shows the capability and the efficiently of the proposed techniques to obtain the predicting load demand up to year 2020.

Saeed M. Badran

2012-04-01

 
 
 
 
161

MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES  

Directory of Open Access Journals (Sweden)

Full Text Available Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established by multiple logistic regression model using SPSS statistical software. Results: The factors like frequency of brushing, timings of cleaning teeth and type of toothpastes are significant persistent predictors of dental caries and periodontal disease. The log likelihood value of full model is –1013.1364 and Akaike’s Information Criterion (AIC is 1.1752 as compared to reduced regression model are -1019.8106 and 1.1748 respectively for dental caries. But, the log likelihood value of full model is –1085.7876 and AIC is 1.2577 followed by reduced regression model are -1019.8106 and 1.1748 respectively for periodontal disease. The area under Receiver Operating Characteristic (ROC curve for the dental caries is 0.7509 (full model and 0.7447 (reduced model; the ROC for the periodontal disease is 0.6128 (full model and 0.5821 (reduced model. Conclusions: The frequency of brushing, timings of cleaning teeth and type of toothpastes are main signifi cant risk factors of dental caries and periodontal disease. The fitting performance of reduced logistic regression model is slightly a better fit as compared to full logistic regression model in identifying the these risk factors for both dichotomous dental caries and periodontal disease.

Parameshwar V. Pandit

2012-06-01

162

Self-concordant analysis for logistic regression  

CERN Document Server

Most of the non-asymptotic theoretical work in regression is carried out for the square loss, where estimators can be obtained through closed-form expressions. In this paper, we use and extend tools from the convex optimization literature, namely self-concordant functions, to provide simple extensions of theoretical results for the square loss to the logistic loss. We apply the extension techniques to logistic regression with regularization by the $\\ell_2$-norm and regularization by the $\\ell_1$-norm, showing that new results for binary classification through logistic regression can be easily derived from corresponding results for least-squares regression.

Bach, Francis

2009-01-01

163

Multiple Logistic Regression Analysis of Risk Factors Associated with Denture Plaque and Staining in Chinese Removable Denture Wearers over 40 Years Old in Xi'an - a Cross-Sectional Study  

Science.gov (United States)

Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ?0.5 years; 2.1–5 years: OR?=?4.155, P?=?0.001; >5 years: OR?=?7.238, P5 years: OR?=?27.226, P<0.001), and cleaning method (reference, chemical cleanser; running water: OR?=?29.184, P<0.001; brushing: OR?=?4.236, P?=?0.007). Conclusion Denture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts.

Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

2014-01-01

164

Regression Discontinuity Designs with Multiple Rating-Score Variables  

Science.gov (United States)

In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those…

Reardon, Sean F.; Robinson, Joseph P.

2012-01-01

165

Multiple Regression Model for Compressive Strength Prediction of High Performance Concrete  

Directory of Open Access Journals (Sweden)

Full Text Available A mathematical model for the prediction of compressive strength of high performance concrete was performed using statistical analysis for the concrete data obtained from experimental work done in this study. The multiple non-linear regression model yielded excellent correlation coefficient for the prediction of compressive strength at different ages (3, 7, 14, 28 and 91 days. The coefficient of correlation was 99.99% for each strength (at each age. Also, the model gives high correlation for strength prediction of concrete with different types of curing.

M. F.M. Zain

2009-01-01

166

Spatial regression analysis on 32 years total column ozone data  

Science.gov (United States)

Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979-2008) and two years of assimilated SCIAMACHY ozone data (2009-2010). The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Nino (ENSO) and stratospheric alternative halogens (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

2014-02-01

167

Spatial regression analysis on 32 years total column ozone data  

Directory of Open Access Journals (Sweden)

Full Text Available Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979–2008 and two years of assimilated SCIAMACHY ozone data (2009–2010. The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO, El Nino (ENSO and stratospheric alternative halogens (EESC. For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

J. S. Knibbe

2014-02-01

168

Application of Partial Least-Squares Regression Model on Temperature Analysis and Prediction of RCCD  

Directory of Open Access Journals (Sweden)

Full Text Available This study, based on the temperature monitoring data of jiangya RCCD, uses principle and method of partial least-squares regression to analyze and predict temperature variation of RCCD. By founding partial least-squares regression model, multiple correlations of independent variables is overcome, organic combination on multiple linear regressions, multiple linear regression and canonical correlation analysis is achieved. Compared with general least-squares regression model result, it is more advanced and accurate, had more practical explanation. It is proved feasible and practical, so, it can be used to predict concrete temperature. By calculating, the result shows that rock temperature is the most important factor which affects RCCD temperature. RCCD temperature is decreasing with rock temperature. We suggest that rock temperature should be monitored as emphasis in the future; this can provide some scientific basis for temperature controlling and preventing RCCD crack.

Yuqing Zhao

2013-06-01

169

A Comparison between the Linear Neural Network Method and the Multiple Linear Regression Method in the Modeling of Continuous Data  

Directory of Open Access Journals (Sweden)

Full Text Available Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fitting and forecasting of the two kinds of models, and consequently, comes to the conclusion that the linear neural network method has a stronger fitting ability and a more stable ability of prediction so that it can be further applied and promoted in the analyzing and forecasting of continuous data factors.

Guoli Wang

2011-10-01

170

X-ray spectrometric determination of Europium(III) in various oxides: comparison of fundamental parameter and multiple regression methods  

International Nuclear Information System (INIS)

The determination of Eu(III) doping levels in various oxide matrices was carried out through x-ray fluorescence analysis. The use of fundamental parameters calculations was investigated as a potentially fast and accurate method, and comparison was made to results obtained by using an intensity model multiple regression method. By use of the fundamental parameters method, results were obtained that differed by less than +- 2% relative to those obtained through multiple regression results. The fundamental parameters method worked well with the use of only two concentration standards (which bracketed the unknown concentrations) and when the sample stoichiometry and matrix composition were specified. The fundamental parameters method is far easier to use than the multiple regression method, since one can obtain accurate results with the use of significantly fewer concentration standards. 20 references, 3 figures, 2 tables

1986-01-01

171

Analysis of Multiple Phenotypes  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The complex etiology of common diseases like cardiovascular disease, diabetes, hypertension, and rheumatoid arthritis has led investigators to focus on the genetics of correlated phenotypes and risk factors. Joint analysis of multiple disease-related phenotypes may reveal genes of pleiotropic effect and increase analytical power, but at the cost of increased analytical and computational complexity. All three data sets provided for analysis at the Genetic Analysis Workshop 16 offered multiple ...

Kent, Jack W.

2009-01-01

172

SOME STATISTICAL ISSUES RELATED TO MULTIPLE LINEAR REGRESSION MODELING OF BEACH BACTERIA CONCENTRATIONS  

Science.gov (United States)

As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those is...

173

Prediction of flow characteristics using multiple regression and neural networks: A case study in Zimbabwe  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The feasibility of predicting flow characteristics from basin descriptors using multiple regression and neural networks has been investigated on 52 basins in Zimbabwe. Flow characteristics considered were average annual runoff, base flow index, flow duration curve, and average monthly runoff . Mean annual runoff is predicted using linear equations from mean annual precipitation, basin slope, and proportion of a basin underlain by granite and gneiss. A multiple regression equation is derived t...

Mazvimavi, D.; Meijerink, A. M. J.; Savenije, H. H. G.; Stein, A.

2005-01-01

174

Egg hatchability prediction by multiple linear regression and artificial neural networks  

Digital Repository Infrastructure Vision for European Research (DRIVER)

An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determi...

2008-01-01

175

Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA  

Science.gov (United States)

Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

Ohlmacher, G. C.; Davis, J. C.

2003-01-01

176

Multiple regression technique for Pth degree polynominals with and without linear cross products  

Science.gov (United States)

A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.

Davis, J. W.

1973-01-01

177

Prediction of groundwater table and salinity fluctuations with a time series multiple regression technique  

Science.gov (United States)

Time series techniques have been extensively applied to research works of many academic disciplines, particularly those concerned with economics and environment. This paper presents application of a time series multiple linear regression technique to a groundwater system to predict groundwater level and salinity fluctuations in a saline area in the northeastern part of Thailand. Surface and groundwater interaction is the major mechanism controlling the shallow subsurface system and salinity of the area. The basic technique is based on the lagged correlation between hydrologic, and hydrogeological and environmental parameters. As a result of a large irrigation project in the area, several regulating gates have been installed to control flooding to the downstream rivers and to provide the upstream areas with sufficient irrigating water. From the lagged correlation analysis, the shallow groundwater and groundwater salinity fluctuation in the irrigating area are shown to be dependent upon the surface water levels at the installed regulated gates and prior rainfall. A set of multiple linear regression equations with lagged time dependent function are then formulated. The dependent variables are groundwater level and groundwater salinity while the independent variables are rainfall rates and water levels measured at the regulating gates. After calibration and verification, the model, as an alternative to the conventional method which requires detailed and continuous variables and is costlier, can be used to forecast and manage future groundwater systems.

Seeboonruang, U.

2013-12-01

178

Analysis of genome-wide association data by large-scale Bayesian logistic regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Single-locus analysis is often used to analyze genome-wide association (GWA) data, but such analysis is subject to severe multiple comparisons adjustment. Multivariate logistic regression is proposed to fit a multi-locus model for case-control data. However, when the sample size is much smaller than the number of single-nucleotide polymorphisms (SNPs) or when correlation among SNPs is high, traditional multivariate logistic regression breaks down. To accommodate the scale o...

Wang Yuanjia; Sha Nanshi; Fang Yixin

2009-01-01

179

Joint regression analysis for discrete longitudinal data.  

Science.gov (United States)

We introduce an approximation to the Gaussian copula likelihood of Song, Li, and Yuan (2009,?Biometrics?65, 60-68) used to estimate regression parameters from correlated discrete or mixed bivariate or trivariate outcomes. Our approximation allows estimation of parameters from response vectors of length much larger than three, and is asymptotically equivalent to the Gaussian copula likelihood. We estimate regression parameters from the toenail infection data of De Backer et al. (1996,?British Journal of Dermatology?134, 16-17), which consist of binary response vectors of length seven or less from 294 subjects. Although maximizing the Gaussian copula likelihood yields estimators that are asymptotically more efficient than generalized estimating equation (GEE) estimators, our simulation study illustrates that for finite samples, GEE estimators can actually be as much as 20% more efficient. PMID:21039391

Madsen, L; Fang, Y

2011-09-01

180

Neutron multiplicity analysis tool  

International Nuclear Information System (INIS)

I describe the capabilities of the EXCOM (EXcel based COincidence and Multiplicity) calculation tool which is used to analyze experimental data or simulated neutron multiplicity data. The input to the program is the count-rate data (including the multiplicity distribution) for a measurement, the isotopic composition of the sample and relevant dates. The program carries out deadtime correction and background subtraction and then performs a number of analyses. These are: passive calibration curve, known alpha and multiplicity analysis. The latter is done with both the point model and with the weighted point model. In the current application EXCOM carries out the rapid analysis of Monte Carlo calculated quantities and allows the user to determine the magnitude of sample perturbations that lead to systematic errors. Neutron multiplicity counting is an assay method used in the analysis of plutonium for safeguards applications. It is widely used in nuclear material accountancy by international (IAEA) and national inspectors. The method uses the measurement of the correlations in a pulse train to extract information on the spontaneous fission rate in the presence of neutrons from (?,n) reactions and induced fission. The measurement is relatively simple to perform and gives results very quickly ((le) 1 hour). By contrast, destructive analysis techniques are extremely costly and time consuming (several days). By improving the achievable accuracy of neutron multiplicity counting, a nondestructive analysis technique, it could be possible to reduce the use of destructive analysis measurements required in safeguards applications. The accuracy of a neutron multiplicity measurement can be affected by a number of variables such as density, isotopic composition, chemical composition and moisture in the material. In order to determine the magnitude of these effects on the measured plutonium mass a calculational tool, EXCOM, has been produced using VBA within Excel. This program was developed to help speed the analysis of Monte Carlo neutron transport simulation (MCNP) data, and only requires the count-rate data to calculate the mass of material using INCC's analysis methods instead of the full neutron multiplicity distribution required to run analysis in INCC. This paper describes what is implemented within EXCOM, including the methods used, how the program corrects for deadtime, and how uncertainty is calculated. This paper also describes how to use EXCOM within Excel.

2010-07-11

 
 
 
 
181

Forest Loss Triggers in Cameroon: A Quantitative Assessment Using Multiple Linear Regression Approach  

Directory of Open Access Journals (Sweden)

Full Text Available The triggers of forest area loss in Cameroon have not been properly understood. The measures used to curb forest area loss have been simplistic, generalized with no clear cut knowledge of the specific role of different potential factors. This study aims at investigating the hypothesis that population growth is the main cause of loss in forest area. This study will be able to identify what factors are of more significance in the causal equation. The open R programming software has been used to produce multiple linear regression models. The correlation between the dependent variable and the independent variables was established by a correlation matrix and the strength of the models tested by power analysis. The results supports the hypothesis that population growth is the most dominant cause of deforestation in Cameroon while arable production and permanent crop land and arable production per capita index are second and third respectively.

Epule Terence Epule

2011-08-01

182

Partitioning Predicted Variance into Constituent Parts: A Primer on Regression Commonality Analysis.  

Science.gov (United States)

Commonality analysis is a method of decomposing the R squared in a multiple regression analysis into the proportion of explained variance of the dependent variable associated with each independent variable uniquely and the proportion of explained variance associated with the common effects of one or more independent variables in various…

Amado, Alfred J.

183

Confidence intervals after multiple imputation: combining profile likelihood information from logistic regressions.  

Science.gov (United States)

In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter ? ?at level 1?-?? to be identified as those ?* and ?** that satisfy CDF c (?*)?=?????2 and CDF c (?**)?=?1?-?????2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477

Heinze, Georg; Ploner, Meinhard; Beyea, Jan

2013-12-20

184

Maximum likelihood, multiple imputation and regression calibration for measurement error adjustment  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In epidemiologic studies of exposure-disease association, often only a surrogate measure of exposure is available for the majority of the sample. A validation sub-study may be conducted to estimate the relation between the surrogate measure and true exposure levels. In this article, we discuss three methods of estimation for such a main study / validation study design: (i) maximum likelihood (ML), (ii) multiple imputation (MI) and (iii) regression calibration (RC). For logistic regression, we...

Messer, Karen; Natarajan, Loki

2008-01-01

185

The Determination of Polyethlylene Glycol and Water in Archaeological Wood using Infrared Spectroscopy and Stepwise Multiple Linear Regression  

Directory of Open Access Journals (Sweden)

Full Text Available Polyethylene glycol (PEG is the most common preservative in use for bulking and maintaining structural integrity in waterlogged wood. Conservators therefore have a need to be able to determine PEG concentrations in wood in a non-destructive manner. We present a study highlighting the application of infrared spectroscopy coupled with multivariate analysis techniques to predict the concentration of polyethylene glycol 400 (PEG-400 and water simultaneously. This technique uses attenuated total reflectance (ATR spectroscopy andunconstrained stepwise multiple linear regression (SMLR analysis for prediction of multiple components in archaeological wood. Using this model we have calculated the concentration of PEG-400 and water in treated archaeological waterlogged wood samples.

Rohan PATEL

2012-03-01

186

Multiple linear and principal component regressions for modelling ecotoxicity bioassay response.  

Science.gov (United States)

The ecotoxicological response of the living organisms in an aquatic system depends on the physical, chemical and bacteriological variables, as well as the interactions between them. An important challenge to scientists is to understand the interaction and behaviour of factors involved in a multidimensional process such as the ecotoxicological response. With this aim, multiple linear regression (MLR) and principal component regression were applied to the ecotoxicity bioassay response of Chlorella vulgaris and Vibrio fischeri in water collected at seven sites of Leça river during five monitoring campaigns (February, May, June, August and September of 2006). The river water characterization included the analysis of 22 physicochemical and 3 microbiological parameters. The model that best fitted the data was MLR, which shows: (i) a negative correlation with dissolved organic carbon, zinc and manganese, and a positive one with turbidity and arsenic, regarding C. vulgaris toxic response; (ii) a negative correlation with conductivity and turbidity and a positive one with phosphorus, hardness, iron, mercury, arsenic and faecal coliforms, concerning V. fischeri toxic response. This integrated assessment may allow the evaluation of the effect of future pollution abatement measures over the water quality of Leça River. PMID:24645478

Gomes, Ana I; Pires, José C M; Figueiredo, Sónia A; Boaventura, Rui A R

2014-01-01

187

Multi-Feature Fusion via Hierarchical Regression for Multimedia Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Multimedia data are usually represented by multiple features. In this paper, we propose a new algorithm, namely Multi-feature Learning via Hierarchical Regression for multimedia semantics understanding, where two issues are considered. First, labeling large amount of training data is labor-intensive. It is meaningful to effectively leverage unlabeled data to facilitate multimedia semantics understanding. Second, given that multimedia data can be represented by multiple features, it is advanta...

Yang, Yi; Song, Jingkuan; Huang, Zi; Ma, Zhigang; Sebe, Nicu; Hauptmann, Alexander G.

2013-01-01

188

Egg hatchability prediction by multiple linear regression and artificial neural networks  

Directory of Open Access Journals (Sweden)

Full Text Available An artificial neural network (ANN was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

AC Bolzan

2008-06-01

189

Inferring gene expression dynamics via functional regression analysis  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Temporal gene expression profiles characterize the time-dynamics of expression of specific genes and are increasingly collected in current gene expression experiments. In the analysis of experiments where gene expression is obtained over the life cycle, it is of interest to relate temporal patterns of gene expression associated with different developmental stages to each other to study patterns of long-term developmental gene regulation. We use tools from functional data analysis to study dynamic changes by relating temporal gene expression profiles of different developmental stages to each other. Results We demonstrate that functional regression methodology can pinpoint relationships that exist between temporary gene expression profiles for different life cycle phases and incorporates dimension reduction as needed for these high-dimensional data. By applying these tools, gene expression profiles for pupa and adult phases are found to be strongly related to the profiles of the same genes obtained during the embryo phase. Moreover, one can distinguish between gene groups that exhibit relationships with positive and others with negative associations between later life and embryonal expression profiles. Specifically, we find a positive relationship in expression for muscle development related genes, and a negative relationship for strictly maternal genes for Drosophila, using temporal gene expression profiles. Conclusion Our findings point to specific reactivation patterns of gene expression during the Drosophila life cycle which differ in characteristic ways between various gene groups. Functional regression emerges as a useful tool for relating gene expression patterns from different developmental stages, and avoids the problems with large numbers of parameters and multiple testing that affect alternative approaches.

Leng Xiaoyan

2008-01-01

190

Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional...

2004-01-01

191

Multiple predictor smoothing methods for sensitivity analysis: Example results  

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described in the first part of this presentation: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. In this, the second and concluding part of the presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

2008-01-01

192

Affine Invariant Descriptors of 3D Object Using Multiple Regression Model  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this work, a new method invariant [1,2,3] for 3D object is proposed using multiple regression model.This method consists of extracting an invariant vector using the multiple linear parameters modelapplied to the 3D object, it’s invariant against affine transformation of this object.The concerned 3D objects are transformations of 3D objects by one element of the overalltransformation. The set of transformations considered in this work is the general affine group.

2011-01-01

193

Marginal Regression Models with a Time to Event Outcome and Discrete Multiple Source Predictors  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Information from multiple informants is frequently used to assess psychopathology. We consider marginal regression models with multiple informants as discrete predictors and a time to event outcome. We fit these models to data from the Stirling County Study; specifically, the models predict mortality from self report of psychiatric disorders and also predict mortality from physician report of psychiatric disorders. Previously, Horton et al. found little relationship between self and physician...

2006-01-01

194

Marine Geodatabase and Multiple Regressive Pattern Recognition Technique: A New Approach to Marine Placer Resource Assessment.  

Science.gov (United States)

The ultramafic rocks of the Red Mountain in Goodnews Bay area of southwest Alaska have been the commercial source of onshore placer Pt since 1926. The proximity of the Red Mountain to the Bering Sea, our geophysical survey revealing the possibility of drowned ultramafic and paleo-drainage channels offshore, and the platinum samples collected by various agencies suggests the availability of a significant quantity of marine Pt accumulations in this region. We have created a comprehensive geodatabase for future Pt prospecting and possible exploration in the offshore regions of Goodnews Bay. Offshore exploration needs a preliminary assessment of the marine Pt resource. We have used several regression techniques such as inverse distance weight, kriging, radial basis function, support vector machines (SVM) and relevant vector machines for our assessment. None of these techniques individually was able to capture the entire Pt data variability obtained from the sampled data. The reason could be simply due to the limitation of the method used or the complexity of the governing processes that influence the accumulation of marine Pt such as glaciations, littoral currents, bathymetry, sea-level transgression, or paleo-drainage processes that are difficult to be quantitatively included in the assessment. To obtain improved accuracy of assessment, we propose a new method called the Multiple Regressive Pattern Recognition Technique (MRPRT). We hypothesize that by using the outputs of the different individual regression techniques as the input for a pattern recognition technique, such as the SVM, we will be able to overcome the shortcomings of these regression methods discussed above. The performance of MRPRT was evaluated using the coefficient of correlation (CC) and the coefficient of efficiency (CE). With MRPRT, the CC of our prediction has improved from 0.57 to 0.77 and the CE from 0.28 to 0.43. Post comparative analysis of the predicted marine Pt resource with the different governing processes of accumulation of modern Pt in the offshore Goodnews Bay region revealed that the littoral currents significantly influenced such accumulations.

Oommen, T.; Misra, D.; Prakash, A.; Bandopadhyay, S.; Naidu, S.; Kelley, J. J.

2006-12-01

195

Calculation of U, Ra, Th and K contents in uranium ore by multiple linear regression method  

International Nuclear Information System (INIS)

A multiple linear regression method was used to compute ? spectra of uranium ore samples and to calculate contents of U, Ra, Th, and K. In comparison with the inverse matrix method, its advantage is that no standard samples of pure U, Ra, Th and K are needed for obtaining response coefficients

1991-01-01

196

INTRODUCTION TO A COMBINED MULTIPLE LINEAR REGRESSION AND ARMA MODELING APPROACH FOR BEACH BACTERIA PREDICTION  

Science.gov (United States)

Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...

197

A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants  

Science.gov (United States)

A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

Cooper, Paul D.

2010-01-01

198

Early cost estimating for road construction projects using multiple regression techniques  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that t...

Ibrahim Mahamid

2011-01-01

199

A STATISTICAL MODEL FOR THE 2G, GSM COMMUNICATION SYSTEM IN UTTARAKHAND USING MULTIPLE REGRESSION TECHNIQUE  

Directory of Open Access Journals (Sweden)

Full Text Available This paper introduces a statistical model by using the statistical methods in 2G,GSM communication system.Multiple regression formula is to calculate path loss. It is assumed that hb,W and ? are three statistical variables. We use nakagami distribution to model hb,W and uniform distribution to model ?.

Meenal Sharma

2011-07-01

200

A Spreadsheet Tool for Learning the Multiple Regression F-Test, T-Tests, and Multicollinearity  

Science.gov (United States)

This note presents a spreadsheet tool that allows teachers the opportunity to guide students towards answering on their own questions related to the multiple regression F-test, the t-tests, and multicollinearity. The note demonstrates approaches for using the spreadsheet that might be appropriate for three different levels of statistics classes,…

Martin, David

2008-01-01

 
 
 
 
201

Test Cycle Optimization using Regression Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Industrial robots make up an important part in today’s industry and are assigned to a range of different tasks. Needless to say, businesses need to rely on their machine park to function as planned, avoiding stops in production due to machine failures. This is where fault detection methods play a very important part. In this thesis a specific fault detection method based on signal analysis will be considered. When testing a robot for fault(s), a specific test cycle (trajectory) is executed ...

Meless, Dejen

2010-01-01

202

Applying Multiple Linear Regression and Neural Network to Predict Bank Performance  

Directory of Open Access Journals (Sweden)

Full Text Available Globalization and technological advancement has created a highly competitive market in the banking and finance industry. Performance of the industry depends heavily on the accuracy of the decisions made at managerial level. This study uses multiple linear regression technique and feed forward artificial neural network in predicting bank performance. The study aims to predict bank performance using multiple linear regression and neural network. The study then evaluates the performance of the two techniques with a goal to find a powerful tool in predicting the bank performance. Data of thirteen banks for the period 2001-2006 was used in the study. ROA was used as a measure of bank performance, and hence is a dependent variable for the multiple linear regressions. Seven variables including liquidity, credit risk, cost to income ratio, size, concentration ratio, inflation and GDP were used as independent variables. Under supervised learning, the dependent variable, ROA was used as the target output for the artificial neural network. Seven inputs corresponding to seven predictor variables were used for pattern recognition at the training phase. Experimental results from the multiple linear regression show that two variables: credit risk and cost to income ratio are significant in determining the bank performance.  Two variables were found to explain about 60.9 percent of the total variation in the data with a mean square error (MSE of 0.330. The artificial neural network was found to give optimal results by using thirteen hidden neurons. Testing results show that the seven inputs explain about 66.9 percent of the total variation in the data with a very low MSE of 0.00687. Performance of both methods is measured by mean square prediction error (MSPR at the validation stage. The MSPR value for neural network is lower than the MPSR value for multiple linear regression (0.0061 against 0.6190. The study concludes that artificial neural network is the more powerful tool in predicting bank performance.

Nor Mazlina Abu Bakar

2009-09-01

203

Regression Analysis of Censored Data with Applications in Perimetry  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This thesis treats regression analysis when either the dependent or the independent variable is censored. We deal with quantile regression when the dependent variable is censored. Using the independence between the true values and the censoring limits the quantile function for the true values can be rewritten as another quantile function of the observed, censored values, where the quantile value itself is a function of the censoring distribution. The quantile value is estimated non-parametric...

Lindgren, Anna

1999-01-01

204

Regression Model Optimization for the Analysis of Experimental Data  

Science.gov (United States)

A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

Ulbrich, N.

2009-01-01

205

Sintering equation: determination of its coefficients by experiments - using multiple regression  

International Nuclear Information System (INIS)

Sintering is a method for volume-compression (or volume-contraction) of powdered or grained material applying high temperature (less than the melting point of the material). Maekipirtti tried to find an equation which describes the process of sintering by its main parameters sintering time, sintering temperature and volume contracting. Such equation is called a sintering equation. It also contains some coefficients which characterise the behaviour of the material during the process of sintering. These coefficients have to be determined by experiments. Here we show that some linear regressions will produce wrong coefficients, but multiple regression results in an useful sintering equation. (orig.)

1999-09-01

206

A new synthesis analysis method for building logistic regression prediction models.  

Science.gov (United States)

Synthesis analysis refers to a statistical method that integrates multiple univariate regression models and the correlation between each pair of predictors into a single multivariate regression model. The practical application of such a method could be developing a multivariate disease prediction model where a dataset containing the disease outcome and every predictor of interest is not available. In this study, we propose a new version of synthesis analysis that is specific to binary outcomes. We show that our proposed method possesses desirable statistical properties. We also conduct a simulation study to assess the robustness of the proposed method and compare it to a competing method. Copyright © 2014 John Wiley & Sons, Ltd. PMID:24634227

Sheng, Elisa; Zhou, Xiao Hua; Chen, Hua; Hu, Guizhou; Duncan, Ashlee

2014-07-10

207

Fundamental parameters vs. multiple regression calculations for the determination of europium in oxide catalyst supports by XRF  

International Nuclear Information System (INIS)

Fundamental parameters calculations are used for the analysis of europium in the concentration range of 0.1 WT% to 30.0 WT% in the oxidic catalyst supports alumina, calcia, magnesia, lanthania, and thoria. The precision and accuracy of this method is dependent on how the sample matrix is defined in the fundamental parameters program and the number and concentration of the standards used. Results comparable to the multiple regression method are obtained when the matrix stoichiometry is defined as Eu2O3 and the catalyst oxide (i.e. A12O3 etc). It is also necessary to use standards which bracket the europium concentration in the samples. When these conditions are met, the results are comparable to those obtained from a ten point multiple regression calibration curve but with a considerable saving of standard preparation time. The precision is better than + or - 2% relative. The % relative difference between the fundamental parameters and multiple regression results is also 2%. Data is presented which illustrates the effect of defining the sample stoichiometry in the XRF11 computer program

1984-08-03

208

Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil  

Directory of Open Access Journals (Sweden)

Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.

Newton Carneiro Affonso da Costa Jr.

2004-06-01

209

Analysis of Sting Balance Calibration Data Using Optimized Regression Models  

Science.gov (United States)

Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

Ulbrich, N.; Bader, Jon B.

2010-01-01

210

Multiple regression method to determine aerosol optical depth in atmospheric column in Penang, Malaysia  

Science.gov (United States)

Aerosol optical depth (AOD) from AERONET data has a very fine resolution but air pollution index (API), visibility and relative humidity from the ground truth measurements are coarse. To obtain the local AOD in the atmosphere, the relationship between these three parameters was determined using multiple regression analysis. The data of southwest monsoon period (August to September, 2012) taken in Penang, Malaysia, was used to establish a quantitative relationship in which the AOD is modeled as a function of API, relative humidity, and visibility. The highest correlated model was used to predict AOD values during southwest monsoon period. When aerosol is not uniformly distributed in the atmosphere then the predicted AOD can be highly deviated from the measured values. Therefore these deviated data can be removed by comparing between the predicted AOD values and the actual AERONET data which help to investigate whether the non uniform source of the aerosol is from the ground surface or from higher altitude level. This model can accurately predict AOD if only the aerosol is uniformly distributed in the atmosphere. However, further study is needed to determine this model is suitable to use for AOD predicting not only in Penang, but also other state in Malaysia or even global.

Tan, F.; Lim, H. S.; Abdullah, K.; Yoon, T. L.; Zubir Matjafri, M.; Holben, B.

2014-02-01

211

Groundwater-level prediction using multiple linear regression and artificial neural network techniques: a comparative assessment  

Science.gov (United States)

The potential of multiple linear regression (MLR) and artificial neural network (ANN) techniques in predicting transient water levels over a groundwater basin were compared. MLR and ANN modeling was carried out at 17 sites in Japan, considering all significant inputs: rainfall, ambient temperature, river stage, 11 seasonal dummy variables, and influential lags of rainfall, ambient temperature, river stage and groundwater level. Seventeen site-specific ANN models were developed, using multi-layer feed-forward neural networks trained with Levenberg-Marquardt backpropagation algorithms. The performance of the models was evaluated using statistical and graphical indicators. Comparison of the goodness-of-fit statistics of the MLR models with those of the ANN models indicated that there is better agreement between the ANN-predicted groundwater levels and the observed groundwater levels at all the sites, compared to the MLR. This finding was supported by the graphical indicators and the residual analysis. Thus, it is concluded that the ANN technique is superior to the MLR technique in predicting spatio-temporal distribution of groundwater levels in a basin. However, considering the practical advantages of the MLR technique, it is recommended as an alternative and cost-effective groundwater modeling tool.

Sahoo, Sasmita; Jha, Madan K.

2013-12-01

212

Optimization of fixture layouts of glass laser optics using multiple kernel regression.  

Science.gov (United States)

We aim to build an integrated fixturing model to describe the structural properties and thermal properties of the support frame of glass laser optics. Therefore, (a) a near global optimal set of clamps can be computed to minimize the surface shape error of the glass laser optic based on the proposed model, and (b) a desired surface shape error can be obtained by adjusting the clamping forces under various environmental temperatures based on the model. To construct the model, we develop a new multiple kernel learning method and call it multiple kernel support vector functional regression. The proposed method uses two layer regressions to group and order the data sources by the weights of the kernels and the factors of the layers. Because of that, the influences of the clamps and the temperature can be evaluated by grouping them into different layers. PMID:24922017

Su, Jianhua; Cao, Enhua; Qiao, Hong

2014-05-10

213

Estimation of Parameters in Heteroscedastic Multiple Regression Model using Leverage Based Near-Neighbors  

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we propose a Leverage Based Near-Neighbor (LBNN method where prior information on the structure of the heteroscedastic error is not required. In the proposed LBNN method, weights are determined not from the near-neighbor values of the explanatory variables, but from their corresponding leverage values so that it can be readily applied to a multiple regression model. Both the empirical and Monte Carlo simulation results show that the LBNN method offers substantial improvement over the existing methods. The LBNN has significantly reduced the standard errors of the estimates and also the standard errors of residuals for both simple and multiple linear regression models. Hence, the LBNN can be established as one reliable alternative approach to other existing methods that deal with heteroscedastic errors when the form of heteroscedasticity is unknown.

H. Midi

2009-01-01

214

User's Guide to the Weighted-Multiple-Linear Regression Program (WREG version 1.0)  

Science.gov (United States)

Streamflow is not measured at every location in a stream network. Yet hydrologists, State and local agencies, and the general public still seek to know streamflow characteristics, such as mean annual flow or flood flows with different exceedance probabilities, at ungaged basins. The goals of this guide are to introduce and familiarize the user with the weighted multiple-linear regression (WREG) program, and to also provide the theoretical background for program features. The program is intended to be used to develop a regional estimation equation for streamflow characteristics that can be applied at an ungaged basin, or to improve the corresponding estimate at continuous-record streamflow gages with short records. The regional estimation equation results from a multiple-linear regression that relates the observable basin characteristics, such as drainage area, to streamflow characteristics.

Eng, Ken; Chen, Yin-Yu; Kiang, Julie.E.,

2009-01-01

215

Multiple linear regression MOS for short-term wind power forecast  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Short-term (0 - 36 h ahead) wind power forecast is a central issue for the correct management of a grid connected wind farm. A combination of physical and statistical treatments to post-process Numerical Weather Predictions (NWP) outputs is needed for successful short-term wind power forecasts. One of the most promising and effective approaches for statistical treatment is the Model Output Statistics (MOS) technique. In this study a MOS based on multiple linear regression is proposed: the mod...

Ranaboldo, Matteo

2011-01-01

216

Multiple polynomial regression method for determination of biomedical optical properties from integrating sphere measurements  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We present a new, to our knowledge, method for extracting optical properties from integrating sphere measurements on thin biological samples. The method is based on multivariate calibration techniques involving Monte Carlo simulations, multiple polynomial regression, and a Newton-Raphson algorithm for solving nonlinear equation systems. Prediction tests with simulated data showed that the mean relative prediction error of the absorption and the reduced scattering coefficients within typical b...

Dam, J. S.; Dalgaard, T.; Fabricius, P. E.; Andersson-engels, Stefan

2000-01-01

217

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic effect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-addit...

2012-01-01

218

General regression neural network in energy cost analysis  

Energy Technology Data Exchange (ETDEWEB)

Previous researches on energy cost evaluation in industrial processes have been led by the authors using variance analysis techniques, MANOVA. The results were satisfactory and the codes developed using this techniques on process computers were capable to take care of various factors. Nevertheless either many hypothesis had to be made on the analytical form of the regression surfaces, or a pure MANOVA model had to be used, loosing information on the possible interpolation. Moreover, regression approach was hardly extensible to on-line acquisition of new data. In order to achieve this goal and to simplify the processing of data, we adopted neural networks techniques. We tested various types of networks and we found empirical evidence that the General Regression Neural Networks structure (GRNN) could behave consistently better than back-propagation algorithms.

Tucci, M.; Rinaldi, R.; Romoli, S. [Florence Univ. (Italy). Dept. of Energy Engineering

1996-11-01

219

Regression Analysis between Properties of Subgrade Lateritic Soil  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR) of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8) different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction...

2012-01-01

220

Isolated Area Load Forecasting using Linear Regression Analysis: Practical Approach  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated are...

2011-01-01

 
 
 
 
221

Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis  

DEFF Research Database (Denmark)

This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.

Nielsen, Allan Aasbjerg

2007-01-01

222

Multiple Time Scales and Longitudinal Measurements in Event History Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A general time-to-event data analysis known as event history analysis is considered. The focus is on the analysis of time-to-event data using Cox's regression model when the time to the event may be measured from different origins giving several observable time scales and when longitudinal measurements are involved. For the multiple time scales problem, procedures to choose a basic time scale in Cox's regression model are proposed. The connections between piecewise constant hazards, time-depe...

2005-01-01

223

Early cost estimating for road construction projects using multiple regression techniques  

Directory of Open Access Journals (Sweden)

Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.

Ibrahim Mahamid

2011-12-01

224

Eigenspectra, a robust regression method for multiplexed Raman spectra analysis.  

Science.gov (United States)

With the latest development of Surface Enhanced Raman Scattering (SERS) nanoparticles, Raman spectroscopy now can be extended to bioimaging and biosensing. In this study, we demonstrate the ability of Raman spectroscopy to separate multiple spectral fingerprints using Raman nanotags. A machine learning method is proposed to estimate the mixing ratios of sources from mixture signals. It decomposes the mixture signals into components for both best representation and most relating to mixing ratios. Then regression coefficients are calculated for the prediction. The robustness of the method was compared with least squares and weighted least squares methods. PMID:23798222

Li, Shuo; Nyagilo, James O; Dave, Digant P; Zhang, Baoju; Gao, Jean

2013-01-01

225

Multiple regression and Artificial Neural Network for long-term rainfall forecasting using large scale climate modes  

Science.gov (United States)

In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.

Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.

2013-10-01

226

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects  

Directory of Open Access Journals (Sweden)

Full Text Available Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic effect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1 standard weights based on estimated variances, (2 different weights for minimal, maximal and other lines, and (3 different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

Jan Bocianowski

2012-01-01

227

Multivariate quantiles and multiple-output regression quantiles: From $L_1$ optimization to halfspace depth  

CERN Multimedia

A new multivariate concept of quantile, based on a directional version of Koenker and Bassett's traditional regression quantiles, is introduced for multivariate location and multiple-output regression problems. In their empirical version, those quantiles can be computed efficiently via linear programming techniques. Consistency, Bahadur representation and asymptotic normality results are established. Most importantly, the contours generated by those quantiles are shown to coincide with the classical halfspace depth contours associated with the name of Tukey. This relation does not only allow for efficient depth contour computations by means of parametric linear programming, but also for transferring from the quantile to the depth universe such asymptotic results as Bahadur representations. Finally, linear programming duality opens the way to promising developments in depth-related multivariate rank-based inference.

Hallin, Marc; Šiman, Miroslav; 10.1214/09-AOS723

2010-01-01

228

The Study on Technology Innovation of Chinese Enterprises by Regression Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

According to China Science and Technology Data in recent years, we use Multiple Regression to analysis the influencing factors of technology innovation, and demonstrate the impact of significant and non-significant factors about China’s investment expenditures related policies for technological innovation, so as to enhance China's technological innovation capability and to promote domestic economic development play a guidance and reference.

ZIYAN ZHANG; xungang zheng

2011-01-01

229

Comparison of Artificial Neural Networks and Logistic Regression Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Objectives: The factors that affect students’ alcohol use behaviors were examined by logistic regression analysis and artificial neural networks and the efficiency of these methods in identifying alcohol users and non-users was compared using the receiver operating characteristics (ROC) curve method. Study Design: Graduate students of 1-4 years in Trakya University Medical Faculty (2003-2004) were administered a questionnaire to predict their alcohol use behaviors and were assessed with the...

2005-01-01

230

Entrepreneurship programs in developing countries: A meta regression analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper provides a synthetic and systematic review on the effectiveness of various entrepreneurship programs in developing countries. We adopt a meta-regression analysis using 37 impact evaluation studies that were in the public domain by March 2012, and draw out several lessons on the design of the programs. We observe a wide variation in program effectiveness across different interventions depending on outcomes, types of beneficiaries, and country context. Overall, entrepreneurship progr...

Cho, Yoonyoung; Trionfi Honorati, Maddalena

2013-01-01

231

A Logistic Regression Analysis of the Ischemic Heart Disease Risk  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The main objective of the present study is to investigate factors that contribute significantly to enhancing the risk of ischemic heart disease. The dependent variable of the study is diagnosis - whether the patient has the disease or does not have the disease. Logistic regression analysis is applied for exploring the factors affecting the disease. The result of the study show the factors that contribute significantly to enhancing the risk of ischemic heart disease are the use of banaspati gh...

Bhatti, Irfana P.; Lohano, Heman D.; Pirzado, Zafar A.; Jafri, Imran A.

2006-01-01

232

Multiple Linear Regression of Maximum Queue Length Probability Function for Infected Fish in the Fish Farms  

Directory of Open Access Journals (Sweden)

Full Text Available This study deal with the problem of obtaining some important information of the infected fish in the fish farms while there was always a difficulty in handling the mathematical formulas obtained in some of the research related to the subjects such as the maximum number of the infected fish in a fish farm using the queue system because a lot of computational procedures were required. The multiple linear regression formula for the probability function of the maximum queue length of infected fish during finite time estimated in number of days and the cumulative distribution function were obtained.

Mohammed Mohammed El Genidy

2014-01-01

233

Poisson Regression Analysis of Illness and Injury Surveillance Data  

Energy Technology Data Exchange (ETDEWEB)

The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson variation. The R open source software environment for statistical computing and graphics is used for analysis. Additional details about R and the data that were used in this report are provided in an Appendix. Information on how to obtain R and utility functions that can be used to duplicate results in this report are provided.

Frome E.L., Watkins J.P., Ellis E.D.

2012-12-12

234

Research of quality indices for cold-smoked salmon using a stepwise multiple regression of microbiological counts and physico-chemical parameters  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Aims: The aim of the study was to assess the relationships between the remaining shelf-life (RSL) of cold-smoked salmon and various microbiological and physico-chemical parameters, using a multivariate data analysis in the form of stepwise forward multiple regression.

2001-01-01

235

Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression  

Science.gov (United States)

This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

2013-01-01

236

Bayesian residual analysis for beta-binomial regression models  

Science.gov (United States)

The beta-binomial regression model is an alternative model to the sum of any sequence of equicorrelated binary variables with common probability of success p. In this work a Bayesian perspective of this model is presented considering different link functions and different correlation structures. A general Bayesian residual analysis for this model, a issue which is often neglected in Bayesian analysis, using the residuals based on the predicted values obtained by the conditional predictive ordinate [1], the residuals based on the posterior distribution of the model parameters [2] and the Bayesian deviance residual [3] are presented in order to check the assumptions in the model.

Pires, Rubiane Maria; Diniz, Carlos Alberto Ribeiro

2012-10-01

237

Regression analysis of radiological parameters in nuclear power plants  

International Nuclear Information System (INIS)

Indian Pressurized Heavy Water Reactors (PHWRs) have now attained maturity in their operations. Indian PHWR operation started in the year 1972. At present there are 12 operating PHWRs collectively producing nearly 2400 MWe. Sufficient radiological data are available for analysis to draw inferences which may be utilised for better understanding of radiological parameters influencing the collective internal dose. Tritium is the main contributor to the occupational internal dose originating in PHWRs. An attempt has been made to establish the relationship between radiological parameters, which may be useful to draw inferences about the internal dose. Regression analysis have been done to find out the relationship, if it exist, among the following variables: A. Specific tritium activity of heavy water (Moderator and PHT) and tritium concentration in air at various work locations. B. Internal collective occupational dose and tritium release to environment through air route. C. Specific tritium activity of heavy water (Moderator and PHT) and collective internal occupational dose. For this purpose multivariate regression analysis has been carried out. D. Tritium concentration in air at various work location and tritium release to environment through air route. For this purpose multivariate regression analysis has been carried out. This analysis reveals that collective internal dose has got very good correlation with the tritium activity release to the environment through air route. Whereas no correlation has been found between specific tritium activity in the heavy water systems and collective internal occupational dose. The good correlation has been found in case D and F test reveals that it is not by chance. (author)

2003-03-05

238

Arch Height: A Regression Analysis of Different Measuring Parameters  

Directory of Open Access Journals (Sweden)

Full Text Available Rationale: For measuring the height of the arch of foot either standing navicular height or talar height of the medial longitudinal arch was accepted in earlier days, where as the ‘standing normalised navicular height’ is taken by modern day by authors as a yardstick. But being troublesome and time consuming, we practically not opt for them in busy OPD schedule; rather go for measuring the arch-height in supine posture. Objectives: So this study was aimed to derive the regression between the standing arch-height values with the supine counterparts, so that former can be predicted easily from later. Methodology: It was carried out among 103 adult subjects in the purview of North Bengal Medical College & Hospital. From the x-ray films of their feet in supine and standing posture the navicular and talar heights were determined and the records were analysed. Result: Statistically significant correlation followed by regression analysis could reveal simple linear regression-equations for predicting the standing arch-height values from the supine values; derived separately in both males and females. Conclusion: Thus, from a known supine arch-height value, we can derive the respective standing arch- height, as well as the ‘standing normalised navicular height’ indirectly avoiding the entire troublesome maneuver in regular practice. So the present study recommends this method in clinical fields as because this is more rational and ideal approach to estimate arch height.

Hironmoy Roy

2011-07-01

239

Efficient regression analysis with ranked-set sampling.  

Science.gov (United States)

This article is motivated by a lung cancer study where a regression model is involved and the response variable is too expensive to measure but the predictor variable can be measured easily with relatively negligible cost. This situation occurs quite often in medical studies, quantitative genetics, and ecological and environmental studies. In this article, by using the idea of ranked-set sampling (RSS), we develop sampling strategies that can reduce cost and increase efficiency of the regression analysis for the above-mentioned situation. The developed method is applied retrospectively to a lung cancer study. In the lung cancer study, the interest is to investigate the association between smoking status and three biomarkers: polyphenol DNA adducts, micronuclei, and sister chromatic exchanges. Optimal sampling schemes with different optimality criteria such as A-, D-, and integrated mean square error (IMSE)-optimality are considered in the application. With set size 10 in RSS, the improvement of the optimal schemes over simple random sampling (SRS) is great. For instance, by using the optimal scheme with IMSE-optimality, the IMSEs of the estimated regression functions for the three biomarkers are reduced to about half of those incurred by using SRS. PMID:15606420

Chen, Zehua; Wang, You-Gan

2004-12-01

240

Augmented multiple instance regression for inferring object contours in bounding boxes.  

Science.gov (United States)

In this paper, we address the problem of the high annotation cost of acquiring training data for semantic segmentation. Most modern approaches to semantic segmentation are based upon graphical models, such as the conditional random fields, and rely on sufficient training data in form of object contours. To reduce the manual effort on pixel-wise annotating contours, we consider the setting in which the training data set for semantic segmentation is a mixture of a few object contours and an abundant set of bounding boxes of objects. Our idea is to borrow the knowledge derived from the object contours to infer the unknown object contours enclosed by the bounding boxes. The inferred contours can then serve as training data for semantic segmentation. To this end, we generate multiple contour hypotheses for each bounding box with the assumption that at least one hypothesis is close to the ground truth. This paper proposes an approach, called augmented multiple instance regression (AMIR), that formulates the task of hypothesis selection as the problem of multiple instance regression (MIR), and augments information derived from the object contours to guide and regularize the training process of MIR. In this way, a bounding box is treated as a bag with its contour hypotheses as instances, and the positive instances refer to the hypotheses close to the ground truth. The proposed approach has been evaluated on the Pascal VOC segmentation task. The promising results demonstrate that AMIR can precisely infer the object contours in the bounding boxes, and hence provide effective alternatives to manually labeled contours for semantic segmentation. PMID:24808342

Kuang-Jui Hsu; Yen-Yu Lin; Yung-Yu Chuang

2014-04-01

 
 
 
 
241

Multivariate study and regression analysis of gluten-free granola  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were [...] evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.

Pagamunici, Lilian Maria; Souza, Aloisio Henrique Pereira de; Gohara, Aline Kirie; Silvestre, Alline Aparecida Freitas; Visentainer, Jesuí Vergílio; Souza, Nilson Evelázio de; Gomes, Sandra Terezinha Marques; Matsushita, Makoto.

242

Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report  

International Nuclear Information System (INIS)

Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems

1981-09-01

243

Isolated Area Load Forecasting using Linear Regression Analysis: Practical Approach  

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated area. The proposed analysis is applied to an isolated area of Bangladesh, called Swandip where a past history of electrical load demand is not available and also there is no possibility of connecting the area with the main land grid system.

M. A. Mahmud

2011-09-01

244

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic e [...] ffect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

Jan, Bocianowski.

245

Multiple regression models for the prediction of the maximum obtainable thermal efficiency of organic Rankine cycles  

DEFF Research Database (Denmark)

Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 C, without detailed knowledge or need for simulation of the process. © 2013 Elsevier Ltd. All rights reserved

Larsen, Ulrik; Pierobon, Leonardo

2014-01-01

246

A multiple regression equation for calculated k /SUB eff/ bias errors by criticality code system  

International Nuclear Information System (INIS)

Some 500 cases of benchmark calculations on criticality problems for homogeneous experimental systems have been made with the KENO-IV Monte Carlo calculation code using the MGCL cross-section data library. The calculation results have been analyzed to classify the experimental systems so as to make the variance of calculated k /SUB eff/ bias as small as possible in each classified system. The trends of bias are identified and illustrated to be optimumly expressed by a multiple variable regression equation in terms of several variables, which adequately correlate with the bias value of k /SUB eff/ calculated for the experiments. The uncertainty accompanied by bias correction for calculated k /SUB eff/ is clearly determined, and the margin set aside for the experimental error is assessed. Finally, the procedure to estimate nuclear criticality safety is proposed

1984-01-01

247

Spontaneous Regression of Multiple Pulmonary Metastases After Radiofrequency Ablation of a Single Metastasis  

International Nuclear Information System (INIS)

We report two cases of spontaneous regression of multiple pulmonary metastases occurring after radiofrequency ablation (RFA) of a single lung metastasis. To the best of our knowledge, these are the first such cases reported. These two patients presented with lung metastases progressive despite treatment with interleukin-2, interferon, or sorafenib but were safely ablated with percutaneous RFA under computed tomography guidance. Percutaneous RFA allowed control of the targeted tumors for >1 year. Distant lung metastases presented an objective response despite the fact that they received no targeted local treatment. Local ablative techniques, such as RFA, induce the release of tumor-degradation product, which is probably responsible for an immunologic reaction that is able to produce a response in distant tumors.

2011-04-01

248

Evaluating the Sustainable Development of Agriculture Based on Multiple Linear Regression  

Directory of Open Access Journals (Sweden)

Full Text Available Agriculture is the base of national economy, rural area is basic community and agricultural sustainable development is the base of whole society sustainable development. Studying evaluation index system of agricultural sustainable development level, constructing reasonable evaluation model, are significant for path selection and level promotion. Evaluation index system based on input and output has been built with the method of multiple regression, the interrelation between agricultural investment in fixed assets and related output indexes of agricultural sustainable development, degree of closeness and changing law have been analyzed to find the interrelation mode existing in indexes, a set comprehensive evaluation methods of agricultural sustainable development have been constructed. This evaluation method were used to evaluate agricultural sustainable development level in China’s 31 provinces, can help the local government scientifically know agricultural sustainable development level, provide agricultural sustainable development with scientific basis of decision-making.

Li Qing-xue

2013-01-01

249

Equação de regressão linear múltipla para estimativa do erro experimental / Multiple linear regression equation as an estimation of experimental error  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: Portuguese Abstract in portuguese Este trabalho teve por objetivo estimar equações de regressão linear múltipla tendo, como variáveis explicativas, as demais características avaliadas em experimento de milho e, como variáveis principais, a diferença mínima significativa em percentagem da média (DMS%) e quadrado médio do erro (QMe), [...] para peso de grãos. Com 610 experimentos conduzidos na Rede de Ensaios Nacionais de Competição de Cultivares de Milho, realizados entre 1986 e 1996 (522 experimentos) e em 1997 (88 experimentos), estimaram-se duas equações de regressão, com os 522 experimentos, validando estas pela análise de regressão simples entre os valores reais e os estimados pelas equações, com os 88 restantes, observando que, para a DMS% a equação não estimava o mesmo valor que a fórmula original e, para o QMe, a equação poderia ser utilizada na estimação. Com o teste de Lilliefors, verificou-se que os valores do QMe aderiam à distribuição normal padrão e foi construída uma tabela de classificação dos valores do QMe, baseada nos valores observados na análise da variância dos experimentos e nos estimados pela equação de regressão. Abstract in english The aims of this study were to estimate the multiple linear regression equation and to verify the possible relationship between dependent and independent variables. Dependent variables were the mean percentage of the least significant difference (LSD%) and the mean square of the error (MSe) for grai [...] n yield. Data from 522 experiments conducted from 1986 to 1996 and 88 experiments conducted in 1997 were used in a total of 610 experiments of the National Competition of Maize Cultivars. In the 522 experiments, two regression equations validated by the analysis of simple regression between the real values and the foreseen for the equations were estimated, in the 88 experiments, it was observed that the regression equation was not a good estimation for the same original value for LSD%, but the equation can be used for the estimation of MSe. The application of Lilliefors test resulted in normal pattern distribution of MSe values. One classification table of MSe values was built based on observed values of variance analysis of the experiments and on the regression equation estimated value.

Lúcio, Alessandro Dal?Col; Banzatto, David Ariovaldo; Storck, Lindolfo; Martin, Thomas Newton; Lorentz, Leandro Homrich.

250

Regression of uveal malignant melanomas following cobalt-60 plaque. Correlates between acoustic spectrum analysis and tumor regression  

International Nuclear Information System (INIS)

Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque

1985-01-01

251

ANOVA for Regression  

Science.gov (United States)

This site, created by Michelle Lacey of Yale University, gives an explanation, a definition and an example of ANOVA for regression. Topics include analysis of variance calculations for simple and multiple regression, and F-statistics. This is a great overview of this topic.

Lacey, Michelle

2009-01-05

252

Regression analysis exploring teacher impact on student FCI post scores  

Science.gov (United States)

High School Modeling Workshops are designed to improve high school physics teachersâ understanding of physics and how to teach using the Modeling method. The basic assumption is that the teacher plays a critical role in their studentsâ physics education. This study investigated teacher impacts on studentsâ Force Concept Inventory scores, (FCI), with the hopes of identifying quantitative differences between teachers. This study examined student FCI scores from 18 teachers with at least a year of teaching high school physics. This data was then evaluated using a General Linear Model (GLM), which allowed for a regression equation to be fitted to the data. This regression equation was used to predict student post FCI scores, based on: teacher ID, student pre FCI score, gender, and representation. The results show 12 out of 18 teachers significantly impact their student post FCI scores. The GLM further revealed that of the 12 teachers only five have a positive impact on student post FCI scores. Given these differences among teachers it is our intention to extend our analysis to investigate pedagogical differences between them.

Mahadeo, Jonathan V.; Manthey, Seth; Brewe, Eric

2013-07-17

253

Accounting for data errors discovered from an audit in multiple linear regression.  

Science.gov (United States)

A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan. PMID:21281274

Shepherd, Bryan E; Yu, Chang

2011-09-01

254

Multiple regression equations to estimate the content of breast muscles, meat, and fat in Muscovy ducks.  

Science.gov (United States)

The aim of the present study was to derive multiple regression equations for in vivo estimation of the carcass lean and fat content in Muscovy ducks. The experimental materials consisted of 240 White Muscovy ducklings (120 male and 120 female). One hundred sixteen females aged 10 wk and 112 males aged 12 wk were slaughtered. Before slaughter the ducks were weighed, and the following body measurements were taken: humerus length, drumstick length, chest girth, breast-bone crest length, width between the humeral bones, chest depth, and breast muscle thickness. The coefficients of simple correlation between carcass tissue components and body measurements were calculated. It was found that live body weight was highly correlated with the weights of all tissue components (r = 0.701 to 0.857). In males a significant interrelation was found between breast muscle weight and all body measurements, whereas in females breast muscle weight was correlated with breast-bone crest length, chest girth, width between the humeral bones, chest depth, and breast muscle thickness only. In both males and females the carcass lean content was closely correlated with drumstick length, breast-bone crest length, chest girth, and width between the humeral bones. In drakes the carcass fat content was closely correlated with all body measurements, whereas in hens significant correlations were observed between the carcass fat content and chest girth, width between the humeral bones, and chest depth only. The coefficients of simple correlation between the percentages of carcass tissue components and body measurements were generally low and statistically nonsignificant. Twelve multiple regression equations formulated based on the body measurements of live ducks were verified with respect to the accuracy of estimation of the content of breast muscles, meat, and fat with skin in the carcass. These equations give small SE of the estimate (Sy = 23.3 to 83.8 g), high values of coefficients of multiple correlation between the dependent variable and the set of independent variables, and high values of determination coefficients. PMID:16830875

Kleczek, K; Wawro, K; Wilkiewicz-Wawro, E; Makowski, W

2006-07-01

255

Sensitivity analysis and optimization of system dynamics models: Regression analysis and statistical design of experiments  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for coal transportation, taken from Wolstenholme's book "System Enquiry: a System Dynamics Approach" (1990). The regression analysis uses the least squares algorithm. DOE uses classic desig...

1995-01-01

256

ROC curve regression analysis: the use of ordinal regression models for diagnostic test assessment.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Diagnostic tests commonly are characterized by their true positive (sensitivity) and true negative (specificity) classification rates, which rely on a single decision threshold to classify a test result as positive. A more complete description of test accuracy is given by the receiver operating characteristic (ROC) curve, a graph of the false positive and true positive rates obtained as the decision threshold is varied. A generalized regression methodology, which uses a class of ordinal regre...

Tosteson, A. N.; Weinstein, M. C.; Wittenberg, J.; Begg, C. B.

1994-01-01

257

Experimental and regression analysis for multi cylinder diesel engine operated with hybrid fuel blends  

Directory of Open Access Journals (Sweden)

Full Text Available The purpose of this research work is to build a multiple linear regression model for the characteristics of multicylinder diesel engine using multicomponent blends (diesel- pungamia methyl ester-ethanol as fuel. Nine blends were tested by varying diesel (100 to 10% by Vol., biodiesel (80 to 10% by vol. and keeping ethanol as 10% constant. The brake thermal efficiency, smoke, oxides of nitrogen, carbon dioxide, maximum cylinder pressure, angle of maximum pressure, angle of 5% and 90% mass burning were predicted based on load, speed, diesel and biodiesel percentage. To validate this regression model another multi component fuel comprising diesel-palm methyl ester-ethanol was used in same engine. Statistical analysis was carried out between predicted and experimental data for both fuel. The performance, emission and combustion characteristics of multi cylinder diesel engine using similar fuel blends can be predicted without any expenses for experimentation.

Gopal Rajendiran

2014-01-01

258

Multivariate Regression Analysis of Gravitational Waves from Rotating Core Collapse  

CERN Multimedia

We present a new multivariate regression model for analysis and parameter estimation of gravitational waves observed from well but not perfectly modeled sources such as core-collapse supernovae. Our approach is based on a principal component decomposition of simulated waveform catalogs. Instead of reconstructing waveforms by direct linear combination of physically meaningless principal components, we solve via least squares for the relationship that encodes the connection between chosen physical parameters and the principal component basis. Although our approach is linear, the waveforms' parameter dependence may be non-linear. For the case of gravitational waves from rotating core collapse, we show, using statistical hypothesis testing, that our method is capable of identifying the most important physical parameters that govern waveform morphology in the presence of simulated detector noise. We also demonstrate our method's ability to predict waveforms from a principal component basis given a set of physical ...

Engels, William J; Ott, Christian D

2014-01-01

259

A Quantile Regression Analysis of Micro-lending's Poverty Impact  

Directory of Open Access Journals (Sweden)

Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF, a database using the Progress out of Poverty (PPI Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade but require patience as each additional year increases the client’s average change in poverty score.

Stephen W. Polk

2012-07-01

260

A Logistic Regression Analysis of the Ischemic Heart Disease Risk  

Directory of Open Access Journals (Sweden)

Full Text Available The main objective of the present study is to investigate factors that contribute significantly to enhancing the risk of ischemic heart disease. The dependent variable of the study is diagnosis - whether the patient has the disease or does not have the disease. Logistic regression analysis is applied for exploring the factors affecting the disease. The result of the study show the factors that contribute significantly to enhancing the risk of ischemic heart disease are the use of banaspati ghee, living in urban area, high cholesterol level, age group of 51 to 60 years. Other significant factors are Apo Protein A, Apo Protein B, cholesterol level, high density Lipo protein, low density Lipo protein, phospholipids, total lipid and uric acid.

Irfana P. Bhatti

2006-01-01

 
 
 
 
261

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis  

Energy Technology Data Exchange (ETDEWEB)

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

2013-01-01

262

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis  

Energy Technology Data Exchange (ETDEWEB)

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

2012-02-01

263

Logistic regression analysis on the risk factors of radiation pneumonitis  

International Nuclear Information System (INIS)

Objective: To identify the risk factors of radiation pneumonitis (RP). Methods: A retrospective study was conducted on 101 patients with radiation pneumonitis using SPSS 8.0 software. Factors evaluated included: gender, age, pathology, clinical stage, irradiation dose, irradiation field size, history of smoking, cardiovascular disease, bronchitis, surgery, chemotherapy, lung infection, atelectasis, obstructive infection and pleural effusion. Univariate analysis was performed using Chi-Square test and multivariate analysis was performed using Logistic regression model. Results: Univariate analysis revealed a significant relationship between 10 factors: pulmonary infection, atelectasis, obstructive infection, cardiovascular disease, bronchitis, chemotherapy, irradiation dose, number of days of radiation and irradiation field size were factors leading to radiation pneumonitis. Multivariate analysis showed that 9 factors: pulmonary infection, obs tractive infection, atelectasis, pleural effusion, bronchitis, cardiovascular disease, chemotherapy, irradiation dose, and irradiation field size were independent factors. Conclusion: Comprehensive consideration of the accompanying disease, chemotherapy, dose, field size, etc during the planning of radiotherapy is able to minimize the possibility of developing radiation pneumonitis

2003-02-01

264

Development of Multiple Regression and Neural Network Models for Assessment of Blasting Dust at a Large Surface Coal Mine  

Directory of Open Access Journals (Sweden)

Full Text Available oped for prediction of particulate matter. The performance of the multiple regression models was assessed. For the development of neural network models, a feed forward with back propagation learning algorithm was used to train the network. The performance of neural network was determined in terms of correlation coefficient (R and Mean Square Error (MSE. The optimum number of hidden neurons was found out for obtaining the lowest value of MSE and the highest value of R. The results indicated that the network can predict particulate concentrations better than multiple regression models.

T.A. Renaldy

2011-01-01

265

The role of multiple regression and exploratory data analysis in the development of leukemia incidence risk models for comparison of radionuclide air stack emissions from nuclear and coal power industries  

International Nuclear Information System (INIS)

Risk associated with power generation must be identified to make intelligent choices between alternate power technologies. Radionuclide air stack emissions for a single coal plant and a single nuclear plant are used to compute the single plant leukemia incidence risk and total industry leukemia incidence risk. Leukemia incidence is the response variable as a function of radionuclide bone dose for the six proposed dose response curves considered. During normal operation a coal plant has higher radionuclide emissions than a nuclear plant and the coal industry has a higher leukaemia incidence risk than the nuclear industry, unless a nuclear accident occurs. Variation of nuclear accident size allows quantification of the impact of accidents on the total industry leukemia incidence risk comparison. The leukemia incidence risk is quantified as the number of accidents of a given size for the nuclear industry leukemia incidence risk to equal the coal industry leukemia incidence risk. The general linear model is used to develop equations that relate the accident frequency required for equal industry risks to the magnitude of the nuclear emission. Exploratory data analysis revealed that the relationship between the natural log of accident number versus the natural log of accident size is linear. (Author)

1995-01-01

266

Semiparametric regression analysis under imputation for missing response data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We develop inference tools in a semiparametric regression model with missing response data. A semiparametric regression imputation estimator and an empirical likelihood based one for the mean of the response variable are defined. Both the estimators are proved to be asymptotically normal, with asymptotic variances estimated with Jackknife method. The empirical likelihood method is developed. It is shown that when missing responses are imputed using the semiparametric regression method the emp...

Wang, Qihua; Ha?rdle, Wolfgang; Linton, Oliver

2002-01-01

267

Prediction of radiation levels in residences: A methodological comparison of CART [Classification and Regression Tree Analysis] and conventional regression  

International Nuclear Information System (INIS)

In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

1990-07-09

268

Dental malocclusion and body posture in young subjects: A multiple regression study  

Directory of Open Access Journals (Sweden)

Full Text Available OBJECTIVES: Controversial results have been reported on potential correlations between the stomatognathic system and body posture. We investigated whether malocclusal traits correlate with body posture alterations in young subjects to determine possible clinical applications. METHODS: A total of 122 subjects, including 86 males and 36 females (age range of 10.8-16.3 years, were enrolled. All subjects tested negative for temporomandibular disorders or other conditions affecting the stomatognathic systems, except malocclusion. A dental occlusion assessment included phase of dentition, molar class, overjet, overbite, anterior and posterior crossbite, scissorbite, mandibular crowding and dental midline deviation. In addition, body posture was recorded through static posturography using a vertical force platform. Recordings were performed under two conditions, namely, i mandibular rest position (RP and ii dental intercuspidal position (ICP. Posturographic parameters included the projected sway area and velocity and the antero-posterior and right-left load differences. Multiple regression models were run for both recording conditions to evaluate associations between each malocclusal trait and posturographic parameters. RESULTS: All of the posturographic parameters had large variability and were very similar between the two recording conditions. Moreover, a limited number of weakly significant correlations were observed, mainly for overbite and dentition phase, when using multivariate models. CONCLUSION: Our current findings, particularly with regard to the use of posturography as a diagnostic aid for subjects affected by dental malocclusion, do not support existence of clinically relevant correlations between malocclusal traits and body posture

Giuseppe Perinetti

2010-01-01

269

Supply and Demand of Jeneberang River Aggregate Using Multiple Regression Model  

Directory of Open Access Journals (Sweden)

Full Text Available Aggregate plays an important role in developing infrastructure because it is the major raw materials used in construction such as roads, hospitals, schools, factories, homes and other buildings. Sand and gravel are essential sources of aggregate and exploited often from the active channels of river systems. Jeneberang River is one of the main rivers in South Sulawesi Province which is located at Gowa Regency and mined in order to fulfill the aggregate demand of Gowa Regency and Makassar City. Supply and demand are economic occurrences that affected by several factors, so this research aims to (1 determine influencing factors to aggregate supply and demand, (2 develop supply and demand model. Data was obtained from Central Bureau Statistics of Gowa Regency and Makassar City, and Department of Mines and Energy, Gowa Regency for eleven years (2001 – 2011. In this research, aggregate supply and demand were modeled using multiple regression method. First, relationship among supply and influencing factors were established, followed by demand and its factors. Second, supply and demand model was established using SPSS. The result of this research showed that the model can be used to estimate accurately supply and demand of aggregate using the established relationship among the influencing factors. Supply of aggregate was affected by several factors including price, number of trucks, number of mining companies and mining permit area meanwhile the price, GDP, income per capita, length of road, number of buildings and economic growth had high influence on demand rate.

Aryanti Virtanti Anas

2013-07-01

270

[Clinical research XX. From clinical judgment to multiple logistic regression model].  

Science.gov (United States)

The complexity of the causality phenomenon in clinical practice implies that the result of a maneuver is not solely caused by the maneuver, but by the interaction among the maneuver and other baseline factors or variables occurring during the maneuver. This requires methodological designs that allow the evaluation of these variables. When the outcome is a binary variable, we use the multiple logistic regression model (MLRM). This multivariate model is useful when we want to predict or explain, adjusting due to the effect of several risk factors, the effect of a maneuver or exposition over the outcome. In order to perform an MLRM, the outcome or dependent variable must be a binary variable and both categories must mutually exclude each other (i.e. live/death, healthy/ill); on the other hand, independent variables or risk factors may be either qualitative or quantitative. The effect measure obtained from this model is the odds ratio (OR) with 95 % confidence intervals (CI), from which we can estimate the proportion of the outcome's variability explained through the risk factors. For these reasons, the MLRM is used in clinical research, since one of the main objectives in clinical practice comprises the ability to predict or explain an event where different risk or prognostic factors are taken into account. PMID:24758859

Berea-Baltierra, Ricardo; Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Talavera, Juan O

2014-01-01

271

Locating multiple interacting quantitative trait Loci with the zero-inflated generalized poisson regression.  

Science.gov (United States)

We consider the problem of locating multiple interacting quantitative trait loci (QTL) influencing traits measured in counts. In many applications the distribution of the count variable has a spike at zero. Zero-inflated generalized Poisson regression (ZIGPR) allows for an additional probability mass at zero and hence an improvement in the detection of significant loci. Classical model selection criteria often overestimate the QTL number. Therefore, modified versions of the Bayesian Information Criterion (mBIC and EBIC) were successfully used for QTL mapping. We apply these criteria based on ZIGPR as well as simpler models. An extensive simulation study shows their good power detecting QTL while controlling the false discovery rate. We illustrate how the inability of the Poisson distribution to account for over-dispersion leads to an overestimation of the QTL number and hence strongly discourages its application for identifying factors influencing count data. The proposed method is used to analyze the mice gallstone data of Lyons et al. (2003). Our results suggest the existence of a novel QTL on chromosome 4 interacting with another QTL previously identified on chromosome 5. We provide the corresponding code in R. PMID:20597852

Erhardt, Vinzenz; Bogdan, Malgorzata; Czado, Claudia

2010-01-01

272

Regression Analysis between Properties of Subgrade Lateritic Soil  

Directory of Open Access Journals (Sweden)

Full Text Available The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8 different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction and Specific Gravity were performed on the soil samples.Various linear relationships between index properties and CBR of the samples were investigated and predictive equations estimating CBR from the experimental index values were developed. The findings indicate that good correlation exists between the two groups (i.e Index properties and CBR values. However, the values of the CBR computed from the models are only to be used for preliminary in view of simplicity and economy and not acceptable alternatives to laboratory testing because of the anisotropic nature of lateritic soil and its heterogeneity.

Afeez Adefemi BELLO

2012-12-01

273

Risk factors for temporomandibular disorder: Binary logistic regression analysis  

Science.gov (United States)

Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence.

Magalhaes, Bruno G.; de-Sousa, Stephanie T.; de Mello, Victor V C.; da-Silva-Barbosa, Andre C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Marcia M V.

2014-01-01

274

Risk factors for temporomandibular disorder: Binary logistic regression analysis.  

Science.gov (United States)

Objective: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson's chi-square test for proportions, Fisher's exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusion: Poverty is a important condition to exhibit myofascial pain and joint problems. PMID:24316706

Magalhães, B-G; de-Sousa, S-T; de Mello, V-V-C; da-Silva-Barbosa, A-C; de-Assis-Morais, M-P-L; Barbosa-Vasconcelos, M-M-V; Caldas-Júnior, A-F

2014-01-01

275

Integrated analysis of incidence, progression, regression and disappearance probabilities  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Age-related maculopathy (ARM is a leading cause of vision loss in people aged 65 or older. ARM is distinctive in that it is a disease which can transition through incidence, progression, regression and disappearance. The purpose of this study is to develop methodologies for studying the relationship of risk factors with different transition probabilities. Methods Our framework for studying this relationship includes two different analytical approaches. In the first approach, one can define, model and estimate the relationship between each transition probability and risk factors separately. This approach is similar to constraining a population to a certain disease status at the baseline, and then analyzing the probability of the constrained population to develop a different status. While this approach is intuitive, one risks losing available information while at the same time running into the problem of insufficient sample size. The second approach specifies a transition model for analyzing such a disease. This model provides the conditional probability of a current disease status based upon a previous status, and can therefore jointly analyze all transition probabilities. Throughout the paper, an analysis to determine the birth cohort effect on ARM is used as an illustration. Results and conclusion This study has found parallel separate and joint analyses to be more enlightening than any analysis in isolation. By implementing both approaches, one can obtain more reliable and more efficient results.

Huang Guan-Hua

2008-06-01

276

Relationship of push-ups and sit-ups tests to selected anthropometric variables and performance results: a multiple regression study.  

Science.gov (United States)

The purpose of this study was to explore whether selected anthropometric measures such as specific skinfold sites, along with weight, height, body mass index (BMI), waist and hip circumferences, and waist/hip ratio (WHR) were associated with sit-ups (SU) and push-ups (PU) performance, and to build a regression model for SU and PU tests. One hundred apparently healthy adults (40 men and 60 women) served as the subjects for test validation. The subjects performed 60-second SU and PU tests. The variables analyzed via multiple regression included weight, height, BMI, hip and waist circumferences, WHR, skinfolds at the abdomen (SFAB), thigh (SFTH), and subscapularis (SFSS), and sex. An additional cohort of 40 subjects (17 men and 23 women) was used to cross-validate the regression models. Validity was confirmed by correlation and paired t-tests. The regression analysis yielded a four-variable (PU, height, SFAB, and SFTH) multiple regression equation for estimating SU (R2 = 0.64, SEE = 7.5 repetitions). For PU, only SU was loaded into the regression equation (R2 = 0.43, SEE = 9.4 repetitions). Thus, the variables in the regression models accounted for 64% and 43% of the variation in SU and PU, respectively. The cross-validation sample elicited a high correlation for SU (r = 0.87) and PU (r = 0.79) scores. Moreover, paired-samples t-tests revealed that there were no significant differences between actual and predicted SU and PU scores. Therefore, this study shows that there are a number of selected, health-related anthropometric variables that account significantly for, and are predictive of, SU and PU tests. PMID:18824933

Esco, Michael R; Olson, Michele S; Williford, Henry

2008-11-01

277

The Analysis of Bootstrap Method in Linear Regression Effect  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper combines the least squaress estimate, least absolute deviation estimate, least median estimate with Bootstrapmethod. When the overall error distribution is unknown or it is not the normal distribution, we estimate the regression coefficientand confidence interval of coefficient, and through data simulation, obtain Bootstrap method, which can improvestability of regression coefficient and reduce the length of confidence interval.

Jiehan Zhu; Ping Jing

2010-01-01

278

The Analysis of Bootstrap Method in Linear Regression Effect  

Directory of Open Access Journals (Sweden)

Full Text Available This paper combines the least squaress estimate, least absolute deviation estimate, least median estimate with Bootstrapmethod. When the overall error distribution is unknown or it is not the normal distribution, we estimate the regression coefficientand confidence interval of coefficient, and through data simulation, obtain Bootstrap method, which can improvestability of regression coefficient and reduce the length of confidence interval.

Jiehan Zhu

2010-10-01

279

Teaching Quantitative Literacy through a Regression Analysis of Exam Performance  

Science.gov (United States)

Quantitative literacy is increasingly essential for both informed citizenship and a variety of careers. Though regression is one of the most common methods in quantitative sociology, it is rarely taught until late in students' college careers. In this article, the author describes a classroom-based activity introducing students to regression…

Lindner, Andrew M.

2012-01-01

280

Use of a neural network and a multiple regression model to predict histologic grade of astrocytoma from MRI appearances  

International Nuclear Information System (INIS)

Several MRI features of supratentorial astrocytomas are associated with high histologic grade by statistically significant p values. We sought to apply this information prospectively to a group of astrocytomas in the prediction of tumor grade. We used 10 MRI features of fibrillary astrocytomas from 52 patient studies to develop neural network and multiple linear regression models for practical use in predicting tumor grade. The models were tested prospectively on MR images from 29 patient studies. The performance of the models was compared against that of a radiologist. Neural network accuracy was 61 % in distinguishing between low and high grade tumors. Multiple linear regression achieved an accuracy of 59 %. Assessment of the images by a radiologist yielded 57 % accuracy. We conclude that while certain MRI parameters may be statistically related to astrocytoma histologic grade, neural network and linear regression models cannot reliably use them to predict tumor grade. (orig.)

1995-02-01

 
 
 
 
281

The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements  

Science.gov (United States)

This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

Le, Huy; Marcus, Justin

2012-01-01

282

Regression tree analysis for predicting slaughter weight in broilers  

Directory of Open Access Journals (Sweden)

Full Text Available In this study, Regression Tree Analysis (RTA was used to predict and to determine the most important variables in predicting the slaughter weight of Ross 308 broiler chickens. Data for this study came from 224 chickens raised during three different seasons, namely spring (n=66, summer (n=66, winter (n=92. Second week body weight, shank length, shank width, breast bone length, breast width, breast circumference and body length were used to predict the slaughter weight. Results of RTA showed that among the seven independent variables only four were selected, namely; body weight, breast bone length, shank width, and breast circumference. These selected independent variables were more efficient than the others in predicting the slaughter weight. RTA indicated that the birds which had values of second week body weight >295.95 g, breast bone length >55.82 mm and breast circumference >14.18 cm or that of body weight ?295.95 g, breast bone length >60.26 mm and shank width >8.32 mm could be expected to have higher slaughter weights.

Erkut Akkartal

2010-01-01

283

Nonlinear Robust Regression Using Kernel Principal Component Analysis and R-Estimators  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In recent years, many algorithms based on kernel principal component analysis (KPCA) have been proposed including kernel principal component regression (KPCR). KPCR can be viewed as a non-linearization of principal component regression (PCR) which uses the ordinary least squares (OLS) for estimating its regression coefficients. We use PCR to dispose the negative effects of multicollinearity in regression models. However, it is well known that the main disadvantage of OLS is its sensitiveness ...

Antoni Wibowo; Mohammad Ishak Desa

2011-01-01

284

The univariation and multiple linear regression analyses for seventeen SNPs in thirteen cardiovascular disease-predisposing genes and blood pressure in Chinese Han males.  

Science.gov (United States)

Blood pressure (BP) is a complex trait regulated by the interaction among multiple physiologic regulatory systems, likely involving numerous genes that lead to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of hypertension is based on multiple genes with minor effects. To learn the association between 17 single nucleotide polymorphisms (SNPs) in 13 cardiovascular disease-predisposing genes and blood pressure of Han males, the 17 SNPs genotypes of 375 Han males were detected and analyzed with BaiO gene chip. The relationship between the SNPs and blood pressure was analyzed with variance analysis and multiple linear regression analysis. Variance analysis and/or multiple linear regression showed that: systolic blood pressure (SBP) was increasing with the elevation of year; AGT(235)M, ApoE(112,158)E4, and SerpinA3(rs4934)A were relative to the increase of SBP; AGT(235)M, ET-2(985)G, ApoC3(3206)T, and ApoE(112,158)E4 may have had some relation with diastolic blood pressure (DBP) elevation; and ApoB(Xba) + was associated with the increase of pulse pressure (PP). These findings support the multigenic nature of the etiology of essential hypertension and propose a potential gene-gene interactive model for future studies. PMID:18855268

Tang, Min; Dai, Yong; Huang, Yuanshuai; Cai, Xiaozhong; Tian, Xiaoyuan; Tu, Zhiguang

2008-10-01

285

Improved performance of a two-element TLD badge for determining gamma and beta doses using multiple linear regression  

International Nuclear Information System (INIS)

The gamma/beta TLD badge used by OPPD consists of two TLD-700 chips (Harshaw G7 card), one of which (chip number sign 2) is shielded by a 0.102 cm-thick aluminum filter, and the other (chip number sign 1) is unshielded, as shown in Fig. 1. Standard procedure had been to determine the beta dose to the badge by subtracting the response of chip number sign 2 from that of chip number sign 1 and then dividing by a calibrated beta-sensitivity factor; the gamma dose was taken to be the response of chip number sign 2 divided by the chip's gamma-sensitivity factor followed by the subtraction of the background dose. A problem with this procedure is penetration of energetic beta particles through the aluminum filter on chip number sign 2 which causes an over-response. Due to the technique used to obtain the beta dose, this also results in an under-estimate of the beta dose. This problem has been corrected through application of multiple linear regression analysis on a large data base of pure gamma (137Cs), pure beta (90Sr), and mixed exposures. The outcome of the analysis is an algorithm that automatically corrects for penetration effects. Performance tests using the ANSI N13.11 standard are presented to show the improvement

1988-11-01

286

Regional flood frequency analysis in eastern Australia: Bayesian GLS regression-based methods within fixed region and ROI framework - Quantile Regression vs. Parameter Regression Technique  

Science.gov (United States)

SummaryIn this article, an approach using Bayesian Generalised Least Squares (BGLS) regression in a region-of-influence (ROI) framework is proposed for regional flood frequency analysis (RFFA) for ungauged catchments. Using the data from 399 catchments in eastern Australia, the BGLS-ROI is constructed to regionalise the flood quantiles (Quantile Regression Technique (QRT)) and the first three moments of the log-Pearson type 3 (LP3) distribution (Parameter Regression Technique (PRT)). This scheme firstly develops a fixed region model to select the best set of predictor variables for use in the subsequent regression analyses using an approach that minimises the model error variance while also satisfying a number of statistical selection criteria. The identified optimal regression equation is then used in the ROI experiment where the ROI is chosen for a site in question as the region that minimises the predictive uncertainty. To evaluate the overall performances of the quantiles estimated by the QRT and PRT, a one-at-a-time cross-validation procedure is applied. Results of the proposed method indicate that both the QRT and PRT in a BGLS-ROI framework lead to more accurate and reliable estimates of flood quantiles and moments of the LP3 distribution when compared to a fixed region approach. Also the BGLS-ROI can deal reasonably well with the heterogeneity in Australian catchments as evidenced by the regression diagnostics. Based on the evaluation statistics it was found that both BGLS-QRT and PRT-ROI perform similarly well, which suggests that the PRT is a viable alternative to QRT in RFFA. The RFFA methods developed in this paper is based on the database available in eastern Australia. It is expected that availability of a more comprehensive database (in terms of both quality and quantity) will further improve the predictive performance of both the fixed and ROI based RFFA methods presented in this study, which however needs to be investigated in future when such a database is available.

Haddad, Khaled; Rahman, Ataur

2012-04-01

287

Hierarchical Regression for Multiple Comparisons in a Case-Control Study of Occupational Risks for Lung Cancer  

Science.gov (United States)

Background Occupational studies often involve multiple comparisons and therefore suffer from false positive findings. Semi-Bayes adjustment methods have sometimes been used to address this issue. Hierarchical regression is a more general approach, including Semi-Bayes adjustment as a special case, that aims at improving the validity of standard maximum-likelihood estimates in the presence of multiple comparisons by incorporating similarities between the exposures of interest in a second-stage model. Methodology/Principal Findings We re-analysed data from an occupational case-control study of lung cancer, applying hierarchical regression. In the second-stage model, we included the exposure to three known lung carcinogens (asbestos, chromium and silica) for each occupation, under the assumption that occupations entailing similar carcinogenic exposures are associated with similar risks of lung cancer. Hierarchical regression estimates had smaller confidence intervals than maximum-likelihood estimates. The shrinkage toward the null was stronger for extreme, less stable estimates (e.g., “specialised farmers”: maximum-likelihood OR: 3.44, 95%CI 0.90–13.17; hierarchical regression OR: 1.53, 95%CI 0.63–3.68). Unlike Semi-Bayes adjustment toward the global mean, hierarchical regression did not shrink all the ORs towards the null (e.g., “Metal smelting, converting and refining furnacemen”: maximum-likelihood OR: 1.07, Semi-Bayes OR: 1.06, hierarchical regression OR: 1.26). Conclusions/Significance Hierarchical regression could be a valuable tool in occupational studies in which disease risk is estimated for a large amount of occupations when we have information available on the key carcinogenic exposures involved in each occupation. With the constant progress in exposure assessment methods in occupational settings and the availability of Job Exposure Matrices, it should become easier to apply this approach.

Corbin, Marine; Richiardi, Lorenzo; Vermeulen, Roel; Kromhout, Hans; Merletti, Franco; Peters, Susan; Simonato, Lorenzo; Steenland, Kyle; Pearce, Neil; Maule, Milena

2012-01-01

288

Regression analysis of technical parameters affecting nuclear power plant performances  

International Nuclear Information System (INIS)

Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)

2012-06-24

289

Regression analysis of technical parameters affecting nuclear power plant performances  

Energy Technology Data Exchange (ETDEWEB)

Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)

Ghazy, R.; Ricotti, M. E.; Trueco, P. [Politecnico di Milano, Via La Masa, 34, 20156 Milano (Italy)

2012-07-01

290

Analysis of Success in General Chemistry Based on Diagnostic Testing Using Logistic Regression  

Science.gov (United States)

Several chemistry diagnostic and placement exams are used to help place chemistry students in an appropriate course or to determine strengths and weaknesses for specific topics in chemistry or math. The purpose of obtaining pre-course measurements is to increase students' academic success. Often these tests are used to predict the chance a student has in passing a course. This paper discusses the statistical methods of logistic regression applied to predicting the probability of passing a course, based on the scores on the California Chemistry Diagnostic Test at two different institutions with two different instructors over multiple years. This technique describes the relation of a test score (a continuous variable) to the probability of passing the class (a binary variable). Many papers in the Journal of Chemical Education have used a simple linear regression technique to correlate placement test scores with the proportion of students passing a course. The model assumptions are difficult to satisfy when using simple linear regression. Simple linear regression is useful when continuous predictor variables predict a continuous response, whereas logistic regression is useful when continuous predictor variables predict a binary response. Differences between simple linear regression and logistic regression and methods for evaluating linear regression model assumptions are discussed in detail. The fundamental concepts behind regression are described, with the caveats of using regression equations for predictions. By using logistic regression, instructors will be able to provide students with an estimate of their probability of passing the course.

Legg, Margaret J.; Legg, Jason C.; Greenbowe, Thomas J.

2001-08-01

291

Buffalos milk yield analysis using random regression models  

Directory of Open Access Journals (Sweden)

Full Text Available Data comprising 1,719 milk yield records from 357 females (predominantly Murrah breed, daughters of 110 sires, with births from 1974 to 2004, obtained from the Programa de Melhoramento Genético de Bubalinos (PROMEBUL and from records of EMBRAPA Amazônia Oriental - EAO herd, located in Belém, Pará, Brazil, were used to compare random regression models for estimating variance components and predicting breeding values of the sires. The data were analyzed by different models using the Legendre’s polynomial functions from second to fourth orders. The random regression models included the effects of herd-year, month of parity date of the control; regression coefficients for age of females (in order to describe the fixed part of the lactation curve and random regression coefficients related to the direct genetic and permanent environment effects. The comparisons among the models were based on the Akaike Infromation Criterion. The random effects regression model using third order Legendre’s polynomials with four classes of the environmental effect were the one that best described the additive genetic variation in milk yield. The heritability estimates varied from 0.08 to 0.40. The genetic correlation between milk yields in younger ages was close to the unit, but in older ages it was low.

A.S. Schierholt

2010-02-01

292

Combinatorial Analysis of Multiple Networks  

CERN Document Server

The study of complex networks has been historically based on simple graph data models representing relationships between individuals. However, often reality cannot be accurately captured by a flat graph model. This has led to the development of multi-layer networks. These models have the potential of becoming the reference tools in network data analysis, but require the parallel development of specific analysis methods explicitly exploiting the information hidden in-between the layers and the availability of a critical mass of reference data to experiment with the tools and investigate the real-world organization of these complex systems. In this work we introduce a real-world layered network combining different kinds of online and offline relationships, and present an innovative methodology and related analysis tools suggesting the existence of hidden motifs traversing and correlating different representation layers. We also introduce a notion of betweenness centrality for multiple networks. While some preli...

Magnani, Matteo; Rossi, Luca

2013-01-01

293

Two-stage source tracking method using a multiple linear regression model in the expanded phase domain  

Science.gov (United States)

This article proposes an efficient two-channel time delay estimation method for tracking a moving speaker in noisy and re-verberant environment. Unlike conventional linear regression model-based methods, the proposed multiple linear regression model designed in the expanded phase domain shows high estimation accuracy in adverse condition because its the Gaussian assumption on phase distribution is valid. Therefore, the least-square-based time delay estimator using the proposed multiple linear regression model becomes an ideal estimator that does not require a complicated phase unwrapping process. In addition, the proposed method is extended to the two-stage recursive estimation approach, which can be used for a moving source tracking scenario. The performance of the proposed method is compared with that of conventional cross-correlation and linear regression-based methods in noisy and reverberant environment. Experimental results verify that the proposed algorithm significantly decreases estimation anomalies and improves the accuracy of time delay estimation. Finally, the tracking performance of the proposed method to both slow and fast moving speakers is confirmed in adverse environment.

Yang, Jae-Mo; Kang, Hong-Goo

2012-12-01

294

Multivariate Regression Approach To Integrate Multiple Satellite And Tide Gauge Data For Real Time Sea Level Prediction  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The Sea Level Thematic Assembly Center in the EUFP7 MyOcean project aims at build a sea level service for multiple satellite sea level observations at a European level for GMES marine applications. It aims to improve the sea level related products to guarantee the sustainability and the quality of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the...

Cheng, Yongcun; Andersen, Ole Baltazar; Knudsen, Per

2011-01-01

295

Multiple trait model combining random regressions for daily feed intake with single measured performance traits of growing pigs  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG), feed conversion ratio (FCR), carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial...

Schnyder, Urs; Hofer, Andreas; Labroue, Florence; Ku?nzi, Niklaus

2002-01-01

296

Multiple Linear Regression Formula for the Probability of the Average Daily Solar Energy using the Queue System  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The multiple linear regression formula of the probability of the averaged daily solar energy reaching a specific location on the earth's surface in a calendar month was obtained with the assumption that the arrival process of clouds and solar energy during the day follows the exponential distribution. This formula enables any user to find out some of the required information such as knowing the maximum probability for the averaged daily solar energy and the amount of the corresponding clouds....

2012-01-01

297

Application of a multiple least-squares regression program to dual energy NaI-CsI(T1) measurements  

International Nuclear Information System (INIS)

In conjunction with the development of an optimum background subtraction routine, a multiple least-squares regression program for simultaneous utilization of both the NaI(T1) and CsI(T1) energy ranges of a dual anti-coincidence detection system was applied. To experimentally evaluate the program for whole body counting purposes, an Am-241 contaminated subject was measured in the whole body counter using the standard three phoswich detector array surrounding the head

1982-01-01

298

Development of Multiple Regression and Neural Network Models for Assessment of Blasting Dust at a Large Surface Coal Mine  

Digital Repository Infrastructure Vision for European Research (DRIVER)

oped for prediction of particulate matter. The performance of the multiple regression models was assessed. For the development of neural network models, a feed forward with back propagation learning algorithm was used to train the network. The performance of neural network was determined in terms of correlation coefficient (R) and Mean Square Error (MSE). The optimum number of hidden neurons was found out for obtaining the lowest value of MSE and the highest value of R. The results indicated ...

Roy, S.; Adhikari, G. R.; Renaldy, T. A.; Jha, A. K.

2011-01-01

299

Prediction of the Rock Mass Diggability Index by Using Fuzzy Clustering-Based, ANN and Multiple Regression Methods  

Science.gov (United States)

Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.

Saeidi, Omid; Torabi, Seyed Rahman; Ataei, Mohammad

2014-03-01

300

Performance of Multiple Linear Regression Model for Long-term PM10 Concentration Prediction Based on Gaseous and Meteorological Parameters  

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this study was to investigate performance of Multiple Linear Regression (MLR method in predicting future (next day, next 2 days and next 3 days PM10 concentration levels in Seberang Perai, Malaysia. The developed model was compared to multiple linear regression models. The model used gaseous (NO2, SO2, CO, PM10 and meteorological parameters (temperature, relative humidity and wind speed as predictors. Performance indicators such as Prediction Accuracy (PA, Coefficient of Determination (R2, Index of Agreement (IA, Normalized Absolute Error (NAE and Root Mean Square Error (RMSE were used to measure the accuracy of the models. Performance indicator shows next day (RMSE = 11.211, NAE = 0.124, PA = 0.927, IA = 0.960, R2 = 0.858, and next 2-day (RMSE = 14.652, NAE = 0.155, PA = 0.881, IA = 0.925, R2 = 0.775 and next 3-day (RMSE = 15.611, NAE = 0.167, PA = 0.849, IA = 0.912, R2 = 0.720. Assessment of model performance indicated that multiple linear regression method can be used for long term PM10 concentration prediction with next day for next day.

NorAzam Ramli

2012-01-01

 
 
 
 
301

HIGH RESOLUTION FOURIER ANALYSIS WITH AUTO-REGRESSIVE LINEAR PREDICTION  

Energy Technology Data Exchange (ETDEWEB)

Auto-regressive linear prediction is adapted to double the resolution of Angle-Resolved Photoemission Extended Fine Structure (ARPEFS) Fourier transforms. Even with the optimal taper (weighting function), the commonly used taper-and-transform Fourier method has limited resolution: it assumes the signal is zero beyond the limits of the measurement. By seeking the Fourier spectrum of an infinite extent oscillation consistent with the measurements but otherwise having maximum entropy, the errors caused by finite data range can be reduced. Our procedure developed to implement this concept adapts auto-regressive linear prediction to extrapolate the signal in an effective and controllable manner. Difficulties encountered when processing actual ARPEFS data are discussed. A key feature of this approach is the ability to convert improved measurements (signal-to-noise or point density) into improved Fourier resolution.

Barton, J.; Shirley, D.A.

1984-04-01

302

Arch Height: A Regression Analysis of Different Measuring Parameters  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Rationale: For measuring the height of the arch of foot either standing navicular height or talar height of the medial longitudinal arch was accepted in earlier days, where as the ‘standing normalised navicular height’ is taken by modern day by authors as a yardstick. But being troublesome and time consuming, we practically not opt for them in busy OPD schedule; rather go for measuring the arch-height in supine posture. Objectives: So this study was aimed to derive the regression between...

2011-01-01

303

A Nonmonotone Line Search Method for Regression Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available In this paper, we propose a nonmonotone line search combining with the search direction (G. L. Yuan and Z. X.Wei, New Line Search Methods for Unconstrained Optimization, Journal of the Korean Statistical Society, 38(2009, pp. 29-39. for regression problems. The global convergence of the given method will be established under suitable conditions. Numerical results show that the presented algorithm is more competitive than the normal methods.

Gonglin Yuan

2009-03-01

304

Analysis of some methods for reduced rank Gaussian process regression  

DEFF Research Database (Denmark)

While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.

Quinonero-Candela, J.; Rasmussen, Carl Edward

2005-01-01

305

BRGLM, Interactive Linear Regression Analysis by Least Square Fit  

International Nuclear Information System (INIS)

1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems

1985-01-01

306

A new approach in regression analysis for modeling adsorption isotherms.  

Science.gov (United States)

Numerous regression approaches to isotherm parameters estimation appear in the literature. The real insight into the proper modeling pattern can be achieved only by testing methods on a very big number of cases. Experimentally, it cannot be done in a reasonable time, so the Monte Carlo simulation method was applied. The objective of this paper is to introduce and compare numerical approaches that involve different levels of knowledge about the noise structure of the analytical method used for initial and equilibrium concentration determination. Six levels of homoscedastic noise and five types of heteroscedastic noise precision models were considered. Performance of the methods was statistically evaluated based on median percentage error and mean absolute relative error in parameter estimates. The present study showed a clear distinction between two cases. When equilibrium experiments are performed only once, for the homoscedastic case, the winning error function is ordinary least squares, while for the case of heteroscedastic noise the use of orthogonal distance regression or Margart's percent standard deviation is suggested. It was found that in case when experiments are repeated three times the simple method of weighted least squares performed as well as more complicated orthogonal distance regression method. PMID:24672394

Markovi?, Dana D; Leki?, Branislava M; Rajakovi?-Ognjanovi?, Vladana N; Onjia, Antonije E; Rajakovi?, Ljubinka V

2014-01-01

307

A New Approach in Regression Analysis for Modeling Adsorption Isotherms  

Science.gov (United States)

Numerous regression approaches to isotherm parameters estimation appear in the literature. The real insight into the proper modeling pattern can be achieved only by testing methods on a very big number of cases. Experimentally, it cannot be done in a reasonable time, so the Monte Carlo simulation method was applied. The objective of this paper is to introduce and compare numerical approaches that involve different levels of knowledge about the noise structure of the analytical method used for initial and equilibrium concentration determination. Six levels of homoscedastic noise and five types of heteroscedastic noise precision models were considered. Performance of the methods was statistically evaluated based on median percentage error and mean absolute relative error in parameter estimates. The present study showed a clear distinction between two cases. When equilibrium experiments are performed only once, for the homoscedastic case, the winning error function is ordinary least squares, while for the case of heteroscedastic noise the use of orthogonal distance regression or Margart's percent standard deviation is suggested. It was found that in case when experiments are repeated three times the simple method of weighted least squares performed as well as more complicated orthogonal distance regression method.

Onjia, Antonije E.

2014-01-01

308

Multi-stratified multiple regression tests of the linear/no-threshold theory of radon-induced lung cancer  

International Nuclear Information System (INIS)

A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy

1990-10-15

309

Use of Structure Coefficients in Published Multiple Regression Articles: Beta Is Not Enough.  

Science.gov (United States)

Reviewed articles published in the "Journal of Applied Psychology" (JAP) to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or bivariate "r"s of predictors with the criterion) had been interpreted. Summarizes some dramatic misinterpretations or incomplete interpretations.…

Courville, Troy; Thompson, Bruce

2001-01-01

310

Iterative weighted regression analysis of logit responses: a computer program for analysis of bioassays and immunoassays  

International Nuclear Information System (INIS)

A program, WRANL, is described for the analysis of immunoassays or bioassays which have a logistic dose-response relationship. Responses are transformed to logits and iterative weighted regression analysis is used to obtain log dose-logit response lines for all preparations compared in an assay. Potency estimates of preparations relative to the standard preparation are available for both unweighted and weighted regression analyses together with detailed analysis of variance, estimates of slope and other relevant parameters. The general comparisons of dose-response relationships produced by the program are a feature of particular interest. However, an option which suppresses the more general output is available if the program is to be used for analysis of a 'screening' assay comparing single dilutions or doses of test samples with a standard curve. Data input is designed to permit immediate running of the program by junior personnel. Data output is designed to facilitate record keeping. (Auth.)

1982-08-01

311

Estimation of streamflow, base flow, and nitrate-nitrogen loads in Iowa using multiple linear regression models  

Science.gov (United States)

Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).

Schilling, K. E.; Wolter, C. F.

2005-01-01

312

Better prediction of protein contact number using a support vector regression analysis of amino acid sequence  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C? atoms in other residues within a sphere around the C? atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles, we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either "contacted" or "non-contacted", the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary protein sequence and higher order consecutive protein structural and functional properties.

Yuan Zheng

2005-10-01

313

Problems of Interpretation and Use of the Residuals in Multiple Regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Die Zuordnung der gesamten erklärten Varianz zu den Prädikatoren wird diskutiert. Mit Hilfe der Residuen werden u.a. die Beta-Koeffizienten und die Part Correlation Coefficients charakterisiert. Die erklärte Varianz läßt sich zerlegen: 1) 'von unten' (mit Hilfe der quadrierten Part Correlation Coefficients), 2) 'von oben' (mit Hilfe der quadrierten Simple Correlations), 3) hierarchisch bei der schrittweisen Regression, 4) nur formal befriedigend mit den Produkten aus Beta-Koeffizi...

1980-01-01

314

Control of matrix interferences by multiple linear regression models in the determination of arsenic and lead concentrations in fly ashes by inductively coupled plasma optical emission spectrometry  

Energy Technology Data Exchange (ETDEWEB)

A multiple linear regression technique was used to evaluate and correct the matrix interferences in the determination of As and Pb concentrations in fly ashes by inductively coupled plasma optical emission spectrometry. The direct determination of As and Pb in SRM 1633b by ICP-OES failed to obtain the certified concentrations, except in a couple of cases. However, it proved possible to use the multiple linear regression (MLR) technique to correct the determined concentrations to a satisfactory level. This method of correction is based on the multiple regression line obtained from the analysis of 19 synthetic mixtures of matrix and analyte elements (Al, As, Ca, Fe, Pb, and Si) at five concentration levels. The matrix interferences in the determination of As were caused by Al, Pb, and Ca whereas the matrix interferences in the determination of Pb were caused by Al and Fe. The most suitable parameters for the determination of As and Pb were a plasma power of 1500 W and a nebulizer flow of 0.5 or 0.6 L min{sup -1}. The accuracy of the method was shown with the analysis of SRM 1633b and two fly ash samples with the standard addition method. A recovery rate of 96% can be reached for Pb at 220.353 nm with three digestion met hods (US-TSD, US1 and MW) by using both direct measurement with thoroughly optimized plasma conditions and the MLR method. A recovery rate of 93% was obtained for As when using the MLR method at 193.696 nm with the digestion method US2, a plasma power of 1500 W, and nebulizer flow of 0.6 L min{sup -1}. The corrected and determined concentrations of As and Pb in samples analyzed resulted in a precision of 0.6 to 3.9%.

Hander, A.; Vaisanen, A. [University of Jyvaskyla, Jyvaskyla (Finland). Dept. of Chemistry

2010-10-15

315

An R package to compute commonality coefficients in the multiple regression case: an introduction to the package and a practical example.  

Science.gov (United States)

Multiple regression is a widely used technique for data analysis in social and behavioral research. The complexity of interpreting such results increases when correlated predictor variables are involved. Commonality analysis provides a method of determining the variance accounted for by respective predictor variables and is especially useful in the presence of correlated predictors. However, computing commonality coefficients is laborious. To make commonality analysis accessible to more researchers, a program was developed to automate the calculation of unique and common elements in commonality analysis, using the statistical package R. The program is described, and a heuristic example using data from the Holzinger and Swineford (1939) study, readily available in the MBESS R package, is presented. PMID:18522056

Nimon, Kim; Lewis, Mitzi; Kane, Richard; Haynes, R Michael

2008-05-01

316

A comparison of multiple regression and neural network techniques for mapping in situ pCO2 data  

International Nuclear Information System (INIS)

Using about 138,000 measurements of surface pCO2 in the Atlantic subpolar gyre (50-70 deg N, 60-10 deg W) during 1995-1997, we compare two methods of interpolation in space and time: a monthly distribution of surface pCO2 constructed using multiple linear regressions on position and temperature, and a self-organizing neural network approach. Both methods confirm characteristics of the region found in previous work, i.e. the subpolar gyre is a sink for atmospheric CO2 throughout the year, and exhibits a strong seasonal variability with the highest undersaturations occurring in spring and summer due to biological activity. As an annual average the surface pCO2 is higher than estimates based on available syntheses of surface pCO2. This supports earlier suggestions that the sink of CO2 in the Atlantic subpolar gyre has decreased over the last decade instead of increasing as previously assumed. The neural network is able to capture a more complex distribution than can be well represented by linear regressions, but both techniques agree relatively well on the average values of pCO2 and derived fluxes. However, when both techniques are used with a subset of the data, the neural network predicts the remaining data to a much better accuracy than the regressions, with a residual standard deviation ranging from 3 to 11 ?atm. The subpolar gyre is a net sink of CO2 of 0.13 Gt-C/yr using the multiple linear regressions and 0.15 Gt-C/yr using the neural network, on average between 1995 and 1997. Both calculations were made with the NCEP monthly wind speeds converted to 10 m height and averaged between 1995 and 1997, and using the gas exchange coefficient of Wanninkhof

2005-11-01

317

Analysis of Maryland Poisoning Deaths Using Classification And Regression Tree (CART) Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Our study is a cross-sectional analysis of Maryland poisoning deaths for years 2003 and 2004. We used Classification and Regression Tree (CART) methodology to classify 1,204 Maryland undetermined intent poisoning deaths as either unintentional or suicidal poisonings. The predictive ability of the selected set of variables (i.e., poisoned in the home or workplace, location type where poisoned, place of death, poison type, victim race and age, year of death) was extremely good. Of the 301 test ...

Pamer, Carol; Serpi, Tracey; Finkelstein, Joseph

2008-01-01

318

Predicting agility performance with other performance variables in pubescent boys: a multiple-regression approach.  

Science.gov (United States)

The goal was to investigate the influence of balance, jumping power, reactive-strength, speed, and morphological variables on five different agility performances in early pubescent boys (N = 71). The predictors included body height and mass, countermovement and broad jumps, overall stability index, 5 m sprint, and bilateral side jumps test of reactive strength. Forward stepwise regressions calculated on 36 randomly selected participants explained 47% of the variance in performance of the forward-backward running test, 50% of the 180 degrees turn test, 55% of the 20 yd. shuttle test, 62% of the T-shaped course test, and 44% of the zig-zag test, with the bilateral side jumps as the single best predictor. Regression models were cross-validated using the second half of the sample (n = 35). Correlation between predicted and achieved scores did not provide statistically significant validation statistics for the continuous-movement zig-zag test. Further study is needed to assess other predictors of agility in early pubescent boys. PMID:24897879

Sekulic, Damir; Spasic, Miodrag; Esco, Michael R

2014-04-01

319

Piecewise polynomial regression with fractional residuals for the analysis of calcium imaging data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this work we deal with the mathematical analysis and application of piecewise (or segmented) polynomial regression. Motivated by an application in neurobiology we allow the residual processes of our model to exhibit long memory, short memory or antipersistence.

2012-01-01

320

Logistic regression analysis of the risk factors of acute renal failure complicating limb war injuries  

Directory of Open Access Journals (Sweden)

Full Text Available Objective To explore the risk factors of complication of acute renal failure(ARF in war injuries of limbs.Methods The clinical data of 352 patients with limb injuries admitted to 303 Hospital of PLA from 1968 to 2002 were retrospectively analyzed.The patients were divided into ARF group(n=9 and non-ARF group(n=343 according to the occurrence of ARF,and the case-control study was carried out.Ten factors which might lead to death were analyzed by logistic regression to screen the risk factors for ARF,including causes of trauma,shock after injury,time of admission to hospital after injury,injured sites,combined trauma,number of surgical procedures,presence of foreign matters,features of fractures,amputation,and tourniquet time.Results Fifteen of the 352 patients died(4.3%,among them 7 patients(46.7% died of ARF,3(20.0% of pulmonary embolism,3(20.0% of gas gangrene,and 2(13.3% of multiple organ failure.Univariate analysis revealed that the shock,time before admitted to hospital,amputation and tourniquet time were the risk factors for ARF in the wounded with limb injuries,while the logistic regression analysis showed only amputation was the risk factor for ARF(P < 0.05.Conclusion ARF is the primary cause-of-death in the wounded with limb injury.Prompt and accurate treatment and optimal time for amputation may be beneficial to decreasing the incidence and mortality of ARF in the wounded with severe limb injury and ischemic necrosis.

Chang-zhi CHENG

2011-06-01

 
 
 
 
321

Análise de regressão múltipla das concentrações de PM10 em função de elementos meteorológicos para Porto Alegre, Estado do Rio Grande do Sul, em 2005 e 2006 = Multiple regression analysis of PM10 concentration concerning to meteorological elements for Porto Alegre, Rio Grande do Sul State, in 2005 and 2006  

Directory of Open Access Journals (Sweden)

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seucomportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se ocomportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: asconcentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; astemperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Dataanalysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured placesare east and southeast.

Angela Radünz Lazzari

2011-01-01

322

Using Negative Binomial Regression Analysis to Predict Software Faults: A Study of Apache Ant  

Directory of Open Access Journals (Sweden)

Full Text Available Negative binomial regression has been proposed as an approach to predicting fault-prone software modules. However, little work has been reported to study the strength, weakness, and applicability of this method. In this paper, we present a deep study to investigate the effectiveness of using negative binomial regression to predict fault-prone software modules under two different conditions, self-assessment and forward assessment. The performance of negative binomial regression model is also compared with another popular fault prediction model—binary logistic regression method. The study is performed on six versions of an open-source objected-oriented project, Apache Ant. The study shows (1 the performance of forward assessment is better than or at least as same as the performance of self-assessment; (2 in predicting fault-prone modules, negative binomial regression model could not outperform binary logistic regression model; and (3 negative binomial regression is effective in predicting multiple errors in one module.

Liguo Yu

2012-07-01

323

Additive Intensity Regression Models in Corporate Default Analysis  

DEFF Research Database (Denmark)

We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of model checking techniques to identify misspecifications. In our final model, we find evidence of time-variation in the effects of distance-to-default and short-to-long term debt. Also we identify interactions between distance-to-default and other covariates, and the quick ratio covariate is significant. None of our macroeconomic covariates are significant.

Lando, David; Medhat, Mamdouh

2013-01-01

324

The application of a multiple regression model for aero radiometric data  

International Nuclear Information System (INIS)

The data observed in the total channel of high sensitivity airborne ?-ray spectrometric surveys is selected as the dependent variable while those of the Th, K and U channels are considered as independent variables and a linear statistical model is assumed to relate them as (Total)sub(i) ?sub(0) + ?1(U)sub(i) + ?2(Th)sub(i) + ?3(K)sub(i) + ?sub(i), ?1, ?2, ?3, are the partial regression coefficients and ?sub(i) is the error term. The estimated coefficients (?1, ?2, ?3) are used to check on board the data acquisition system as well as to predict occasionally the more appropriate value of the data in case a single data item is not recorded correctly. (author)

1988-01-01

325

Analysis of Maryland poisoning deaths using classification and regression tree (CART) analysis.  

Science.gov (United States)

Our study is a cross-sectional analysis of Maryland poisoning deaths for years 2003 and 2004. We used Classification and Regression Tree (CART) methodology to classify undetermined intent Maryland poisoning deaths as either unintentional or suicidal poisonings. The predictive ability of the selected set of variables (i.e., poisoned in the home or workplace, location type, where poisoned, place of death, poison type, victim race and age, year of death) was extremely good. Of the 301 test cases, only eight were misclassified by the CART regression tree. Of 1,204 undetermined intent poisoning deaths, CART classified 903 as suicides and 301 as unintentional deaths. The major strength of our study is the use of CART to differentiate with a high degree of accuracy between unintentional and suicidal poisoning deaths among Maryland undetermined intent poisoning deaths. PMID:18999168

Pamer, Carol; Serpi, Tracey; Finkelstein, Joseph

2008-01-01

326

ESTIMATE OF CO2 EFFLUX OF SOIL, OF A TRANSITION FOREST IN NORTHWEST OF MATO GROSSO STATE, USING MULTIPLE REGRESSION  

Directory of Open Access Journals (Sweden)

Full Text Available Many research groups have being studying the contribution of tropical forests to the global carbon cycle, and theclimatic consequences of substituting the forests for pastures. Considering that soil CO2 efflux is the greater component of the carboncycle of the biosphere, this work found an equation for estimating the soil CO2 efflux of an area of the Transition Forest, using a modelof multiple regression for time series data of temperature and soil moisture. The study was carried out in the northwest of MatoGrosso, Brazil (11°24.75’S; 55°19.50’W, in a transition forest between cerrado and AmazonForest, 50 km far from Sinop county.Each month, throughout one year, it was measured soil CO2 efflux, temperature and soil moisture. The annual average of soil CO2 efflux was 7.5 ± 0.6 (mean ± SE ì mol m-2 s-1, the annual mean soil temperature was 25,06 ± 0.12 (mean ± SE ºC. The study indicatedthat the humidity had high influence on soil CO2 efflux; however the results were more significant using a multiple regression modelthat estimated the logarithm of soil CO2 efflux, considering time, soil moisture and the interaction between time duration and theinverse of soil temperature. .

Carla Maria Abido Valentini

2008-03-01

327

Tree-Ring Growth Response of Common Ash ( Fraxinus excelsior L. to Climatic Variables Using Multiple Regressions  

Directory of Open Access Journals (Sweden)

Full Text Available   This study was down in Forest Park of Noor. In order to determination of tree ring response to climatic variations, 35 cores were taken from dominant natural stand of common ash (Fraxinus excelsior L.. The guide of this study was finding which climatic variables are effective in the ring width growth of ash in current growing year and previous years (one, two and three years before current growing year by multiple regression models at the North of IR-Iran. Totally, 85 annually, monthly seasons and seasonal growth climatic variations of precipitation, temperature, heat index, evapotranspiration and water balance were analyzed. The best multiple regression models were explained 83 percent of total variance of the growth of common ash. The results show that the growth of common ash was related to the previous year's climatic variations than that of the current year. The most effective role of climatic variations was due to the first and second preceding years (55%. Evapotranspiration of July and September, and precipitation of May in the second and precipitation of March in the third previous years, all were positively affected the growth of this species. This study revealed that ash is interested in warmer condition on early and middle of seasonal growth in present of available humid, and precipitation in the months of early growing season (Ordibehesht-Khordad of two previous years.

H. Jalilvand

2008-01-01

328

An Application of the ABS Algorithm for Modeling Multiple Regression on Massive Data, Predicting the Most Influencing Factors  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Linear Least Square (LLS) is an approach for modeling regression analysis, applied for prediction and quantification of the strength of relationship between dependent and independent variables. There are a number of methods for solving the LLS problem but as soon as the data size increases and system becomes ill conditioned, the classical methods become complex at time and space with decreasing level of accuracy. Proposed work is based on prediction and quantification of the strength of relat...

Soniya Lalwani; Krishna Mohan, M.; Pooran Singh Solanki; Sorabh Singhal; Sandeep Mathur; Emilio Spedicato

2013-01-01

329

A conditional likelihood approach for regression analysis using biomarkers measured with batch-specific error  

Science.gov (United States)

Measurement error is common in epidemiological and biomedical studies. When biomarkers are measured in batches or groups, measurement error is potentially correlated within each batch or group. In regression analysis, most existing methods are not applicable in the presence of batch-specific measurement error in predictors. We propose a robust conditional likelihood approach to account for batch-specific error in predictors when batch effect is additive and the predominant source of error, which requires no assumptions on the distribution of measurement error. While a regression model with batch as a categorical covariable yields the same parameter estimates as the proposed conditional likelihood approach for linear regression, this result does not hold in general for all generalized linear models, in particular, logistic regression. Our simulation studies show that the conditional likelihood approach achieves better finite sample performance than the regression calibration approach or a naive approach without adjustment for measurement error. In the case of logistic regression, our proposed approach is shown to also outperform the regression approach with batch as a categorical covariate. In addition, we also examine a “hybrid” approach combining the conditional likelihood method and the regression calibration method, which is shown in simulations to achieve good performance in the presence of both batch-specific and measurement-specific error. We illustrate our method using data from a colorectal adenoma study.

Wang, Ming; Flanders, W. Dana; Bostick, Roberd M.; Long, Qi

2012-01-01

330

A method for meta-analysis of case-control genetic association studies using logistic regression.  

Science.gov (United States)

We propose here a simple and robust approach for meta-analysis of molecular association studies. Making use of the binary structure of the data, and by treating the genotypes as independent variables in a logistic regression, we apply a simple and commonly used methodology that performs satisfactorily, being at the same time very flexible. We present simple tests for detecting heterogeneity and we describe a random effects extension of the method in order to allow for between studies heterogeneity. We derive also simple tests for assessing the most plausible genetic model of inheritance, and its between-studies heterogeneity as well as adjusting for covariates. The methodology introduced here is easily extended in cases with polytomous or continuous outcomes as well as in cases with more than two alleles. We apply the methodology in several published meta-analyses of genetic association studies with very encouraging results. The main advantages of the proposed methodology is its flexibility and the ease of use, while at the same time covers almost every aspect of a meta-analysis providing overall estimates without the need of multiple comparisons. We anticipate that this simple method would be used in the future in meta-analyses of genetic association studies. A STATA command performing all the available computations is available at http://bioinformatics.biol.uoa.gr/~pbagos/metagen/. PMID:17605724

Bagos, Pantelis G; Nikolopoulos, Georgios K

2007-01-01

331

Regression analysis for a bottom-up approach to analyzing semi-prompt fission gamma yields  

International Nuclear Information System (INIS)

Highlights: ? Fitting the semi-prompt non-resolved photon spectrum after fission. ? Energy–time dependence can be factorized. ? Physical model, statistical model, sampling procedure. ? The best fit is: lognormal for energy and F for time. - Abstract: We present an empirical model that describes the yield of gamma rays emitted by fission in the time interval from 20 to 958 ns following a fission event. The analysis is based on experimental data from neutron-induced fission of 235U and 239Pu. The model is devised by first using regression analysis to identify likely patterns in the data and to choose plausible fitting functions. We provide statistical and physical arguments in support of time and energy independence. The intensity of the emitted gamma rays can be described as a bivariate distribution that is the product of independent variates for energy and time. We test several plausible distribution families for the energy and time variates and use maximum likelihood and minimum ?2 to estimate distribution parameters. Because of the uncertainty in the experimental data, multiple combinations of variate pairs give rise to a surface that plausibly well fits the observations well. The best-fit variate turns out to be lognormal in energy and F in time. The findings illustrated in this paper can be used to simulate gamma ray de-excitation from fission in Monte Carlo codes.

2012-08-01

332

Using Multiple Regression in Estimating (semi) VOC Emissions and Concentrations at the European Scale  

DEFF Research Database (Denmark)

This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.

Fauser, Patrik; Thomsen, Marianne

2010-01-01

333

EPSVR and EPMeta: prediction of antigenic epitopes using support vector regression and multiple server results  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Accurate prediction of antigenic epitopes is important for immunologic research and medical applications, but it is still an open problem in bioinformatics. The case for discontinuous epitopes is even worse - currently there are only a few discontinuous epitope prediction servers available, though discontinuous peptides constitute the majority of all B-cell antigenic epitopes. The small number of structures for antigen-antibody complexes limits the development of reliable discontinuous epitope prediction methods and an unbiased benchmark to evaluate developed methods. Results In this work, we present two novel server applications for discontinuous epitope prediction: EPSVR and EPMeta, where EPMeta is a meta server. EPSVR, EPMeta, and datasets are available at http://sysbio.unl.edu/services. Conclusion The server application for discontinuous epitope prediction, EPSVR, uses a Support Vector Regression (SVR method to integrate six scoring terms. Furthermore, we combined EPSVR with five existing epitope prediction servers to construct EPMeta. All methods were benchmarked by our curated independent test set, in which all antigens had no complex structures with the antibody, and their epitopes were identified by various biochemical experiments. The area under the receiver operating characteristic curve (AUC of EPSVR was 0.597, higher than that of any other existing single server, and EPMeta had a better performance than any single server - with an AUC of 0.638, significantly higher than PEPITO and Disctope (p-value

Yao Bo

2010-07-01

334

A Noncentral "t" Regression Model for Meta-Analysis  

Science.gov (United States)

In this article, three multilevel models for meta-analysis are examined. Hedges and Olkin suggested that effect sizes follow a noncentral "t" distribution and proposed several approximate methods. Raudenbush and Bryk further refined this model; however, this procedure is based on a normal approximation. In the current research literature, this…

Camilli, Gregory; de la Torre, Jimmy; Chiu, Chia-Yi

2010-01-01

335

OPTREG - An Interactive Computer Program for Optimization and Regression.  

Science.gov (United States)

OPTREG is a computer program which provides stepwise multiple regression analysis and optimization of a user defined function by the geometric simplex method. Both optimization and regression are performed interactively. This gives the user visibility and...

R. A. Erickson R. Southall D. W. Twigg Y. Y. Wong

1977-01-01

336

A regressed phase analysis for coupled joint systems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This study aims to address shortcomings of the relative phase analysis, a widely-used method for assessment of coupling among joints of the lower limb. Goniometric data from 15 individuals with spastic diplegic cerebral palsy were recorded from the hip and knee joints during ambulation on a flat surface, and from a single healthy individual with no known motor impairment, over at least 10 gait cycles. The minimum relative phase (MRP) revealed substantial disparity in the timing and severity o...

Wininger, Michael

2011-01-01

337

An Application of the ABS Algorithm for Modeling Multiple Regression on Massive Data, Predicting the Most Influencing Factors  

Directory of Open Access Journals (Sweden)

Full Text Available Linear Least Square (LLS is an approach for modeling regression analysis, applied for prediction and quantification of the strength of relationship between dependent and independent variables. There are a number of methods for solving the LLS problem but as soon as the data size increases and system becomes ill conditioned, the classical methods become complex at time and space with decreasing level of accuracy. Proposed work is based on prediction and quantification of the strength of relationship between sugar fasting and Post-Prandial (PP sugar with 73 factors that affect diabetes. Due to the large number of independent variables, presented problem of diabetes prediction also presented similar complexities. ABS method is an approach proven better than other classical approaches for LLS problems. ABS algorithm has been applied for solving LLS problem. Hence, separate regression equations were obtained for sugar fasting and PP severity.

Soniya Lalwani

2013-06-01

338

Evaluation of Logistic Regression and Neural Network Model With Sensitivity Analysis on Medical Datasets  

Directory of Open Access Journals (Sweden)

Full Text Available Logistic Regression (LR is a well known classification method in the field of statistical learning. Itallows probabilistic classification and shows promising results on several benchmark problems.Logistic regression enables us to investigate the relationship between a categorical outcome anda set of explanatory variables. Artificial Neural Networks (ANNs are popularly used as universalnon-linear inference models and have gained extensive popularity in recent years. Researchactivities are considerable and literature is growing. The goal of this research work is to comparethe performance of logistic regression and neural network models on publicly available medicaldatasets. The evaluation process of the model is as follows. The logistic regression and neuralnetwork methods with sensitivity analysis have been evaluated for the effectiveness of theclassification. The classification accuracy is used to measure the performance of both themodels. From the experimental results it is confirmed that the neural network model withsensitivity analysis model gives more efficient result.

Raghavendra B.K. & S.K. Srivatsa

2011-12-01

339

Multiple-outcome meta-analysis of clinical trials.  

Science.gov (United States)

When several clinical trials report multiple outcomes, meta-analyses ordinarily analyse each outcome separately. Instead, by applying generalized-least-squares (GLS) regression, Raudenbush et al. showed how to analyse the multiple outcomes jointly in a single model. A variant of their GLS approach, discussed here, can incorporate correlations among the outcomes within treatment groups and thus provide more accurate estimates. Also, it facilitates adjustment for covariates. In our approach, each study need not report all outcomes nor evaluate all treatments. For example, a meta-analysis may evaluate two or more treatments (one 'treatment' may be a control) and include all randomized controlled trials that report on any subset (of one or more) of the treatments of interest. The analysis omits other treatments that these trials evaluated but that are not of interest to the meta-analyst. In the proposed fixed-effects GLS regression model, study-level and treatment-arm-level covariates may be predictors of one or more of the outcomes. An analysis of rheumatoid arthritis data from trials of second-line drug treatments (used after initial standard therapies prove unsatisfactory for a patient) motivates and applies the method. Data from 44 randomized controlled trials were used to evaluate the effectiveness of injectable gold and auranofin on the three outcomes tender joint count, grip strength, and erythrocyte sedimentation rate. The covariates in the regression model were quality and duration of trial and baseline measures of the patients' disease severity and disease activity in each trial. The meta-analysis found that gold was significantly more effective than auranofin on all three treatment outcomes. For all estimated coefficients, the multiple-outcomes model produced moderate changes in their values and slightly smaller standard errors, to the three separate outcome models. PMID:8668877

Berkey, C S; Anderson, J J; Hoaglin, D C

1996-03-15

340

Prediction of Spontaneous Regression of Cervical Intraepithelial Neoplasia Lesions Grades 2 and 3 by Proteomic Analysis  

Science.gov (United States)

Regression of cervical intraepithelial neoplasia (CIN) 2-3 to CIN 1 or less is associated with immune response as demonstrated by immunohistochemistry in formaldehyde-fixed paraffin-embedded (FFPE) biopsies. Proteomic analysis of water-soluble proteins in supernatants of biopsy samples with LC-MS (LTQ-Orbitrap) was used to identify proteins predictive of CIN2-3 lesions regression. CIN2-3 in the biopsies and persistence (CIN2-3) or regression (?CIN1) in follow-up cone biopsies was validated histologically by two experienced pathologists. In a learning set of 20 CIN2-3 (10 regressions and 10 persistence cases), supernatants were depleted of seven high abundance proteins prior to unidimensional LC-MS/MS protein analysis. Mean protein concentration was 0.81?mg/mL (range: 0.55–1.14). Multivariate statistical methods were used to identify proteins that were able to discriminate between regressive and persistent CIN2-3. The findings were validated in an independent test set of 20 CIN2-3 (10 regressions and 10 persistence cases). Multistep identification criteria identified 165 proteins. In the learning set, zinc finger protein 441 and phospholipase D6 independently discriminated between regressive and persistent CIN2-3 lesions and correctly classified all 20 patients. Nine regression and all persistence cases were correctly classified in the validation set. Zinc finger protein 441 and phospholipase D6 in supernatant samples detected by LTQ-Orbitrap can predict regression of CIN2-3.

Uleberg, Kai-Erik; ?vestad, Irene Tveiteras; Munk, Ane Cecilie; van Diermen, Bianca; Gudlaugsson, Einar; Janssen, Emiel A. M.; Hjelle, Anne; Baak, Jan P. A.

2014-01-01

 
 
 
 
341

The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis  

DEFF Research Database (Denmark)

This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the â??trueâ? but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric estimations in general.

Czekaj, Tomasz Gerard

2013-01-01

342

Calculation of Slater-Condon and Lande parameters in some Ndsup(+3) complexes using partial and multiple regression method  

International Nuclear Information System (INIS)

The interelectronic repulsion and spin-orbit interaction parameters for some Ndsup(3+)?-diketone complexes have been computed using partial and multiple regression method from the observed absorption spectra in the region 1000-23500 cmsup(-1). A brief outline of this method which is an alternative to a computer programming method is given. The energy parameters (Slater-Condon and Lande') derived from intra-fsup(N) transitions of lanthanide ion have their importance to predict the covalent tendency of the metal-ligand bond in the complex on the basis of the decrease in the value of these parameters. The complexes have been arranged in the increasing order of covalency as has been indicated by the value of ? or bsup(1/2). (author)

1981-03-01

343

Multiple regression technique for estimating the insulation strength of series dielectrics on distribution systems: a statistical approach  

Energy Technology Data Exchange (ETDEWEB)

A method of estimating the lightning-impulse critical-flashover (CFO) insulation strength of two components used on distribution construction is presented. The multiple regression technique (MRT) has been applied to comprehensive CFO data of two dielectric materials in series. General CFO populations of two-component models and sample models for different combination of two materials have been developed. A diagnostic correction test is performed for the general case and for each combination model, to decide which models fit well. Suggestions are made regarding the more accurate prediction model, and the main factors that might have affected predicted results are emphasised. Also, a procedure to predict values outside the experimental results range is described for other sizes and lengths of the tested components. This procedure may be a good tool for finding better insulation added in distribution systems. (author)

Shwehdi, M.H.; Shahzad, F.; Izzularab, M.; Abu-Al-Feilat, E.A. [King Fahd University of Petroleum and Minerals, Dhahran (Saudi Arabia)

1998-03-01

344

Modelling of Habitat Suitability Index for Muntjac Muntiacus muntjak Using Remote Sensing, GIS and Multiple Logistic Regression  

Directory of Open Access Journals (Sweden)

Full Text Available Habitat degradation and loss has been widely recognized as the main cause for the decline of wildlife population. Evaluating the quality of wildlife habitat can provide essential information for wildlife refuge design and management. The purpose of this study was to produce georeferenced ecological information about suitable habitats available for muntjac, Muntiacus muntjak in Chandoli tiger reserve, India (17° 04' 00" N to 17° 19' 54" N and 73° 40' 43" E to 73° 53' 09" E. Habitats were evaluated using multiple logistic regression integrated with remote sensing and geographic information system. Satellite imageries of LISS-III of IRS-P6 of study area were digitally processed. To generate collateral data topographic maps were analysed in a GIS framework. Layers of different variables such as Landuse land cover, forest density, proximity to disturbances and water resources and a digital terrain model were created from satellite and topographic sheets. These layers along with GPS location of muntjac presence/absence and ?multiple logistic regression (MLR techniques were integrated in a GIS environment to model habitat suitability index of muntjac. The results indicate that approximately 222.39 km2 (75.4% of the forest of tiger reserve was least suitable for muntjac, whereas, 29.53 km2 (10.02% was moderately suitable, 22.12 km2 (7.5% suitable and 20.70 km2 (7.0% was highly suitable. The accuracy level of this model was 97.6%. The model can be considered as potent enough to advocate that forests of this area are most appropriate for declaring it as a reserve for muntjac conservation, ultimately to provide prey base for tiger.

Imam EKWAL

2012-12-01

345

Regression Analysis on the Chemical Descriptors of a Selected Class of DPP4 Inhibitors  

Directory of Open Access Journals (Sweden)

Full Text Available The activity of a selected class of DPP4 inhibitors was assessed using quantum-chemical and physical descriptors. Using multiple linear regression model, it was found that ?E, LUMO energy, dipole, area, volume, molecular weight and ?H are the significant descriptors that can adequately assess the activity of the compounds. The model suggests that bulky and electrophilic inhibitors are desired. Furthermore a pair interaction between ?E and dipole as well as for LUMO energy and dipole were determined as well. It is expected that the information derived herein will be beneficial for future design and development of DPP4 inhibitors. Key words: Multiple Linear Regression; Molecular Descriptors; 2D-QSAR; DPP4 Inhinitors

Jose Isagani B. Janairo

2010-11-01

346

Prediction of cartilage compressive modulus using multiexponential analysis of T(2) relaxation data and support vector regression.  

Science.gov (United States)

Evaluation of mechanical characteristics of cartilage by magnetic resonance imaging would provide a noninvasive measure of tissue quality both for tissue engineering and when monitoring clinical response to therapeutic interventions for cartilage degradation. We use results from multiexponential transverse relaxation analysis to predict equilibrium and dynamic stiffness of control and degraded bovine nasal cartilage, a biochemical model for articular cartilage. Sulfated glycosaminoglycan concentration/wet weight (ww) and equilibrium and dynamic stiffness decreased with degradation from 103.6 ± 37.0 µg/mg ww, 1.71 ± 1.10 MPa and 15.3 ± 6.7 MPa in controls to 8.25 ± 2.4 µg/mg ww, 0.015 ± 0.006 MPa and 0.89 ± 0.25MPa, respectively, in severely degraded explants. Magnetic resonance measurements were performed on cartilage explants at 4 °C in a 9.4 T wide-bore NMR spectrometer using a Carr-Purcell-Meiboom-Gill sequence. Multiexponential T2 analysis revealed four water compartments with T2 values of approximately 0.14, 3, 40 and 150 ms, with corresponding weight fractions of approximately 3, 2, 4 and 91%. Correlations between weight fractions and stiffness based on conventional univariate and multiple linear regressions exhibited a maximum r(2) of 0.65, while those based on support vector regression (SVR) had a maximum r(2) value of 0.90. These results indicate that (i) compartment weight fractions derived from multiexponential analysis reflect cartilage stiffness and (ii) SVR-based multivariate regression exhibits greatly improved accuracy in predicting mechanical properties as compared with conventional regression. PMID:24519878

Irrechukwu, Onyi N; Thaer, Sarah Von; Frank, Eliot H; Lin, Ping-Chang; Reiter, David A; Grodzinsky, Alan J; Spencer, Richard G

2014-04-01

347

A viscoelastic model of blood capillary extension and regression: derivation, analysis, and simulation.  

Science.gov (United States)

This work studies a fundamental problem in blood capillary growth: how the cell proliferation or death induces the stress response and the capillary extension or regression. We develop a one-dimensional viscoelastic model of blood capillary extension/regression under nonlinear friction with surroundings, analyze its solution properties, and simulate various growth patterns in angiogenesis. The mathematical model treats the cell density as the growth pressure eliciting a viscoelastic response from the cells, which again induces extension or regression of the capillary. Nonlinear analysis captures two cases when the biologically meaningful solution exists: (1) the cell density decreases from root to tip, which may occur in vessel regression; (2) the cell density is time-independent and is of small variation along the capillary, which may occur in capillary extension without proliferation. The linear analysis with perturbation in cell density due to proliferation or death predicts the global biological solution exists provided the change in cell density is sufficiently slow in time. Examples with blow-ups are captured by numerical approximations and the global solutions are recovered by slow growth processes, which validate the linear analysis theory. Numerical simulations demonstrate this model can reproduce angiogenesis experiments under several biological conditions including blood vessel extension without proliferation and blood vessel regression. PMID:23149501

Zheng, Xiaoming; Xie, Chunjing

2014-01-01

348

Simultaneous Optimization of Nanocrystalline SnO2 Thin Film Deposition Using Multiple Linear Regressions  

Directory of Open Access Journals (Sweden)

Full Text Available A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27, very low P-value (<0.012 and 0.0478, non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977 and the adequate precision (170.96 and 12.57 validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis.

Saeideh Ebrahimiasl

2014-02-01

349

Simultaneous Optimization of Nanocrystalline SnO2 Thin Film Deposition Using Multiple Linear Regressions  

Science.gov (United States)

A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O) concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA) design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27), very low P-value (<0.012 and 0.0478), non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977) and the adequate precision (170.96 and 12.57) validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis.

Ebrahimiasl, Saeideh; Zakaria, Azmi

2014-01-01

350

Seasonal forecasting of Bangladesh summer monsoon rainfall using simple multiple regression model  

Science.gov (United States)

In this paper, the development of a statistical forecasting method for summer monsoon rainfall over Bangladesh is described. Predictors for Bangladesh summer monsoon (June-September) rainfall were identified from the large scale ocean-atmospheric circulation variables (i.e., sea-surface temperature, surface air temperature and sea level pressure). The predictors exhibited a significant relationship with Bangladesh summer monsoon rainfall during the period 1961-2007. After carrying out a detailed analysis of various global climate datasets; three predictors were selected. The model performance was evaluated during the period 1977-2007. The model showed better performance in their hindcast seasonal monsoon rainfall over Bangladesh. The RMSE and Heidke skill score for 31 years was 8.13 and 0.37, respectively, and the correlation between the predicted and observed rainfall was 0.74. The BIAS of the forecasts (% of long period average, LPA) was -0.85 and Hit score was 58%. The experimental forecasts for the year 2008 summer monsoon rainfall based on the model were also found to be in good agreement with the observation.

Rahman, Md Mizanur; Rafiuddin, M.; Alam, Md Mahbub

2013-04-01

351

NEW IDEA FOR THE TOPOLOGICAL INDEX EVALUATION AND TREATISE MULTIPLE REGRESSION WITH THREE INDEPENDENT VARIABLES: SATURATED HYDROCARBONS USED LIKE A MODEL  

Directory of Open Access Journals (Sweden)

Full Text Available In QSRR discipline an easy novel to used parameter was designed (Vc for evaluated classical topological index (W, ¹chi, Z, MTI and two new generation ones (Xu, ¹chih. Regression between Vc and ¹chih presented a correlation index (r of 0,9992, a surprising high value in comparison with that founds commonly in QSPR/QSAR discipline. Through Vc parameter, an idea to treatise multiple three independent variable regression is present. Model of 35 saturated hydrocarbons were used

E CORNWELL

2006-03-01

352

MODELOS DE OPTIMIZACIÓN POR METAS PARA EL CÁLCULO DE ESTIMADORES EN REGRESIÓN MÚLTIPLE / GOAL OPTIMIZATION MODELS FOR THE ESTIMATORS CALCULUS IN MULTIPLE REGRESSION PROBLEMS  

Scientific Electronic Library Online (English)

Full Text Available SciELO Colombia | Language: Spanish Abstract in spanish Este trabajo introductorio presenta y describe diversos modelos de regresión múltiple y su respectiva formulación como un problema de optimización por metas. Se describen los modelos de regresión mediana, regresión mediana ponderada, regresión cuantílica, regresión cuantílica ponderada y formulación [...] minimax. Además, se describe la formulación dual de estos modelos y se presentan algunos ejemplos sencillos se presentan para explicar los conceptos desarrollados y las aplicaciones de dichos modelos en ingeniería y ciencias. Abstract in english This introductory work shows several multiple regression models and their relevant development as a problem of goal programming (eliminar...optimization by goals). It describes the median regression, weighted median regression, quantile regression, weighted quantile regression, and minimax formulati [...] on models. Furthermore, describes their dual formulation. We describe some simple examples to explain the concepts developed and applications of such models on engineering and sciences.

López Ospina, Héctor Andrés; López Ospina, Rafael David.

353

Analysis of multiple primary cancers  

International Nuclear Information System (INIS)

From January 1971 to August 1979, 4156 patients with malignant tumor except brain tumor were registered at the Department of Radiotherapy, National Sapporo Hospital. Seventy-one patients out of them had multiple primary cancers. The incidence in our series was 1.71%. One patient had four separate primary cancers arising in respectively the cervix uteri, the sigmoid colon, the thymus and the stomach. In 27 cases (38.0%), the cancers occurred within 1 year of each other. The longest interval was 33 years. Five cases were considered to be radiation-induced cancers. They developed secondarily in the region irradiated in the period between 5 and 26 years after the completion of irradiation. In 25%, patient had a family history of cancer. (author)

1980-01-01

354

Development of an empirical model of turbine efficiency using the Taylor expansion and regression analysis  

International Nuclear Information System (INIS)

The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination Rc2 ? 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: ? Performed a critical review of empirical models of turbine efficiency. ? Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. ? Verified the method for developing the empirical model. ? Verified the model.

2011-05-01

355

Analysis of patients with complete histopathological tumour regression after neoadjuvant chemoradiotherapy for locally advanced rectal cancer  

Directory of Open Access Journals (Sweden)

Full Text Available Background: Our aim was to present the effect of the neoadjuvant chemoradiation therapy on the development of the complete histopathological tumour regression in patients with locally advanced rectal cancer and its influence on a five-year survival of these patients. Methods: In total, 223 patients were included in the analysis; 109 patients had the locally advanced rectal cancer; 75 patients received the neoadjuvant chemoradiation therapy, which was later followed by surgery; 34 patients were treated with the surgery alone. The surgical procedure was done 6 to 8 weeks after the chemoradiation therapy and it was preceded by haematology and biochemical analyses. In addition, patients were examined by ultrasound and MRI imaging of liver to evaluate the effects of neoadjuvant chemoradiation therapy. Accordingly, we had two patient groups: patients with the complete histopathological tumour regression and patients with the incomplete or no regression. We performed the statistical analysis of all locally advanced rectal cancer patients and determined their survival. Results: The complete histopathological tumour regression was found in 10.7% of 75 patients who were treated with preoperative chemoradiation. The down staging of the tumour appeared in 53.3% of patients. There were no stage changes in 21.3% of patients. The disease progressed into a more severe stage in 9.3% of patients, while the effects of the preoperative chemoradiation therapy could not be determined in 5.3% of patients. The survival of patients with the complete histopathological tumour regression was 70% in a five-year period, while it was 40% in patients with incomplete histopathological regression. Conclusion: The preoperative chemoradiation therapy leads to complete histopathological tumour regression and increases a five-year survival (70%. It also leads to the increase of the number of patients who undergo radical surgery.

Kneževi?-Ušaj Slavica

2010-01-01

356

Analysis of stresses on buried pipeline subjected to landslide based on numerical simulation and regression analysis  

Energy Technology Data Exchange (ETDEWEB)

Landslides have a serious impact on the integrity of oil and gas pipelines in the tough terrain of Western China. This paper introduces a solving method of axial stress, which uses numerical simulation and regression analysis for the pipelines subjected to landslides. Numerical simulation is performed to analyze the change regularity of pipe stresses for the five vulnerability assessment indexes, which are: the distance between pipeline and landslide tail; the thickness of landslide; the inclination angle of landslide; the pipeline length passing through landslide; and the buried depth of pipeline. A pipeline passing through a certain landslide in southwest China was selected as an example to verify the feasibility and effectiveness of this method. This method has practical applicability, but it would need large numbers of examples to better verify its reliability and should be modified accordingly. Also, it only considers the case where the direction of the pipeline is perpendicular to the primary slip direction of the landslide.

Han, Bing; Jing, Hongyuan; Liu, Jianping; Wu, Zhangzhong [PetroChina Pipeline RandD Center, Langfang, Hebei (China); Hao, Jianbin [School of Petroleum Engineering, Southwest Petroleum University, Chengdu, Sichuan (China)

2010-07-01

357

Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data  

Directory of Open Access Journals (Sweden)

Full Text Available Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS and weighed least-squares regression (WLS. All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding ?1.

Sara M. Gustavsson

2012-10-01

358

Quantile regression for the statistical analysis of immunological data with many non-detects  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Conclusion Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

Eilers Paul HC

2012-07-01

359

Spline Nonparametric Regression Analysis of Stress-Strain Curve of Confined Concrete  

Directory of Open Access Journals (Sweden)

Full Text Available Due to enormous uncertainties in confinement models associated with the maximum compressive strength and ductility of concrete confined by rectilinear ties, the implementation of spline nonparametric regression analysis is proposed herein as an alternative approach. The statistical evaluation is carried out based on 128 large-scale column specimens of either normal-or high-strength concrete tested under uniaxial compression. The main advantage of this kind of analysis is that it can be applied when the trend of relation between predictor and response variables are not obvious. The error in the analysis can, therefore, be minimized so that it does not depend on the assumption of a particular shape of the curve. This provides higher flexibility in the application. The results of the statistical analysis indicates that the stress-strain curves of confined concrete obtained from the spline nonparametric regression analysis proves to be in good agreement with the experimental curves available in literatures

Tavio Tavio

2008-01-01

360

Regression Activity  

Science.gov (United States)

This activity focuses on basic ideas of linear regression. It covers creating scatterplots from data, describing the association between two variables, and correlation as a measure of linear association. After this activity students will have the knowledge to create output that yields R-square, the slope and intercept, as well as their interpretations. This activity also covers some of the basics about residual analysis and the fit of the linear regression model in certain settings.

2009-01-28

 
 
 
 
361

Semiparametric modeling and estimation of heteroscedasticity in regression analysis of cross-sectional data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We consider the problem of modeling heteroscedasticity in semiparametric regression analysis of crosssectional data. Existing work in this setting is rather limited and mostly adopts a fully nonparametric variance structure. This approach is hampered by curse of dimensionality in practical applications. Moreover, the corresponding asymptotic theory is largely restricted to estimators that minimize certain smooth objective functions. The asymptotic derivation thus excludes semiparametric quant...

Keilegom, Ingrid; Wang, Lan

2010-01-01

362

Application of logistic regression in an analysis of Polish households’ financial problems  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This article attempted to identify the socio-economic and demographic factors influencing the problems with arrears in Polish households. The micro data from Social Diagnosis were used. In order to achieve the main goal the logistic regression analysis was used.

2012-01-01

363

Land use regression modeling of intra-urban residential variability in multiple traffic-related air pollutants  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5, nitrogen dioxide (NO2, and elemental carbon (EC outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68. Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76. EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52. NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction, and with higher concentrations during summer (R2 = 0.56. Conclusion Each pollutant examined displayed somewhat different spatial patterns within urban neighborhoods, and were differently related to local traffic and meteorology. Our results indicate a need for multi-pollutant exposure modeling to disentangle causal agents in epidemiological studies, and further investigation of site-specific and meteorological modification of the traffic-concentration relationship in urban neighborhoods.

Baxter Lisa K

2008-05-01

364

QUANTITATIVE STRUCTURE–PROPERTY RELATIONSHIP (QSPR STUDY OF KOVATS RETENTION INDICES OF SOME OF ADAMANTANE DERIVATIVES BYTHE GENETIC ALGORITHM AND MULTIPLE LINEAR REGRESSION (GA-MLR METHOD  

Directory of Open Access Journals (Sweden)

Full Text Available A quantitative structure–property relationship (QSPR study was performed to develop models those relate the structures of 65 Kovats retention index (RI of adamantane derivatives. Molecular descriptors derived solely from 3D structures of the molecular compounds. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 52 molecules as training set, and predictive ability tested using 13 compounds. Modeling of RI of Adamantane derivatives as a function of the theoretically derived descriptors was established by multiple linear regression (MLR. The usefulness of the quantum chemical descriptors, calculated at the level of the DFT theories using 6-311+G** basis set for QSAR study of adamantane derivatives was examined. The use of descriptors calculated only from molecular structure eliminates the need to experimental determination of properties for use in the correlation and allows for the estimation of RI for molecules not yet synthesized. Application of the developed model to testing set of 13 drug organic compounds demonstrates that the model is reliable with goo predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. A multi-parametric equation containing maximum Four descriptors at B3LYP/6-31+G** method with good statistical qualities (R2train=0.913, Ftrain=97.67, R2test=0.770, Ftest=3.21, Q2LOO=0.895, R2adj=0.904, Q2LGO=0.844 was obtained by Multiple Linear Regression using stepwise method.

Z. Bayat

2011-05-01

365

Stepwise Multiple Linear Regression.  

Science.gov (United States)

This is a statistical technique for analyzing a relationship between a dependent variable (y) and a set of independent variables (x1,x2,...,xm) and for selecting the independent variables in the order of their importance. ...Software Description: The prog...

1974-01-01

366

Partially linear censored quantile regression  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Censored regression quantile (CRQ) methods provide a powerful and flexible approach to the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models: one (or more) of the explanatory covariates are assumed to act on the response through a non-linear function. Here the CRQ approach of Portnoy (...

2009-01-01

367

Parent Progeny regression analysis in F2 and F3 generations of rice  

Directory of Open Access Journals (Sweden)

Full Text Available Parent progeny regression analysis involving F2 and F3 generation of two crosses in rice was undertaken to estimate the geneticpotential transferred from one generation to other by adopting three levels of selection for single plant yield. Significant positivecorrelation and regression was observed in both crosses at positive level of selection (mean +1SD between F3 mean and thecorresponding F2 values, indicating that selection of single plant yield at these levels would be effective in both crosses. Itindicates the chances of selecting high yielding genotypes at early generations.

Anilkumar , C. Vanniarajan*1 and J. Ramalingam

2011-12-01

368

Fast algorithm of the robust Gaussian regression filter for areal surface analysis  

International Nuclear Information System (INIS)

In this paper, the general model of the Gaussian regression filter for areal surface analysis is explored. The intrinsic relationships between the linear Gaussian filter and the robust filter are addressed. A general mathematical solution for this model is presented. Based on this technique, a fast algorithm is created. Both simulated and practical engineering data (stochastic and structured) have been used in the testing of the fast algorithm. Results show that with the same accuracy, the processing time of the second-order nonlinear regression filters for a dataset of 1024*1024 points has been reduced to several seconds from the several hours of traditional algorithms

2010-05-01

369

Statistical methods in regression and calibration analysis of chromosome aberration data  

International Nuclear Information System (INIS)

The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)

1983-02-01

370

Spatial analysis of Snow Watar Equivalent (SWE) using Multi Weighted Linear Regression (MWLR) in Czech Republic  

Science.gov (United States)

The poster describe new method called Multi Weighted Linear Regression (MWLR) which we want to use for spatial analysis of Snow Water Equivalent in the Czech Republic.MWRL is based on localized linear regression (SWE and sea level) between neighbouring meteorological stations. The neighbouring stations are selected according to their terrain parameters like altitude, aspect, vertical and horizontal distance from target grid point. The quality of estimation MWLR method was tested by objective cross validation method with about 50 control points. MWLR method was compared by cross validation techniques with standard interpolation methods like IDW, Spline and Kriging. MWLR is still developing method and poster describe current status and current results.

Striz, M.

2009-09-01

371

Automatic regression analysis for use in a complex system of evaluation of plant genetic resources  

Directory of Open Access Journals (Sweden)

Full Text Available In accordance with the general requirements regarding computerization in gene banks and germplasm research a computer program has been compiled for the analysis of univariate response in crop germplasm evaluation. The program is compiled in COBOL and run on a FELIX C-256 computer. The different modules of the program allows for: (1. data control and error listing; (2 computation of the regression function; (3 listing of the difference between the values measured and computed; (4 sorting of the individuals samples; (5 construction of scattergrams in two dimensions for measured values with the simultaneous representation of the regression line; (6 listing of examined samples in a sequence required in evaluation.

Attila T. SZABO

1984-08-01

372

Comparison of Neural Networks Prediction and Regression Analysis (MLR and PCR in Modelling Nonlinear System  

Directory of Open Access Journals (Sweden)

Full Text Available Different methods for modelling nonlinear system are investigated in this paper. Neural network (NN techniques, multiple linear regression (MLR and principal component regression (PCR are applied to two nonlinear systems which are sine function and distillation column. For the sake of studying these three distinctive methods, all the data taken is from simulation which is then be seperated into training, testing and validation. Among those different approaches, the NN approach based on the nonlinear prediction technique gives a very good performance in for both case studies. It is also shown that MLR model suffers from glitches due to the collinearity of the input variables whereas PCR model shows good result in the prediction output. As a conclusion, the NN methods exhibit a consistent result with least sum square error (SSE on the unseen data compared to the other two technique

Zainal Ahmad

2007-10-01

373

Assessment of neural network, frequency ratio and regression models for landslide susceptibility analysis  

Science.gov (United States)

This paper presents the assessment results of spatially based probabilistic three models using Geoinformation Techniques (GIT) for landslide susceptibility analysis at Penang Island in Malaysia. Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and supported with field surveys. Maps of the topography, soil type, lineaments and land cover were constructed from the spatial data sets. There are nine landslide related factors were extracted from the spatial database and the neural network, frequency ratio and logistic regression coefficients of each factor was computed. Landslide susceptibility maps were drawn for study area using neural network, frequency ratios and logistic regression models. For verification, the results of the analyses were compared with actual landslide locations in study area. The verification results show that frequency ratio model provides higher prediction accuracy than the ANN and regression models.

Pradhan, B.; Buchroithner, M. F.; Mansor, S.

2009-04-01

374

Electricity Consumption Analysis Using Spline Regression Models: The Case of a Turkish Province  

Directory of Open Access Journals (Sweden)

Full Text Available Energy is one of the indispensible elements of human life and electrical energy is adopted as the most frequently used energy type. As this type of energy can not be stored at the present time, it has to be instantly consumed. In other words, the demand of the consumers has to be compensated, immediately. This paper employs to model the electrical consumption of Erzurum province in 2011 by spline regression and to decide whether a statistically seasonal variation exists for this consumption. The one-year data set of the investigation was obtained from Turkish Electricity Transmission Company Provincial Directorate of Erzurum and was analyzed by the agency of continuous partial polynomial spline regressions. This analysis determined three knots and fits linear, quadratic and cubic spline regression models.

Omer Alkan

2013-05-01

375

Multiple scattering in neutron polarization analysis  

International Nuclear Information System (INIS)

An analytical method of correcting measured neutron polarization analysis spectra for multiple scattering effects is presented. The multiple scattering corrections are applicable to any general instrumental and specimen geometries, and have been evaluated within the elastic quasi-isotropic scattering approximation. The results are appropriate for specimens where the total attenuation of the incident neutron beam is small and where the multiple scattering is no more than about 20% of the total scattered intensity. Specific results have been calculated for the geometry of the LONGPOL polarization analysis instrument at the AAEC Research Establishment, and the theoretical predictions have been tested by examination of the neutron spin-dependent cross sections of a vanadium standard specimen measured using LONGPOL. Finally, the use of the analytical multiple scattering corrections is illustrated by reference to some recent measurements of spin-dependent cross sections of an MnCu alloy obtained on LONGPOL

1981-01-01

376

Informal Housing in Greece: A Multinomial Logistic Regression Analysis at the Regional Level  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper deals with the primary causes of informal housing in Greece as well as the observed differentiations in informal housing patterns across space. The spatial level of analysis is the prefectural administrative level. The results of the multinomial logistic regression analysis indicate that Greek prefectures differ in the way they experience the informal housing phenomenon. An explanation for the observed differences may be the separate development paths followed and the ...

Polyzos, Serafeim; Minetos, Dionysios

2014-01-01

377

Ranking contributing areas of salt and selenium in the Lower Gunnison River Basin, Colorado, using multiple linear regression models  

Science.gov (United States)

Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.

Linard, Joshua I.

2013-01-01

378

Association of perceived stress and stiff neck/shoulder with health status: multiple regression models by gender.  

Science.gov (United States)

It is well known that psychological stress affects health status. Stiff neck and shoulder in a broad sense is one of the major somatic complaints among Japanese. The objective was to determine how much perceived stress and stiff neck/shoulder are associated with health-related quality of life (HRQoL) by gender. Participants (n = 512) completed the Japanese version of Perceived Stress Scale, the SF-8 Japanese version and original questions on perceived stiff neck/shoulder. Muscle hardness around the shoulder also was measured with the muscle tension meter. The multiple regression model of the men demonstrated that perceived stress was associated with not only the mental component summary (MCS) (beta: -0.494), but also the physical component summary (PCS) (beta = -0.319) of the SF-8. Although, in the model of the women, perceived stress was also associated with MCS (beta: -0.632) more than in that of the men, stiff neck/shoulder and age group (beta: -0.231; -0.268, respectively), but not stress, were related to PCS. The subjective neck/shoulder stiffness was hardly correlated with the objective shoulder muscle hardness. This study revealed the associations between perceived stress, stiff neck/shoulder and HRQoL, and their difference by gender. The hypothesis of gender differences was discussed with a focus on kind of stressors, perception of stress, admission of negative symptoms and cause of stiff neck/shoulder. PMID:17274540

Kimura, Tomoaki; Tsuda, Yasutami; Uchida, Seiya; Eboshida, Akira

2006-12-01

379

Linear regression analysis of the gamma dose in fast neutron beams  

International Nuclear Information System (INIS)

The dual dosimeter technique for determining both the absorbed dose of neutrons and photons in a mixed field has been applied to multiple dosimeter use. The data were analyzed by a linear regression method which yields the neutron dose from the slope and the photon dose from the intercept and an estimation of the uncertainty of the photon dose can also be obtained. Measurements were made on a high energy neutron beam and the photon dose obtained both as a function of field size and depth in a tissue equivalent phantom

1980-01-01

380

Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Results Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. Conclusion The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Skjerve Eystein

2009-10-01

 
 
 
 
381

Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering  

DEFF Research Database (Denmark)

Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Ussery, David; Bohlin, Jon

2009-01-01

382

Application of Binary Regression Analysis in the Prescription Pattern of Antidepressants  

Directory of Open Access Journals (Sweden)

Full Text Available Background:In Nepal several research studies are reported using percentages or cross tabulation method, but the relevance of logistic regression methodology in research is lag behind among the researchers. Objectives: The main objective of this study was to find the role of logistic regression analysis in the pattern of antidepressants in a tertiary care center in hospitalized patients of Western Nepal.Methods: A hospital based study was done between 1st October 2009 and 31st March 2010 at Psychiatry Ward of Manipal Teaching Hospital, Nepal. Z test, Chi square test and Binary logistic regression were used for the analysis. We calculated odds ratios (OR and their 95% confidence intervals (95% CI P-value 10000, 2.63 times more in Hindus and 1.197 times more in Brahmins than any other ethnic groups. 9.179 times more tendency of prescribing antidepressants by trade names in case of unemployed patients as compared to employed patients in Nepal.Conclusion: Binary Logistic regression plays an important role to understand the drug utilization pattern of mood elevators in Western Nepal.

Dr.Indrajit Banerjee, MBBS, MD

2013-05-01

383

Classificação da composição iônica da água de irrigação usando regressão linear múltipla / Classification of the ionic composition of the irrigation water using multiple linear regression  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: Portuguese Abstract in portuguese Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da [...] água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM). As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente. Abstract in english This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calc [...] ium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM). The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas) or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation), piezometric well or superficial water, respectivelly.

Maia, Celsemy E.; Morais, Elís R.C. de; Oliveira, Maurício de.

384

Classificação da composição iônica da água de irrigação usando regressão linear múltipla Classification of the ionic composition of the irrigation water using multiple linear regression  

Directory of Open Access Journals (Sweden)

Full Text Available Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM. As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente.This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calcium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM. The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation, piezometric well or superficial water, respectivelly.

Celsemy E. Maia

2001-04-01

385

Transferencia de información hidrológica mendiante regresión lineal múltiple, con selección óptima de regresores / Transference of hydrologic information through multiple linear regression, with best predictor variables selection  

Scientific Electronic Library Online (English)

Full Text Available SciELO Mexico | Language: Spanish Abstract in spanish Es necesario contar con registros largos de información hidrológica anual para obtener una imagen más apegada a la realidad de su variabilidad, así como estimaciones confiables de sus propiedades estadísticas. Para obtener tales registros es común buscar fuentes adicionales de datos y técnicas de tr [...] ansferencia. Una técnica es la regresión lineal múltiple, cuya aplicación numérica lleva implícita la selección óptima de los registros largos cercanos (regresores) para buscar que la ampliación del registro corto sea una estimación confiable. Este proceso de selección implica tres análisis: 1) cómo definir las mejores estimaciones, 2) cuáles ecuaciones de regresión investigar, y 3) cuál modelo tiene mejor capacidad predictiva. Para el primer análisis se presentan cuatro criterios basados en las sumas de los cuadrados de los residuos; para el segundo se investigan todas las regresiones posibles porque en los problemas de transferencia de información hidrológica se dispondrá máximo de cinco regresores; para el tercero, seleccionar el mejor modelo predictivo se utiliza el análisis de residuales y la validación cruzada. La aplicación numérica descrita es una ampliación del registro de volúmenes escurridos anuales en la estación hidrométrica Platón Sánchez del sistema del río Tempoal, en la Región Hidrológica No. 26 (Pánuco, México). En este caso se utilizan cuatro regresores que son los registros del resto de las estaciones de aforos de tal sistema. Se concluye que incluso en problemas con multicolinealidad, los criterios de selección y los análisis expuestos conducen a resultados consistentes y permiten obtener las mejores ecuaciones de regresión. La similitud de los resultados alcanzados con los modelos de regresión seleccionados genera confianza en las estimaciones adoptadas. Abstract in english It is necessary to have long records of annual hydrological data to get a truer picture of their variability, as well as reliable estimates of their statistical properties. To obtain these records it is common to use additional sources of data and transfer techniques. One technique is the multiple l [...] inear regression whose numerical application implies the optimum selection of close lengthy records (regressors) to have the extension of short registration be a reliable estimate. This selection process involves three analyses: 1) how to define the best estimates, 2) what regression equations should be investigated, and 3) which model has better predictive ability. For the first analysis four criteria based on the sums of the squares of the residuals are presented; for the second all possible regressions are investigated since in the problems of hydrological information transfer, we will have five regressors at the most; for the third, about selecting the best predictive model, we used the residual analysis and cross-validation. The numerical application described is an extension of the annual runoff volume record in the Platón Sánchez hydrometric station of the Tempoal river system in the 26 Hydrological Region (Pánuco, México). Here we used four regressors that are the records of other gauging stations in such system. We came to the conclusion that even in problems with multicollinearity, the selection criteria and analysis led to consistent results and allowed for the best regression equations. The similarity of the results obtained with the selected regression models generated confidence in the estimates adopted.

Campos-Aranda, Daniel F..

386

Analysis of radial velocity variations in multiple planetary systems  

CERN Multimedia

The study of multiple extrasolar planetary systems has the opportunity to obtain constraints for the planetary masses and orbital inclinations via the detection of mutual perturbations. The analysis of precise radial velocity measurements might reveal these planet-planet interactions and yields a more accurate view of such planetary systems. Like in the generic data modelling problems, a fit to radial velocity data series has a set of unknown parameters of which parametric derivatives have to be known by both the regression methods and the estimations for the uncertainties. In this paper an algorithm is described that aids the computation of such derivatives in case of when planetary perturbations are not neglected. The application of the algorithm is demonstrated on the planetary systems of HD 73526, HD 128311 and HD 155358. In addition to the functions related to radial velocity analysis, the actual implementation of the algorithm contains functions that computes spatial coordinates, velocities and barycent...

Pál, András

2010-01-01

387

Multiple-scale analysis of quantum systems  

International Nuclear Information System (INIS)

Conventional weak-coupling Rayleigh-Schroedinger perturbation theory suffers from problems that arise from resonant coupling of successive orders in the perturbation series. Multiple-scale analysis, a powerful and sophisticated perturbative method that quantitatively analyzes characteristic physical behaviors occurring on various length or time scales, avoids such problems by implicitly performing an infinite resummation of the conventional perturbation series. Multiple-scale perturbation theory provides a good description of the classical anharmonic oscillator. Here, it is extended to study (1) the Heisenberg operator equations of motion and (2) the Schroedinger equation for the quantum anharmonic oscillator. In the former case, it leads to a system of coupled operator differential equations, which is solved exactly. The solution provides an operator mass renormalization of the theory. In the latter case, multiple-scale analysis elucidates the connection between weak-coupling perturbative and semiclassical nonperturbative aspects of the wave function. copyright 1996 The American Physical Society

1996-12-01

388

Multiple-scale analysis of quantum systems  

CERN Document Server

Conventional weak-coupling Rayleigh-Schr\\"odinger perturbation theory suffers from problems that arise from resonant coupling of successive orders in the perturbation series. Multiple-scale analysis, a powerful and sophisticated perturbative method that quantitatively analyzes characteristic physical behaviors occurring on various length or time scales, avoids such problems by implicitly performing an infinite resummation of the conventional perturbation series. Multiple-scale perturbation theory provides a good description of the classical anharmonic oscillator. Here, it is extended to study (1) the Heisenberg operator equations of motion and (2) the Schr\\"odinger equation for the quantum anharmonic oscillator. In the former case, it leads to a system of coupled operator differential equations, which is solved exactly. The solution provides an operator mass renormalization of the theory. In the latter case, multiple-scale analysis elucidates the connection between weak-coupling perturbative and semiclassical...

Bender, C M; Bender, Carl M; Bettencourt, Luis M A

1996-01-01

389

Multiple linear regression to develop strength scaled equations for knee and elbow joints based on age, gender and segment mass  

DEFF Research Database (Denmark)

Background: The next fifty years will see a drastic increase in the older population. Among other effects, ageing causes a decrease in strength. It is necessary to provide safe and comfortable environments for the elderly. To achieve this, digital human modelling has proved to be a useful and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent parameters based on statistical redundancies, and then validate these equations. Methods: 283 subjects (141 males, 142 females) aged 50-59 years (54.9 +/- 2.9) , 60-69 years (65.4 +/- 2.9) and 70-79 years (73.7 +/- 2.7) were tested for maximal voluntary isometric torque of right knee extensors and elbow flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (P<0.05) across all age groups. This held true for males and females. Gender, thigh mass and age best predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass and age.

D'Souza, Sonia; Rasmussen, John

2012-01-01

390

A Combinatorial Protocol in Multiple Linear Regression to Model GasChromatographic Response Factor of Organophosphonate Esters  

Directory of Open Access Journals (Sweden)

Full Text Available Organophosphorus compounds are a well known class of toxic chemicals which find their way into ecosystem due to their wide spread use. Their detection, identification and quantification are cause of concern world over. In environmental samples these compounds are detected and estimated through the gas chromatographic response factor. This prompted us to study the quantitative structure-response relationships (QSRR of gas chromatographic response factor of organophosphonate esters. In this study attempts have been made to rationalize the gas chromatographic response factor of twenty-eight organophosphonates in terms of their physicochemical and electronic descriptors. Combinatorial Protocol in Multiple Linear Regression (CP-MLR, a 'filter' based variable selection procedure for model development in structure-activity or property relationship studies, has been used for the variable selection and identification of diverse QSRR models of the GC response factor of organophosphonates. The study has resulted in the identification of ten models (equations, having two or three descriptor each, to account for the response factor of organophosphonates (cross-validated R2 or Q2 is 0.88 to 0.95. The response factor of the compounds is strongly correlated with the total refractivity (TREF, molecular weight (MW and thermodynamic properties, e.g., enthalpy of vaporization (ENTH. In the study, alkyl groups of these compounds have shown two-fold influence (namely, steric and branching effect on the response factor. Also, the study suggests that the polarization of (d-p? bond of P=Oa in these compounds plays a critical role in the formation of the responding species. The steric and electronic properties of organophosphonates play a determining role in the predictive aspect of their gas chromatographic response factor. Also the study suggested a mechanism for the formation of the responding species.

Yenamandra S. Prabhakar

2004-03-01

391

Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions  

Directory of Open Access Journals (Sweden)

Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

Catalin Angelo Ioan

2011-08-01

392

A logistic normal multinomial regression model for microbiome compositional data analysis.  

Science.gov (United States)

Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ?1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ?1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype. PMID:24128059

Xia, Fan; Chen, Jun; Fung, Wing Kam; Li, Hongzhe

2013-12-01

393

Mixed-effects Poisson regression analysis of adverse event reports: The relationship between antidepressants and suicide  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A new statistical methodology is developed for the analysis of spontaneous adverse event (AE) reports from post-marketing drug surveillance data. The method involves both empirical Bayes (EB) and fully Bayes estimation of rate multipliers for each drug within a class of drugs, for a particular AE, based on a mixed-effects Poisson regression model. Both parametric and semiparametric models for the random-effect distribution are examined. The method is applied to data from Food and Drug Adminis...

2008-01-01

394

High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a nov...

Daye, Z. John; Chen, Jinbo; Li, Hongzhe

2012-01-01

395

Fertility differences between married and cohabiting couples: a switching regression analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Little is known about why cohabiting couples have fewer children than married couples. We explore the factors that explain the difference in fertility between these two groups using a switching regression analysis, which enables us to quantify the contribution of different factors through a decomposition of the difference. We find that married couples have more children than cohabiting couples primarily because marriage provides stronger incentives for specialization in household production. ...

Zhang, Junfu; Song, Xue

2007-01-01

396

Quantile regression for the statistical analysis of immunological data with many non-detects  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results <...

Hc, Eilers Paul; Röder Esther; Fj, Savelkoul Huub; van Wijk Roy

2012-01-01

397

Correlation Study and Regression Analysis of Drinking Water Quality in Kashan City, Iran  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Chemical and statistical regression analysis on drinking water samples at five fields (21 sampling wells) with hot and dry climate in Kashan city, central Iran was carried out. Samples were collected during October 2006 to May 2007 (25 - 30 °C). Comparing the results with drinking water quality standards issued by World Health Organization (WHO), it is found that some of the water samples are not potable. Hydrochemical facies using a Piper diagram indicate that in most parts of the city, the...

2013-01-01

398

Comparison of Artificial Neural Networks and Logistic Regression Analysis in the Credit Risk Prediction  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repa...

2012-01-01

399

A Logistic Regression Analysis of the Contractor`s Awareness Regarding Waste Management  

Digital Repository Infrastructure Vision for European Research (DRIVER)

This study has highlighted a number of factors affecting contractor`s awareness regarding construction waste management to the construction industry. The data in the present study is based on contractors registered with the Construction Industry Development Board of Malaysia. Binary logistic regression analysis is employed for exploring the factors affecting the awareness. Contractor`s awareness regarding waste management will tend to be significantly adequate with the increasing values in th...

Rawshan Ara Begum; Chamhuri Siwar; Joy Jacqueline Pereira; Abdul Hamid Jaafar

2006-01-01

400

Analysis of patients with complete histopathological tumour regression after neoadjuvant chemoradiotherapy for locally advanced rectal cancer  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Background: Our aim was to present the effect of the neoadjuvant chemoradiation therapy on the development of the complete histopathological tumour regression in patients with locally advanced rectal cancer and its influence on a five-year survival of these patients. Methods: In total, 223 patients were included in the analysis; 109 patients had the locally advanced rectal cancer; 75 patients received the neoadjuvant chemoradiation therapy, which was later followed by surgery; 34 patients wer...

Petrovi? Tomislav; Radovanovi? Zoran; Bokorov Bojana; Nikoli? Ivan; Kneževi?-Ušaj Slavica; ?ankovi? Milenko

2010-01-01

 
 
 
 
401

Ordinal Logistic Regression for the Estimate of the Response Functions in the Conjoint Analysis  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In the Conjoint Analysis (COA) model proposed here – a new approach to estimate more than one response function–an extension of the traditional COA, the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles) is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt – at the aggregate level – an ordinal logistic regression, based on a main effects experimental design.Th...

2011-01-01

402

LOGISTIC REGRESSION RESPONSE FUNCTIONS WITH MAIN AND INTERACTION EFFECTS IN THE CONJOINT ANALYSIS  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In the Conjoint Analysis (COA) model proposed here - an extension of the traditional COA - the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles) is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt - at the aggregate level - a multivariate logistic regression model, based on a main and two-factor interaction effects experimental design. The model provides several ove...

2011-01-01

403

Prediction of large esophageal varices in cirrhotic patients using classification and regression tree analysis  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a hig [...] h risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients) were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for high-risk group and 15.2% for low-risk group. Accuracy of the tree model was maintained in the test sample and different Child-Pugh classes. CONCLUSIONS: A decision tree model that consists of spleen width, portal vein diameter and prothrombin time may be useful for prediction of large esophageal varices in cirrhotic patients

Wan-dong, Hong; Le-mei, Dong; Zen-cai, Jiang; Qi-huai, Zhu; Shu-Qing, Jin.

404

Prediction of large esophageal varices in cirrhotic patients using classification and regression tree analysis  

Directory of Open Access Journals (Sweden)

Full Text Available OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a high risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for hi