1

Multiple linear regression analysis

Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

Edwards, T. R.

1980-01-01

2

Multiple Linear Regressions Analysis

This online calculator allows users to enter sixteen observations with up to four dependent variables and calculates the regression equation, the fitted values, R-Squared, the F-Statistic, mean, variance, first order serial-correlation, second order serial-correlation, the Durbin-Watson statistic, and the mean absolute errors. It also tests normality and gives the i-th residuals.

Arsham, Hossein

3

Importance of Diagnostics in Multiple Regression Analysis

The aim of this study was to obtain some valuable information from different diagnostics in Multiple Regression Analysis (MRA). Sample data set was composed of live weights at different periods (birth weight (X1), live weights in 30th (X2), 45th(X3), 60th (X4) and 75th (Y) days) of 18 Hamdani breed single-male lambs born in early March of 2001. According to results of MRA, although all independent variables including...

Eyduran, E.; Ozdemir, T.; Alarslan, E.

2005-01-01

4

Importance of Diagnostics in Multiple Regression Analysis

Full Text Available The aim of this study was to obtain some valuable information from different diagnostics in Multiple Regression Analysis (MRA. Sample data set was composed of live weights at different periods (birth weight (X1, live weights in 30th (X2, 45th(X3, 60th (X4 and 75th (Y days of 18 Hamdani breed single-male lambs born in early March of 2001. According to results of MRA, although all independent variables including in model explained approximately 92% of variation in dependent variable, Y, the effect of only independent variable X4 on dependent variable Y was significant (p<0.01. With respect to residual analysis, it could be said that the assumptions of normal distribution and homogeneity of error terms in MRA were provided. As the value of Durbin-Watson statistics equaled to 2.31, there was not a sequent correlation among error terms, that is, the assumption that error terms independent from each other was ensured. Considered the leverage and influence diagnostics calculating for observations of sample data set, only two observations (2nd and 16th observations of all observations-both outliers and potential effective (influence observations- should be carefully examined. It could be concluded that diagnostics would be an important statistics for researchers because they could give an idea about whether the basic assumptions would be provided for reliability of MRA, data set and goodness of fit.

E. Eyduran

2005-01-01

5

The graduation thesis presents regression analysis with emphasis on linear regression and multiple regression. The first chapter describes basic concepts of the correlation analysis and analysis of variance, which are important for understanding the thesis. Further, a regression model is presented. The central part of the thesis consists of two chapters: linear regression and multiple regression. The first one describes the method of least squares, which is important for obtaining esti...

Korenjak, Andreja

2010-01-01

6

Multiple Correlation versus Multiple Regression.

Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)

Huberty, Carl J.

2003-01-01

7

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Ba...

Kulcsa?r, Erika

2009-01-01

8

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

Erika KULCSÁR

2009-12-01

9

This page will perform basic multiple regression analysis for the case where there are several independent predictor variables, X1, X2, etc., and one dependent or criterion variable, Y. Requires import of data from a spreadsheet.

Lowry, Richard, 1940-

10

Applied multiple regression correlation analysis for the behavioral sciences

This classic text on multiple regression is noted for its nonmathematical, applied, and data-analytic approach. Readers profit from its verbal-conceptual exposition and frequent use of examples. The applied emphasis provides clear illustrations of the principles and provides worked examples of the types of applications that are possible. Researchers learn how to specify regression models that directly address their research questions. An overview of the fundamental ideas of multiple regression and a review of bivariate correlation and regression and other elementary statistical concepts provide a strong foundation for understanding the rest of the text. The third edition features an increased emphasis on graphics and the use of confidence intervals and effect size measures, and an accompanying CD with data for most of the numerical examples along with the computer code for SPSS, SAS, and SYSTAT. Applied Multiple Regression serves as both a textbook for graduate students and as a reference tool for researche...

Cohen, Patricia; Aiken, Leona S

2014-01-01

11

Analysis and Interpretation of Findings Using Multiple Regression Techniques

Multiple regression and correlation (MRC) methods form a flexible family of statistical techniques that can address a wide variety of different types of research questions of interest to rehabilitation professionals. In this article, we review basic concepts and terms, with an emphasis on interpretation of findings relevant to research questions…

Hoyt, William T.; Leierer, Stephen; Millington, Michael J.

2006-01-01

12

Multiple Regression Analysis Using ANCOVA in University Model

Full Text Available The government of UAE is promoting Dubai as an academic hub. Dubai International Academic City (DIAC is a free zone area with many national and international universities promoting higher education in almost all disciplines. The aspiration of every graduating student from the university is to get a good placement. In Dubai diverse job opportunities in national and multinational organizations are available. The objective of the paper is to review the placement opportunities in Dubai for the universities offering programs in Engineering. This paper attempts to study the effect of three independent variables namely Cumulative grade point average (CGPA, Engineering disciplines and types of jobs that graduating students are offered on the dependent variable salary. Engineering discipline understudy are Mechanical, Electronics and Communication, Computer Science and Electrical and Electronics Engineering. The type of jobs taken into consideration are marketing, technical marketing, design and logistics. The concepts of Analysis of covariance (ANCOVA and multiple regression are used for review of placement opportunities vis a vis the salary structure.

Maneesha

2013-09-01

13

Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

Williams, Ryan T.

2012-01-01

14

An improved multiple linear regression and data analysis computer program package

NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

Sidik, S. M.

1972-01-01

15

This site, created by Michelle Lacey of Yale University, gives an explanation, a definition and an example of multiple linear regression. Topics include: confidence intervals, tests of significance, and squared multiple correlation. While brief, this is still a valuable site for anyone interested in statistics.

Lacey, Michelle

16

This chapter deals with the multiple linear regression. That is we investigate the situation where the mean of a variable depends linearly on a set of covariables. The noise is supposed to be gaussian. We develop the least squared method to get the parameter estimators and estimates of their precisions. This leads to design confidence intervals, prediction intervals, global tests, individual tests and more generally tests of submodels defined by linear constraints. Methods for model's choice and variables selection, measures of the quality of the fit, residuals study, diagnostic methods are presented. Finally identification of departures from the model's assumptions and the way to deal with these problems are addressed. A real data set is used to illustrate the methodology with software R. Note that this chapter is intended to serve as a guide for other regression methods, like logistic regression or AFT models and Cox regression.

Grégoire, G.

2014-01-01

17

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample si...

Chauhan, R. K.; Abhishek Taneja

2011-01-01

18

Fungible Weights in Multiple Regression

Every set of alternate weights (i.e., nonleast squares weights) in a multiple regression analysis with three or more predictors is associated with an infinite class of weights. All members of a given class can be deemed "fungible" because they yield identical "SSE" (sum of squared errors) and R[superscript 2] values. Equations for generating…

Waller, Niels G.

2008-01-01

19

Full Text Available The aim of this study was to forecast the returns for the Stock Exchange of Thailand (SET Index by adding some explanatory variables and stationary Autoregressive order p (AR (p in the mean equation of returns. In addition, we used Principal Component Analysis (PCA to remove possible complications caused by multicollinearity. Results showed that the multiple regressions based on PCA, has the best performance.

Nop Sopipan

2013-01-01

20

Beyond Multiple Regression: Using Commonality Analysis to Better Understand R[superscript 2] Results

Multiple regression is one of the most common statistical methods used in quantitative educational research. Despite the versatility and easy interpretability of multiple regression, it has some shortcomings in the detection of suppressor variables and for somewhat arbitrarily assigning values to the structure coefficients of correlated…

Warne, Russell T.

2011-01-01

21

REVAAM Model to determine a company's value by multiple valuation and linear regression analysis

Full Text Available This paper shows an alternative model to the widely used method of multiple valuation (or relative valuation) in order to calculate the value of a company by using either the Price Earnings (PE) and/or the Enterprise Value to Earnings Before Interest, Taxes, Depreciation and Amortization (EV/EBITDA). When calculating multiples, analysts tend to consider average multiples within an industry and apply them directly to the target company; however, we believe that this practice is not considering differences among the companies being compared, although they belong to the same sector or industry. REVAAM Model uses linear regression to calculate adjusted PE and EV/EBITDA multiples by taking into consideration profitability factors for each multiple in order to differentiate companies in the samples. Calculations are based on public data for US companies, but could be further expanded to other markets. Not only REVAAM Model provides a better estimate to relative valuation analysis than simply using average multiples, but it could be used to compare under/overvalued companies or sectors, and also analyze multiple value changes over time as the intrinsic fundamentals change.

Luis G. Acosta-Calzado

2010-07-01

22

An application of a microcomputer compiler program to multiple logistic regression analysis.

Microcomputer programs for multiple logistic regression analysis were written in BASIC language to determine the usefulness of microcomputers for multivariate analysis, which is an important method in epidemiological studies. The program, carried out by an interpreter system, required a comparatively long computing time for a small amount of data. For example, it took approximately thirty minutes to compute the data of 6 independent variables and 63 matched sets of case and controls (1:4). The majority of the calculation time was spent computing a matrix. The matrix computation time increased cumulatively in proportion to additions in the number of subjects, and increased exponentially with the number of variables. A BASIC compiler was utilized for the program of multiple logistic regression analysis. The compiled program carried out the same computations as above, but within 4 minutes. Therefore, it is evident that a compiler can be an extremely convenient tool for computing multivariate analysis. The two programs produced here were also easily linked with spreadsheet packages to enter data. PMID:3405017

Sakai, R

1988-01-01

23

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

Full Text Available The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE, R-square, R-Square adjusted, condition number, root mean square error(RMSE, number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

R.K.Chauhan

2011-04-01

24

Various types of ultrasonic techniques have been used for the estimation of compressive strength of concrete structures. However, conventional ultrasonic velocity method using only longitudial wave cannot be determined the compressive strength of concrete structures with accuracy. In this paper, by using the introduction of multiple parameter, e. g. velocity of shear wave, velocity of longitudinal wave, attenuation coefficient of shear wave, attenuation coefficient of longitudinal wave, combination condition, age and preservation method, multiple regression analysis method was applied to the determination of compressive strength of concrete structures. The experimental results show that velocity of shear wave can be estimated compressive strength of concrete with more accuracy compared with the velocity of longitudinal wave, accuracy of estimated error range of compressive strength of concrete structures can be enhanced within the range of ± 10% approximately

25

This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (?0.93) and R2 (?0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.

Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.

2013-06-01

26

Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

2014-10-01

27

Full Text Available Since the fluctuations of the Persian Gulf Sea Surface Temperature (PGSST have a significant effect on the winter precipitation and water resources and agricultural productions of the south western parts of Iran, the possibility of the Winter SST prediction was evaluated by multiple regression model. The time series of PGSSTs for all seasons, during 1947-1992, were considered as predictors, and the time series of MSSTs during 1948-1993, as the prrdictand. For the purpose of data reduction and principal components extraction, the principal components analysis was applied. Just the scores of the first four PCs (PC1 to PC4 that accounted for the total variance in predictor field were considered as the input file for the regression analysis. For finding the dependency of each principal component to the first time series of the PGSST, the Varimax rotation analysis was applied. The results have indicated that PC1 to PC4 respectively are the indicator of temperature changes during winter, autumn, Spring and Summer. According to the regression model, the components of PC1, PC2 and PC4 were significant at 5% level. But the components of PC3 was insignificant. The results indicated that the significant variables are held accountable for the 33.5% of the total variance in the winter PGSSTs. It became obvious that for the prediction of the winter PGSST, the PGSST during the winter of the last year has a particular importance. At the next stage, autumn and summer temperature have also a role in prediction of winter PGSST.

A. Shirvani

2005-10-01

28

Full Text Available Risk is not always avoidable, but it is controllable. The aim of this study is to identify whether those techniques are effective in reducing software failure. This motivates the authors to continue the effort to enrich the managing software project risks with consider mining and quantitative approach with large data set. In this study, two new techniques are introduced namely stepwise multiple regression analysis and fuzzy multiple regression to manage the software risks. Two evaluation procedures such as MMRE and Pred (25 is used to compare the accuracy of techniques. The model’s accuracy slightly improves in stepwise multiple regression rather than fuzzy multiple regression. This study will guide software managers to apply software risk management practices with real world software development organizations and verify the effectiveness of the new techniques and approaches on a software project. The study has been conducted on a group of software project using survey questionnaire. It is hope that this will enable software managers improve their decision to increase the probability of software project success.

Abdelrafe Elzamly

2014-01-01

29

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear re...

Taneja, Abhishek

2011-01-01

30

A Multiple Regression Analysis on Influencing Factors of Urban Services Growth in China

Full Text Available The indicator of urban success is the success of its urban services. Although much research on services have been made, there is major gap with regard to the regional services, especially on urban services within a country. As for urban ser-vices, there are few research on factors influencing urban services and its effect on regional growth. In reaction to this, the government intend to accelerate the development of urban services and regional economy in the present Twelfth Five-Year Plan 2011-2015.Thus, the main purpose of this paper is to investigate the factors that influence urban servic-es growth from demand , supply, institutional environment and spatial agglomeration side. By using cross-section mul-tiple regression analysis, the study examine the factors influencing urban services growth in China .The model indicated that except for urbanization, division of labor , other independent variables have contributed positively towards urban services growth in China.

ABDUL Razak bin Chik

2013-01-01

31

Multiple Regression and Its Discontents

Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

Snell, Joel C.; Marsh, Mitchell

2012-01-01

32

International Nuclear Information System (INIS)

In this study, thermodynamic and statistical analyses were performed on a gas turbine system, to assess the impact of some important operating parameters like CIT (Compressor Inlet Temperature), PR (Pressure Ratio) and TIT (Turbine Inlet Temperature) on its performance characteristics such as net power output, energy efficiency, exergy efficiency and fuel consumption. Each performance characteristic was enunciated as a function of operating parameters, followed by a parametric study and optimization. The results showed that the performance characteristics increase with an increase in the TIT and a decrease in the CIT, except fuel consumption which behaves oppositely. The net power output and efficiencies increase with the PR up to certain initial values and then start to decrease, whereas the fuel consumption always decreases with an increase in the PR. The results of exergy analysis showed the combustion chamber as a major contributor to the exergy destruction, followed by stack gas. Subsequently, multiple regression models were developed to correlate each of the response variables (performance characteristic) with the predictor variables (operating parameters). The regression model equations showed a significant statistical relationship between the predictor and response variables. (author)

33

Incremental Net Effects in Multiple Regression

A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…

Lipovetsky, Stan; Conklin, Michael

2005-01-01

34

A multiple regression model allows us to interpret differences in flood discharge between Tajo and Guadiana watersheds. Statistical relationships between discharge and rainfall and morphometrical parameters of each subbasin shows up the significance of relief, stream length and either rainfalls or permeability according also to the considered return period

Potenciano, A.; Garzo?n Heydt, Guillermina

2005-01-01

35

International Nuclear Information System (INIS)

The calculated >1-MeV pressure vessel fluence is used to determine the fracture toughness and integrity of the reactor pressure vessel. It is therefore of the utmost importance to ensure that the fluence prediction is accurate and unbiased. In practice, this assurance is provided by comparing the predictions of the calculational methodology with an extensive set of accurate benchmarks. A benchmarking database is used to provide an estimate of the overall average measurement-to-calculation (M/C) bias in the calculations (). This average is used as an ad-hoc multiplicative adjustment to the calculations to correct for the observed calculational bias. However, this average only provides a well-defined and valid adjustment of the fluence if the M/C data are homogeneous; i.e., the data are statistically independent and there is no correlation between subsets of M/C data.Typically, the identification of correlations between the errors in the database M/C values is difficult because the correlation is of the same magnitude as the random errors in the M/C data and varies substantially over the database. In this paper, an evaluation of a reactor dosimetry benchmark database is performed to determine the statistical validity of the adjustment to the calculated pressure vessel fluence. Physical mechanisms that could potentially introduce a correlation between the subsets of M/C ratios are identified and included in a multiple regression analysis of the M/C data.tiple regression analysis of the M/C data. Rigorous statistical criteria are used to evaluate the homogeneity of the M/C data and determine the validity of the adjustment.For the database evaluated, the M/C data are found to be strongly correlated with dosimeter response threshold energy and dosimeter location (e.g., cavity versus in-vessel). It is shown that because of the inhomogeneity in the M/C data, for this database, the benchmark data do not provide a valid basis for adjusting the pressure vessel fluence.The statistical criteria and methods employed in this analysis are generic and may be applied in benchmarking applications where the M/C comparisons are used to determine an adjustment of the calculations

36

Anomalous particle pinch and scaling of vin/D based on transport analysis and multiple regression

International Nuclear Information System (INIS)

Predictions of density profiles in current tokamaks and ITER require a validated scaling relation for vin/D where vin is the anomalous inward drift velocity and D is the anomalous diffusion coefficient. Transport analysis is necessary for determining the anomalous particle pinch from measured density profiles and for separating the impact of particle sources. A set of discharges in ASDEX Upgrade, DIII-D, JET and ASDEX is analysed using a special version of the 1.5-D BALDUR transport code. Profiles of ?svin/D with ?s the effective separatrix radius, five other dimensionless parameters and many further quantities in the confinement zone are compiled, resulting in the dataset VIND1.dat, which covers a wide parameter range. Weighted multiple regression is applied to the ASDEX Upgrade subset which leads to a two-term scaling ?svin(x')/D(x') 0.0432[(LTe(x-bar')/?s)-2.58 + 7.13 UL1.55?e*(x-bar')-0.42]x' with x' = ?/?s, effective radius ? and average value x-bar'. The rmse value of the scaling equals 15.2%. The electron temperature gradient length LTe is the key parameter of the anomalous particle pinch which yields the main contribution. A further parameter is the loop voltage UL which introduces the electron collisionality parameter ?e*. All exponenmeter ?e*. All exponents are statistically significant. The parameters UL and ?e* suggest a new anomalous particle pinch term driven by the Ohmic inductive electric field. The nonlinearities in the two-term scaling show that quasilinear theory is disproved by experiment. Regression analysis of the whole dataset VIND1.dat from four tokamaks shows that the LTe/?s scaling covers the dependence of ?svin/D on the effective plasma radius. It is further found that the ?svin/D values from transport analysis do not respond to a change in collisionality regime and are not clearly related to the prevailing turbulence type. The new scaling law predicts for ITER high values of ?svin/D and peaked density profiles, caused by the LTe/?s term and central heating due to alpha particles. The density peaking improves the energy confinement by some 20%

37

International Nuclear Information System (INIS)

Polycyclic aromatic hydrocarbons (PAHs) are contaminants that reside mainly in surface soils. Dietary intake of plant-based foods can make a major contribution to total PAH exposure. Little information is available on the relationship between root morphology and plant uptake of PAHs. An understanding of plant root morphologic and compositional factors that affect root uptake of contaminants is important and can inform both agricultural (chemical contamination of crops) and engineering (phytoremediation) applications. Five crop plant species are grown hydroponically in solutions containing the PAH phenanthrene. Measurements are taken for 1) phenanthrene uptake, 2) root morphology – specific surface area, volume, surface area, tip number and total root length and 3) root tissue composition – water, lipid, protein and carbohydrate content. These factors are compared through Pearson's correlation and multiple linear regression analysis. The major factors which promote phenanthrene uptake are specific surface area and lipid content. -- Highlights: •There is no correlation between phenanthrene uptake and total root length, and water. •Specific surface area and lipid are the most crucial factors for phenanthrene uptake. •The contribution of specific surface area is greater than that of lipid. -- The contribution of specific surface area is greater than that of lipid in the two most important root morphological and compositional factors affecting phenanthrene uptake

38

Penalized Functional Regression Analysis of White-Matter Tract Profiles in Multiple Sclerosis

Diffusion tensor imaging (DTI) enables noninvasive parcellation of cerebral white matter into its component fiber bundles or tracts. These tracts often subserve specific functions, and damage to the tracts can therefore result in characteristic forms of disability. Attempts to quantify the extent of tract-specific damage have been limited in part by substantial spatial variation of imaging properties from one end of a tract to the other, variation that can be compounded by the effects of disease. Here, we develop a “penalized functional regression” procedure to analyze spatially normalized tract profiles, which powerfully characterize such spatial variation. The central idea is to identify and emphasize portions of a tract that are more relevant to a clinical outcome score, such as case status or degree of disability. The procedure also yields a “tract abnormality score” for each tract and MRI index studied. Importantly, the weighting function used in this procedure is constrained to be smooth, and the statistical associations are estimated using generalized linear models. We test the method on data from a cross-sectional MRI and functional study of 115 multiple-sclerosis cases and 42 healthy volunteers, considering a range of quantitative MRI indices, white-matter tracts, and clinical outcome scores, and using training and testing sets to validate the results. We show that attention to spatial variation yields up to 15% (mean across all tracts and MRI indices: 6.4%) improvement in the ability to discriminate multiple sclerosis cases from healthy volunteers. Our results confirm that comprehensive analysis of white-matter tract-specific imaging data improves with knowledge and characterization of the normal spatial variation. PMID:21554962

Goldsmith, Jeff; Crainiceanu, Ciprian M.; Caffo, Brian S.; Reich, Daniel S.

2011-01-01

39

Since the fluctuations of the Persian Gulf Sea Surface Temperature (PGSST) have a significant effect on the winter precipitation and water resources and agricultural productions of the south western parts of Iran, the possibility of the Winter SST prediction was evaluated by multiple regression model. The time series of PGSSTs for all seasons, during 1947-1992, were considered as predictors, and the time series of MSSTs during 1948-1993, as the prrdictand. For the purpose of data reduction an...

Shirvani, A.; Nazemosadat, M. J.

2005-01-01

40

Investigations upon the indefinite rolls quality assurance in multiple regression analysis

International Nuclear Information System (INIS)

The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks) and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates. (Author) 16 refs.

41

Regression analysis by example

Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

Chatterjee, Samprit

2012-01-01

42

Full Text Available The multiple linear regression (MLR was used to build the linear quantitative structure-property relationship (QSPR model for the prediction of the molar diamagnetic susceptibility (?mfor 140 diverse organic compounds using the three significant descriptors calculated from the molecular structures alone and selected by stepwise regression method. Stepwise regression was employed to develop a regression equation based on 100training compounds, and predictive ability was tested on 40 compounds reserved for that purpose. The stability of the proposed model was validated using Leave-One-Out cross-validation and randomization test. Application of the developed model to a testing set of 40 organic compounds demonstrates that the new model is reliable with good predictive accuracy and simple formulation. By applying MLR method we can predict the test set (40compounds with Q2extof 0.9894 and average root mean square error (RMSE of 2.2550. The model applicability domain was always verified by the leverage approach in order to propose reliable predicted data. The prediction results are in good agreement with the experimental values.

S . Saaidpour

2012-03-01

43

Absorption spectra of multicomponent hydrocarbon mixtures based on n-heptane and isooctane with addition of benzene (up to 1%) and toluene and o-xylene (up to 20%) were investigated experimentally in the region of the first overtones of the hydrocarbon groups (? = 1620-1780 nm). It was shown that their concentrations could be determined separately by using a multiple linear regression method. The optimum result was obtained by including four wavelengths at 1671, 1680, 1685, and 1695 nm, which took into account absorption of CH groups in benzene, toluene, and o-xylene and CH3 groups, respectively.

Vesnin, V. L.; Muradov, V. G.

2012-09-01

44

Perturbations in the daily averages of the `true' neutral atmospheric densities have been derived from High Accuracy Satellite Drag Models (HASDM) at different perigee heights of the satellites during disturbed periods of 2001/02. A Mulltiple Linear Regression Analysis is then performed on daily averages of Ap indices and the perigee latitudes to predict changes in the `true' neutral densities in response to geomagnetic disturbances. The results and limitations are discussed.

Bhatnagar, V. P.; Bowman, B.; Tan, A.

2005-12-01

45

This article presents the possibility of using of multiple regression analysis (MRA) and dynamic neural network (DNN) for prediction of stability of Hydrocortisone 100 mg (in a form of hydrocortisone sodium succinate) freeze-dried powder for injection packed into a dual chamber container. Degradation products of hydrocortisone sodium succinate: free hydrocortisone and related substances (impurities A, B, C, D and E; unspecified impurities and total impurities) were followed during stres...

Vuji? Zorica B.; Jockovi? Jelena M.; Stankovic Predrag D.; Solomun Ljiljana N.; Ibri? Svetlana R.; Pejanovi? Vjera M.; ?uriš Jelena D.

2012-01-01

46

Full Text Available This article presents the possibility of using of multiple regression analysis (MRA and dynamic neural network (DNN for prediction of stability of Hydrocortisone 100 mg (in a form of hydrocortisone sodium succinate freeze-dried powder for injection packed into a dual chamber container. Degradation products of hydrocortisone sodium succinate: free hydrocortisone and related substances (impurities A, B, C, D and E; unspecified impurities and total impurities were followed during stress and formal stability studies. All data obtained during stability studies were used for in silico modeling; multiple regression models and dynamic neural networks as well, in order to compare predicted and observed results. High values of coefficient of determination (0.950.99 were gained using MRA and DNN, so both methods are powerful tools for in silico stability studies, but superiority of DNN over mathematical modeling of degradation was also confirmed.

Vuji? Zorica B.

2012-01-01

47

Multiple Regressions in Analysing House Price Variations

Full Text Available An application of rigorous statistical analysis in aiding investment decision making gains momentum in the United States of America as well as the United Kingdom. Nonetheless in Malaysia the responses from the local academician are rather slow and the rate is even slower as far as the practitioners are concern. This paper illustrates how Multiple Regression Analysis (MRA and its extension, Hedonic Regression Analysis been used in explaining price variation for selected houses in Malaysia. Each attribute that theoretically identified as price determinant is priced and the perceived contribution of each is explicitly shown. The paper demonstrates how the statistical analysis is capable of analyzing property investment by considering multiple determinants. The consideration of various characteristics which is more rigorous enables better investment decision making.

Aminah Md Yusof

2012-03-01

48

Full Text Available This study was carried out to detect nitrogen content in lettuce leaves rapidly and non-destructively using visible and near infrared (VIS-NIR hyperspectral imaging technology. Principal Component Analysis (PCA was performed on the average spectra to reduce the spectral dimensionality and the principal components (PCs were extracted as the input vectors of prediction models. Partial Least Square Regression (PLSR, Back Propagation Artificial Neural Network (BP-ANN, Extreme Learning Machine (ELM, Support Vector Machine Regression (SVR were, respectively applied to relate the nitrogen content to the corresponding PCs to build the prediction models of nitrogen content. R2p of the PLSR model for nitrogen content was 0.91 and RMSEP was 0.32. BP model of structure 5-2-1 with R2p of 0.92 and RMSEP of 0.21, ELM model of structure 5-10-1 with R2p of 0.95 and RMSEP of 0.19 and SVR model for nitrogen with R2p of 0.96 and RMSEP of 0.18, all got good prediction performance. Compared with the other three models, SVR model has the better performance for predicting nitrogen content in lettuce leaves. This work demonstrated that the hyperspectral imaging technique coupled with PCA-SVR exhibits a considerable promise for nondestructive detection of nitrogen content in lettuce leaves.

Sun Jun

2013-01-01

49

Multiple regression modeling of nonlinear data sets

Application of multiple polynomial regression modeling to observational and model generated data sets is discussed. Here the form of classical multiple linear regression is generalized to a model that is still linear in its parameters, but includes general multivariate polynomials of predictor variables as the basis functions. The system's low-frequency evolution is assumed to be the result of deterministic, possibly nonlinear, dynamics excited by a temporally white, but geographically coherent and normally distributed white noise. In determining the appropriate structure of the latter, the multi-level generalization of multiple polynomial regression, where the residual stochastic forcing at a given level is subsequently modeled as a function of variables at this, and all preceding levels, has turned out to be useful. The number of levels is determined so that lag-0 covariance of the residual forcing converges to a constant matrix, while its lag-1 covariance vanishes. The method has been applied to the output from a three-layer quasi-geostrophic model, to the analysis of the Northern Hemisphere wintertime geopotential height anomalies, and to global sea-surface temperature (SST) data. In the former two cases, the nonlinear multi-regime structure of probability density function (PDF) constructed in the phase subspace of a few leading empirical orthogonal functions (EOFs), as well as the detailed spectrum of the data's temporal evolution, have been well reproduced by the regression simulations. We have given a simple dynamical interpretation of these results in terms of synoptic-eddy feedback on the system's low-frequency variability. In modeling of SST data, a simple way to include the seasonal cycle into the regression model has been developed. The regression simulation in this case produces ENSO events with maximum amplitude in December/January, while the positive events generally tend to have a larger amplitude than the negative events -- a feature that cannot be adequately represented in linear models. The method is expected to work well provided a sample of data that is long enough. For short data records, such as SST record above, the wealth of techniques exists to improve the accuracy of the regression fit; the so-called partial least-square fit turns out to be most useful. The extreme numerical efficiency and ease of interpretation make multi-level multiple polynomial regression an appealing tool for dynamical analysis of geophysical data.

Kravtsov, S.; Kondrashov, D.; Ghil, M.

2003-04-01

50

A variety of user-friendly spreadsheet templates have been developed for geoscience studies. However, the use of the built-in matrix functions within spreadsheet programs, such as Excel, is not particularly straightforward, lowering the value of spreadsheet programs for matrix-based computations, such as multiple regression analyses. Therefore, this study first developed two applications for Excel to perform multiple regression analyses in a much more user-friendly manner. Then using earthquake time histories from a reputable database, a series of regression analyses were performed. A new framework for on-site earthquake early warning based on multiple regression analyses is presented as an alternative to conventional models which were developed with single regression analyses.

Wang, J. P.; Huang, Duruo; Chang, Su-Chin; Brant, Logan

2013-08-01

51

Abstract Objectives. To reveal the suitable surface condition of an implant abutment for fibroblast attachment, the correlation between the surface characteristics of various materials and the human gingival fibroblast (HGF-1) attachment to the surfaces were analyzed. Methods. Six kinds of surfaces comprised of machined titanium alloy (SM), machined Co-Cr-Mo alloy (CCM), titanium nitride coated titanium alloy (TiN), anodized titanium alloy (AO), composite resin coating on titanium alloy (R) and zirconia (Zr) were used. The measured surface parameters were Sa, Sq, Sz, Sdr, Sdq, Sal, Str and water contact angle (WCA). The HGF-1 cell attachment was investigated and the correlations were analyzed using a multiple regression analysis. Results. The HGF-1 cell attachment was greater in the SM, TiN and Zr groups than the other groups and smallest in the CCM group (p = 0.0096). From the multiple regression analysis, the HGF-1 cell attachment was significantly correlated with Sdr, Sdq and WCA. When the R group was excluded, only WCA showed significant correlation with the fibroblast attachment. Conclusions. Within the limitations of this study, the cell attachment of human gingival fibroblasts was correlated with WCA, developed interfacial area ratio and surface slope. When the surfaces with Sa values of ? 0.2 ?m or less were concerned, only WCA showed a correlation in a third order manner. PMID:25183254

Kim, Young-Sung; Shin, Seung-Yun; Moon, Seung-Kyun; Yang, Seung-Min

2015-01-01

52

Introduction to regression analysis

This book is an introduction to regression analysis for upper division and graduate students in science, engineering, social science and medicine. The emphasis is on the classical linear model using least squares estimation and inference. In addition, topics of current interest, such as regression diagnostics, ridge and logistic regression are treated as well. In contrast to other books at this level, the theoretical foundation of the subject is presented in some detail based on extensive use of matrix algebra. Throughout the text model building and evaluation are emphasised and illustrated wi

GOLBERG, M

2003-01-01

53

Multiple Instance Regression with Structured Data

This slide presentation reviews the use of multiple instance regression with structured data from multiple and related data sets. It applies the concept to a practical problem, that of estimating crop yield using remote sensed country wide weekly observations.

Wagstaff, Kiri L.; Lane, Terran; Roper, Alex

2008-01-01

54

This study by the U.S. Geological Survey, prepared in cooperation with the Virginia Department of Environmental Quality, quantifies the components of the hydrologic cycle across the Commonwealth of Virginia. Long-term, mean fluxes were calculated for precipitation, surface runoff, infiltration, total evapotranspiration (ET), riparian ET, recharge, base flow (or groundwater discharge) and net total outflow. Fluxes of these components were first estimated on a number of real-time-gaged watersheds across Virginia. Specific conductance was used to distinguish and separate surface runoff from base flow. Specific-conductance data were collected every 15 minutes at 75 real-time gages for approximately 18 months between March 2007 and August 2008. Precipitation was estimated for 1971–2000 using PRISM climate data. Precipitation and temperature from the PRISM data were used to develop a regression-based relation to estimate total ET. The proportion of watershed precipitation that becomes surface runoff was related to physiographic province and rock type in a runoff regression equation. Component flux estimates from the watersheds were transferred to flux estimates for counties and independent cities using the ET and runoff regression equations. Only 48 of the 75 watersheds yielded sufficient data, and data from these 48 were used in the final runoff regression equation. The base-flow proportion for the 48 watersheds averaged 72 percent using specific conductance, a value that was substantially higher than the 61 percent average calculated using a graphical-separation technique (the USGS program PART). Final results for the study are presented as component flux estimates for all counties and independent cities in Virginia.

Sanford, Ward E.; Nelms, David L.; Pope, Jason P.; Selnick, David L.

2012-01-01

55

Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

Barrett, C. A.

1985-01-01

56

International Nuclear Information System (INIS)

A two layer perceptron with backpropagation of error is used for quantitative analysis in ICP-AES. The network was trained by emission spectra of two interfering lines of Cd and As and the concentrations of both elements were subsequently estimated from mixture spectra. The spectra of the Cd and As lines were also used to perform multiple linear regression (MLR) via the calculation of the pseudoinverse S+ of the sensitivity matrix S. In the present paper it is shown that there exist close relations between the operation of the perceptron and the MLR procedure. These are most clearly apparent in the correlation between the weights of the backpropagation network and the elements of the pseudoinverse. Using MLR, the confidence intervals over the predictions are exploited to correct for the optical device of the wavelength shift. (orig.)

57

Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

Seber, George A F

2012-01-01

58

Correlation Weights in Multiple Regression

A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

Waller, Niels G.; Jones, Jeff A.

2010-01-01

59

Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L and (P/A factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals. Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.

M. Cholewa

2011-07-01

60

Assumptions of Multiple Regression: Correcting Two Misconceptions

In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason

2013-01-01

61

On Investment Efficiency of China's Tourism Listed Companies Based on Multiple Regression Analysis

The paper is to investigate the conditions of efficient investment for China?s tourism listed companies and to examine how other factors affect the level of investment for the companies, in order to establish a basis for further studying the effect of executive compensation incentives on the investment efficiency of the tourism listed companies. Fifteen tourism listed companies from 2002 to 2010 are selected as study samples. On the basis of analysis of literature, the paper builds tourism ...

Wei, Wei; Yan, Xing-hua

2013-01-01

62

On Investment Efficiency of China's Tourism Listed Companies Based on Multiple Regression Analysis

Full Text Available The paper is to investigate the conditions of efficient investment for China?s tourism listed companies and to examine how other factors affect the level of investment for the companies, in order to establish a basis for further studying the effect of executive compensation incentives on the investment efficiency of the tourism listed companies. Fifteen tourism listed companies from 2002 to 2010 are selected as study samples. On the basis of analysis of literature, the paper builds tourism listed companies' capital investment model by using the Richardson expected investment model for reference and then use it to deal with and analyze the data by the tools of SPSS 17.0 and EXCEL 2010. It is found that the mean residual of fifteen tourism listed companies' capital investment model is -0.000 000 744 with the mean residuals of seven companies less than zero and the ones of eight companies greater than zero. The minimum and maximum of the mean residuals respectively are -0.040 181 25 (Beijing Capital Tourism Co., Ltd and 0.036 942 5(Shenzhen Overseas Chinese Town Co., Ltd. ROAi,t-1(return of assets, p<0.10andINVi,t-1(scale of investment, p<0.01 respectively have significant positive correlations with INVi,t. And Agei,t-1(p<0.05has the significant negative correlation with INVi,t. It suggests that fifteen tourism listed companies from 2003 to 2010 have under-investment on the whole, in which seven ones and eight ones respectively have under-investment and over-investment. In addition, the total return on assets and the level of investment in tourism listed companies significantly advance the level of investment of the company of the following year. And the listing age significantly inhibits the level of investment of the company of the following year.

WEI Wei

2013-09-01

63

Multiple Linear Regression Models in Outlier Detection

Full Text Available Identifying anomalous values in the real-world database is important both for improving the quality of original data and for reducing the impact of anomalous values in the process of knowledge discovery in databases. Such anomalous values give useful information to the data analyst in discovering useful patterns. Through isolation, these data may be separated and analyzed. The analysis of outliers and influential points is an important step of the regression diagnostics. In this paper, our aim is to detect the points which are very different from the others points. They do not seem to belong to a particular population and behave differently. If these influential points are to be removed it will lead to a different model. Distinction between these points is not always obvious and clear. Hence several indicators are used for identifying and analyzing outliers. Existing methods of outlier detection are based on manual inspection of graphically represented data. In this paper, we present a new approach in automating the process of detecting and isolating outliers. Impact of anomalous values on the dataset has been established by using two indicators DFFITS and Cook’sD. The process is based on modeling the human perception of exceptional values by using multiple linear regression analysis.

S.M.A.Khaleelur Rahman

2012-02-01

64

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

65

The Geometry of Enhancement in Multiple Regression

In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and enhancement cannot…

Waller, Niels G.

2011-01-01

66

We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.30 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.301 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.

Kokaly, R.F.; Clark, R.N.

1999-01-01

67

Enhance-Synergism and Suppression Effects in Multiple Regression

Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…

Lipovetsky, Stan; Conklin, W. Michael

2004-01-01

68

This study investigated the relationship between punching acceleration and selected strength and power variables in 19 professional karate athletes from the Brazilian National Team (9 men and 10 women; age, 23 ± 3 years; height, 1.71 ± 0.09 m; and body mass [BM], 67.34 ± 13.44 kg). Punching acceleration was assessed under 4 different conditions in a randomized order: (a) fixed distance aiming to attain maximum speed (FS), (b) fixed distance aiming to attain maximum impact (FI), (c) self-selected distance aiming to attain maximum speed, and (d) self-selected distance aiming to attain maximum impact. The selected strength and power variables were as follows: maximal dynamic strength in bench press and squat-machine, squat and countermovement jump height, mean propulsive power in bench throw and jump squat, and mean propulsive velocity in jump squat with 40% of BM. Upper- and lower-body power and maximal dynamic strength variables were positively correlated to punch acceleration in all conditions. Multiple regression analysis also revealed predictive variables: relative mean propulsive power in squat jump (W·kg-1), and maximal dynamic strength 1 repetition maximum in both bench press and squat-machine exercises. An impact-oriented instruction and a self-selected distance to start the movement seem to be crucial to reach the highest acceleration during punching execution. This investigation, while demonstrating strong correlations between punching acceleration and strength-power variables, also provides important information for coaches, especially for designing better training strategies to improve punching speed. PMID:24276310

Loturco, Irineu; Artioli, Guilherme Giannini; Kobal, Ronaldo; Gil, Saulo; Franchini, Emerson

2014-07-01

69

Multiple Outliers Detection Procedures in Linear Regression

This paper describes a procedure for identifying multiple outliers in linear regression. This procedure uses a robust fit which is the least of trimmed of squares (LTS) and the single linkage clustering method to obtain the potential outliers. Then multiple-case diagnostics are used to obtain the outliers from these potential outliers. The performance of this procedure is also compared to Serbert’s method. Monte Carlo simulations are used in determining which procedure performed best in all...

Robiah Adnan; Mohd Nor Mohamad; Halim Setan

2003-01-01

70

Full Text Available Factor and multiple regression analysis were carried out on morphological traits (body length, body width, bill length, bill width, bill height, shank length, body height, head length, head width, neck length, wing length, chest circumference and body weight of male and female muscovy ducks. Obvious sexual dimorphism was exhibited between sexes, relationship between body measurement and body weight were examined through factor and multiple linear regression analysis. Three factors had positive significant effect on body weight of the male muscovy representing size and shape while only one factor had positive relationship with body weight in female, accounting for 84.2% and 63.5% of variation in body weight for male and female respectively. The result reveals that body measurements can be better selected for improvement in weight for male muscovy than for females.

D.M. Ogah

2009-01-01

71

Salience Assignment for Multiple-Instance Regression

We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag's real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstrate that it provides more extensive, intuitive, and stable salience models than Primary-Instance Regression, which selects a single relevant item from each bag.

Wagstaff, Kiri L.; Lane, Terran

2007-01-01

72

Multiple Regression Analyses in Clinical Child and Adolescent Psychology

A major form of data analysis in clinical child and adolescent psychology is multiple regression. This article reviews issues in the application of such methods in light of the research designs typical of this field. Issues addressed include controlling covariates, evaluation of predictor relevance, comparing predictors, analysis of moderation,…

Jaccard, James; Guilamo-Ramos, Vincent; Johansson, Margaret; Bouris, Alida

2006-01-01

73

Full Text Available In this study we were investigated the relationship between the antifungal activity of some benzimidazole derivatives and some absorption, distribution, metabolism and excretion (ADME parameters. The antifungal activity of studied compounds against Saccharomyces cerevisiae was expressed as the minimal inhibitory concentration (MIC. A statistically significant quantitative structure-activity relationship (QSAR model for predicting antifungal activity of the investigated benzimidazole derivatives against Saccharomyces cerevisiae was obtained by multiple linear regression (MLR using ADME parameters. The quality of the MLR model was validated by the leave-one-out (LOO technique, as well as by the calculation of the statistical parameters for the developed model, and the results are discussed based on the statistical data. [Projekat Ministarstva nauke Republike Srbije, br. 172012 i br. 172014

Kalajdžija Nataša D.

2013-01-01

74

Multiple-Regression Hidden Markov Model

This paper proposes a new class of hidden Markov model (HMM) called multiple-regression HMM (MRHMM) that utilizes auxiliary features such as fundamental frequency (F0) and speaking styles that affect spectral parameters to better model the acoustic features of phonemes. Though such auxiliary features are considered to be the factors that degrade the performance of speech recognizers, the proposed MR-HMM adapts its model parameters, i.e. mean vectors of output probabili...

Fujinaga, Katsuhisa; Nakai, Mitsuru; Shimodaira, Hiroshi; Sagayama, Shigeki

2001-01-01

75

Flexible Model Selection Criterion for Multiple Regression

Predictors of a multiple linear regression equation selected by *GCV* (Generalized Cross Validation) may contain undesirable predictors with no linear functional relationship with the target variable, but are chosen only by accident. This is because GCV estimates prediction error, but does not control the probability of selecting irrelevant predictors of the target variable. To take this possibility into account, a new statistics “*GCV _{f}
*

Kunio Takezawa

2012-01-01

76

Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.

Lee, L.; Helsel, D.

2005-01-01

77

Full Text Available The thermal inactivation of Enterococcus faecium under isothermal conditions in tryptic soy broth of different pH (4.0, 5.5 and 7.4 was studied. The bacterial cells were more sensitive at higher temperature and in media of low pH. Decimal reduction times at 71ºC were 2.56, 0.39 and 0.03 min at pH 7.4, 5.5 and 4.0 respectively. At all temperatures and pH assayed, the survival curves obtained were linear. A mathematical model based on the first order kinetic accurately described these survival curves. The relationship between DT values and temperature was also linear. A mean z-value of 5ºC was established. A multiple linear regression model using four predictor variables (pH, T, pH2 and T2 related the Log of DT value with pH and treatment temperature. The developed tertiary model satisfactorily predicted the heat inactivation of Enterococcus faeciumunder the treatment conditions investigated.

S. CONDON

2014-06-01

78

Omnibus Hypothesis Testing in Dominance-Based Ordinal Multiple Regression

Often quantitative data in the social sciences have only ordinal justification. Problems of interpretation can arise when least squares multiple regression (LSMR) is used with ordinal data. Two ordinal alternatives are discussed, dominance-based ordinal multiple regression (DOMR) and proportional odds multiple regression. The Q[superscript 2]…

Long, Jeffrey D.

2005-01-01

79

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robust [...] ness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Paulo Canas, Rodrigues; Dulce Gamito Santinhos, Pereira; João Tiago, Mexia.

2011-12-01

80

Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than RRR and RL methods when the level of correlations is low or median and sample sizes are large.The AMSE varies with, most to least, respectively, error of restrictions, level of correlations, standard deviation and number of independent variables but inversely with to sample sizes, except that error of restrictions does not affect AMSE of OLS method.

Hukharnsusatrue, A.

2005-11-01

81

Linear calibration is usually performed using eight to ten calibration concentration levels in regulated LC-MS bioanalysis because a minimum of six are specified in regulatory guidelines. However, we have previously reported that two-concentration linear calibration is as reliable as or even better than using multiple concentrations. The purpose of this research is to compare two-concentration with multiple-concentration linear calibration through retrospective data analysis of multiple bioanalytical projects that were conducted in an independent regulated bioanalytical laboratory. A total of 12 bioanalytical projects were randomly selected: two validations and two studies for each of the three most commonly used types of sample extraction methods (protein precipitation, liquid-liquid extraction, solid-phase extraction). When the existing data were retrospectively linearly regressed using only the lowest and the highest concentration levels, no extra batch failure/QC rejection was observed and the differences in accuracy and precision between the original multi-concentration regression and the new two-concentration linear regression are negligible. Specifically, the differences in overall mean apparent bias (square root of mean individual bias squares) are within the ranges of -0.3% to 0.7% and 0.1-0.7% for the validations and studies, respectively. The differences in mean QC concentrations are within the ranges of -0.6% to 1.8% and -0.8% to 2.5% for the validations and studies, respectively. The differences in %CV are within the ranges of -0.7% to 0.9% and -0.3% to 0.6% for the validations and studies, respectively. The average differences in study sample concentrations are within the range of -0.8% to 2.3%. With two-concentration linear regression, an average of 13% of time and cost could have been saved for each batch together with 53% of saving in the lead-in for each project (the preparation of working standard solutions, spiking, and aliquoting). Furthermore, examples are given as how to evaluate the linearity over the entire concentration range when only two concentration levels are used for linear regression. To conclude, two-concentration linear regression is accurate and robust enough for routine use in regulated LC-MS bioanalysis and it significantly saves time and cost as well. PMID:23917407

Musuku, Adrien; Tan, Aimin; Awaiye, Kayode; Trabelsi, Fethi

2013-09-01

82

The set of alcohol-metabolizing enzymes has considerable genetic and functional complexity. The relationships between some alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) genes and alcohol dependence (AD) have long been studied in many populations, but not comprehensively. In the present study, we genotyped 16 markers within the ADH gene cluster (including the ADH1A, ADH1B, ADH1C, ADH5, ADH6, and ADH7 genes), 4 markers within the ALDH2 gene, and 38 unlinked ancestry-informative markers in a case-control sample of 801 individuals. Associations between markers and disease were analyzed by a Hardy-Weinberg equilibrium (HWE) test, a conventional case-control comparison, a structured association analysis, and a novel diplotype trend regression (DTR) analysis. Finally, the disease alleles were fine mapped by a Hardy-Weinberg disequilibrium (HWD) measure (J). All markers were found to be in HWE in controls, but some markers showed HWD in cases. Genotypes of many markers were associated with AD. DTR analysis showed that ADH5 genotypes and diplotypes of ADH1A, ADH1B, ADH7, and ALDH2 were associated with AD in European Americans and/or African Americans. The risk-influencing alleles were fine mapped from among the markers studied and were found to coincide with some well-known functional variants. We demonstrated that DTR was more powerful than many other conventional association methods. We also found that several ADH genes and the ALDH2 gene were susceptibility loci for AD, and the associations were best explained by several independent risk genes. PMID:16685648

Luo, Xingguang; Kranzler, Henry R; Zuo, Lingjun; Wang, Shuang; Schork, Nicholas J; Gelernter, Joel

2006-06-01

83

Dimension Reduction of the Explanatory Variables in Multiple Linear Regression

In classical multiple linear regression analysis problems will occur if the regressors are either multicollinear or if the number of regressors is larger than the number of observations. In this note a new method is introduced which constructs orthogonal predictor variables in a way to have a maximal correlation with the dependent variable. The predictor variables are linear combinations of the original regressors. This method allows a major reduction of the number of predictor...

Filzmoser, P.; Croux, C.

2003-01-01

84

Shrinkage Estimation and Selection for Multiple Functional Regression

Functional linear regression is a useful extension of simple linear regression and has been investigated by many researchers. However, functional variable selection problems when multiple functional observations exist, which is the counterpart in the functional context of multiple linear regression, is seldom studied. Here we propose a method using group smoothly clipped absolute deviation penalty (gSCAD) which can perform regression estimation and variable selection simulta...

Lian, Heng

2011-01-01

85

MORES: Online Incremental Multiple-Output Regression for Data Streams

Online multiple-output regression is an important machine learning technique for modeling, predicting, and compressing multi-dimensional correlated data streams. In this paper, we propose a novel online multiple-output regression method, called MORES, for streaming data. MORES can \\emph{dynamically} learn the structure of the regression coefficients to facilitate the model's continuous refinement. We observe that limited expressive ability of the regression model, especially...

Li, Changsheng; Dong, Weishan; Liu, Qingshan; Zhang, Xin

2014-01-01

86

Four Assumptions of Multiple Regression That Researchers Should Always Test.

Discusses assumptions of multiple regression that are not robust to violation: linearity, reliability of measurement, homoscedasticity, and normality. Stresses the importance of checking assumptions. (SLD)

Osbourne, Jason W.; Waters, Elaine

2002-01-01

87

Full Text Available Abstract Background There is a small, but growing body of literature highlighting inequities in GP practice prescribing rates for many drug therapies. The aim of this paper is to further explore the equity of prescribing for five major CHD drug groups and to explain the amount of variation in GP practice prescribing rates that can be explained by a range of healthcare needs indicators (HCNIs. Methods The study involved a cross-sectional secondary analysis in four primary care trusts (PCTs 1–4 in the North West of England, including 132 GP practices. Prescribing rates (average daily quantities per registered patient aged over 35 years and HCNIs were developed for all GP practices. Analysis was undertaken using multiple linear regression. Results Between 22–25% of the variation in prescribing rates for statins, beta-blockers and bendrofluazide was explained in the multiple regression models. Slightly more variation was explained for ACE inhibitors (31.6% and considerably more for aspirin (51.2%. Prescribing rates were positively associated with CHD hospital diagnoses and procedures for all drug groups other than ACE inhibitors. The proportion of patients aged 55–74 years was positively related to all prescribing rates other than aspirin, where they were positively related to the proportion of patients aged >75 years. However, prescribing rates for statins and ACE inhibitors were negatively associated with the proportion of patients aged >75 years in addition to the proportion of patients from minority ethnic groups. Prescribing rates for aspirin, bendrofluazide and all CHD drugs combined were negatively associated with deprivation. Conclusion Although around 25–50% of the variation in prescribing rates was explained by HCNIs, this varied markedly between PCTs and drug groups. Prescribing rates were generally characterised by both positive and negative associations with HCNIs, suggesting possible inequities in prescribing rates on the basis of ethnicity, deprivation and the proportion of patients aged over 75 years (for statins and ACE inhibitors, but not for aspirin.

St Leger Antony S

2005-02-01

88

The relationship between maceral content plus mineral matter and gross calorific value (GCV) for a wide range of West Virginia coal samples (from 6518 to 15330 BTU/lb; 15.16 to 35.66MJ/kg) has been investigated by multivariable regression and adaptive neuro-fuzzy inference system (ANFIS). The stepwise least square mathematical method comparison between liptinite, vitrinite, plus mineral matter as input data sets with measured GCV reported a nonlinear correlation coefficient (R2) of 0.83. Using the same data set the correlation between the predicted GCV from the ANFIS model and the actual GCV reported a R2 value of 0.96. It was determined that the GCV-based prediction methods, as used in this article, can provide a reasonable estimation of GCV. Copyright ?? Taylor & Francis Group, LLC.

Chelgani, S.C.; Hart, B.; Grady, W.C.; Hower, J.C.

2011-01-01

89

Retail sales forecasting with application the multiple regression

Full Text Available The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with multiple model estimation. One of the most important decisions a retailer can make with information obtained by the multiple regression. Recently, a changing retail environment is causing by an expected consumer’s income and advertising costs. Checking model on the goodness of fit and statistical significance are explored in the article. Finally, the quantitative value of retail sales forecast based on multiple regression model is calculated.

Kuzhda, Tetyana

2012-05-01

90

Fuzzy multiple linear regression: A computational approach

This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.

Juang, C. H.; Huang, X. H.; Fleming, J. W.

1992-01-01

91

Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R 2) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth.

Riccardi, M.; Mele, G.

2014-01-01

92

Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R (2)) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth. PMID:24442792

Riccardi, M; Mele, G; Pulvento, C; Lavini, A; d'Andria, R; Jacobsen, S-E

2014-06-01

93

Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression

The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…

Beckstead, Jason W.

2012-01-01

94

Estimation of transport airplane aerodynamics using multiple stepwise regression

This paper presents an application of multiple stepwise regression to the flight test data of a typical transport airplane. The flight test data was carefully preprocessed to eliminate aliasing, time skews and high frequency noise. The data consisted both of basic certification maneuvers, such as wind-up-turns and maneuvers suitable for parameter estimation, such as responses to elevator pulses and doublets. It is shown that the results of multiple stepwise regression techniques compare favorably with the results obtained from maximum likelihood estimation. Finally, it is concluded that multiple stepwise regression could be a fast economical way to estimate transport airplane aerodynamics.

Keskar, D. A.; Klein, V.; Batterson, J. G.

1985-01-01

95

A variable group Y is assumed to depend upon R thematic variable groups X 1, >..., X R . We assume that components in Y depend linearly upon components in the Xr's. In this work, we propose a multiple covariance criterion which extends that of PLS regression to this multiple predictor groups situation. On this criterion, we build a PLS-type exploratory method - Structural Equation Exploratory Regression (SEER) - that allows to simultaneously perform dimension reduction in gr...

Bry, Xavier; Verron, Thomas; Cazes, Pierre

2008-01-01

96

Full Text Available Wire Electrical Discharge Machining (WEDM is a specialized thermal machining process capable of accurately machining parts with varying hardness or complex shapes, which have sharp edges that are very difficult to be machined by the main stream machining processes. In WEDM a specific wire run-off speed is applied to compensate wear and avoid wire breakage. Since the workpiece generally stays stationary and short discharge durations are applied, the relative displacement between wire and workpiece during one single discharge is very small. This study outlines the development of model and its application to optimize WEDM machining parameters using the Taguchi?s technique which is based on the robust design. Present study outlines the electrode wear estimation in the wire EDM. EN-8 and EN-19 was machined using different process parameters based on L?16 orthogonal array. Among different process parameters voltage and flush rate were kept constant. Parameters such as bed speed, current, pulse-on and pulse-off was varied. Molybdenum wire having diameter of 0.18 mm was used as an electrode. Electrode wear was measured using universal measuring machine. Estimation and comparison of electrode wear was done using multiple regression analysis and group method data handling technique. From the results it was observed that, measured electrode wear and estimated electrode wear correlates well with respect to MRA than GMDH

G. Ugrasen

2014-05-01

97

International Nuclear Information System (INIS)

Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

98

Using Cigarette Data for An Introduction to Multiple Regression

This article, created by Lauran McIntyre of North Carolina State University, describes a dataset containing information for twenty-five brands of domestic cigarettes. The information collected includes: measurements of weight, tar, nicotine and carbon monoxide. The dataset can be used to illustrate multiple regression, outliers, and collinearity. Speaking to this, the author states: "The dataset is useful for introducing the ideas of multiple regression and provides examples of an outlier and a pair of collinear variables."

McIntyre, Lauren

99

Fuzzy Multiple Regression Model for Estimating Software Development Time

As software becomes more complex and its scope dramatically increase, the importance of research on developing methods for estimating software development time has perpetually increased, so accurate estimation is the main goal of software managers for reducing risks of projects. The purpose of this article is to introduce a new Fuzzy Multiple Regression approach, which has the higher accurate than other methods for estimating. Furthermore, we compare Fuzzy Multiple Regression model with Fuzzy...

Venus Marza; Mir Ali Seyyedi

2009-01-01

100

In ordinary statistical methods, multiple outliers in multiple linear regression model are detected sequentially one after another, where smearing and masking effects give misleading results. If the potential multiple outliers can be detected simultaneously, smearing and...

Salena Akter; Khan, Mozammel H. A.

2010-01-01

101

Confidence Intervals for an Effect Size Measure in Multiple Linear Regression

The increase in the squared multiple correlation coefficient ([Delta]R[squared]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. The coverage probability that an asymptotic and percentile bootstrap confidence interval includes [Delta][rho][squared] was investigated. As expected,…

Algina, James; Keselman, H. J.; Penfield, Randall D.

2007-01-01

102

Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR), Multiple Polynomial Regression (MPR) and soft computing...

Srinivasan, M.; Krishnan, A.

2012-01-01

103

In the first part of this work [1] a field operational test (FOT) on micro-HEVs (hybrid electric vehicles) and conventional vehicles was introduced. Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology and flooded batteries were applied. The FOT data were analyzed by kernel density estimation. In this publication multiple regression analysis is applied to the same data. Square regression models without interdependencies are used. Hereby, capacity loss serves as dependent parameter and several battery-related and vehicle-related parameters as independent variables. Battery temperature is found to be the most critical parameter. It is proven that flooded batteries operated in the conventional power system (CPS) degrade faster than VRLA-AGM batteries in the micro-hybrid power system (MHPS). A smaller number of FOT batteries were applied in a vehicle-assigned test design where the test battery is repeatedly mounted in a unique test vehicle. Thus, vehicle category and specific driving profiles can be taken into account in multiple regression. Both parameters have only secondary influence on battery degradation, instead, extended vehicle rest time linked to low mileage performance is more serious. A tear-down analysis was accomplished for selected VRLA-AGM batteries operated in the MHPS. Clear indications are found that pSoC-operation with periodically fully charging the battery (refresh charging) does not result in sulphation of the negative electrode. Instead, the batteries show corrosion of the positive grids and weak adhesion of the positive active mass.

Schaeck, S.; Karspeck, T.; Ott, C.; Weirather-Koestner, D.; Stoermer, A. O.

2011-03-01

104

Relative risk regression analysis of epidemiologic data.

Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiolog...

Prentice, R. L.

1985-01-01

105

Sample Sizes when Using Multiple Linear Regression for Prediction

When using multiple regression for prediction purposes, the issue of minimum required sample size often needs to be addressed. Using a Monte Carlo simulation, models with varying numbers of independent variables were examined and minimum sample sizes were determined for multiple scenarios at each number of independent variables. The scenarios…

Knofczynski, Gregory T.; Mundfrom, Daniel

2008-01-01

106

Vehicle Travel Time Predication based on Multiple Kernel Regression

Full Text Available With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR method etc. However, these algorithms still have some shortcomings, such as highcomputationcomplexity, slow convergence rate etc. This paper exploits the learning ability of multiple kernel learning regression (MKLR in nonlinear prediction processing characteristics, logistics planning based on MKLR for vehicle travel time prediction. The method for Vehicle travel time prediction includes the following steps: (1 preprocessing historical data; (2 selecting appropriate kernel function, training the historical data and performing analysis ;(3 predicting the vehicle travel time based on the trained model. The experimental results show that, through the analysis of using different methods for prediction, the vehicle travel time prediction method proposed in this paper, archives higher accuracy than other methods. It also illustrates the feasibility and effectiveness of the proposed prediction method.

Wenjing Xu

2014-07-01

107

On relationship between regression models and interpretation of multiple regression coefficients

In this paper, we consider the problem of treating linear regression equation coefficients in the case of correlated predictors. It is shown that in general there are no natural ways of interpreting these coefficients similar to the case of single predictor. Nevertheless we suggest linear transformations of predictors, reducing multiple regression to a simple one and retaining the coefficient at variable of interest. The new variable can be treated as the part of the old var...

Varaksin, A. N.; Panov, V. G.

2012-01-01

108

Steganalysis of LSB Image Steganography using Multiple Regression and Auto Regressive (AR Model

Full Text Available The staggering growth in communication technologyand usage of public domain channels (i.e. Internet has greatly facilitated transfer of data. However, such open communication channelshave greater vulnerability to security threats causing unauthorizedin- formation access. Traditionally, encryption is used to realizethen communication security. However, important information is notprotected once decoded. Steganography is the art and science of communicating in a way which hides the existence of the communication.Important information is ?rstly hidden in a host data, such as digitalimage, text, video or audio, etc, and then transmitted secretly tothe receiver. Steganalysis is another important topic in informationhiding which is the art of detecting the presence of steganography. Inthis paper a novel technique for the steganalysis of Image has beenpresented. The proposed technique uses an auto-regressive model todetect the presence of the hidden messages, as well as to estimatethe relative length of the embedded messages.Various auto regressiveparameters are used to classify cover image as well as stego imagewith the help of a SVM classi?er. Multiple Regression analysis ofthe cover carrier along with the stego carrier has been carried outin order to ?nd out the existence of the negligible amount of thesecret message. Experimental results demonstrate the effectivenessand accuracy of the proposed technique.

Souvik Bhattacharyya

2011-07-01

109

Additive-multiplicative regression models for spatio-temporal epidemics.

An extension of the stochastic susceptible-infectious-recovered (SIR) model is proposed in order to accommodate a regression context for modelling infectious disease data. The proposal is based on a multivariate counting process specified by conditional intensities, which contain an additive epidemic component and a multiplicative endemic component. This allows the analysis of endemic infectious diseases by quantifying risk factors for infection by external sources in addition to infective contacts. Inference can be performed by considering the full likelihood of the stochastic process with additional parameter restrictions to ensure non-negative conditional intensities. Simulation from the model can be performed by Ogata's modified thinning algorithm. As an illustrative example, we analyse data provided by the Federal Research Centre for Virus Diseases of Animals, Wusterhausen, Germany, on the incidence of the classical swine fever virus in Germany during 1993-2004. PMID:20029897

Höhle, Michael

2009-12-01

110

Significant Tests of Coefficient Multiple Regressions by using Permutation Methods

Full Text Available Tests of significance of a single partial regression coefficient in a multiple regression model are often made in situations where the standard assumptions underlying the probability calculation (for example assumption of normally of random error term do not hold. When the random error term fails to fulfill some of these assumptions, one need resort to some other nonparametric methods to carry out statistical inferences. Permutation methods are a branch of nonparametric methods. This study compared empirical type one error of different permutation strategies that proposed for testing nullity of a partial regression coefficient in a multiple regression model, using simulation and show that the type one error of Freedman and Lanes strategy is lower to than the other methods.

Ali Shadrokh

2011-01-01

111

Hierarchical regression for epidemiologic analyses of multiple exposures.

Many epidemiologic investigations are designed to study the effects of multiple exposures. Most of these studies are analyzed either by fitting a risk-regression model with all exposures forced in the model, or by using a preliminary-testing algorithm, such as stepwise regression, to produce a smaller model. Research indicates that hierarchical modeling methods can outperform these conventional approaches. These methods are reviewed and compared to two hierarchical methods, empirical-Bayes re...

Greenland, S.

1994-01-01

112

Full Text Available In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.

Arindam Chaudhuri

2012-09-01

113

Unified Multiple Linear Regression (UMLR) is a nonlinear programming model that unifies all kind of multiple linear regression models, such as Principal Components Regression, Ridge Regression, Robust Regression and constrained regression. Although, UMLR has exhibited excellent performances in some real applications, the optimization procedure is not satisfying yet. This study proposes a novel Granular Computing-Particle Swarm Optimization (Grc-PSO) algorithm by ...

Chen Su-Fen

2013-01-01

114

AN EFFECTIVE TECHNIQUE OF MULTIPLE IMPUTATION IN NONPARAMETRIC QUANTILE REGRESSION

Full Text Available In this study, we consider the nonparametric quantile regression model with the covariates Missing at Random (MAR. Multiple imputation is becoming an increasingly popular approach for analyzing missing data, which combined with quantile regression is not well-developed. We propose an effective and accurate two-stage multiple imputation method for the model based on the quantile regression, which consists of initial imputation in the first stage and multiple imputation in the second stage. The estimation procedure makes full use of the entire dataset to achieve increased efficiency and we show the proposed two-stage multiple imputation estimator to be asymptotically normal. In simulation study, we compare the performance of the proposed imputation estimator with Complete Case (CC estimator and other imputation estimators, e.g., the regression imputation estimator and k-Nearest-Neighbor imputation estimator. We conclude that the proposed estimator is robust to the initial imputation and illustrates more desirable performance than other comparative methods. We also apply the proposed multiple imputation method to an AIDS clinical trial data set to show its practical application.

Yanan Hu

2014-01-01

115

The cytotoxicity of methanolic extracts from rice cultures of 53 Fusarium avenaceum strains, which had been isolated from different host organisms in Northern Europe, Canada and Australia/New Zealand, was investigated in a rat hepatoma (H4IIE-W), porcine epithelial kidney (PK-15), foetal feline lung fibroblast, dog lymphoblast (D3447), and a human hepatocarcinoma (Hep G2) cell line using the Alamar Bluetrade mark assay. All extracts were screened for known fungal metabolites using high-performance liquid chromatography with photodiode array and mass spectrometric detection, and both known and unknown metabolites were semi-quantified. Known metabolites that were determined in the cultures include acuminatopyrone, 2-amino-14,16-dimethyloctadecan-3-ol (2-AOD-3-ol), antibiotic Y, aurofusarin, chlamydosporol, chlamydospordiol, enniatins, fusarin A and C, and moniliformin. Multiple regression analysis was used in order to relate fungal metabolites to the cytotoxicity of the extracts. Separate linear regression models were constructed for each cell line. Eleven different fungal metabolites were related to the cytotoxicity (P<0.05). Out of these, nine metabolites were siginificantly related to the cytotoxicity in only one of the five models, while two, namely enniatins and 2-AOD-3-ol, were significant contributors in three or four regression models, respectively. This paper describes how multiple regression analysis may be applied for the assignment of bioactivity/toxicity to the constituents of a multi-component mixture. PMID:16908037

Uhlig, Silvio; Jestoi, Marika; Kristin Knutsen, Ann; Heier, Berit T

2006-10-01

116

Interpreting Multiple Linear Regression: A Guidebook of Variable Importance

Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

2012-01-01

117

Strategies for Identification and Detection of Outliers in Multiple Regression.

Outliers are frequently found in data sets and can cause problems for researchers if not addressed. Failure to identify and deal with outliers in an appropriate manner may lead researchers to report erroneous results. Using a multiple regression context, this paper examines some of the reasons for the presence of outliers and simple methods for…

Vannoy, Martha

118

Directional Hypotheses with the Multiple Linear Regression Approach.

Two well known directional (one-tailed) tests of significance, mean difference and correlation coefficient, are presented within the multiple linear regression framework. Adjustments on the computed probability level are indicated. The case for a directional interaction research hypothesis is defended. Conservative adjustments on the computed…

McNeil, Keith A.; Beggs, Donald L.

119

Full Text Available This study explores the relationship between the student performance and instructional design. The research was conducted at the E-Learning School at a university in Turkey. A list of design factors that had potential influence on student success was created through a review of the literature and interviews with relevant experts. From this, the five most import design factors were chosen. The experts scored 25 university courses on the extent to which they demonstrated the chosen design factors. Multiple-regression and supervised artificial neural network (ANN models were used to examine the relationship between student grade point averages and the scores on the five design factors. The results indicated that there is no statistical difference between the two models. Both models identified the use of examples and applications as the most influential factor. The ANN model provided more information and was used to predict the course-specific factor values required for a desired level of success.

Halil Ibrahim Cebeci

2009-12-01

120

Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.

Whitlock, C. H., III

1977-01-01

121

Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

Kim, Rae Seon

2011-01-01

122

Linear regression analysis theory and computing

This volume presents in detail the fundamental theories of linear regression analysis and diagnosis, as well as the relevant statistical computing techniques so that readers are able to actually model the data using the methods and techniques described in the book. It covers the fundamental theories in linear regression analysis and is extremely useful for future research in this area. The examples of regression analysis using the Statistical Application System (SAS) are also included. This book is suitable for graduate students who are either majoring in statistics/biostatistics or using line

Yan, Xin

2009-01-01

123

Moderation analysis using a two-level regression model.

Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model. PMID:24337935

Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

2014-10-01

124

Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…

Li, Spencer D.

2011-01-01

125

Modeling Oil Palm Yield Using Multiple Linear Regression and Robust M-regression

This study shows how a multiple linear regression model can be used to model palm oil yield. The methods are illustrated by examining the time series data of foliar nutrient compositions as one of the independent variable and fresh fruit bunch as dependent variable. Other independent variables include the nutrient balance ratio and major nutrient composition. This modeling approach is capable of identifying the significant contribution of each independent variable in the improving the modelin...

Azme Khamis; Zuhaimy Ismail; Khalid Haron; Ahmad Tarmizi Mohammed

2006-01-01

126

Selecting Local Models in Multiple Regression by Maximizing Power

This paper considers multiple regression procedures for analyzing the relationship between a response variable and a vector of covariates in a nonparametric setting where both tuning parameters and the number of covariates need to be selected. We introduce an approach which handles the dilemma that with high dimensional data the sparsity of data in regions of the sample space makes estimation of nonparametric curves and surfaces virtually impossible. This is accomplished by ...

Schafer, Chad M.; Doksum, Kjell A.

2006-01-01

127

A New Greedy Algorithm for Multiple Sparse Regression

This paper proposes a new algorithm for multiple sparse regression in high dimensions, where the task is to estimate the support and values of several (typically related) sparse vectors from a few noisy linear measurements. Our algorithm is a "forward-backward" greedy procedure that -- uniquely -- operates on two distinct classes of objects. In particular, we organize our target sparse vectors as a matrix; our algorithm involves iterative addition and removal of both (a) ind...

Jalali, Ali; Sanghavi, Sujay

2012-01-01

128

Outlier Detection for Multivariate Multiple Regression in Y-direction

This study focuses on the outlier detection for Multivariate Multiple Regression in Y-direction however, we propose an alternative method based on the squared distances of the residuals. The proposed method refers to the robust estimates of location and covariance matrices derived from the squared distances of the residuals. The proposed method is compared to Mahalanobis Distance method, Minimum Covariance Determinant method and Minimum Volume Ellipsoid met...

Paweena Tangjuang; Pachitjanut Siripanich

2014-01-01

129

Executive functioning impairments have been demonstrated following consumption of drugs of abuse. These executive impairments could play an important role on the development of the addictive process and rehabilitation of substance abusers. Recent neuropsychological models of executive functioning assume a multicomponent organization of these processes, suggesting different functions could contribute differentially to performance on executive tasks. The aim of this study was to analyze the relationship between severity of consumption of different drugs and neuropsychological performance on tasks sensitive to impairment in the executive subprocesses of working memory, response inhibition, cognitive flexibility, and abstract reasoning. Instruments sensitive to impairment in these four components were administered to 38 polysubstance abusers along with a severity of drug consumption interview. Multiple regression analyses were used. Results showed a differential impact of severity of MDMA abuse on working memory and abstract reasoning indices, of cocaine severity on an inhibitory control index and of cannabis on a cognitive flexibility index. Metabolic reorganization of monoamine frontal-subcortical pathways after drug exposure are proposed as possible explanations for these impairments. PMID:15561451

Verdejo-García, Antonio J; López-Torrecillas, Francisca; Aguilar de Arcos, Francisco; Pérez-García, Miguel

2005-01-01

130

International Nuclear Information System (INIS)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphyto exercise ECG and thallium scintigraphy

131

Application of wavelet-based multiple linear regression model to rainfall forecasting in Australia

In this study, a wavelet-based multiple linear regression model is applied to forecast monthly rainfall in Australia by using monthly historical rainfall data and climate indices as inputs. The wavelet-based model is constructed by incorporating the multi-resolution analysis (MRA) with the discrete wavelet transform and multiple linear regression (MLR) model. The standardized monthly rainfall anomaly and large-scale climate index time series are decomposed using MRA into a certain number of component subseries at different temporal scales. The hierarchical lag relationship between the rainfall anomaly and each potential predictor is identified by cross correlation analysis with a lag time of at least one month at different temporal scales. The components of predictor variables with known lag times are then screened with a stepwise linear regression algorithm to be selectively included into the final forecast model. The MRA-based rainfall forecasting method is examined with 255 stations over Australia, and compared to the traditional multiple linear regression model based on the original time series. The models are trained with data from the 1959-1995 period and then tested in the 1996-2008 period for each station. The performance is compared with observed rainfall values, and evaluated by common statistics of relative absolute error and correlation coefficient. The results show that the wavelet-based regression model provides considerably more accurate monthly rainfall forecasts for all of the selected stations over Australia than the traditional regression model.

He, X.; Guan, H.; Zhang, X.; Simmons, C.

2013-12-01

132

Multiple predictor smoothing methods for sensitivity analysis

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

133

Multiple Linear Regression for Extracting Phrase Translation Pairs

Full Text Available Phrase translation pairs are very useful for bilingual lexicography, machine translation system, cross-lingual information retrieval and many applications in natural language processing. Phrase translation pairs are always extracted from bilingual sentence pairs. In this paper, we extract phrase translation pairs based on word alignment results of Chinese-English bilingual sentence pairs and parsing trees of Chinese sentences, in order to decrease the influence of the grammar disagreement between Chinese and English. Discriminative features for phrase translation pairs are proposed to evaluate extracted ones in this paper, including translation literality, phrase alignment probability and phrase length difference. Multiple linear regression model combined with N-best strategy will be employed to filter phrase translation pairs, in order to improve the evaluating and filtering performance. Experimental results indicate that the filtering performance of phrase alignment probability is best in three kinds of discriminative features for evaluating Chinese-English phrase translation pairs. After multiple linear regression model combined with N-best strategy is used, its F1 achieves 86.24%.

Chun-Xiang Zhang

2011-05-01

134

Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include...

AmandaKraha; LindaZientek

2012-01-01

135

Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

M. Srinivasan

2012-01-01

136

Full Text Available The activity of a selected class of DPP4 inhibitors was preliminarily assessed using chemical descriptors derived AM1 optimized geometries. Using multiple linear regression model, it was found that ?E0, LUMO energy, area, molecular weight and ?H0 are the significant descriptors that can adequately assess the binding affinity of the compounds. The derived multiple linear regression (MLR model was validated using rigorous statistical analysis. The preliminary model suggests that bulky and electrophilic inhibitors are desired.

Jose Isagani Janairo

2011-08-01

137

Precipitation interpolation in mountainous regions using multiple linear regression

Multiple linear regression (MLR) was used to spatially interpolate precipitation for simulating runoff in the Animas River basin of southwestern Colorado. MLR equations were defined for each time step using measured precipitation as dependent variables. Explanatory variables used in each MLR were derived for the dependent variable locations from a digital elevation model (DEM) using a geographic information system. The same explanatory variables were defined for a 5 ?? 5 km grid of the DEM. For each time step, the best MLR equation was chosen and used to interpolate precipitation onto the 5 ?? 5 km grid. The gridded values of precipitation provide a physically-based estimate of the spatial distribution of precipitation and result in reliable simulations of daily runoff in the Animas River basin.

Hay, L.; Viger, R.; McCabe, G.

1998-01-01

138

Outlier Detection for Multivariate Multiple Regression in Y-direction

Full Text Available This study focuses on the outlier detection for Multivariate Multiple Regression in Y-direction however, we propose an alternative method based on the squared distances of the residuals. The proposed method refers to the robust estimates of location and covariance matrices derived from the squared distances of the residuals. The proposed method is compared to Mahalanobis Distance method, Minimum Covariance Determinant method and Minimum Volume Ellipsoid method which are used to detect multivariate outliers. An advantage of the proposed method is that it is an alternative method to solve the complicated problem of resampling algorithm in detecting multivariate outliers in Y-direction in the case of having a large sample size and correlation between the dependent variables.

Paweena Tangjuang

2014-01-01

139

LOGISTIC REGRESSION ANALYSIS WITH STANDARDIZED MARKERS

Two different approaches to analysis of data from diagnostic biomarker studies are commonly employed. Logistic regression is used to fit models for probability of disease given marker values, while ROC curves and risk distributions are used to evaluate classification performance. In this paper we present a method that simultaneously accomplishes both tasks. The key step is to standardize markers relative to the nondiseased population before including them in the logistic reg...

Huang, Ying; Pepe, Margaret S.; Feng, Ziding

2013-01-01

140

Robust mediation analysis based on median regression.

Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925

Yuan, Ying; Mackinnon, David P

2014-03-01

141

Stukel's Extended Logistic Regression Analysis with R

Directory of Open Access Journals (Sweden)

Vilda PURUTÇUO?LU

2011-01-01

142

A multiple regression model for urban traffic noise in Hong Kong

This article describes the roadside traffic noise surveys conducted in heavily built-up urban areas in Hong Kong. Noise measurements were carried out along 18 major roads in 1999. The measurement data included L10, L50, L90, Leq, Lmax, the number of light vehicles, the number of heavy vehicles, the total traffic flow, and the average speed of vehicles. Statistical analysis using the analysis of variance (ANOVA) and Tukey test (pnoise. Multiple regression was used to derive a set of empirical formulas for predicting L10 noise level due to road traffic. The accuracy of these empirical formulas is quantified and compared to that of another widely used prediction model in Hong Kong--the Calculation of Road Traffic Noise. The applicability of the selected multiple regression model is validated by the noise measurements performed in the winter of 2000. copyright 2002 Acoustical Society of America.

To, W. M.; Ip, Rodney C. W.; Lam, Gabriel C. K.; Yau, Chris T. H.

2002-08-01

143

Multiple regression models for energy use in air-conditioned office buildings in different climates

International Nuclear Information System (INIS)

An attempt was made to develop multiple regression models for office buildings in the five major climates in China - severe cold, cold, hot summer and cold winter, mild, and hot summer and warm winter. A total of 12 key building design variables were identified through parametric and sensitivity analysis, and considered as inputs in the regression models. The coefficient of determination R2 varies from 0.89 in Harbin to 0.97 in Kunming, indicating that 89-97% of the variations in annual building energy use can be explained by the changes in the 12 parameters. A pseudo-random number generator based on three simple multiplicative congruential generators was employed to generate random designs for evaluation of the regression models. The difference between regression-predicted and DOE-simulated annual building energy use are largely within 10%. It is envisaged that the regression models developed can be used to estimate the likely energy savings/penalty during the initial design stage when different building schemes and design concepts are being considered.

144

The effects of two environmental temperatures (T; 16 degrees and 31 degrees), five diet dilutions (D; 0%, 12.5%, 25%, 37.5% and 50%), and five daily treadmill running periods (E; 10 minutes, 40 minutes, 70 minutes, 100 minutes, and 130 minutes) upon enzyme activities of liver and adipose tissue of male rats were observed. Liver enzymes studied were glucose-6-phosphatase (G6Pase), 6-P-gluconate dehydrogenase (6PGD), glucose-6-phosphate dehydrogenase (G6PD), fructose diphosphatase (FDPase), NADP-isocitrate dehydrogenase (ICDH), and malic enzyme (ME). Adipose tissue (epididymal fat) enzymes (6PGD, G6PD, and ME) were studied as well as the in vitro incorporation of the 14C of [U-14C] glucose into liberated 14CO2 and into the triglycerides, free fatty acids, and total lipids by adipose tissue slices. Equations describing regression surfaces for these responses (expressed as units/100 g body weight) could contain significant linear coefficients of the independent variables (T, D, and E), their first order interactions, and quadratic coefficients for D and E. Significnat regression coefficients for activities of liver enzymes associated with increased lipogenesis (6PGD, G6PD, and ME) produced response surfaces with conformations generally concave downward. All enzymes possessed positive and negative linear and quadratic coefficients for D which caused response surfaces to be concave downward with respect to that variable. Also, 6PGD and G6PD (positive linear and negative quadratic coefficients for E) exhibited response surfaces concave downward with respect to E. Additionally, 6PGD showed greater activity at 31 degrees than at 16 degrees while G6PD showed no effect of temperature on activity. Liver ICDH, probably important in supplying reducing equivalents for fatty acid synthesis, evidenced response surfaces almost identical to those for 6PGD. Significant regression coefficients for activity of liver enzymes associated with increased gluconeogenesis (FDPase and G6Pase) produced for FDPase a response surface concave downward with respect to both D and E with greater values at 31 degrees than at 16 degrees; but for G6Pase non-concave surfaces with lesser values at 31 degrees than at 16 degrees. Significant regression coefficients for activities of adipose enzymes associated with increased lipogenesis produced for 6PGD a response surface concave upward due to negative linear and positive quadratic coefficients for both D and E. For G6PD and ME regression surfaces were concave upward with respect to E, but these were modified by positive and negative linear coefficients, respectively, for D. Significant regression coefficients for incorporation of the 14C of glucose into triglycerides and free fatty acids of adipose tissue slices and their production of 14CO2 yielded response surfaces concave upward with respect to E (negative linear and positive quadratic coefficients). In addition, the surface for free fatty acids was concave upward with respect to D. The 14CO2 production was greater at 16 degrees than at 31 degrees... PMID:182937

Multiple Retrieval Models and Regression Models for Prior Art Search

This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on mul...

146

Multiple Regression Model for Compressive Strength Prediction of High Performance Concrete

A mathematical model for the prediction of compressive strength of high performance concrete was performed using statistical analysis for the concrete data obtained from experimental work done in this study. The multiple non-linear regression model yielded excellent correlation coefficient for the prediction of compressive strength at different ages (3, 7, 14, 28 and 91 days). The coefficient of correlation was 99.99% for each strength (at each age). Also, the model gives high correlat...

147

Functional linear regression analysis for longitudinal data

We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allow...

148

Full Text Available Unified Multiple Linear Regression (UMLR is a nonlinear programming model that unifies all kind of multiple linear regression models, such as Principal Components Regression, Ridge Regression, Robust Regression and constrained regression. Although, UMLR has exhibited excellent performances in some real applications, the optimization procedure is not satisfying yet. This study proposes a novel Granular Computing-Particle Swarm Optimization (Grc-PSO algorithm by introducing granular computing into standard PSO which is used for the optimization of the UMLR model. The experimental results show that the solution got by Grc-PSO algorithm is much better to the real situation than other state-of-art algorithms.

149

On connectivity of fibers with positive marginals in multiple logistic regression

In this paper we consider exact tests of a multiple logistic regression, where the levels of covariates are equally spaced, via Markov beses. In usual application of multiple logistic regression, the sample size is positive for each combination of levels of the covariates. In this case we do not need a whole Markov basis, which guarantees connectivity of all fibers. We first give an explicit Markov basis for multiple Poisson regression. By the Lawrence lifting of this basis,...

150

Regression Analysis for the Social Sciences

The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include:interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis.teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming

151

An Effect Size for Regression Predictors in Meta-Analysis

A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

152

Sliced Inverse Regression for big data analysis

Modem advances in computing power have greatly widened scientists' scope in gathering and investigating information from many variables. We describe sliced inverse regression (SIR), for reducing the dimension of the input variable x without going through any parametric or nonparametric model-fitting process. This method explores the simplicity of the inverse view of regression. Instead of regressing the univariate output variable y against the multivariate x, we regress x against y. Forward r...

153

A Software Tool for Regression Analysis and its Assumptions

Nowadays, among the forecasting methods, the most important one is the regression analysis. In this method, the aim is to estimate the population regression model as much as accurate by taking as basis the sample regression function. Its results are valid under certain assumptions and the violations of these assumptions cause the invalidity of some properties of the estimators. In this study, a new object-oriented program concentrated only on the regression analysis and its assumptions has be...

154

Type I error rates in multiple regression, and hence the chance for false positive research findings, can be drastically inflated when multiple regression models are used to analyze data that contain random measurement error. This article shows the potential for inflated Type I error rates in commonly encountered scenarios and provides new…

155

Throughput Prediction of Fishing Goods Based on the Grey Multiple Linear Regression Method

Full Text Available Based on the grey prediction method and multiple linear regression method, the grey multiple linear regression method was presented. This method was applied to the throughput prediction of fishing goods according to five fishing ports’ actual throughput data. The result of comparing the calculating conclusion to the time series one-dimensional linear regression method and grey prediction method proved that the method of calculation and analyzing was more effective and the forecasting precision was higher.

156

Forecasting Gold Prices Using Multiple Linear Regression Method

Full Text Available Problem statement: Forecasting is a function in management to assist decision making. It is also described as the process of estimation in unknown future situations. In a more general term it is commonly known as prediction which refers to estimation of time series or longitudinal type data. Gold is a precious yellow commodity once used as money. It was made illegal in USA 41 years ago, but is now once again accepted as a potential currency. The demand for this commodity is on the rise. Approach: Objective of this study was to develop a forecasting model for predicting gold prices based on economic factors such as inflation, currency price movements and others. Following the melt-down of US dollars, investors are putting their money into gold because gold plays an important role as a stabilizing influence for investment portfolios. Due to the increase in demand for gold in Malaysian and other parts of the world, it is necessary to develop a model that reflects the structure and pattern of gold market and forecast movement of gold price. The most appropriate approach to the understanding of gold prices is the Multiple Linear Regression (MLR model. MLR is a study on the relationship between a single dependent variable and one or more independent variables, as this case with gold price as the single dependent variable. The fitted model of MLR will be used to predict the future gold prices. A naive model known as ?forecast-1? was considered to be a benchmark model in order to evaluate the performance of the model. Results: Many factors determine the price of gold and based on ?a hunch of experts?, several economic factors had been identified to have influence on the gold prices. Variables such as Commodity Research Bureau future index (CRB; USD/Euro Foreign Exchange Rate (EUROUSD; Inflation rate (INF; Money Supply (M1; New York Stock Exchange (NYSE; Standard and Poor 500 (SPX; Treasury Bill (T-BILL and US Dollar index (USDX were considered to have influence on the prices. Parameter estimations for the MLR were carried out using Statistical Packages for Social Science package (SPSS with Mean Square Error (MSE as the fitness function to determine the forecast accuracy. Conclusion: Two models were considered. The first model considered all possible independent variables. The model appeared to be useful for predicting the price of gold with 85.2% of sample variations in monthly gold prices explained by the model. The second model considered the following four independent variables the (CRB lagged one, (EUROUSD lagged one, (INF lagged two and (M1 lagged two to be significant. In terms of prediction, the second model achieved high level of predictive accuracy. The amount of variance explained was about 70% and the regression coefficients also provide a means of assessing the relative importance of individual variables in the overall prediction of gold price.

157

Forecasting Electrical Load using ANN Combined with Multiple Regression Method

This paper combined artificial neural network and regression modeling methods to predict electrical load. We propose an approach for specific day, week and/or month load forecasting for electrical companies taking into account the historical load. Therefore, a modified technique, based on artificial neural network (ANN) combined with linear regression, is applied on the KSA electrical network dependent on its historical data to predict the electrical load demand forecasting up to year 2020. T...

158

MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES

Full Text Available Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established by multiple logistic regression model using SPSS statistical software. Results: The factors like frequency of brushing, timings of cleaning teeth and type of toothpastes are significant persistent predictors of dental caries and periodontal disease. The log likelihood value of full model is –1013.1364 and Akaike’s Information Criterion (AIC is 1.1752 as compared to reduced regression model are -1019.8106 and 1.1748 respectively for dental caries. But, the log likelihood value of full model is –1085.7876 and AIC is 1.2577 followed by reduced regression model are -1019.8106 and 1.1748 respectively for periodontal disease. The area under Receiver Operating Characteristic (ROC curve for the dental caries is 0.7509 (full model and 0.7447 (reduced model; the ROC for the periodontal disease is 0.6128 (full model and 0.5821 (reduced model. Conclusions: The frequency of brushing, timings of cleaning teeth and type of toothpastes are main signifi cant risk factors of dental caries and periodontal disease. The fitting performance of reduced logistic regression model is slightly a better fit as compared to full logistic regression model in identifying the these risk factors for both dichotomous dental caries and periodontal disease.

159

Outcome misclassification is widespread in epidemiology, but methods to account for it are rarely used. We describe the use of multiple imputation to reduce bias when validation data are available for a subgroup of study participants. This approach is illustrated using data from 308 participants in the multicenter Herpetic Eye Disease Study between 1992 and 1998 (48% female; 85% white; median age, 49 years). The odds ratio comparing the acyclovir group with the placebo group on the gold-standard outcome (physician-diagnosed herpes simplex virus recurrence) was 0.62 (95% confidence interval (CI): 0.35, 1.09). We masked ourselves to physician diagnosis except for a 30% validation subgroup used to compare methods. Multiple imputation (odds ratio (OR) = 0.60; 95% CI: 0.24, 1.51) was compared with naive analysis using self-reported outcomes (OR = 0.90; 95% CI: 0.47, 1.73), analysis restricted to the validation subgroup (OR = 0.57; 95% CI: 0.20, 1.59), and direct maximum likelihood (OR = 0.62; 95% CI: 0.26, 1.53). In simulations, multiple imputation and direct maximum likelihood had greater statistical power than did analysis restricted to the validation subgroup, yet all 3 provided unbiased estimates of the odds ratio. The multiple-imputation approach was extended to estimate risk ratios using log-binomial regression. Multiple imputation has advantages regarding flexibility and ease of implementation for epidemiologists familiar with missing data methods. PMID:24627573

160

Spatial regression analysis on 32 years total column ozone data

Full Text Available Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1? × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979–2008 and two years of assimilated SCIAMACHY ozone data (2009–2010. The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO, El Nino (ENSO and stratospheric alternative halogens (EESC. For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30? N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

161

Multiple Regression Model for Compressive Strength Prediction of High Performance Concrete

Full Text Available A mathematical model for the prediction of compressive strength of high performance concrete was performed using statistical analysis for the concrete data obtained from experimental work done in this study. The multiple non-linear regression model yielded excellent correlation coefficient for the prediction of compressive strength at different ages (3, 7, 14, 28 and 91 days. The coefficient of correlation was 99.99% for each strength (at each age. Also, the model gives high correlation for strength prediction of concrete with different types of curing.

162

SummaryThe purpose of this study is to test and compare artificial neural network (ANN) and regression models for estimating river streamflow affected by ice conditions. Three regression models are investigated including: multiple regression, stepwise regression and ridge regression. A case study conducted on the Fraser River in British Columbia (Canada) is presented in which various combinations of hydrological and meteorological explanatory variables were used. Discharge estimates obtained by statistical modeling were also compared to the official estimates made by Water Survey of Canada (WSC) hydrometric technologists. The case study shows that ANN models are relatively more successful than regression models for winter streamflow estimation purposes. However, due to data scarcity, it was difficult to make a definitive assessment. Stepwise regression was found to be the most effective of the three regressive approaches investigated. Statistical modeling is a viable approach for winter streamflow data estimation, but data completeness and reliability is a major limitation.

163

Regression Discontinuity Designs with Multiple Rating-Score Variables

In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those…

164

Analysis of Multiple Phenotypes

The complex etiology of common diseases like cardiovascular disease, diabetes, hypertension, and rheumatoid arthritis has led investigators to focus on the genetics of correlated phenotypes and risk factors. Joint analysis of multiple disease-related phenotypes may reveal genes of pleiotropic effect and increase analytical power, but at the cost of increased analytical and computational complexity. All three data sets provided for analysis at the Genetic Analysis Workshop 16 offered multiple ...

165

Multiple functional regression with both discrete and continuous covariates

In this paper we present a nonparametric method for extending functional regression methodology to the situation where more than one functional covariate is used to predict a functional response. Borrowing the idea from Kadri et al. (2010a), the method, which support mixed discrete and continuous explanatory variables, is based on estimating a function-valued function in reproducing kernel Hilbert spaces by virtue of positive operator-valued kernels.

166

In a traditional regression-discontinuity design (RDD), units are assigned to treatment on the basis of a cutoff score and a continuous assignment variable. The treatment effect is measured at a single cutoff location along the assignment variable. This article introduces the multivariate regression-discontinuity design (MRDD), where multiple…

167

Modeling Lateral and Longitudinal Control of Human Drivers with Multiple Linear Regression Models

In this paper, we describe results to model lateral and longitudinal control behavior of drivers with simple linear multiple regression models. This approach fits into the Bayesian Programming (BP) approach (Bessi

168

Multiple regression technique for Pth degree polynominals with and without linear cross products

A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.

169

Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

170

The feasibility of predicting flow characteristics from basin descriptors using multiple regression and neural networks has been investigated on 52 basins in Zimbabwe. Flow characteristics considered were average annual runoff, base flow index, flow duration curve, and average monthly runoff . Mean annual runoff is predicted using linear equations from mean annual precipitation, basin slope, and proportion of a basin underlain by granite and gneiss. A multiple regression equation is derived t...

171

Egg hatchability prediction by multiple linear regression and artificial neural networks

An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determi...

172

Estimation of crown closure from AVIRIS data using regression analysis

Crown closure is one of the input parameters used for forest growth and yield modelling. Preliminary work by Staenz et al. indicates that imaging spectrometer data acquired with sensors such as the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) have some potential for estimating crown closure on a stand level. The objectives of this paper are: (1) to establish a relationship between AVIRIS data and the crown closure derived from aerial photography of a forested test site within the Interior Douglas Fir biogeoclimatic zone in British Columbia, Canada; (2) to investigate the impact of atmospheric effects and the forest background on the correlation between AVIRIS data and crown closure estimates; and (3) to improve this relationship using multiple regression analysis.

173

Time series techniques have been extensively applied to research works of many academic disciplines, particularly those concerned with economics and environment. This paper presents application of a time series multiple linear regression technique to a groundwater system to predict groundwater level and salinity fluctuations in a saline area in the northeastern part of Thailand. Surface and groundwater interaction is the major mechanism controlling the shallow subsurface system and salinity of the area. The basic technique is based on the lagged correlation between hydrologic, and hydrogeological and environmental parameters. As a result of a large irrigation project in the area, several regulating gates have been installed to control flooding to the downstream rivers and to provide the upstream areas with sufficient irrigating water. From the lagged correlation analysis, the shallow groundwater and groundwater salinity fluctuation in the irrigating area are shown to be dependent upon the surface water levels at the installed regulated gates and prior rainfall. A set of multiple linear regression equations with lagged time dependent function are then formulated. The dependent variables are groundwater level and groundwater salinity while the independent variables are rainfall rates and water levels measured at the regulating gates. After calibration and verification, the model, as an alternative to the conventional method which requires detailed and continuous variables and is costlier, can be used to forecast and manage future groundwater systems.

174

Stukel's Extended Logistic Regression Analysis with R

Objective: For a logistic regression model, the degree to which predicted probabilities agree with actual outcomes can be expressed as a classification table. Being crucial in model adequacy checking, such tables may be slightly different when the same data are modeled with different statistical packages. The underlying reason is that when classifying a set of binary data, if the observations used to fit the model are also used to estimate the classification error, the resulting error-count e...

175

Rate optimal multiple testing procedure in high-dimensional regression

Multiple testing and variable selection have gained much attention in statistical theory and methodology research. They are dealing with the same problem of identifying the important variables among many (Jin, 2012). However, there is little overlap in the literature. Research on variable selection has been focusing on selection consistency, i.e., both type I and type II errors converging to zero. This is only possible when the signals are sufficiently strong, contrary to ma...

176

Multiple predictor smoothing methods for sensitivity analysis: Example results

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described in the first part of this presentation: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. In this, the second and concluding part of the presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

177

Multiple predictor smoothing methods for sensitivity analysis: Description of techniques

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. Then, in the second and concluding part of this presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

178

Full Text Available The triggers of forest area loss in Cameroon have not been properly understood. The measures used to curb forest area loss have been simplistic, generalized with no clear cut knowledge of the specific role of different potential factors. This study aims at investigating the hypothesis that population growth is the main cause of loss in forest area. This study will be able to identify what factors are of more significance in the causal equation. The open R programming software has been used to produce multiple linear regression models. The correlation between the dependent variable and the independent variables was established by a correlation matrix and the strength of the models tested by power analysis. The results supports the hypothesis that population growth is the most dominant cause of deforestation in Cameroon while arable production and permanent crop land and arable production per capita index are second and third respectively.

179

Analysis and forecasting of air quality parameters are important topics of atmospheric and environmental research today due to the health impact caused by air pollution. This study examines transformation of nitrogen dioxide (NO(2)) into ozone (O(3)) at urban environment using time series plot. Data on the concentration of environmental pollutants and meteorological variables were employed to predict the concentration of O(3) in the atmosphere. Possibility of employing multiple linear regression models as a tool for prediction of O(3) concentration was tested. Results indicated that the presence of NO(2) and sunshine influence the concentration of O(3) in Malaysia. The influence of the previous hour ozone on the next hour concentrations was also demonstrated. PMID:19440846

180

Full Text Available Polyethylene glycol (PEG is the most common preservative in use for bulking and maintaining structural integrity in waterlogged wood. Conservators therefore have a need to be able to determine PEG concentrations in wood in a non-destructive manner. We present a study highlighting the application of infrared spectroscopy coupled with multivariate analysis techniques to predict the concentration of polyethylene glycol 400 (PEG-400 and water simultaneously. This technique uses attenuated total reflectance (ATR spectroscopy andunconstrained stepwise multiple linear regression (SMLR analysis for prediction of multiple components in archaeological wood. Using this model we have calculated the concentration of PEG-400 and water in treated archaeological waterlogged wood samples.

181

A method for the analysis of capillary column Polychlorinated biphenyl (PCB) data using regression analysis with outlier checking and elimination, COMSTAR, is presented and evaluated. his algorithm determines the best combination of the commercial PCB mixtures which best fits the...

182

The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions. PMID:19407344

183

In this paper, we define two restricted estimators for the regression parameters in a multiple linear regression model with measurement errors when prior information for the parameters is available. We then construct two sets of improved estimators which include the preliminary test estimator, the Stein-type estimator and the positive rule Stein type estimator for both slope and intercept, and examine their statistical properties such as the asymptotic distributional quadratic biases, the asy...

184

Multiple linear and principal component regressions for modelling ecotoxicity bioassay response.

The ecotoxicological response of the living organisms in an aquatic system depends on the physical, chemical and bacteriological variables, as well as the interactions between them. An important challenge to scientists is to understand the interaction and behaviour of factors involved in a multidimensional process such as the ecotoxicological response. With this aim, multiple linear regression (MLR) and principal component regression were applied to the ecotoxicity bioassay response of Chlorella vulgaris and Vibrio fischeri in water collected at seven sites of Leça river during five monitoring campaigns (February, May, June, August and September of 2006). The river water characterization included the analysis of 22 physicochemical and 3 microbiological parameters. The model that best fitted the data was MLR, which shows: (i) a negative correlation with dissolved organic carbon, zinc and manganese, and a positive one with turbidity and arsenic, regarding C. vulgaris toxic response; (ii) a negative correlation with conductivity and turbidity and a positive one with phosphorus, hardness, iron, mercury, arsenic and faecal coliforms, concerning V. fischeri toxic response. This integrated assessment may allow the evaluation of the effect of future pollution abatement measures over the water quality of Leça River. PMID:24645478

185

Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging

Full Text Available Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 and PP900 are 62.66%, 79.37% and 90.62%, respectively and results of fuzzy interference system and decision tree showed that this method have high potential in comparison with multiple regression and values of total classification accuracy for PP450, PP700 and PP900 are 98.6%, 97.3% and 98% respectively. Then final results of this research showed fuzzy interference systems by using decision tree have high potential to predict clogging in envelops.

186

Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates

Abstract Background Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hier...

187

Analysis of genome-wide association data by large-scale Bayesian logistic regression.

Single-locus analysis is often used to analyze genome-wide association (GWA) data, but such analysis is subject to severe multiple comparisons adjustment. Multivariate logistic regression is proposed to fit a multi-locus model for case-control data. However, when the sample size is much smaller than the number of single-nucleotide polymorphisms (SNPs) or when correlation among SNPs is high, traditional multivariate logistic regression breaks down. To accommodate the scale of data from a GWA while controlling for collinearity and overfitting in a high dimensional predictor space, we propose a variable selection procedure using Bayesian logistic regression. We explored a connection between Bayesian regression with certain priors and L1 and L2 penalized logistic regression. After analyzing large number of SNPs simultaneously in a Bayesian regression, we selected important SNPs for further consideration. With much fewer SNPs of interest, problems of multiple comparisons and collinearity are less severe. We conducted simulation studies to examine probability of correctly selecting disease contributing SNPs and applied developed methods to analyze Genetic Analysis Workshop 16 North American Rheumatoid Arthritis Consortium data. PMID:20018005

188

Egg hatchability prediction by multiple linear regression and artificial neural networks

Full Text Available An artificial neural network (ANN was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

189

Egg hatchability prediction by multiple linear regression and artificial neural networks

Full Text Available SciELO Brazil | Language: English Abstract in english An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained [...] from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

190

A multiple regression model for predicting airwaves in shallow water sea bed logging data

This paper focuses on formulating a multiple regression model using matrix notation that can be used to predict the magnitude of airwaves in Shallow Water Sea Bed Logging (SBL) Data. The term airwaves refer to the propagated EM signals from the source antenna via atmosphere that is induced along air/sea surface and interferes with the subsurface signal. In shallow water, the airwaves have the ability to mask other subsurface responses possibly containing valuable information about subsurface resistive structure such as hydrocarbon reservoir. A fair representation of SBL environments was simulated to generate the airwaves data. Magnitude of airwaves at selected offset is used as the dependent variable. Whereas the predictor variables (independent variables) for the proposed multiple regression model are the frequency, seawater depth, seawater conductivity, sediment conductivity and offset. Akaike's Information Criterion (AIC) is used for selecting the multiple regression models. The formulated regression model is benchmarked with the theoretical well-known space-domain expression for the Airwaves estimation. The model reveals goodness of fit with R2 of 0.9561and the overall statistical significance of the estimated parameters F-value of 19.35. The result indicates that the magnitudes of airwaves predicted by the regression model are approximately consistent with theoretical model.

191

Biplots in Reduced-Rank Regression

Regression problems with a number of related response variables are typically analyzed by separate multiple regressions. This paper shows how these regressions can be visualized jointly in a biplot based on reduced-rank regression. Reduced-rank regression combines multiple regression and principal components analysis and can therefore be carried out with standard statistical packages. The proposed biplot highlights the major aspects of the regressions by displaying the least-squares approxima...

192

Regression Model Optimization for the Analysis of Experimental Data

A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

193

A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

194

Early cost estimating for road construction projects using multiple regression techniques

The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that t...

195

A Spreadsheet Tool for Learning the Multiple Regression F-Test, T-Tests, and Multicollinearity

This note presents a spreadsheet tool that allows teachers the opportunity to guide students towards answering on their own questions related to the multiple regression F-test, the t-tests, and multicollinearity. The note demonstrates approaches for using the spreadsheet that might be appropriate for three different levels of statistics classes,…

196

The Sage handbook of regression analysis and causal inference

Covering both general and advanced aspects of multivariate methods, this handbook focuses on regression analysis of cross-sectional and longitudinal data with an emphasis on causal analysis and provides readers with an introduction to and exploration of a large range of techniques.

197

Predicting Dropouts of University Freshmen: A Logit Regression Analysis.

Stepwise discriminant analysis coupled with logit regression analysis of freshmen data from Brandon University (Manitoba) indicated that six tested variables drawn from research on university dropouts were useful in predicting attrition: student status, residence, financial sources, distance from home town, goal fulfillment, and satisfaction with…

198

In this work, we introduce multiplicative drift analysis as a suitable way to analyze the runtime of randomized search heuristics such as evolutionary algorithms. We give a multiplicative version of the classical drift theorem. This allows easier analyses in those settings where the optimization progress is roughly proportional to the current distance to the optimum. To display the strength of this tool, we regard the classical problem how the (1+1) Evolutionary Algorithm optimizes an arbitrary linear pseudo-Boolean function. Here, we first give a relatively simple proof for the fact that any linear function is optimized in expected time $O(n \\log n)$, where $n$ is the length of the bit string. Afterwards, we show that in fact any such function is optimized in expected time at most ${(1+o(1)) 1.39 \\euler n\\ln (n)}$, again using multiplicative drift analysis. We also prove a corresponding lower bound of ${(1-o(1))e n\\ln(n)}$ which actually holds for all functions with a unique global optimum. We further demons...

199

Applying Multiple Linear Regression and Neural Network to Predict Bank Performance

Full Text Available Globalization and technological advancement has created a highly competitive market in the banking and finance industry. Performance of the industry depends heavily on the accuracy of the decisions made at managerial level. This study uses multiple linear regression technique and feed forward artificial neural network in predicting bank performance. The study aims to predict bank performance using multiple linear regression and neural network. The study then evaluates the performance of the two techniques with a goal to find a powerful tool in predicting the bank performance. Data of thirteen banks for the period 2001-2006 was used in the study. ROA was used as a measure of bank performance, and hence is a dependent variable for the multiple linear regressions. Seven variables including liquidity, credit risk, cost to income ratio, size, concentration ratio, inflation and GDP were used as independent variables. Under supervised learning, the dependent variable, ROA was used as the target output for the artificial neural network. Seven inputs corresponding to seven predictor variables were used for pattern recognition at the training phase. Experimental results from the multiple linear regression show that two variables: credit risk and cost to income ratio are significant in determining the bank performance. Two variables were found to explain about 60.9 percent of the total variation in the data with a mean square error (MSE of 0.330. The artificial neural network was found to give optimal results by using thirteen hidden neurons. Testing results show that the seven inputs explain about 66.9 percent of the total variation in the data with a very low MSE of 0.00687. Performance of both methods is measured by mean square prediction error (MSPR at the validation stage. The MSPR value for neural network is lower than the MPSR value for multiple linear regression (0.0061 against 0.6190. The study concludes that artificial neural network is the more powerful tool in predicting bank performance.

200

Detailed estimates of carbon dioxide emissions at fine spatial scales are critical to both modelers and decision makers dealing with global warming and climate change. Globally, traffic-related emissions of carbon dioxide are growing rapidly. This paper presents a new method based on a multiple linear regression model to disaggregate traffic-related CO 2 emission estimates from the parish-level scale to a 1 × 1 km grid scale. Considering the allocation factors (population density, urban area, income, road density) together, we used a correlation and regression analysis to determine the relationship between these factors and traffic-related CO 2 emissions, and developed the best-fit model. The method was applied to downscale the traffic-related CO 2 emission values by parish (i.e. county) for the State of Louisiana into 1-km 2 grid cells. In the four highest parishes in traffic-related CO 2 emissions, the biggest area that has above average CO 2 emissions is found in East Baton Rouge, and the smallest area with no CO 2 emissions is also in East Baton Rouge, but Orleans has the most CO 2 emissions per unit area. The result reveals that high CO 2 emissions are concentrated in dense road network of urban areas with high population density and low CO 2 emissions are distributed in rural areas with low population density, sparse road network. The proposed method can be used to identify the emission "hot spots" at fine scale and is considered more accurate and less time-consuming than the previous methods.

201

Fundamental parameters calculations are used for the analysis of europium in the concentration range of 0.1 WT% to 30.0 WT% in the oxidic catalyst supports alumina, calcia, magnesia, lanthania, and thoria. The precision and accuracy of this method is dependent on how the sample matrix is defined in the fundamental parameters program and the number and concentration of the standards used. Results comparable to the multiple regression method are obtained when the matrix stoichiometry is defined as Eu2O3 and the catalyst oxide (i.e. A12O3 etc). It is also necessary to use standards which bracket the europium concentration in the samples. When these conditions are met, the results are comparable to those obtained from a ten point multiple regression calibration curve but with a considerable saving of standard preparation time. The precision is better than + or - 2% relative. The % relative difference between the fundamental parameters and multiple regression results is also 2%. Data is presented which illustrates the effect of defining the sample stoichiometry in the XRF11 computer program

202

MONEY DEMAND IN ROMANIAN ECONOMY, USING MULTIPLE REGRESSION METHOD AND UNRESTRICTED VAR MODEL

Directory of Open Access Journals (Sweden)

Full Text Available The paper describes the money demand in Romanian economy using two econometrics models. The first model consist in a multiple regression between demand money, monthly inflation rate, Industrial production Index and the foreign exchange rate RON/Euro. The second model (Unrestricted Vector AutoRegressive model is applied for the same variables used in the first model. Identifying a statistically strong model, capable of stable estimations for the money demand function in Romania’s economy constitutes a prerequisite to the application of an efficient monetary policy.

203

Introduction to mixed modelling beyond regression and analysis of variance

Mixed modelling is one of the most promising and exciting areas of statistical analysis, enabling more powerful interpretation of data through the recognition of random effects. However, many perceive mixed modelling as an intimidating and specialized technique. This book introduces mixed modelling analysis in a simple and straightforward way, allowing the reader to apply the technique confidently in a wide range of situations. Introduction to Mixed Modelling shows that mixed modelling is a natural extension of the more familiar statistical methods of regression analysis and analysis of variance. In doing so, it provides the ideal introduction to this important statistical technique for those engaged in the statistical analysis of data. This essential book:Demonstrates the power of mixed modelling in a wide range of disciplines, including industrial research, social sciences, genetics, clinical research, ecology and agricultural research.Illustrates how the capabilities of regression analysis can be combined ...

204

Regression analysis of creep-rupture data: a practical approach

International Nuclear Information System (INIS)

A generalized linear regression approach to the analysis of creep and creep-rupture data appears to have great promise for future applications. Uncertainties in predictions of creep behavior can be large due to heat treatment, heat-to-heat and other variations in properties. For types 304 and 316 stainless steels and for 2 1/4 Cr--1 Mo steel these uncertainties can be reduced by using regression models that include terms involving the ultimate tensile strength or 100-hr rupture strength of a given heat. A model for Alloy 800H was developed to predict the middle of the scatter band on behavior. Regression analysis of single heat data sets for a variety of materials yielded generally good results. Extrapolation of any model must be done with extreme caution. Possible metallurgical instabilities or changes in creep mechanism can cause serious errors in extrapolated results

205

Aerosol optical depth (AOD) from AERONET data has a very fine resolution but air pollution index (API), visibility and relative humidity from the ground truth measurements are coarse. To obtain the local AOD in the atmosphere, the relationship between these three parameters was determined using multiple regression analysis. The data of southwest monsoon period (August to September, 2012) taken in Penang, Malaysia, was used to establish a quantitative relationship in which the AOD is modeled as a function of API, relative humidity, and visibility. The highest correlated model was used to predict AOD values during southwest monsoon period. When aerosol is not uniformly distributed in the atmosphere then the predicted AOD can be highly deviated from the measured values. Therefore these deviated data can be removed by comparing between the predicted AOD values and the actual AERONET data which help to investigate whether the non uniform source of the aerosol is from the ground surface or from higher altitude level. This model can accurately predict AOD if only the aerosol is uniformly distributed in the atmosphere. However, further study is needed to determine this model is suitable to use for AOD predicting not only in Penang, but also other state in Malaysia or even global.

206

Full Text Available The retention behavior and lipophilicity parameters of some antiphychotics were determined using reversed-phase thin layer chromatography. Quantitative structure-activity relationships studies have been performed to correlate the molecular characteristics of observed compounds with their retention as well as with their chromatographically determinated lipophilicity parameters. The effect of different organic modifiers (acetone, tetrahydrofuran, and methanol has been studied. The retention of investigated compounds decreases linearly with increasing concentration of organic modifier. The chemical structures of the antipsychotics have been characterized by molecular descriptors which are calculated from the structure and related to chromatographically determinated lipophilicity parameters by multiple linear regression analysis. This approach gives us the possibility to gain insight into factors responsible for the retention as well as lipophilicity of the investigated set of the compounds. The most prominent factors affecting lipophilicity of the investigated substances are Solubility, Energy of the highest occupied molecular orbital, and Energy of the lowest unoccupied molecular orbital. The obtained models were used for interpretation of the lipophilicity of the investigated compounds. The prediction results are in good agreement with the experimental value. This study provides good information about pharmacologically important physico-chemical parameters of observed antipsychotics relevant to variations in molecular lipophilicity and chromatographic behavior. Established QSAR models could be helpful in design of novel multitarget antipsychotic compounds.

207

Gene-environment (G×E) interactions are biologically important for a wide range of environmental exposures and clinical outcomes. Because of the large number of potential interactions in genomewide association data, the standard approach fits one model per G×E interaction with multiple hypothesis correction (MHC) used to control the type I error rate. Although sometimes effective, using one model per candidate G×E interaction test has two important limitations: low power due to MHC and omitted variable bias. To avoid the coefficient estimation bias associated with independent models, researchers have used penalized regression methods to jointly test all main effects and interactions in a single regression model. Although penalized regression supports joint analysis of all interactions, can be used with hierarchical constraints, and offers excellent predictive performance, it cannot assess the statistical significance of G×E interactions or compute meaningful estimates of effect size. To address the challenge of low power, researchers have separately explored screening-testing, or two-stage, methods in which the set of potential G×E interactions is first filtered and then tested for interactions with MHC only applied to the tests actually performed in the second stage. Although two-stage methods are statistically valid and effective at improving power, they still test multiple separate models and so are impacted by MHC and biased coefficient estimation. To remedy the challenges of both poor power and omitted variable bias encountered with traditional G×E interaction detection methods, we propose a novel approach that combines elements of screening-testing and hierarchical penalized regression. Specifically, our proposed method uses, in the first stage, an elastic net-penalized multiple logistic regression model to jointly estimate either the marginal association filter statistic or the gene-environment correlation filter statistic for all candidate genetic markers. In the second stage, a single multiple logistic regression model is used to jointly assess marginal terms and G×E interactions for all genetic markers that pass the first stage filter. A single likelihood-ratio test is used to determine whether any of the interactions are statistically significant. We demonstrate the efficacy of our method relative to alternative G×E detection methods on a bladder cancer data set. PMID:25592580

208

Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.

209

Full Text Available In this study, we propose a Leverage Based Near-Neighbor (LBNN method where prior information on the structure of the heteroscedastic error is not required. In the proposed LBNN method, weights are determined not from the near-neighbor values of the explanatory variables, but from their corresponding leverage values so that it can be readily applied to a multiple regression model. Both the empirical and Monte Carlo simulation results show that the LBNN method offers substantial improvement over the existing methods. The LBNN has significantly reduced the standard errors of the estimates and also the standard errors of residuals for both simple and multiple linear regression models. Hence, the LBNN can be established as one reliable alternative approach to other existing methods that deal with heteroscedastic errors when the form of heteroscedasticity is unknown.

210

User's Guide to the Weighted-Multiple-Linear Regression Program (WREG version 1.0)

Streamflow is not measured at every location in a stream network. Yet hydrologists, State and local agencies, and the general public still seek to know streamflow characteristics, such as mean annual flow or flood flows with different exceedance probabilities, at ungaged basins. The goals of this guide are to introduce and familiarize the user with the weighted multiple-linear regression (WREG) program, and to also provide the theoretical background for program features. The program is intended to be used to develop a regional estimation equation for streamflow characteristics that can be applied at an ungaged basin, or to improve the corresponding estimate at continuous-record streamflow gages with short records. The regional estimation equation results from a multiple-linear regression that relates the observable basin characteristics, such as drainage area, to streamflow characteristics.

211

Mass estimation of loose parts in nuclear power plant based on multiple regression

International Nuclear Information System (INIS)

According to the application of the Hilbert–Huang transform to the non-stationary signal and the relation between the mass of loose parts in nuclear power plant and corresponding frequency content, a new method for loose part mass estimation based on the marginal Hilbert–Huang spectrum (MHS) and multiple regression is proposed in this paper. The frequency spectrum of a loose part in a nuclear power plant can be expressed by the MHS. The multiple regression model that is constructed by the MHS feature of the impact signals for mass estimation is used to predict the unknown masses of a loose part. A simulated experiment verified that the method is feasible and the errors of the results are acceptable. (paper)

212

General regression neural network in energy cost analysis

International Nuclear Information System (INIS)

Previous researches on energy cost evaluation in industrial processes have been led by the authors using variance analysis techniques, MANOVA. The results were satisfactory and the codes developed using this techniques on process computers were capable to take care of various factors. Nevertheless either many hypothesis had to be made on the analytical form of the regression surfaces, or a pure MANOVA model had to be used, loosing information on the possible interpolation. Moreover, regression approach was hardly extensible to on-line acquisition of new data. In order to achieve this goal and to simplify the processing of data, we adopted neural networks techniques. We tested various types of networks and we found empirical evidence that the General Regression Neural Networks structure (GRNN) could behave consistently better than back-propagation algorithms

213

Regression Analysis between Properties of Subgrade Lateritic Soil

The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR) of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8) different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction...

214

Multiple-regression equations for estimating low flows at ungaged stream sites in Ohio

This report presents multiple-regression equations for estimating selected low-flow characteristics for most unregulated Ohio streams at sites where little or no discharge data are available. The equations relate combinations of drainage area, main-channel length, main-channel slope, average basin elevation, forested area, average annual precipitation, and an index of infiltration to low flows with durations of 7 and 30 days and average recurrence intervals of 2 and 10 years. Data from 132 long-term continuous-record gaging stations and partial-record sites in Ohio were used in the analyses. Multiple-regression analyses were first performed by using data from all 132 sites in an attempt to develop equations that would be applicable statewide. Standard errors for the statewide equations were too high (111 to 189 percent) for them to be of practical use in estimating low streamflows. Data for the state were then subdivided into five regions, and multiple-regression equations were developed for each region. Standard errors for four of the five regions improved, and raged from 43 to 106 percent. Standard errors for region 5 remained high (74 to 129 percent). The multiple-regression equations presented in this report are not applicable to streams with significant low-flow regulation. The equations also are not applicable if (1) the site has been gaged and low-flow estimates have been developed from gaging-station records, (2) low flow can be estimated by the drainage-area transference method from data for a nearby gaged site, or (3) a sufficient number of partial-record measurements made at the site can be adquately correlated with concurrent base flows at a suitable index station.

215

Affine Invariant Descriptors of 3D Object Using Multiple Regression Model

Full Text Available In this work, a new method invariant [1,2,3] for 3D object is proposed using multiple regression model.This method consists of extracting an invariant vector using the multiple linear parameters modelapplied to the 3D object, it’s invariant against affine transformation of this object.The concerned 3D objects are transformations of 3D objects by one element of the overalltransformation. The set of transformations considered in this work is the general affine group.

216

Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.

217

A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG), feed conversion ratio (FCR), carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch) and random (additive genetic, pen and individual permanent environmental) effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult. PMID:11929625

218

Full Text Available Abstract A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG, feed conversion ratio (FCR, carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch and random (additive genetic, pen and individual permanent environmental effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.

219

Bayesian Method of Moments (BMOM) Analysis of Mean and Regression Models

A Bayesian method of moments/instrumental variable (BMOM/IV) approach is developed and applied in the analysis of the important mean and multiple regression models. Given a single set of data, it is shown how to obtain posterior and predictive moments without the use of likelihood functions, prior densities and Bayes' Theorem. The posterior and predictive moments, based on a few relatively weak assumptions, are then used to obtain maximum entropy densities for parameters, re...

220

This paper deals with the multiple annotation problem in medical application of cancer detection in digital images. The main assumption is that though images are labeled by many experts, the number of images read by the same expert is not large. Thus differing with the existing work on modeling each expert and ground truth simultaneously, the multi annotation information is used in a soft manner. The multiple labels from different experts are used to estimate the probability...

221

Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because…

222

In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.

223

Bayesian Method of Moments (BMOM) Analysis of Mean and Regression Models

A Bayesian method of moments/instrumental variable (BMOM/IV) approach is developed and applied in the analysis of the important mean and multiple regression models. Given a single set of data, it is shown how to obtain posterior and predictive moments without the use of likelihood functions, prior densities and Bayes' Theorem. The posterior and predictive moments, based on a few relatively weak assumptions, are then used to obtain maximum entropy densities for parameters, realized error terms and future values of variables. Posterior means for parameters and realized error terms are shown to be equal to certain well known estimates and rationalized in terms of quadratic loss functions. Conditional maxent posterior densities for means and regression coefficients given scale parameters are in the normal form while scale parameters' maxent densities are in the exponential form. Marginal densities for individual regression coefficients, realized error terms and future values are in the Laplace or double-exponenti...

224

Early cost estimating for road construction projects using multiple regression techniques

Directory of Open Access Journals (Sweden)

Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.

225

Entrepreneurship programs in developing countries: A meta regression analysis

This paper provides a synthetic and systematic review on the effectiveness of various entrepreneurship programs in developing countries. We adopt a meta-regression analysis using 37 impact evaluation studies that were in the public domain by March 2012, and draw out several lessons on the design of the programs. We observe a wide variation in program effectiveness across different interventions depending on outcomes, types of beneficiaries, and country context. Overall, entrepreneurship progr...

226

Bayesian residual analysis for beta-binomial regression models

The beta-binomial regression model is an alternative model to the sum of any sequence of equicorrelated binary variables with common probability of success p. In this work a Bayesian perspective of this model is presented considering different link functions and different correlation structures. A general Bayesian residual analysis for this model, a issue which is often neglected in Bayesian analysis, using the residuals based on the predicted values obtained by the conditional predictive ordinate [1], the residuals based on the posterior distribution of the model parameters [2] and the Bayesian deviance residual [3] are presented in order to check the assumptions in the model.

227

This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

228

Full Text Available This study deal with the problem of obtaining some important information of the infected fish in the fish farms while there was always a difficulty in handling the mathematical formulas obtained in some of the research related to the subjects such as the maximum number of the infected fish in a fish farm using the queue system because a lot of computational procedures were required. The multiple linear regression formula for the probability function of the maximum queue length of infected fish during finite time estimated in number of days and the cumulative distribution function were obtained.

229

A regressed phase analysis for coupled joint systems.

This study aims to address shortcomings of the relative phase analysis, a widely used method for assessment of coupling among joints of the lower limb. Goniometric data from 15 individuals with spastic diplegic cerebral palsy were recorded from the hip and knee joints during ambulation on a flat surface, and from a single healthy individual with no known motor impairment, over at least 10 gait cycles. The minimum relative phase (MRP) revealed substantial disparity in the timing and severity of the instance of maximum coupling, depending on which reference frame was selected: MRP(knee-hip) differed from MRP(hip-knee) by 16.1±14% of gait cycle and 50.6±77% difference in scale. Additionally, several relative phase portraits contained discontinuities which may contribute to error in phase feature extraction. These vagaries can be attributed to the predication of relative phase analysis on a transformation into the velocity-position phase plane, and the extraction of phase angle by the discontinuous arc-tangent operator. Here, an alternative phase analysis is proposed, wherein kinematic data is transformed into a profile of joint coupling across the entire gait cycle. By comparing joint velocities directly via a standard linear regression in the velocity-velocity phase plane, this regressed phase analysis provides several key advantages over relative phase analysis including continuity, commutativity between reference frames, and generalizability to many-joint systems. PMID:20971643

230

Arch Height: A Regression Analysis of Different Measuring Parameters

Directory of Open Access Journals (Sweden)

Hironmoy Roy

Spectral Regression Discriminant Analysis for Hyperspectral Image Classification

Dimensionality reduction algorithms, which aim to select a small set of efficient and discriminant features, have attracted great attention for Hyperspectral Image Classification. The manifold learning methods are popular for dimensionality reduction, such as Locally Linear Embedding, Isomap, and Laplacian Eigenmap. However, a disadvantage of many manifold learning methods is that their computations usually involve eigen-decomposition of dense matrices which is expensive in both time and memory. In this paper, we introduce a new dimensionality reduction method, called Spectral Regression Discriminant Analysis (SRDA). SRDA casts the problem of learning an embedding function into a regression framework, which avoids eigen-decomposition of dense matrices. Also, with the regression based framework, different kinds of regularizes can be naturally incorporated into our algorithm which makes it more flexible. It can make efficient use of data points to discover the intrinsic discriminant structure in the data. Experimental results on Washington DC Mall and AVIRIS Indian Pines hyperspectral data sets demonstrate the effectiveness of the proposed method.

232

MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES

Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established...

233

SummaryThis study attempts to compare the performance of two statistical downscaling frameworks in downscaling hydrological indices (descriptive statistics) characterizing the low flow regimes of three rivers in Eastern Canada - Moisie, Romaine and Ouelle. The statistical models selected are Relevance Vector Machine (RVM), an implementation of Sparse Bayesian Learning, and the Automated Statistical Downscaling tool (ASD), an implementation of Multiple Linear Regression. Inputs to both frameworks involve climate variables significantly (? = 0.05) correlated with the indices. These variables were processed using Canonical Correlation Analysis and the resulting canonical variates scores were used as input to RVM to estimate the selected low flow indices. In ASD, the significantly correlated climate variables were subjected to backward stepwise predictor selection and the selected predictors were subsequently used to estimate the selected low flow indices using Multiple Linear Regression. With respect to the correlation between climate variables and the selected low flow indices, it was observed that all indices are influenced, primarily, by wind components (Vertical, Zonal and Meridonal) and humidity variables (Specific and Relative Humidity). The downscaling performance of the framework involving RVM was found to be better than ASD in terms of Relative Root Mean Square Error, Relative Mean Absolute Bias and Coefficient of Determination. In all cases, the former resulted in less variability of the performance indices between calibration and validation sets, implying better generalization ability than for the latter.

234

Residential behavioural energy savings : a meta-regression analysis

Increasing attention is being given to opportunities for residential energy behavioural savings, as developed countries attempt to reduce energy use and greenhouse gas emissions. Several utility companies have undertaken pilot programs geared at understanding which interventions are most effective in reducing residential energy consumption through behavioural change. This paper presented the first metaregression analysis of residential energy behavioural savings. This study focused on interventions which affected household energy-related behaviours and as a result, affected household energy use. The paper described rational choice theory, the theory of planned behaviour, and the integration of rational choice theory and the adjusted expectancy values theory in a simple framework. The paper also discussed the review of various social, psychological and economics journals and databases. The results of the studies were presented. A basic concept in meta-regression analysis is the effects size which is defined as the program effect divided by the standard error of the program effect. A lengthy review of the literature found twenty-eight treatments from ten experiments for which an effect size could be calculated. The experiments involved classifying treatments according to whether the interventions were information, goal setting, feedback, rewards or combinations of these interventions. The impact of these alternative interventions on the effect size was then modelled using White's robust regression. Five regression models were compared on the basis of the Akaike's information criterion. It was found that model 5, which used all of the regressors, was the preferred model. It was concluded that the theory of planned behaviour is more appropriate in the context of analysis of behavioural change and energy use. 21 refs., 4 tabs.

235

Multivariate study and regression analysis of gluten-free granola

Full Text Available SciELO Brazil | Language: English Abstract in english This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were [...] evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.

236

Objective: To analyze the correlations between liver lipid level determined by liver 3.0 T 1H-MRS in vivo and influencing factors using multiple linear stepwise regression. Methods: The prospective study of liver 1H-MRS was performed with 3.0 T system and eight-channel torso phased-array coils using PRESS sequence. Forty-four volunteers were enrolled in this study. Liver spectra were collected with a TR of 1500 ms, TE of 30 ms, volume of interest of 2 cm×2 cm×2 cm, NSA of 64 times. The acquired raw proton MRS data were processed by using a software program SAGE. For each MRS measurement, using water as the internal reference, the amplitude of the lipid signal was normalized to the sum of the signal from lipid and water to obtain percentage lipid within the liver. The statistical description of height, weight, age and BMI, Line width and water suppression were recorded, and Pearson analysis was applied to test their relationships. Multiple linear stepwise regression was used to set the statistical model for the prediction of Liver lipid content. Results: Age (39.1±12.6) years, body weight (64.4±10.4) kg, BMI (23.3±3.1) kg/m2, linewidth (18.9±4.4) and the water suppression (90.7±6.5)% had significant correlation with liver lipid content (0.00 to 0.96%, median 0.02%), r were 0.11, 0.44, 0.40, 0.52, -0.73 respectively (P<0.05). But only age, BMI, line width, and the water suppression entered into the multiple linear regression equation. Liver lipid content prediction equation was as follows: Y= 1.395 - (0.021×water suppression) + (0.022×BMI) + (0.014×line width) - (0.004×age), and the coefficient of determination was 0. 613, corrected coefficient of determination was 0.59. Conclusion: The regression model fitted well, since the variables of age, BMI, width, and water suppression can explain about 60% of liver lipid content changes. (authors)

237

International Nuclear Information System (INIS)

Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque

238

Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

International Nuclear Information System (INIS)

Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was dApril 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems

239

Determination of ventilatory threshold through quadratic regression analysis.

Ventilatory threshold (VT) has been used to measure physiological occurrences in athletes through models via gas analysis with limited accuracy. The purpose of this study is to establish a mathematical model to more accurately detect the ventilatory threshold using the ventilatory equivalent of carbon dioxide (VE/VCO2) and the ventilatory equivalent of oxygen (VE/Vo2). The methodology is primarily a mathematical analysis of data. The raw data used were archived from the cardiorespiratory laboratory in the Department of Kinesiology at Midwestern State University. Procedures for archived data collection included breath-by-breath gas analysis averaged every 20 seconds (ParVoMedics, TrueMax 2400). A ramp protocol on a Velotron bicycle ergometer was used with increased work at 25 W.min beginning with 150 W, until volitional fatigue. The subjects consisted of 27 healthy, trained cyclists with age ranging from 18 to 50 years. All subjects signed a university approved informed consent before testing. Graphic scatterplots and statistical regression analyses were performed to establish the crossover and subsequent dissociation of VE/Vo2 to VE/VCO2. A polynomial trend line along the scatterplots for VE/VO2 and VE/VCO2 was used because of the high correlation coefficient, the coefficient of determination, and trend line. The equations derived from the scatterplots and trend lines were quadratic in nature because they have a polynomial degree of 2. A graphing calculator in conjunction with a spreadsheet was used to find the exact point of intersection of the 2 trend lines. After the quadratic regression analysis, the exact point of VE/Vo2 and VE/VCO2 crossover was established as the VT. This application will allow investigators to more accurately determine the VT in subsequent research. PMID:20802290

Gregg, Joey S; Wyatt, Frank B; Kilgore, J Lon

2010-09-01

240

Hypoglycemia or low blood glucose is dangerous and can result in unconsciousness, seizures, and even death. It is a common and serious side effect of insulin therapy in patients with diabetes. Hypoglycemic monitor is a noninvasive monitor that measures some physiological parameters continuously to provide detection of hypoglycemic episodes in type 1 diabetes mellitus patients (T1DM). Based on heart rate (HR), corrected QT interval of the ECG signal, change of HR, and the change of corrected QT interval, we develop a genetic algorithm (GA)-based multiple regression with fuzzy inference system (FIS) to classify the presence of hypoglycemic episodes. GA is used to find the optimal fuzzy rules and membership functions of FIS and the model parameters of regression method. From a clinical study of 16 children with T1DM, natural occurrence of nocturnal hypoglycemic episodes is associated with HRs and corrected QT intervals. The overall data were organized into a training set (eight patients) and a testing set (another eight patients) randomly selected. The results show that the proposed algorithm performs a good sensitivity with an acceptable specificity. PMID:21349796

241

Full Text Available The objectives of this study were to estimate (covariance functions for additive genetic and permanent environmental effects, as well as the genetic parameters for milk yield over multiple parities, using random regressions models (RRM. Records of 4,757 complete lactations of Murrah breed buffaloes from 12 herds were analyzed. Ages at calving were between 2 and 11 years. The model included the additive genetic and permanent environmental random effects and the fixed effects of contemporary groups (herd, year and calving season and milking frequency (1 or 2. A cubic regression on Legendre orthogonal polynomials of ages was used to model the mean trend. The additive genetic and permanent environmental effects were modeled by Legendre orthogonal polynomials. Residual variances were considered homogenous or heterogeneous, modeled through variance functions or step functions with 5, 7 or 10 classes. Results from Akaike’s and Schwarz’s Bayesian information criterion indicated that a RRM considering a third order polynomial for the additive genetic and permanent environmental effects and a step function with 5 classes for residual variances fitted best. Heritability estimates obtained by this model varied from 0.10 to 0.28. Genetic correlations were high between consecutive ages, but decreased when intervals between ages increased

242

Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 C, without detailed knowledge or need for simulation of the process. © 2013 Elsevier Ltd. All rights reserved

243

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects

Full Text Available SciELO Brazil | Language: English Abstract in english Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic e [...] ffect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

244

The Gauss-Newton algorithm has been used to evaluate tracer binding parameters of RIA by nonlinear regression analysis. The calculations were carried out on the K1003 desk computer. Equations for simple binding models and its derivatives are presented. The advantages of nonlinear regression analysis over linear regression are demonstrated

245

Evaluating the Sustainable Development of Agriculture Based on Multiple Linear Regression

Directory of Open Access Journals (Sweden)

Full Text Available Agriculture is the base of national economy, rural area is basic community and agricultural sustainable development is the base of whole society sustainable development. Studying evaluation index system of agricultural sustainable development level, constructing reasonable evaluation model, are significant for path selection and level promotion. Evaluation index system based on input and output has been built with the method of multiple regression, the interrelation between agricultural investment in fixed assets and related output indexes of agricultural sustainable development, degree of closeness and changing law have been analyzed to find the interrelation mode existing in indexes, a set comprehensive evaluation methods of agricultural sustainable development have been constructed. This evaluation method were used to evaluate agricultural sustainable development level in China’s 31 provinces, can help the local government scientifically know agricultural sustainable development level, provide agricultural sustainable development with scientific basis of decision-making.

246

Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

Full Text Available The saturation percentage (SP of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs, multiple regression (MR, and adaptive neural-based fuzzy inference system (ANFIS were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC were used to develop the applied methods. In additions contributions of each input variable were assessed on estimation of SP index. Two performance functions, namely root mean square errors (RMSE and determination coefficient (R2, were used to evaluate the adequacy of the models. ANFIS method was found to be superior over the other methods. It is, then, proposed that ANFIS model can be used for reasonable estimation of SP values of soils.

247

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

248

Multivariate Regression Analysis of Gravitational Waves from Rotating Core Collapse

We present a new multivariate regression model for analysis and parameter estimation of gravitational waves observed from well but not perfectly modeled sources such as core-collapse supernovae. Our approach is based on a principal component decomposition of simulated waveform catalogs. Instead of reconstructing waveforms by direct linear combination of physically meaningless principal components, we solve via least squares for the relationship that encodes the connection between chosen physical parameters and the principal component basis. Although our approach is linear, the waveforms' parameter dependence may be non-linear. For the case of gravitational waves from rotating core collapse, we show, using statistical hypothesis testing, that our method is capable of identifying the most important physical parameters that govern waveform morphology in the presence of simulated detector noise. We also demonstrate our method's ability to predict waveforms from a principal component basis given a set of physical ...

Engels, William J; Ott, Christian D

2014-01-01

249

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

250

A Logistic Regression Analysis of the Ischemic Heart Disease Risk

Full Text Available The main objective of the present study is to investigate factors that contribute significantly to enhancing the risk of ischemic heart disease. The dependent variable of the study is diagnosis - whether the patient has the disease or does not have the disease. Logistic regression analysis is applied for exploring the factors affecting the disease. The result of the study show the factors that contribute significantly to enhancing the risk of ischemic heart disease are the use of banaspati ghee, living in urban area, high cholesterol level, age group of 51 to 60 years. Other significant factors are Apo Protein A, Apo Protein B, cholesterol level, high density Lipo protein, low density Lipo protein, phospholipids, total lipid and uric acid.

251

A Quantile Regression Analysis of Micro-lending's Poverty Impact

Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF, a database using the Progress out of Poverty (PPI Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade but require patience as each additional year increases the client’s average change in poverty score.

252

Logistic regression analysis on the risk factors of radiation pneumonitis

International Nuclear Information System (INIS)

Objective: To identify the risk factors of radiation pneumonitis (RP). Methods: A retrospective study was conducted on 101 patients with radiation pneumonitis using SPSS 8.0 software. Factors evaluated included: gender, age, pathology, clinical stage, irradiation dose, irradiation field size, history of smoking, cardiovascular disease, bronchitis, surgery, chemotherapy, lung infection, atelectasis, obstructive infection and pleural effusion. Univariate analysis was performed using Chi-Square test and multivariate analysis was performed using Logistic regression model. Results: Univariate analysis revealed a significant relationship between 10 factors: pulmonary infection, atelectasis, obstructive infection, cardiovascular disease, bronchitis, chemotherapy, irradiation dose, number of days of radiation and irradiation field size were factors leading to radiation pneumonitis. Multivariate analysis showed that 9 factors: pulmonary infection, obs tractive infection, atelectasis, pleural effusion, bronchitis, cardiovascular disease, chemotherapy, irradiation dose, and irradiation field size were independent factors. Conclusion: Comprehensive consideration of the accompanying disease, chemotherapy, dose, field size, etc during the planning of radiotherapy is able to minimize the possibility of developing radiation pneumonitis

253

Low-Cost Housing in Sabah, Malaysia: A Regression Analysis

Full Text Available Low-cost housing plays a vital role in the development process especially in providing accommodation to those who are less fortunate and the lower income group. This effort is also a step in overcoming the squatter problem which could cripple the competitive drive of the local community especially in the state of Sabah, Malaysia. This article attempts to look into the influencing factors to low-cost housing in Sabah namely the government’s budget (allocation for low cost housing projects and Sabah’s total population. At the same time, this study will attempt to show the implication from the development and economic crises which occurred during period 1971 to 2000 towards the provision of low cost houses in Sabah. Empirical analyses were conducted using the multiple linear regression method, stepwise and also the dummy variable approach in demonstrating the link. The empirical result shows that the government’s budget for low-cost housing is the main contributor to the provision of low-cost housing in Sabah. The empirical decision also suggests that economic growth namely Gross Domestic Product (GDP did not provide a significant effect to the low-cost housing in Sabah. However, almost all major crises that have beset upon Malaysia’s economy caused a significant and consistent effect to the low-cost housing in Sabah especially the financial crisis which occurred in mid 1997.

254

We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS)-based approaches: logistic regression and Akaike’s Information Criterion (AIC), Multiple Criteria Evaluation (MCE), and Bayesian Analysis (specifically Dempster-Shafer theory). We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1...

255

Full Text Available oped for prediction of particulate matter. The performance of the multiple regression models was assessed. For the development of neural network models, a feed forward with back propagation learning algorithm was used to train the network. The performance of neural network was determined in terms of correlation coefficient (R and Mean Square Error (MSE. The optimum number of hidden neurons was found out for obtaining the lowest value of MSE and the highest value of R. The results indicated that the network can predict particulate concentrations better than multiple regression models.

256

International Nuclear Information System (INIS)

In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

257

Classification and regression tree analysis in acute coronary syndrome patients

Heng-Hsin Tung; Chiang-Yi Chen; Kuan-Chia Lin; Nai-Kuan Chou; Jyun-Yi Lee; Clinciu, Daniel L.; Ru-Yu Lien

2012-01-01

258

Nonparametric regression analysis of data from the Ames mutagenicity assay.

The Ames assay has received widespread attention from statisticians because of its popularity and importance to risk assessment. However, investigators have yet to routinely apply modern regression methods that have been available for more than a decade. We study yet another approach, the application of nonparametric regression techniques, not as the ultimate solution but rather as a framework within which to address some of the shortcomings of other methods. But nonparametric regression is i...

259

Regression Analysis between Properties of Subgrade Lateritic Soil

Full Text Available The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8 different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction and Specific Gravity were performed on the soil samples.Various linear relationships between index properties and CBR of the samples were investigated and predictive equations estimating CBR from the experimental index values were developed. The findings indicate that good correlation exists between the two groups (i.e Index properties and CBR values. However, the values of the CBR computed from the models are only to be used for preliminary in view of simplicity and economy and not acceptable alternatives to laboratory testing because of the anisotropic nature of lateritic soil and its heterogeneity.

260

It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation ...

261

Nonlinear Robust Regression Using Kernel Principal Component Analysis and R-Estimators

In recent years, many algorithms based on kernel principal component analysis (KPCA) have been proposed including kernel principal component regression (KPCR). KPCR can be viewed as a non-linearization of principal component regression (PCR) which uses the ordinary least squares (OLS) for estimating its regression coefficients. We use PCR to dispose the negative effects of multicollinearity in regression models. However, it is well known that the main disadvantage of OLS is its sensitiveness ...

262

The Analysis of Bootstrap Method in Linear Regression Effect

Directory of Open Access Journals (Sweden)

263

Adaptive regression analysis: theory and applications in econometrics

Full Text Available In this work we (a discuss some theoretical and computational difficulties of regression analysing dependences, describing the behaviour of the heterogeneous systems, (b offer a set of new techniques adaptable to regression analysing the heterogeneous dependences and (c demonstrate the advantages of application of these new techniques in econometrics.

264

Multiple regression and model building with mediator variables was addressed to avoid double counting when economic values are estimated from data simulated with herd simulation modeling (using the SimHerd model). The simulated incidence of metritis was analyzed statistically as the independent variable, while using the traits representing the direct effects of metritis on yield, fertility and occurrence of other diseases as mediator variables. The economic value of metritis was estimated to be €78 per 100 cow-years for each 1% increase of metritis in the period of 1-100 days in milk in multiparous cows. The merit of using this approach was demonstrated since the economic value of metritis was estimated to be 81% higher when no mediator variables were included in the multiple regression analysis

265

Abstract Background Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. Methods A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary va...

2012-01-01

266

This article reviews three traditional methods for the analysis of multicenter trials with persons nested within clusters, i.e., centers, namely na?¨ve regression (persons as units of analysis), fixed effects regression, and the use of summary measures (clusters as units of analysis), and compares these methods with multilevel regression. The comparison is made for continuous (quantitative) outcomes, and is based on the estimator of the treatment effect and its standard error, beca...

267

Mapping of multiple quantitative trait loci by simple regression in half-sib designs

Detection of QTL in outbred half-sib family structures has mainly been based on interval mapping of single QTL on individual chromosomes. Methods to account for linked and unlinked QTL have been developed, but most of them are only applicable in designs with inbred species or pose great demands on computing facilities. This study describes a strategy that allows for rapid analysis, involving multiple QTL, of complete genomes. The methods combine information from individual analyses after whic...

268

This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

269

In the recent years, new techniques such as; artificial neural networks and fuzzy inference systems were employed for developing of the predictive models to estimate the needed parameters. Soft computing techniques are now being used as alternate statistical tool. Determination of swell potential of soil is difficult, expensive, time consuming and involves destructive tests. In this paper, use of MLP and RBF functions of ANN (artificial neural networks), ANFIS (adaptive neuro-fuzzy inference system) for prediction of S% (swell percent) of soil was described, and compared with the traditional statistical model of MR (multiple regression). However the accuracies of ANN and ANFIS models may be evaluated relatively similar. It was found that the constructed RBF exhibited a high performance than MLP, ANFIS and MR for predicting S%. The performance comparison showed that the soft computing system is a good tool for minimizing the uncertainties in the soil engineering projects. The use of soft computing will also may provide new approaches and methodologies, and minimize the potential inconsistency of correlations.

270

Dental malocclusion and body posture in young subjects: A multiple regression study

Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Controversial results have been reported on potential correlations between the stomatognathic system and body posture. We investigated whether malocclusal traits correlate with body posture alterations in young subjects to determine possible clinical applications. METHODS: A total of 122 [...] subjects, including 86 males and 36 females (age range of 10.8-16.3 years), were enrolled. All subjects tested negative for temporomandibular disorders or other conditions affecting the stomatognathic systems, except malocclusion. A dental occlusion assessment included phase of dentition, molar class, overjet, overbite, anterior and posterior crossbite, scissorbite, mandibular crowding and dental midline deviation. In addition, body posture was recorded through static posturography using a vertical force platform. Recordings were performed under two conditions, namely, i) mandibular rest position (RP) and ii) dental intercuspidal position (ICP). Posturographic parameters included the projected sway area and velocity and the antero-posterior and right-left load differences. Multiple regression models were run for both recording conditions to evaluate associations between each malocclusal trait and posturographic parameters. RESULTS: All of the posturographic parameters had large variability and were very similar between the two recording conditions. Moreover, a limited number of weakly significant correlations were observed, mainly for overbite and dentition phase, when using multivariate models. CONCLUSION: Our current findings, particularly with regard to the use of posturography as a diagnostic aid for subjects affected by dental malocclusion, do not support existence of clinically relevant correlations between malocclusal traits and body posture

271

The successive projections algorithm (SPA) is widely used to select variables for multiple linear regression (MLR) modeling. However, SPA used only once may not obtain all the useful information of the full spectra, because the number of selected variables cannot exceed the number of calibration samples in the SPA algorithm. Therefore, the SPA-MLR method risks the loss of useful information. To make a full use of the useful information in the spectra, a new method named "consensus SPA-MLR" (C-SPA-MLR) is proposed herein. This method is the combination of consensus strategy and SPA-MLR method. In the C-SPA-MLR method, SPA-MLR is used to construct member models with different subsets of variables, which are selected from the remaining variables iteratively. A consensus prediction is obtained by combining the predictions of the member models. The proposed method is evaluated by analyzing the near infrared (NIR) spectra of corn and diesel. The results of C-SPA-MLR method showed a better prediction performance compared with the SPA-MLR and full-spectra PLS methods. Moreover, these results could serve as a reference for combination the consensus strategy and other variable selection methods when analyzing NIR spectra and other spectroscopic techniques. PMID:25597797

272

In our earlier study[12], we suggested a new alignment algorithm called Multiple Design Configuration Optimization (MDCO hereafter) method combining the merit function regression (MFR) computation with the differential wavefront sampling method (DWS). In this study, we report alignment state estimation performances of the method for three target optical systems (i.e. i) a two-mirror Cassegrain telescope of 58mm in diameter for deep space earth observation, ii) a three-mirror anastigmat of 210mm in aperture for ocean monitoring from the geostationary orbit, and iii) on-axis/off-axis pairs of a extremely large telescope of 27.4m in aperture). First we introduced known amounts of alignment state disturbances to the target optical system elements. Example alignment parameter ranges may include, but not limited to, from 800microns to 10mm in decenter, and from 0.1 to 1.0 degree in tilt. We then ran alignment state estimation simulation using MDCO, MFR and DWS. The simulation results show that MDCO yields much better estimation performance than MFR and DWS over the alignment disturbance level of up to 150 times larger than the required tolerances. In particular, with its simple single field measurement, MDCO exhibits greater practicality and application potentials for shop floor optical testing environment than MFR and DWS.

273

A simplified procedure of linear regression in a preliminary analysis

Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed – and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median of X and Y. Polynomial models for y(x of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.

274

Full Text Available SciELO Public Health | Language: English Abstract in spanish RESUMEN OBJETIVO: Realizar una revisión sistemática de ensayos aleatorizados y controlados en los que se compara el efecto de la administración de múltiples micronutrientes con el de la administración de hierro y ácido fólico sobre los resultados de los embarazos en los países en vías de desarrollo. [...] MÉTODOS: Se realizaron búsquedas en MEDLINE y EMBASE. Los resultados de interés fueron: peso del neonato, bajo peso neonatal, neonatos con una talla baja para la edad gestacional, mortalidad perinatal y mortalidad neonatal. Se calcularon los riesgos relativos (RR) agrupados, empleando modelos de efectos aleatorios. Se investigaron las fuentes de heterogeneidad del metanálisis y la metarregresión de los subgrupos. RESULTADOS: La administración de múltiples micronutrientes fue más eficaz que la administración de hierro y ácido fólico a la hora de reducir el riesgo del peso bajo neonatal (RR=0,86, IC del 95%=0,79-0,93) y la talla baja para la edad gestacional (RR=0,85; IC del 95%=0,78-0,93). La administración de micronutrientes no tuvo un efecto global en la mortalidad perinatal (RR=1,05; IC del 95%=0,90-1,22), si bien la heterogeneidad fue importante y evidente (I²=58%; p de heterogeneidad=0,008). Los análisis de los subgrupos y de la metarregresión sugirieron que la administración de micronutrientes estaba asociada a un menor riesgo de mortalidad perinatal en aquellos estudios en los que más del 50% de las madres tenía formación universitaria (RR=0,93; IC del 95%=0,82-1,06) o en los que la administración se inició después de una media de 20 semanas de gestación (RR=0,88; IC del 95%=0,80-0,97). CONCLUSIÓN: La educación de la madre o la edad gestacional en la que se inició la administración pueden haber contribuido a los efectos heterogéneos observados en la mortalidad perinatal. Se debe seguir investigando la seguridad, la eficacia y la efectividad de la administración de micronutrientes a mujeres embarazadas. Abstract in english OBJECTIVE: To systematically review randomized controlled trials comparing the effect of supplementation with multiple micronutrients versus iron and folic acid on pregnancy outcomes in developing countries. METHODS: MEDLINE and EMBASE were searched. Outcomes of interest were birth weight, low birth [...] weight, small size for gestational age, perinatal mortality and neonatal mortality. Pooled relative risks (RRs) were estimated by random effects models. Sources of heterogeneity were explored through subgroup meta-analyses and meta-regression. FINDINGS: Multiple micronutrient supplementation was more effective than iron and folic acid supplementation at reducing the risk of low birth weight (RR:0.86, 95% confidence interval, CI:0.79-0.93) and of small size for gestational age (RR:0.85; 95% CI: 0.78-0.93). Micronutrient supplementation had no overall effect on perinatal mortality (RR:1.05; 95% CI:0.90-1.22), although substantial heterogeneity was evident (I²=58%; P for heterogeneity=0.008). Subgroup and meta-regression analyses suggested that micronutrient supplementation was associated with a lower risk of perinatal mortality in trials in which >50% of mothers had formal education (RR:0.93; 95% CI:0.82-1.06) or in which supplementation was initiated after a mean of 20 weeks of gestation (RR:0.88; 95% CI:0.80-0.97). CONCLUSION: Maternal education or gestational age at initiation of supplementation may have contributed to the observed heterogeneous effects on perinatal mortality. The safety, efficacy and effective delivery of maternal micronutrient supplementation require further research.

275

Penalized-regression-based multimarker genotype analysis of Genetic Analysis Workshop 17 data

Abstract Testing for association between multiple markers and a phenotype can not only capture untyped causal variants in weak linkage disequilibrium with nearby typed markers but also identify the effect of a combination of markers. We propose a sliding window approach that uses multimarker genotypes as variables in a penalized regression. We investigate a penalty with three separate components: (1) a group least absolute shrinkage and selection operator (LASSO) that selects multim...

276

Watershed Regressions for Pesticides for multiple pesticides (WARP-MP) are statistical models developed to predict concentration statistics for a wide range of pesticides in unmonitored streams. The WARP-MP models use the national atrazine WARP models in conjunction with an adjustment factor for each additional pesticide. The WARP-MP models perform best for pesticides with application timing and methods similar to those used with atrazine. For other pesticides, WARP-MP models tend to overpredict concentration statistics for the model development sites. For WARP and WARP-MP, the less-than-ideal sampling frequency for the model development sites leads to underestimation of the shorter-duration concentration; hence, the WARP models tend to underpredict 4- and 21-d maximum moving-average concentrations, with median errors ranging from 9 to 38% As a result of this sampling bias, pesticides that performed well with the model development sites are expected to have predictions that are biased low for these shorter-duration concentration statistics. The overprediction by WARP-MP apparent for some of the pesticides is variably offset by underestimation of the model development concentration statistics. Of the 112 pesticides used in the WARP-MP application to stream segments nationwide, 25 were predicted to have concentration statistics with a 50% or greater probability of exceeding one or more aquatic life benchmarks in one or more stream segments. Geographically, many of the modeled streams in the Corn Belt Region were predicted to have one or more pesticides that exceeded an aquatic life benchmark during 2009, indicating the potential vulnerability of streams in this region.

277

Correlation analysis in conjunction with principal-component and multiple-regression analyses were applied to laboratory chemical and petrographic data to assess the usefulness of these techniques in evaluating selected physical and hydraulic properties of carbonate-rock aquifers in central Pennsylvania. Correlation and principal-component analyses were used to establish relations and associations among variables, to determine dimensions of property variation of samples, and to filter the variables containing similar information. Principal-component and correlation analyses showed that porosity is related to other measured variables and that permeability is most related to porosity and grain size. Four principal components are found to be significant in explaining the variance of data. Stepwise multiple-regression analysis was used to see how well the measured variables could predict porosity and (or) permeability for this suite of rocks. The variation in permeability and porosity is not totally predicted by the other variables, but the regression is significant at the 5% significance level. ?? 1993.

278

Classification and regression tree analysis in acute coronary syndrome patients

Full Text Available Objectives: The objectives of this study are to use CART (Classification and regression tree and step-wise regression to 1 define the predictors of quality of life in ACS (acute coronary syndrome patients, using demographics, ACS symptoms, and anxiety as independent variables; and 2 discuss and compare the results of these two statistical approaches. Back- ground: In outcome studies of ACS, CART is a good alternative approach to linear regression; however, CART is rarely used. Methods: A descriptive survey design was used with 100 samples recruited. Result and Conclusions: Anxiety is the most significant predictor and also a stronger predictor than symptoms of ACS for the quality of life. The anxiety level patients experienced at the time heart attack occurred can be used to predict quality of life a month later. Furthermore, the majority of ACS patients experienced a moderate to high level of anxiety during a heart attack.

279

The turbulent boundary layer wall pressure on the surface of a vehicle can induce structural vibration which may produce a high level of noise or may result in structural damage. To estimate the wavevector frequency of turbulent boundary layer wall pressure in the intermediate and high frequency range, a linear regression model is presented. The actual spectrum of the wavevector frequency is obtained by multiplying values for a trial spectrum with a correction spectrum. The correction spectrum is approximated by a polynomial which is a function of the Strouhal wavenumber and a set of coefficients determined by a least square minimization of a set of measured response data. A linear multiple regression relates the response of a measuring system to the coefficients. The advantage of the regression approach is that it relaxes the requirements of the wavevector filter's ability to discriminate against the spectral elements outside the wavenumber bandwidth of the filter. Some numerical results of the model are presented.

280

Grades, Gender, and Encouragement: A Regression Discontinuity Analysis

The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…

281

Measurement and Analysis of Test Suite Volume Metrics for Regression Testing

Full Text Available Regression testing intends to ensure that a software applications works as specified after changes made to it during maintenance. It is an important phase in software development lifecycle. Regression testing is the re-execution of some subset of test cases that has already been executed. It is an expensive process used to detect defects due to regressions. Regression testing has been used to support software-testing activities and assure acquiring an appropriate quality through several versions of a software product during its development and maintenance. Regression testing assures the quality of modified applications. In this proposed work, a study and analysis of metrics related to test suite volume was undertaken. It was shown that the software under test needs more test cases after changes were made to it. A comparative analysis was performed for finding the change in test suite size before and after the regression test.

282

Many research groups have being studying the contribution of tropical forests to the global carbon cycle, and theclimatic consequences of substituting the forests for pastures. Considering that soil CO2 efflux is the greater component of the carboncycle of the biosphere, this work found an equation for estimating the soil CO2 efflux of an area of the Transition Forest, using a modelof multiple regression for time series data of temperature and soil moisture. The study was carried out in the n...

283

Application of a multiple least-squares regression program to dual energy NaI-CsI(T1) measurements

In conjunction with the development of an optimum background subtraction routine, a multiple least-squares regression program for simultaneous utilization of both the NaI(T1) and CsI(T1) energy ranges of a dual anti-coincidence detection system was applied. To experimentally evaluate the program for whole body counting purposes, an Am-241 contaminated subject was measured in the whole body counter using the standard three phoswich detector array surrounding the head

284

With determination micro-Fe by 1, 10-phenanthroline spectrophotometry for example, they are systematically introduced the combinatorial measurement and regression analysis method application about metheodic principle, operation step and data processing in the instrumental analysis, including: calibration curve best linear equation is set up, measurand best linear equation is set up, and calculation of best value of a concentration. The results showed that mean of thrice determination

285

The aim of this paper is to generalize permutation methods for multiple testing adjustment of significant partial regression coefficients in a linear regression model used for microarray data. Using a permutation method outlined by Anderson and Legendre [1999] and the permutation P-value adjustment from Simon et al. [2004], the significance of disease related gene expression will be determined and adjusted after accounting for the effects of covariates, which are not restricted to be categori...

286

Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.

287

REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL

Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.

288

Full Text Available Abstract Background Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C? atoms in other residues within a sphere around the C? atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles, we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either "contacted" or "non-contacted", the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary protein sequence and higher order consecutive protein structural and functional properties.

289

Grades, gender, and encouragement: A regression discontinuity analysis

This study employs a regression discontinuity design in order to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an “A” for a final grade in the first economics class is associated with a meaningful increase in the probability of majoring in economics, even after controlling for the numerical grade earned ...

290

Model performance analysis and model validation in logistic regression

Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

291

Retirement patterns in Hong Kong: A censored regression analysis

This paper provides an overview of retirement patterns in Hong Kong on the basis of limited data. A censored regression model is used to infer the retirement age from people's current retirement status and their current age. This model is equivalent to a restricted probit model, and the interpretation of parameters is straightforward. The results clearly show a negative income effect on the retirement decision. The retirement age seems to be positively related to lifetime earnings but negativ...

292

BRGLM, Interactive Linear Regression Analysis by Least Square Fit

1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems

293

Analysis of some methods for reduced rank Gaussian process regression

DEFF Research Database (Denmark)

While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.

294

A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case hereadon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy

295

In this paper, the most relevant multiple regression models for sales forecasting of gas stations, developed over the past ten years, are reviewed. The most significant variables related to gas station sales, the types of the multiple regression models (linear or non-linear), the most common uses in supporting decision making and its limits are presented. The predictive power of each model and its impact on decision-making, such as sensitivity analysis and confidence intervals for independent variables, are also commented. Four models are presented, based on studies conducted in South Africa, Portugal and Brazil. In conclusion, suggestions for future developments are presented based on past developments. (author)

296

Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).

297

This paper introduces a generalization of the regression-discontinuity design (RDD). Traditionally, RDD is considered in a two-dimensional framework, with a single assignment variable and cutoff. Treatment effects are measured at a single location along the assignment variable. However, this represents a specialized (and straight-forward)…

298

Use of Structure Coefficients in Published Multiple Regression Articles: Beta Is Not Enough.

Reviewed articles published in the "Journal of Applied Psychology" (JAP) to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or bivariate "r"s of predictors with the criterion) had been interpreted. Summarizes some dramatic misinterpretations or incomplete interpretations.…

299

Full Text Available We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS-based approaches: logistic regression and Akaike’s Information Criterion (AIC, Multiple Criteria Evaluation (MCE, and Bayesian Analysis (specifically Dempster-Shafer theory. We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy, the failure to predict a species where it occurred (omission error and the prediction of presence where there was absence (commission error. Our overall accuracy showed the logistic regression approach was the most accurate (74.51%. The multiple criteria evaluation was intermediate (39.22%, while the Dempster-Shafer (D-S theory model was the poorest (29.90%. However, omission and commission error tell us a different story: logistic regression had the lowest commission error, while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least, the logistic regression model is optimal. However, where sample size is small or the species is very rare, it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer that would over-predict, protect more sites, and thereby minimize the risk of missing critical habitat in conservation plans[Current Zoology 55(1: 28 – 40, 2009].

300

A comparison of multiple regression and neural network techniques for mapping in situ pCO2 data

Using about 138,000 measurements of surface pCO2 in the Atlantic subpolar gyre (50-70 deg N, 60-10 deg W) during 1995-1997, we compare two methods of interpolation in space and time: a monthly distribution of surface pCO2 constructed using multiple linear regressions on position and temperature, and a self-organizing neural network approach. Both methods confirm characteristics of the region found in previous work, i.e. the subpolar gyre is a sink for atmospheric CO2 throughout the year, and exhibits a strong seasonal variability with the highest undersaturations occurring in spring and summer due to biological activity. As an annual average the surface pCO2 is higher than estimates based on available syntheses of surface pCO2. This supports earlier suggestions that the sink of CO2 in the Atlantic subpolar gyre has decreased over the last decade instead of increasing as previously assumed. The neural network is able to capture a more complex distribution than can be well represented by linear regressions, but both techniques agree relatively well on the average values of pCO2 and derived fluxes. However, when both techniques are used with a subset of the data, the neural network predicts the remaining data to a much better accuracy than the regressions, with a residual standard deviation ranging from 3 to 11 ?atm. The subpolar gyre is a net sink of CO2 of 0.13 Gt-C/yr using the mu2 of 0.13 Gt-C/yr using the multiple linear regressions and 0.15 Gt-C/yr using the neural network, on average between 1995 and 1997. Both calculations were made with the NCEP monthly wind speeds converted to 10 m height and averaged between 1995 and 1997, and using the gas exchange coefficient of Wanninkhof

301

Multiple linear regression models are often used to predict levels of fecal indicator bacteria (FIB) in recreational swimming waters based on independent variables (IVs) such as meteorologic, hydrodynamic, and water-quality measures. The IVs used for these analyses are traditiona...

302

This paper proposes a five-step process by which to analyze whether the salary ratio between junior and senior college faculty exhibits salary compression, a term used to describe an unusually small differential between faculty with different levels of experience. The procedure utilizes commonly used statistical techniques (multiple regression…

303

The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.

304

Multiple neutron spectrum activation analysis

A new nuclear analytical technique based on neutron source spectrum differentiation has been developed to solve complex problems in multinuclide activation analysis with wide nuclide concentration, half-life, cross-section, radioactivity and counting rate ranges. Thus the isotopic and elemental analysis of various isotopic compositions of uranium for nuclear safeguard purposes and of uranium-thorium mixtures in geological, archaeological and other samples, as well as the identification and quantitative determination of multiple photopeaks in instrumental neutron activation analysis is facilitated by the new technique. In case of short and medium-lived nuclides the technique is particularly powerful if it is combined with cyclic activation, intermediate sample storage and proper choice of timing sequences and sample sizes, as well as with irradiation position selection for neutron flux adjustment, in order to optimize the experimental conditions. Thus the counting statistics and consequently the accuracy and sensitivity of the measurements can be improved, high counting rates and radiation build-up, which could cause dead-time losses and pulse pile-up effects, can be avoided, and timing and sample positioning uncertainties, as well as matrix interferences and other negative effects on the measurements, can be reduced. Recent intercomparisons with other laboratories, as the Safeguards Analytical Laboratory of the International Atomic Energy Agency, showed that the new ttomic Energy Agency, showed that the new technique can provide high accuracy in uranium element and U-235 abundance determination by delayed fission neutron counting after neutron activation, with certain advantages as non-destructive sample preparation, high throughput and low analytical cost, and thus can complement or even compete with other well established analytical techniques

305

[An applied study on Fourier transform near-infrared whole spectroscopy regression analysis].

In the present paper, 66 wheat samples were used as experimental materials, 33 of them were used for building the quantitative analysis model of protein content, and the rest composed the prediction set. Using Moore-Penrose matrix, we estimated directly the regression coefficients of the regression analysis model with Fourier transform near-infrared (FTNIR) whole spectroscopy. The samples of prediction set were analyzed, and the correlation coefficient is 0.979 9 between the prediction values of the near-infrared model and the standard chemical ones by Kjeldahl's method, and the average relative error is 1.76%. Using Moore-Penrose matrix, we can not only get the near-infrared spectroscopy analysis model's regression coefficients, but also know their contribution at every wavelength point. Consequently we can understand and explain the physical and chemical significance of the FTNIR whole spectroscopy regression model. PMID:16544481

306

This study was down in Forest Park of Noor. In order to determination of tree ring response to climatic variations, 35 cores were taken from dominant natural stand of common ash (Fraxinus excelsior L.). The guide of this study was finding which climatic variables are effective in the ring width growth of ash in current growing year and previous years (one, two and three years before current growing year) by multiple regression models at the North of IR-Iran. Totally, 85 annually, monthly s...

307

Full Text Available The multiple linear regression formula of the probability of the averaged daily solar energy reaching a specific location on the earth's surface in a calendar month was obtained with the assumption that the arrival process of clouds and solar energy during the day follows the exponential distribution. This formula enables any user to find out some of the required information such as knowing the maximum probability for the averaged daily solar energy and the amount of the corresponding clouds. In addition, the cumulative distribution functions of this probability was obtained.

Mohammed Mohammed El Genidy

2012-01-01

308

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seucomportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se ocomportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: asconcentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; astemperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Dataanalysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured placesare east and southeast.

Angela Radünz Lazzari

2011-01-01

309

Regression analysis for recurrent events data under dependent censoring.

Hsieh, Jin-Jian; Ding, A Adam; Wang, Weijing

310

This study encompasses air surface temperature (AST) modeling in the lower atmosphere. Data of four atmosphere pollutant gases (CO, O3, CH4, and H2O) dataset, retrieved from the National Aeronautics and Space Administration Atmospheric Infrared Sounder (AIRS), from 2003 to 2008 was employed to develop a model to predict AST value in the Malaysian peninsula using the multiple regression method. For the entire period, the pollutants were highly correlated (R=0.821) with predicted AST. Comparisons among five stations in 2009 showed close agreement between the predicted AST and the observed AST from AIRS, especially in the southwest monsoon (SWM) season, within 1.3 K, and for in situ data, within 1 to 2 K. The validation results of AST with AST from AIRS showed high correlation coefficient (R=0.845 to 0.918), indicating the model's efficiency and accuracy. Statistical analysis in terms of ? showed that H2O (0.565 to 1.746) tended to contribute significantly to high AST values during the northeast monsoon season. Generally, these results clearly indicate the advantage of using the satellite AIRS data and a correlation analysis study to investigate the impact of atmospheric greenhouse gases on AST over the Malaysian peninsula. A model was developed that is capable of retrieving the Malaysian peninsulan AST in all weather conditions, with total uncertainties ranging between 1 and 2 K.

311

Additive Intensity Regression Models in Corporate Default Analysis

We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of model checking techniques to identify misspecifications. In our final model, we find evidence of time-variation in the effects of distance-to-default and short-to-long term debt. Also we identify interactions between distance-to-default and other covariates, and the quick ratio covariate is significant. None of our macroeconomic covariates are significant.

312

Multivariate linear regression of high-dimensional fMRI data with multiple target variables.

Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia

313

Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.

Deng, Yangyang; Parajuli, Prem B.

314

Highlights: • Thermodynamic models of simple and regenerative cycles are defined. • Exergy destruction rate of different components was determined. • Impact of important operating parameters on cycles’ characteristics was determined. • Multiple polynomial regression models were developed. • Optimization for optimal operating parameters was performed. - Abstract: In this paper, thermo-environmental, economic and regression analyses of simple and regenerative gas turbine cycles are exhibited. Firstly, thermodynamic models for both cycles are defined; exergy destruction rate of different components is determined and parametric study is carried out to investigate the effects of compressor inlet temperature, turbine inlet temperature and compressor pressure ratio on the parameters that measure cycles’ performance, environmental impact and costs. Subsequently, multiple polynomial regression (MPR) models are developed to correlate important response variables with predictor variables and finally optimization is performed for optimal operating conditions. The results of parametric study have shown a significant impact of operating parameters on the performance parameters, environmental impact and costs. According to exergy analysis, the combustion chamber and exhaust stack are two major sites where largest exergy destruction/losses occur. Also, the total exergy destruction in the regenerative cycle is relatively lower; thereby resulted in a higher exergy efficiency of the cycle. The MPR models are also appeared as good estimator of the response variables since appended with very high R2 values. Finally, these models are used to determine the optimal operating parameters, which maximize the cycles’ performance and minimize CO2 emissions and costs

315

Regression Models for Demand Reduction based on Cluster Analysis of Load Profiles

This paper provides new regression models for demand reduction of Demand Response programs for the purpose of ex ante evaluation of the programs and screening for recruiting customer enrollment into the programs. The proposed regression models employ load sensitivity to outside air temperature and representative load pattern derived from cluster analysis of customer baseline load as explanatory variables. The proposed models examined their performances from the viewpoint of validity of explanatory variables and fitness of regressions, using actual load profile data of Pacific Gas and Electric Company's commercial and industrial customers who participated in the 2008 Critical Peak Pricing program including Manual and Automated Demand Response.

316

Multivariate Bayesian Logistic Regression for Analysis of Clinical Study Safety Issues

Dumouchel, William

2012-01-01

317

Multilayer Perceptron for Robust Nonlinear Interval Regression Analysis Using Genetic Algorithms

Yi-Chung Hu

318

The effectiveness of multiple linear regression approaches in removing solar, volcanic, and El Nino Southern Oscillation (ENSO) influences from the recent (1979-2012) surface temperature record is examined, using simple energy balance and global climate models (GCMs). These multiple regression methods are found to incorrectly diagnose the underlying signal - particularly in the presence of a deceleration - by generally overestimating the solar cooling contribution to an early 21st century pause while underestimating the warming contribution from the Mt. Pinatubo recovery. In fact, one-box models and GCMs suggest that the Pinatubo recovery has contributed more to post-2000 warming trends than the solar minimum has contributed to cooling over the same period. After adjusting the observed surface temperature record based on the natural-only multi-model mean from several CMIP5 GCMs and an empirical ENSO adjustment, a significant deceleration in the surface temperature increase is found, ranging in magnitude from -0.06 to -0.12 K dec-2 depending on model sensitivity and the temperature index used. This likely points to internal decadal variability beyond these solar, volcanic, and ENSO influences.

Masters, T.

319

In this paper, we propose a rapid model adaptation technique for emotional speech recognition which enables us to extract paralinguistic information as well as linguistic information contained in speech signals. This technique is based on style estimation and style adaptation using a multiple-regression HMM (MRHMM). In the MRHMM, the mean parameters of the output probability density function are controlled by a low-dimensional parameter vector, called a style vector, which corresponds to a set of the explanatory variables of the multiple regression. The recognition process consists of two stages. In the first stage, the style vector that represents the emotional expression category and the intensity of its expressiveness for the input speech is estimated on a sentence-by-sentence basis. Next, the acoustic models are adapted using the estimated style vector, and then standard HMM-based speech recognition is performed in the second stage. We assess the performance of the proposed technique in the recognition of simulated emotional speech uttered by both professional narrators and non-professional speakers.

320

Full Text Available This study was down in Forest Park of Noor. In order to determination of tree ring response to climatic variations, 35 cores were taken from dominant natural stand of common ash (Fraxinus excelsior L.. The guide of this study was finding which climatic variables are effective in the ring width growth of ash in current growing year and previous years (one, two and three years before current growing year by multiple regression models at the North of IR-Iran. Totally, 85 annually, monthly seasons and seasonal growth climatic variations of precipitation, temperature, heat index, evapotranspiration and water balance were analyzed. The best multiple regression models were explained 83 percent of total variance of the growth of common ash. The results show that the growth of common ash was related to the previous year's climatic variations than that of the current year. The most effective role of climatic variations was due to the first and second preceding years (55%. Evapotranspiration of July and September, and precipitation of May in the second and precipitation of March in the third previous years, all were positively affected the growth of this species. This study revealed that ash is interested in warmer condition on early and middle of seasonal growth in present of available humid, and precipitation in the months of early growing season (Ordibehesht-Khordad of two previous years.

H. Jalilvand

2008-01-01

321

Full Text Available Many research groups have being studying the contribution of tropical forests to the global carbon cycle, and theclimatic consequences of substituting the forests for pastures. Considering that soil CO2 efflux is the greater component of the carboncycle of the biosphere, this work found an equation for estimating the soil CO2 efflux of an area of the Transition Forest, using a modelof multiple regression for time series data of temperature and soil moisture. The study was carried out in the northwest of MatoGrosso, Brazil (11°24.75’S; 55°19.50’W, in a transition forest between cerrado and AmazonForest, 50 km far from Sinop county.Each month, throughout one year, it was measured soil CO2 efflux, temperature and soil moisture. The annual average of soil CO2 efflux was 7.5 ± 0.6 (mean ± SE ì mol m-2 s-1, the annual mean soil temperature was 25,06 ± 0.12 (mean ± SE ºC. The study indicatedthat the humidity had high influence on soil CO2 efflux; however the results were more significant using a multiple regression modelthat estimated the logarithm of soil CO2 efflux, considering time, soil moisture and the interaction between time duration and theinverse of soil temperature. .

322

The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric estimations in general.

323

The application of a multiple regression model for aero radiometric data

The data observed in the total channel of high sensitivity airborne ?-ray spectrometric surveys is selected as the dependent variable while those of the Th, K and U channels are considered as independent variables and a linear statistical model is assumed to relate them as (Total)sub(i) ?sub(0) + ?1(U)sub(i) + ?2(Th)sub(i) + ?3(K)sub(i) + ?sub(i), ?1, ?2, ?3, are the partial regression coefficients and ?sub(i) is the error term. The estimated coefficients (?1, ?2, ?3) are used to check on board the data acquisition system as well as to predict occasionally the more appropriate value of the data in case a single data item is not recorded correctly. (author)

324

Data analysis in multiple-frequency bioelectrical impedance analysis.

The performance of three analytical methods for multiple-frequency bioelectrical impedance analysis (MFBIA) data was assessed. The methods were the established method of Cole and Cole, the newly proposed method of Siconolfi and co-workers and a modification of this procedure. Method performance was assessed from the adequacy of the curve fitting techniques, as judged by the correlation coefficient and standard error of the estimate, and the accuracy of the different methods in determining the theoretical values of impedance parameters describing a set of model electrical circuits. The experimental data were well fitted by all curve-fitting procedures (r = 0.9 with SEE 0.3 to 3.5% or better for most circuit-procedure combinations). Cole-Cole modelling provided the most accurate estimates of circuit impedance values, generally within 1-2% of the theoretical values, followed by the Siconolfi procedure using a sixth-order polynomial regression (1-6% variation). None of the methods, however, accurately estimated circuit parameters when the measured impedances were low (Cole-Cole modelling remains the preferred method for the analysis of MFBIA data. PMID:9626691

Cornish, B H; Ward, L C

1998-05-01

325

Nonlinear Robust Regression Using Kernel Principal Component Analysis and R-Estimators

Full Text Available In recent years, many algorithms based on kernel principal component analysis (KPCA have been proposed including kernel principal component regression (KPCR. KPCR can be viewed as a non-linearization of principal component regression (PCR which uses the ordinary least squares (OLS for estimating its regression coefficients. We use PCR to dispose the negative effects of multicollinearity in regression models. However, it is well known that the main disadvantage of OLS is its sensitiveness to the presence of outliers. Therefore, KPCR can be inappropriate to be used for data set containing outliers. In this paper, we propose a novel nonlinear robust technique using hybridization of KPCA and R-estimators. The proposed technique is compared to KPCR and gives better results than KPCR.

Antoni Wibowo

326

Regression analysis of ESR/TL dose-response data

Methods are described for the analysis of ESR (electron spin resonance) or TL (thermoluminescence) dose-response data. When fitting data to a straight line, an expression is derived which allows the error in the accumulated dose, AD, to be estimated. For fitting data to a saturating exponential, the simplex algorithm with quadratic convergence is proposed. This allows the errors in the parameters, including the AD, to be estimated. An alternative method for estimating the parameter errors, using analytical expressions for the required partial derivatives, is also described. These techniques are more satisfactory than jackknifing for estimating uncertainties in ADs. (author)

327

Criteria for the use of regression analysis for remote sensing of sediment and pollutants

An examination of limitations, requirements, and precision of the linear multiple-regression technique for quantification of marine environmental parameters is conducted. Both environmental and optical physics conditions have been defined for which an exact solution to the signal response equations is of the same form as the multiple regression equation. Various statistical parameters are examined to define a criteria for selection of an unbiased fit when upwelled radiance values contain error and are correlated with each other. Field experimental data are examined to define data smoothing requirements in order to satisfy the criteria of Daniel and Wood (1971). Recommendations are made concerning improved selection of ground-truth locations to maximize variance and to minimize physical errors associated with the remote sensing experiment.

328

Regression Analysis on the Chemical Descriptors of a Selected Class of DPP4 Inhibitors

Full Text Available The activity of a selected class of DPP4 inhibitors was assessed using quantum-chemical and physical descriptors. Using multiple linear regression model, it was found that ?E, LUMO energy, dipole, area, volume, molecular weight and ?H are the significant descriptors that can adequately assess the activity of the compounds. The model suggests that bulky and electrophilic inhibitors are desired. Furthermore a pair interaction between ?E and dipole as well as for LUMO energy and dipole were determined as well. It is expected that the information derived herein will be beneficial for future design and development of DPP4 inhibitors. Key words: Multiple Linear Regression; Molecular Descriptors; 2D-QSAR; DPP4 Inhinitors

329

Evaluation of mechanical characteristics of cartilage by magnetic resonance imaging would provide a noninvasive measure of tissue quality both for tissue engineering and when monitoring clinical response to therapeutic interventions for cartilage degradation. We use results from multiexponential transverse relaxation analysis to predict equilibrium and dynamic stiffness of control and degraded bovine nasal cartilage, a biochemical model for articular cartilage. Sulfated glycosaminoglycan concentration/wet weight (ww) and equilibrium and dynamic stiffness decreased with degradation from 103.6 ± 37.0 µg/mg ww, 1.71 ± 1.10 MPa and 15.3 ± 6.7 MPa in controls to 8.25 ± 2.4 µg/mg ww, 0.015 ± 0.006 MPa and 0.89 ± 0.25MPa, respectively, in severely degraded explants. Magnetic resonance measurements were performed on cartilage explants at 4 °C in a 9.4 T wide-bore NMR spectrometer using a Carr-Purcell-Meiboom-Gill sequence. Multiexponential T2 analysis revealed four water compartments with T2 values of approximately 0.14, 3, 40 and 150 ms, with corresponding weight fractions of approximately 3, 2, 4 and 91%. Correlations between weight fractions and stiffness based on conventional univariate and multiple linear regressions exhibited a maximum r(2) of 0.65, while those based on support vector regression (SVR) had a maximum r(2) value of 0.90. These results indicate that (i) compartment weight fractions derived from multiexponential analysis reflect cartilage stiffness and (ii) SVR-based multivariate regression exhibits greatly improved accuracy in predicting mechanical properties as compared with conventional regression. PMID:24519878

Irrechukwu, Onyi N; Thaer, Sarah Von; Frank, Eliot H; Lin, Ping-Chang; Reiter, David A; Grodzinsky, Alan J; Spencer, Richard G

2014-04-01

330

This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.

Fauser, Patrik; Thomsen, Marianne

2010-01-01

331

Full Text Available Abstract Background Accurate prediction of antigenic epitopes is important for immunologic research and medical applications, but it is still an open problem in bioinformatics. The case for discontinuous epitopes is even worse - currently there are only a few discontinuous epitope prediction servers available, though discontinuous peptides constitute the majority of all B-cell antigenic epitopes. The small number of structures for antigen-antibody complexes limits the development of reliable discontinuous epitope prediction methods and an unbiased benchmark to evaluate developed methods. Results In this work, we present two novel server applications for discontinuous epitope prediction: EPSVR and EPMeta, where EPMeta is a meta server. EPSVR, EPMeta, and datasets are available at http://sysbio.unl.edu/services. Conclusion The server application for discontinuous epitope prediction, EPSVR, uses a Support Vector Regression (SVR method to integrate six scoring terms. Furthermore, we combined EPSVR with five existing epitope prediction servers to construct EPMeta. All methods were benchmarked by our curated independent test set, in which all antigens had no complex structures with the antibody, and their epitopes were identified by various biochemical experiments. The area under the receiver operating characteristic curve (AUC of EPSVR was 0.597, higher than that of any other existing single server, and EPMeta had a better performance than any single server - with an AUC of 0.638, significantly higher than PEPITO and Disctope (p-value

Yao Bo

2010-07-01

332

Full Text Available Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA. Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are developed. The PCA shows six components of key factors that can explain at least up to 86.7% of the variation of all variables. The regression models show that basin size, channel network characteristics, land use, basin steepness and rainfall distribution are the key factors affecting sediment yield. The validation of regression relationships for estimating suspended sediment yield shows the error of estimation ranging from ?55% to +315% and ?59% to +259% for suspended sediment yield and for area-specific suspended sediment yield, respectively. The proposed relationships may be considered useful for predicting suspended sediment yield in ungauged basins of Northern Thailand that have geologic, climatic and hydrologic conditions similar to the study area.

Piyawat Wuttichaikitcharoen

2014-08-01

333

In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

Nose, Takashi; Kobayashi, Takao

The Sea Level Thematic Assembly Center in the EUFP7 MyOcean project aims at build a sea level service for multiple satellite sea level observations at a European level for GMES marine applications. It aims to improve the sea level related products to guarantee the sustainability and the quality of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the aim at improved future high spatial and temporal sea level prediction for i.e., human safety. Tide gauges and satellite altimetry data from the last seventeen years have been compared for an area around UK and temporal correlation coefficients between them were calculated. The results are extremely encouraging, as we have shown that the detided signal from response method correlates to more than 90% for nearly all tide gauge stations with satellite altimetry.

Cheng, Yongcun; Andersen, Ole Baltazar

2010-01-01

334

Full Text Available Abstract Background During community epidemics, infections may be imported within hospital and transmitted to hospitalized patients. Hospital outbreaks of communicable diseases have been increasingly reported during the last decades and have had significant consequences in terms of patient morbidity, mortality, and associated costs. Quantitative studies are thus needed to estimate the risks of communicable diseases among hospital patients, taking into account the epidemiological process outside, hospital and host-related risk factors of infection and the role of other patients and healthcare workers as sources of infection. Methods We propose a multiplicative hazard regression model to analyze the risk of acquiring a communicable disease by patients at hospital. This model derives from epidemiological data on communicable disease epidemics in the community, hospital ward, patient susceptibility to infection, and exposure of patients to infection at hospital. The model estimates the relative effect of each of these factors on a patient's risk of communicable disease. Results Using individual data on patients and health care workers in a teaching hospital during the 2004-2005 influenza season in Lyon (France, we show the ability of the model to assess the risk of influenza-like illness among hospitalized patients. The significant effects on the risk of influenza-like illness were those of old age, exposure to infectious patients or health care workers, and a stay in a medical care unit. Conclusions The proposed multiplicative hazard regression model could be an interesting epidemiological tool to quantify the risk of communicable disease at hospital during community epidemics and the uncertainty inherent in such quantification. Furthermore, key epidemiological, environmental, host, or exposure factors that influence this risk can be identified.

Vanhems Philippe

2011-04-01

335

Full Text Available Habitat degradation and loss has been widely recognized as the main cause for the decline of wildlife population. Evaluating the quality of wildlife habitat can provide essential information for wildlife refuge design and management. The purpose of this study was to produce georeferenced ecological information about suitable habitats available for muntjac, Muntiacus muntjak in Chandoli tiger reserve, India (17° 04' 00" N to 17° 19' 54" N and 73° 40' 43" E to 73° 53' 09" E. Habitats were evaluated using multiple logistic regression integrated with remote sensing and geographic information system. Satellite imageries of LISS-III of IRS-P6 of study area were digitally processed. To generate collateral data topographic maps were analysed in a GIS framework. Layers of different variables such as Landuse land cover, forest density, proximity to disturbances and water resources and a digital terrain model were created from satellite and topographic sheets. These layers along with GPS location of muntjac presence/absence and ?multiple logistic regression (MLR techniques were integrated in a GIS environment to model habitat suitability index of muntjac. The results indicate that approximately 222.39 km2 (75.4% of the forest of tiger reserve was least suitable for muntjac, whereas, 29.53 km2 (10.02% was moderately suitable, 22.12 km2 (7.5% suitable and 20.70 km2 (7.0% was highly suitable. The accuracy level of this model was 97.6%. The model can be considered as potent enough to advocate that forests of this area are most appropriate for declaring it as a reserve for muntjac conservation, ultimately to provide prey base for tiger.

Imam EKWAL

2012-12-01

336

In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene–gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data. PMID:25574130

Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

337

In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene-gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data. PMID:25574130

338

The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination Rc2 ? 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: ? Performed a critical review of empirical models of turbine efficiency. ? Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. ? Verified the method for developing the empirical model. ? Verified the model.

340

339

Héctor Andrés, López Ospina; Rafael David, López Ospina.

2010-06-01

340

Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data

Full Text Available Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS and weighed least-squares regression (WLS. All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding ?_{1}.

Sara M. Gustavsson

2012-10-01

341

Factor scale scores are sometimes used as weights to create composite variables representing the variables included in a factor analysis. If these composite variables are then used to predict some dependent variable, serious theoretical and methodological problems arise. This paper explores these problems and suggests strategies for circumventing…

Educational and Psychological Measurement, 1979

342

Multiple Imputation, Maximum Likelihood and Fully Bayesian methods are the three most commonly used model-based approaches in missing data problems. Although it is easy to show that when the responses are missing at random (MAR), the complete case analysis is unbiased and efficient, the aforementioned methods are still commonly used in practice for this setting. To examine the performance of and relationships between these three methods in this setting, we derive and investigate small sample and asymptotic expressions of the estimates and standard errors, and fully examine how these estimates are related for the three approaches in the linear regression model when the responses are MAR. We show that when the responses are MAR in the linear model, the estimates of the regression coefficients using these three methods are asymptotically equivalent to the complete case estimates under general conditions. One simulation and a real data set from a liver cancer clinical trial are given to compare the properties of these methods when the responses are MAR. PMID:25309677

343

Given the potential importance of marginal artery localization in automated registration in computed tomography colonography (CTC), we have devised a semi-automated method of marginal vessel detection employing sequential Monte Carlo tracking (also known as particle filtering tracking) by multiple cue fusion based on intensity, vesselness, organ detection, and minimum spanning tree information for poorly enhanced vessel segments. We then employed a random forest algorithm for intelligent cue fusion and decision making which achieved high sensitivity and robustness. After applying a vessel pruning procedure to the tracking results, we achieved statistically significantly improved precision compared to a baseline Hessian detection method (2.7% versus 75.2%, p<0.001). This method also showed statistically significantly improved recall rate compared to a 2-cue baseline method using fewer vessel cues (30.7% versus 67.7%, p<0.001). These results demonstrate that marginal artery localization on CTC is feasible by combining a discriminative classifier (i.e., random forest) with a sequential Monte Carlo tracking mechanism. In so doing, we present the effective application of an anatomical probability map to vessel pruning as well as a supplementary spatial coordinate system for colonic segmentation and registration when this task has been confounded by colon lumen collapse. PMID:25461335

344

What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

345

Meta-regression analysis of commensal and pathogenic Escherichia coli survival in soil and water.

Franz, Eelco; Schijven, Jack; de Roda Husman, Ana Maria; Blaak, Hetty

346

A novel approach for setting interpretive breakpoints in disk diffusion antibiotic susceptibility testing according to determined minimum inhibitory concentration (MIC) limits is described, using the method of single-strain regression analysis. The procedure was tested on reference strains Staphylococcus aureus (ATCC 25923), Streptococcus faecalis (ATCC 29212), Escherichia coli (ATCC 25922), and Pseudomonas aeruginosa (ATCC 27853), using published results from cefoperazone disk diffusion expe...

Kronvall, G.

347

We consider the problem of modeling heteroscedasticity in semiparametric regression analysis of crosssectional data. Existing work in this setting is rather limited and mostly adopts a fully nonparametric variance structure. This approach is hampered by curse of dimensionality in practical applications. Moreover, the corresponding asymptotic theory is largely restricted to estimators that minimize certain smooth objective functions. The asymptotic derivation thus excludes semiparametric quant...

348

Application of logistic regression in an analysis of Polish households’ financial problems

Full Text Available This article attempted to identify the socio-economic and demographic factors influencing the problems with arrears in Polish households. The micro data from Social Diagnosis were used. In order to achieve the main goal the logistic regression analysis was used.

Zbigniew Go?a?,

349

The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.

350

Methods and applications of linear models regression and the analysis of variance

Praise for the Second Edition"An essential desktop reference book . . . it should definitely be on your bookshelf." -Technometrics A thoroughly updated book, Methods and Applications of Linear Models: Regression and the Analysis of Variance, Third Edition features innovative approaches to understanding and working with models and theory of linear regression. The Third Edition provides readers with the necessary theoretical concepts, which are presented using intuitive ideas rather than complicated proofs, to describe the inference that is appropriate for the methods being discussed. The book

351

Fast algorithm of the robust Gaussian regression filter for areal surface analysis

352

353

Multiplicative interaction in network meta-analysis.

Piepho, Hans-Peter; Madden, Laurence V; Williams, Emlyn R

353

This paper presents the assessment results of spatially based probabilistic three models using Geoinformation Techniques (GIT) for landslide susceptibility analysis at Penang Island in Malaysia. Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and supported with field surveys. Maps of the topography, soil type, lineaments and land cover were constructed from the spatial data sets. There are nine landslide related factors were extracted from the spatial database and the neural network, frequency ratio and logistic regression coefficients of each factor was computed. Landslide susceptibility maps were drawn for study area using neural network, frequency ratios and logistic regression models. For verification, the results of the analyses were compared with actual landslide locations in study area. The verification results show that frequency ratio model provides higher prediction accuracy than the ANN and regression models.

355

354

Full Text Available Energy is one of the indispensible elements of human life and electrical energy is adopted as the most frequently used energy type. As this type of energy can not be stored at the present time, it has to be instantly consumed. In other words, the demand of the consumers has to be compensated, immediately. This paper employs to model the electrical consumption of Erzurum province in 2011 by spline regression and to decide whether a statistically seasonal variation exists for this consumption. The one-year data set of the investigation was obtained from Turkish Electricity Transmission Company Provincial Directorate of Erzurum and was analyzed by the agency of continuous partial polynomial spline regressions. This analysis determined three knots and fits linear, quadratic and cubic spline regression models.

Omer Alkan

355

356

Full Text Available A quantitative structure–property relationship (QSPR study was performed to develop models those relate the structures of 65 Kovats retention index (RI of adamantane derivatives. Molecular descriptors derived solely from 3D structures of the molecular compounds. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 52 molecules as training set, and predictive ability tested using 13 compounds. Modeling of RI of Adamantane derivatives as a function of the theoretically derived descriptors was established by multiple linear regression (MLR. The usefulness of the quantum chemical descriptors, calculated at the level of the DFT theories using 6-311+G** basis set for QSAR study of adamantane derivatives was examined. The use of descriptors calculated only from molecular structure eliminates the need to experimental determination of properties for use in the correlation and allows for the estimation of RI for molecules not yet synthesized. Application of the developed model to testing set of 13 drug organic compounds demonstrates that the model is reliable with goo predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. A multi-parametric equation containing maximum Four descriptors at B3LYP/6-31+G** method with good statistical qualities (R2train=0.913, Ftrain=97.67, R2test=0.770, Ftest=3.21, Q2LOO=0.895, R2adj=0.904, Q2LGO=0.844 was obtained by Multiple Linear Regression using stepwise method.

Z. Bayat

356

357

Redirecting T lymphocyte antigen specificity by gene transfer can provide large numbers of tumor-reactive T lymphocytes for adoptive immunotherapy. However, safety concerns associated with viral vector production have limited clinical application of T cells expressing chimeric antigen receptors (CAR). T lymphocytes can be gene modified by RNA electroporation without integration-associated safety concerns. To establish a safe platform for adoptive immunotherapy, we first optimized the vector backbone for RNA in vitro transcription to achieve high-level transgene expression. CAR expression and function of RNA-electroporated T cells could be detected up to a week after electroporation. Multiple injections of RNA CAR-electroporated T cells mediated regression of large vascularized flank mesothelioma tumors in NOD/scid/?c(-/-) mice. Dramatic tumor reduction also occurred when the preexisting intraperitoneal human-derived tumors, which had been growing in vivo for >50 days, were treated by multiple injections of autologous human T cells electroporated with anti-mesothelin CAR mRNA. This is the first report using matched patient tumor and lymphocytes showing that autologous T cells from cancer patients can be engineered to provide an effective therapy for a disseminated tumor in a robust preclinical model. Multiple injections of RNA-engineered T cells are a novel approach for adoptive cell transfer, providing flexible platform for the treatment of cancer that may complement the use of retroviral and lentiviral engineered T cells. This approach may increase the therapeutic index of T cells engineered to express powerful activation domains without the associated safety concerns of integrating viral vectors. PMID:20926399

Zhao, Yangbing; Moon, Edmund; Carpenito, Carmine; Paulos, Chrystal M; Liu, Xiaojun; Brennan, Andrea L; Chew, Anne; Carroll, Richard G; Scholler, John; Levine, Bruce L; Albelda, Steven M; June, Carl H

357

Critical Regression Analysis of Real Time Industrial Web Data Set Using Data Mining Tool

Kohli, Shruti; Gupta, Ankit

358

Factors predicting the failure of Bernese periacetabular osteotomy: a meta-regression analysis

Sambandam, Senthil Nathan; Hull, Jason; Jiranek, William A.

359

Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using...

360

Gene-level pharmacogenetic analysis on survival outcomes using gene-trait similarity regression

Tzeng, Jung-ying; Lu, Wenbin; Hsu, Fang-chi

361

A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O) concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA) design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27), very low P-value (SnO2 thin film synthesis. PMID:24509767

362

Multi-task Regression using Minimal Penalties

Solnon, Matthieu; Arlot, Sylvain; Bach, Francis

363

Risk factors for severe bacterial infections, that is, deep sternal wound infection, pneumonia, septicemia, and prosthetic valve endocarditis, were evaluated in 246 consecutive patients undergoing valve replacement (N = 84) or aortocoronary bypass operation (N = 162). Multiple logistic regression analysis was applied to determine the ability of putative risk factors to predict infection. The risk factors considered were age, sex, diabetes mellitus, duration of cardiopulmonary bypass (CPB), duration of operation, amount of blood restored on the day of operation, repeat thoracotomy for bleeding, intraaortic balloon pumping, reoperation, emergency operation, and the professional status of the surgeon. Severe infections occurred in similar frequency after valve replacement (8/84; 9.5%) and aortocoronary bypass (11/162; 6.8%). For patients who had a bypass procedure, repeat thoracotomy was the only factor significantly associated with infection (p = 0.0004). However, the classification analysis revealed that this variable alone is too unspecific for a reliable prediction. Univariate analysis indicated that restoration of more than 2,500 ml of blood (p = 0.0001), reoperation (p = 0.0821), duration of operation (p = 0.0061), duration of CPB (p = 0.0318), and intraaortic balloon pumping (p = 0.0281) were associated with infection following valve replacement. A model with three variables emerged from the multiple logistic regression: after correction for blood restoration, reoperation, and duration of CPB, no other variable was of additional predictive value. For patients who underwent valve replacement, the model performed well in predicting complications. The classification analysis revealed a high correspondence between observed and predicted instances of infection: it correctly predicted 75% of the patients with infection and 96% of those without infection.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3876084

364

Water-dispersible soil colloids (WDC) act as carriers for sorbing chemicals in macroporous soils and hence constitute a significant risk for the aquatic environment. The prediction of WDC readily available for facilitated chemical transport is an unsolved challenge. This study identifies key parameters and predictive indicators for assessing field-scale variation of WDC. Samples representing three measurement scales (1- to 2-mm aggregates, intact 100-cm rings, and intact 6283 cm columns) were retrieved from the topsoil of a 1.69-ha agricultural field in a 15-m by 15-m grid to determine colloid dispersibility, mobilization, and transport. The amount of WDC was determined using (i) a laser diffraction method on 1- to 2-mm aggregates and (ii) an end-over-end shaking method on 100-cm intact rings. The accumulated amount of colloids leached from 20-cm by 20-cm intact columns was determined as a measure of the integrated colloid mobilization and transport. The WDC and the accumulated colloid transport were higher in samples from the northern part of the field. Using multiple linear regression (MLR) analyses, WDC or amount of colloids transported were predicted at the three measurement scales from 24 measured, geo-referenced parameters to identify parameters that could serve as indicator parameters for screening for colloid dispersibility, mobilization, and transport. The MLR analyses were performed at each sample scale using all, only northern, and only southern field locations. Generally, the predictive power of the regression models was best on the smallest 1- to 2-mm aggregate scale. Overall, our results suggest that different drivers controlled colloid dispersibility and transport at the three measurement scales and in the two subareas of the field. PMID:25603261

2014-09-01

365

Directory of Open Access Journals (Sweden)

Full Text Available A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27, very low P-value (<0.012 and 0.0478, non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977 and the adequate precision (170.96 and 12.57 validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis.

366

Application of appropriate models to approximate the performance function warrants more precise prediction and helps to make the best decisions in the poultry industry. This study reevaluated the factors affecting hatchability in laying hens from 29 to 56 wk of age. Twenty-eight data lines representing 4 inputs consisting of egg weight, eggshell thickness, egg sphericity, and yolk/albumin ratio and 1 output, hatchability, were obtained from the literature and used to train an artificial neural network (ANN). The prediction ability of ANN was compared with that of fuzzy logic to evaluate the fitness of these 2 methods. The models were compared using R(2), mean absolute deviation (MAD), mean squared error (MSE), mean absolute percentage error (MAPE), and bias. The developed model was used to assess the relative importance of each variable on the hatchability by calculating the variable sensitivity ratio. The statistical evaluations showed that the ANN-based model predicted hatchability more accurately than fuzzy logic. The ANN-based model had a higher determination of coefficient (R(2) = 0.99) and lower residual distribution (MAD = 0.005; MSE = 0.00004; MAPE = 0.732; bias = 0.0012) than fuzzy logic (R(2) = 0.87; MAD = 0.014; MSE = 0.0004; MAPE = 2.095; bias = 0.0046). The sensitivity analysis revealed that the most important variable in the ANN-based model of hatchability was egg weight (variable sensitivity ratio, VSR = 283.11), followed by yolk/albumin ratio (VSR = 113.16), eggshell thickness (VSR = 16.23), and egg sphericity (VSR = 3.63). The results of this research showed that the universal approximation capability of ANN made it a powerful tool to approximate complex functions such as hatchability in the incubation process. PMID:23472039

Mehri, M

2013-04-01

367

Application of Binary Regression Analysis in the Prescription Pattern of Antidepressants

Full Text Available Background:In Nepal several research studies are reported using percentages or cross tabulation method, but the relevance of logistic regression methodology in research is lag behind among the researchers. Objectives: The main objective of this study was to find the role of logistic regression analysis in the pattern of antidepressants in a tertiary care center in hospitalized patients of Western Nepal.Methods: A hospital based study was done between 1st October 2009 and 31st March 2010 at Psychiatry Ward of Manipal Teaching Hospital, Nepal. Z test, Chi square test and Binary logistic regression were used for the analysis. We calculated odds ratios (OR and their 95% confidence intervals (95% CI P-value 10000, 2.63 times more in Hindus and 1.197 times more in Brahmins than any other ethnic groups. 9.179 times more tendency of prescribing antidepressants by trade names in case of unemployed patients as compared to employed patients in Nepal.Conclusion: Binary Logistic regression plays an important role to understand the drug utilization pattern of mood elevators in Western Nepal.

Dr.Indrajit Banerjee, MBBS, MD

368

A new cluster-histo-regression analysis for incremental learning from temporal data chunks

Directory of Open Access Journals (Sweden)

Full Text Available In scenarios where data chunks arrive temporally, a good algorithm for exploratory analysisshould be able to generate the knowledge and with the next chunk of data arriving, the process should bethe one of just updating online by accumulating the knowledge derived from the recent chunk. Such anincremental learning process in most of the cases indent a lot of memory requiring to carry all earlier data inthe process of updating the knowledge successively. In this research work we propose to employ a novelCluster-Histo-Regression analysis of the chunk to extract the knowledge for the temporal instant and fusethis knowledge through Histo-Regression-Distance analysis with the already accumulated knowledge. Wehave designed a methodology which (i discards all those data samples from the chunk which haveparticipated in the knowledge generation process (ii indents minimum amount of memory to carry theaccumulated knowledge and (iii proposes to carry forward only those limited data samples (referred to ashard samples which could not contribute to knowledge generated at that moment. Knowledge of eachcluster is represented in the form of a histogram for each dimension of the clustered data and is transformedto regression line for the compact representation of the knowledge. The regression line parameters of theclusters obtained by incremental augmentation have shown an accuracy of up to 100% for some of the datasets that are considered for experimentation.

Nagabhushan P.

369

370

Full Text Available SciELO Chile | Language: Spanish Abstract in spanish La incorporación de nuevo personal o la reasignación del ya existente a tareas específicas constituyen una decisión importante, porque el acierto en ella determinará la propia supervivencia de la empresa. En este contexto se vuelve relevante contar con un modelo de selección de personal que consider [...] e la información ambigua y los grados de incertidumbre que están asociados al momento de evaluar las valoraciones cualitativas de los postulantes y que pueda entregar resultados certeros y precisos, garantizando de esta manera el buen desempeño del cargo y reduciendo así el riesgo que conlleva la incorporación de nuevas personas. En este trabajo se elaboró un modelo de selección de personal, en condiciones de incertidumbre, aplicando Lógica Difusa, utilizando como datos de entrada las descripciones de cargos de una empresa del retail, con variables difusas triangulares y con solapamiento. Este fue comparado con un modelo clásico de regresión múltiple. Los resultados mostraron que, en este caso, el uso del modelo de regresión múltiple es más eficiente que el modelo de lógica difusa optado. Abstract in english The incorporation of new personnel or the reallocation of existing tasks is an important decision, since its correctness will determine the survival of the company. In this context, having a model of personnel selection, that considers the associated ambiguous information and degrees of uncertainty, [...] becomes relevant when assessing the qualitative value of the applicants, able to deliver accurate and precise results thus ensuring the good performance of the position and reducing the associated risk with the incorporation of new people. In this work, a model of personnel selection, in conditions of uncertainty using fuzzy logic and having as input the data descriptions of positions of a retail industry, with triangular fuzzy variables and overlap was developed. This was compared with a classical model of multiple regressions. The results showed in this case, that the use of the model of multiple regressions is more efficient than the opted model of fuzzy logic.

Carlos A, Díaz-Contreras; Alejandra, Aguilera-Rojas; Nathaly, Guillén-Barrientos.

370

371

A multivariate regression analysis is applied to decay measurements of ?-resp. ?-filter activcity. Activity concentrations for Po-218, Pb-214 and Bi-214, resp. for the Rn-222 equilibrium equivalent concentration are obtained explicitly. The regression analysis takes into account properly the variances of the measured count rates and their influence on the resulting activity concentrations. (orig.)

371

Robust Outlier Detection in Linear Regression

Jajo, Nethal K.; Xizhi Wu

372

When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS) method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performan...

373

International Nuclear Information System (INIS)

374

Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.

Linard, Joshua I.

375

Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

376

Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

377

Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be inefficient because of daily fluctuations in real factories. Decision support systems can provide productive tools for production planners to offer a feasible and prompt decision in effective and robust production planning. In this paper, we propose a robust decision support tool for detailed production planning based on statistical multivariate method including principal component analysis and logistic regression. The proposed approach has been used in a real case in Iranian automotive industry. In the presence of existing multisource uncertainties, the results of applying the proposed method in the selected case show that the accuracy of daily production planning increases in comparison with the existing method.

2013-05-01

378

Full Text Available SciELO Mexico | Language: Spanish Abstract in spanish Es necesario contar con registros largos de información hidrológica anual para obtener una imagen más apegada a la realidad de su variabilidad, así como estimaciones confiables de sus propiedades estadísticas. Para obtener tales registros es común buscar fuentes adicionales de datos y técnicas de tr [...] ansferencia. Una técnica es la regresión lineal múltiple, cuya aplicación numérica lleva implícita la selección óptima de los registros largos cercanos (regresores) para buscar que la ampliación del registro corto sea una estimación confiable. Este proceso de selección implica tres análisis: 1) cómo definir las mejores estimaciones, 2) cuáles ecuaciones de regresión investigar, y 3) cuál modelo tiene mejor capacidad predictiva. Para el primer análisis se presentan cuatro criterios basados en las sumas de los cuadrados de los residuos; para el segundo se investigan todas las regresiones posibles porque en los problemas de transferencia de información hidrológica se dispondrá máximo de cinco regresores; para el tercero, seleccionar el mejor modelo predictivo se utiliza el análisis de residuales y la validación cruzada. La aplicación numérica descrita es una ampliación del registro de volúmenes escurridos anuales en la estación hidrométrica Platón Sánchez del sistema del río Tempoal, en la Región Hidrológica No. 26 (Pánuco, México). En este caso se utilizan cuatro regresores que son los registros del resto de las estaciones de aforos de tal sistema. Se concluye que incluso en problemas con multicolinealidad, los criterios de selección y los análisis expuestos conducen a resultados consistentes y permiten obtener las mejores ecuaciones de regresión. La similitud de los resultados alcanzados con los modelos de regresión seleccionados genera confianza en las estimaciones adoptadas. Abstract in english It is necessary to have long records of annual hydrological data to get a truer picture of their variability, as well as reliable estimates of their statistical properties. To obtain these records it is common to use additional sources of data and transfer techniques. One technique is the multiple l [...] inear regression whose numerical application implies the optimum selection of close lengthy records (regressors) to have the extension of short registration be a reliable estimate. This selection process involves three analyses: 1) how to define the best estimates, 2) what regression equations should be investigated, and 3) which model has better predictive ability. For the first analysis four criteria based on the sums of the squares of the residuals are presented; for the second all possible regressions are investigated since in the problems of hydrological information transfer, we will have five regressors at the most; for the third, about selecting the best predictive model, we used the residual analysis and cross-validation. The numerical application described is an extension of the annual runoff volume record in the Platón Sánchez hydrometric station of the Tempoal river system in the 26 Hydrological Region (Pánuco, México). Here we used four regressors that are the records of other gauging stations in such system. We came to the conclusion that even in problems with multicollinearity, the selection criteria and analysis led to consistent results and allowed for the best regression equations. The similarity of the results obtained with the selected regression models generated confidence in the estimates adopted.

Daniel F., Campos-Aranda.

2011-12-01

379

The effects of exchange rate variability on international trade: a Meta-Regression Analysis

??ori??, Bruno; Pugh, Geoffrey Thomas

380

Functional MRI studies have revealed changes in default-mode and salience networks in neurodegenerative dementias, especially in Alzheimer’s disease. The purpose of this study was to analyze the whole brain cortex resting state networks in patients with behavioral variant frontotemporal dementia by using resting state functional MRI. The group specific resting state networks were identified by high model order independent component analysis and a dual regression technique was used to detect...

381

A new statistical methodology is developed for the analysis of spontaneous adverse event (AE) reports from post-marketing drug surveillance data. The method involves both empirical Bayes (EB) and fully Bayes estimation of rate multipliers for each drug within a class of drugs, for a particular AE, based on a mixed-effects Poisson regression model. Both parametric and semiparametric models for the random-effect distribution are examined. The method is applied to data from Food and Drug Adminis...

382

Strategic Investments in the Pulp and Paper Industry: A Count Data Regression Analysis

Bergman, Mats A.; Johansson, Per

383

Robust best linear estimation for regression analysis using surrogate and instrumental variables

Wang, C. Y.

384

A Logistic Regression Analysis of the Contractor`s Awareness Regarding Waste Management

Rawshan Ara Begum; Chamhuri Siwar; Joy Jacqueline Pereira; Abdul Hamid Jaafar

385

LOGISTIC REGRESSION RESPONSE FUNCTIONS WITH MAIN AND INTERACTION EFFECTS IN THE CONJOINT ANALYSIS

Luca, Amedeo; Ciapparelli, Sara

386

The Use of Logistic Regression in the Analysis of Data Concerning Good Medical Practice

Mn, Damon; Aminot I

387

Full Text Available OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a high risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for high-risk group and 15.2% for low-risk group. Accuracy of the tree model was maintained in the test sample and different Child-Pugh classes. CONCLUSIONS: A decision tree model that consists of spleen width, portal vein diameter and prothrombin time may be useful for prediction of large esophageal varices in cirrhotic patients

Wan-dong Hong

2011-01-01

388

Paine, Michael D.; Skinner, Marc A.; Kilgour, Bruce W.; DeBlois, Elisabeth M.; Tracy, Ellen

2014-12-01

389

Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a hig [...] h risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients) were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for high-risk group and 15.2% for low-risk group. Accuracy of the tree model was maintained in the test sample and different Child-Pugh classes. CONCLUSIONS: A decision tree model that consists of spleen width, portal vein diameter and prothrombin time may be useful for prediction of large esophageal varices in cirrhotic patients

Wan-dong, Hong; Le-mei, Dong; Zen-cai, Jiang; Qi-huai, Zhu; Shu-Qing, Jin.

390

Directory of Open Access Journals (Sweden)

Full Text Available Estimating forest canopy height from large-footprint satellite LiDAR waveforms is challenging given the complex interaction between LiDAR waveforms, terrain, and vegetation, especially in dense tropical and equatorial forests. In this study, canopy height in French Guiana was estimated using multiple linear regression models and the Random Forest technique (RF. This analysis was either based on LiDAR waveform metrics extracted from the GLAS (Geoscience Laser Altimeter System spaceborne LiDAR data and terrain information derived from the SRTM (Shuttle Radar Topography Mission DEM (Digital Elevation Model or on Principal Component Analysis (PCA of GLAS waveforms. Results show that the best statistical model for estimating forest height based on waveform metrics and digital elevation data is a linear regression of waveform extent, trailing edge extent, and terrain index (RMSE of 3.7 m. For the PCA based models, better canopy height estimation results were observed using a regression model that incorporated both the first 13 principal components (PCs and the waveform extent (RMSE = 3.8 m. Random Forest regressions revealed that the best configuration for canopy height estimation used all the following metrics: waveform extent, leading edge, trailing edge, and terrain index (RMSE = 3.4 m. Waveform extent was the variable that best explained canopy height, with an importance factor almost three times higher than those for the other three metrics (leading edge, trailing edge, and terrain index. Furthermore, the Random Forest regression incorporating the first 13 PCs and the waveform extent had a slightly-improved canopy height estimation in comparison to the linear model, with an RMSE of 3.6 m. In conclusion, multiple linear regressions and RF regressions provided canopy height estimations with similar precision using either LiDAR metrics or PCs. However, a regression model (linear regression or RF based on the PCA of waveform samples with waveform extent information is an interesting alternative for canopy height estimation as it does not require several metrics that are difficult to derive from GLAS waveforms in dense forests, such as those in French Guiana.

Ibrahim Fayad

391

Full Text Available Abstract Background About 10-20% of neonates with suspected or proven early onset sepsis (EOS fail on the empiric antibiotic regimen of ampicillin or penicillin and gentamicin. We aimed to identify clinical and laboratory markers associated with empiric antibiotic treatment failure in neonates with suspected EOS. Methods Maternal and early neonatal characteristics predicting failure of empiric antibiotic treatment were identified by univariate logistic regression analysis from a prospective database of 283 neonates admitted to neonatal intensive care unit within 72 hours of life and requiring antibiotic therapy with penicillin or ampicillin and gentamicin. Variables, identified as significant by univariate analysis, were entered into stepwise multiple logistic regression (MLR analysis and classification and regression tree (CRT analysis to develop a decision algorithm for clinical application. In order to ensure the earliest possible timing separate analysis for 24 and 72 hours of age was performed. Results At 24 hours of age neonates with hypoglycaemia ? 2.55 mmol/L together with CRP values > 1.35 mg/L or those with BW ? 678 g had more than 30% likelihood of treatment failure. In normoglycaemic neonates with higher BW the best predictors of treatment failure at 24 hours were GA ? 27 weeks and among those, with higher GA, WBC ? 8.25 × 109 L-1 together with platelet count ? 143 × 109 L-1. The algorithm allowed capture of 75% of treatment failure cases with a specificity of 89%. By 72 hours of age minimum platelet count ? 94.5 × 109 L-1 with need for vasoactive treatment or leukopaenia ? 3.5 × 109 L-1 or leukocytosis > 39.8 × 109 L-1 or blood glucose ? 1.65 mmol/L allowed capture of 81% of treatment failure cases with the specificity of 88%. The performance of MLR and CRT models was similar, except for higher specificity of the CRT at 72 h, compared to MLR analysis. Conclusion There is an identifiable group of neonates with high risk of EOS, likely to fail on conventional antibiotic therapy.

Merila Mirjam

392

393

Buston, Peter M; Elith, Jane

393

A wavelet-based latent variable regression (WLVR) method was developed to perform simultaneous quantitative analysis of overlapping spectrophotometric signals. The quality of the noise removal was improved by combining wavelet thresholding with principal component analysis (PCA). A method for selecting the optimum threshold was also developed. Eight error functions were calculated for deducing the number of factor. The latent variables were made by projecting the wavelet-processed signals onto orthogonal basis eigenvectors. Two-programs WMRA and WLVR, were designed to perform wavelet thresholding and simultaneous multicomponent determination. Experimental results showed the WLVR method to be successful even where there was severe overlap of spectra.

Gao, Ling; Ren, Shouxin

2008-12-01

394

Cigarette Smoking Habits among Men and Women in Turkey: A Meta Regression Analysis

Sahin Mutlu, F.; Ayranci, U.; Ozdamar, K.

395

Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis

Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.

Carlos Augusto Zangrando Toneli

396

Identification of cotton properties to improve yarn count quality by using regression analysis

Identification of raw material characteristics towards yarn count variation was studied by using statistical techniques. Regression analysis is used to meet the objective. Stepwise regression is used for mode) selection, and coefficient of determination and mean squared error (MSE) criteria are used to identify the contributing factors of cotton properties for yam count. Statistical assumptions of normality, autocorrelation and multicollinearity are evaluated by using probability plot, Durbin Watson test, variance inflation factor (VIF), and then model fitting is carried out. It is found that, invisible (INV), nepness (Nep), grayness (RD), cotton trash (TR) and uniformity index (VI) are the main contributing cotton properties for yarn count variation. The results are also verified by Pareto chart. (author)

397

Multifractal analysis of some multiple ergodic averages

In this paper we study the multiple ergodic averages $$ \\frac{1}{n}\\sum_{k=1}^n \\varphi(x_k, x_{kq}, \\cdots, x_{k q^{\\ell-1}}), \\qquad (x_n) \\in \\Sigma_m $$ on the symbolic space $\\Sigma_m =\\{0, 1, \\cdots, m-1\\}^{\\mathbb{N}^*}$ where $m\\ge 2, \\ell\\ge 2, q\\ge 2$ are integers. We give a complete solution to the problem of multifractal analysis of the limit of the above multiple ergodic averages. Actually we develop a non-invariant and non-linear version of thermodynamic formalism that is of its...

Fan, Ai-hua; Schmeling, Joerg; Wu, Meng

398

399

Background: The next fifty years will see a drastic increase in the older population. Among other effects, ageing causes a decrease in strength. It is necessary to provide safe and comfortable environments for the elderly. To achieve this, digital human modelling has proved to be a useful and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent parameters based on statistical redundancies, and then validate these equations. Methods: 283 subjects (141 males, 142 females) aged 50-59 years (54.9 +/- 2.9) , 60-69 years (65.4 +/- 2.9) and 70-79 years (73.7 +/- 2.7) were tested for maximal voluntary isometric torque of right knee extensors and elbow flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (P<0.05) across all age groups. This held true for males and females. Gender, thigh mass and age best predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass and age.

399

2012-01-01

400

A general method for predicting the intestinal absorption of a wide range of drugs using multiple regression analysis of their physicochemical properties and the drug-membrane electrostatic interaction was developed. The absorption rates of tested drugs from rat jejunum were measured by the in situ single-pass perfusion technique. The drugs used in this study were divided into three groups for regression analysis, and a smaller "test" set of compounds was used to assess the predictive capacity of the regression equation. When the analysis was applied to each respective group of drugs (i.e., anionic, cationic, and nonionized compounds), obtained regression coefficients were 0.569, 0.821, 0.728 by using the organic solvent (n-octanol)/buffer partition coefficient, 0.730, 0.734, 0.914 using the permeation rate across a silicon membrane, and 0.790, 0.915, 0.941 using an EVA membrane, respectively. However, smaller regression coefficients of 0.377, 0. 468, and 0.718 were obtained when these three groups of drugs were put together for prediction. Meanwhile, correlation was improved remarkably when drug-membrane electrostatic interactions, namely, hydrogen-bonding donor (Halpha) and acceptor (Hbeta) activity or index of electricity (Ec), were added to the other parameters of lipophilicity and permeation rate across the EVA membrane (r = 0.880 and 0.883, respectively). Moreover, the equation obtained from these regression analyses was applicable even to the prediction of the absorption of the zwitterionic drugs. These results suggest that including the electrostatic interaction parameters in addition to lipophilicity and permeability across artificial membranes would afford a better prediction for the intestinal absorption of the vast majority of drugs. PMID:9687340

Sugawara, M; Takekuma, Y; Yamada, H; Kobayashi, M; Iseki, K; Miyazaki, K

400

International Nuclear Information System (INIS)

Computation of distance to fault on an electrical transmission line is affected by many sources of uncertainty, including parameter setting errors, measurement errors, as well as absence of information and incomplete modelling of a system under fault condition. In this paper we propose an application of the variance-based global sensitivity measures for evaluation of fault location algorithms. The main goal of the evaluation is to identify factors and their interactions that contribute to the fault locator output variability. This analysis is based on the results of Sparse Grid Regression. The method compiles the Functional ANOVA model to represent fault locator output as a function of uncertain factors. The ANOVA model provides a tool for interpretation and sensitivity analysis. In practice, such analysis can help in functional performance tests, especially in: selection of the optimal fault location algorithm (device) for a specific application, calibration process and building confidence in a fault location function result. The paper concludes with an application example which demonstrates use of the proposed methodology in testing and comparing some commonly used fault location algorithms. This example is also used to demonstrate numerical efficiency for this type of application of the proposed Sparse Grid Regression method in comparison to the Quasi-Monte Carlo approach. - Highlights: ? Sparse Grid Regression (SGR) method has been developed and presented in the has been developed and presented in the paper. ? The SGR method is able to fit ANOVA model to input/output data of a black-box function. ? The SGR provides variance-based sensitivities to be used for Global Sensitivity Analysis (GSA). ? The SGR algorithm relies on the numerical multi-dimensional integration on a sparse grid. ? Application example presented is GSA of fault-locating algorithms used in electrical networks.

401

Palatal rugae patterns are relatively unique to an individual and are well protected by the lips, buccal pad of fat and teeth. They are considered to be stable throughout life following completion of growth, although there is considerable debate on the matter, they can be used successfully in post mortem identification provided an antemortem record exists. Thus the aim of this study was to examine palatal rugae shape among two Indian populations and determine the accuracy in defining the Indian population using logistic regression analysis. The study comprises two groups from geographically different regions of India with basic origin from Maharashtra and Karnataka state. The sample includes 100 plaster cast equally distributed between two populations and genders with age ranging between 18 and 40 years. Impression of maxillary arch was obtained using alginate impression material and plaster cast was made. The rugae was delineated on the cast using a sharp graphite pencil under adequate light and magnification and recorded according to classification given by Kapali et al. and Thomas and Kotze (1983). Chi-Square analysis showed significant difference in wavy, circular and divergent pattern between the two populations. The straight and wavy forms were significant in logistic regression analysis. A predictive value of 71% was obtained in determining the original cases correctly when straight, wavy, curved and circular patterns were assessed. 70% of predictive value was achieved when all rugae patterns were assessed. Mean number of rugae was greater in females compared to males with straight pattern showing statistically significant difference between males and females. Significant difference was recorded among straight, wavy, circular and divergent pattern between two populations. Consequently this study demonstrates moderate accuracy of palatal rugae pattern using logistic regression analysis in identification of Indians. PMID:22018168

402

Full Text Available SciELO Colombia | Language: Spanish Abstract in spanish Uno de los supuestos principales del análisis de regresión lineal es la existencia de una relación de causalidad entre las variables analizadas, sin que el análisis de regresión lo permita demostrar. Esta investigación demuestra la causalidad entre las variables analizadas a través de la construcció [...] n y análisis de la retroalimentación entre las variables en estudio, plasmada en un diagrama causal y validado a través de simulación dinámica. Una de las principales contribuciones de ésta investigación, es la propuesta de utilizar un enfoque de dinámica de sistemas, para desarrollar un método de transición de un modelo de regresión lineal múltiple predictivo a un modelo de regresión no lineal simple explicativo, que incrementa el nivel de predicción del modelo. El error cuadrático medio (ECM) es utilizado como criterio de predicción. La validación se realizó con tres modelos de regresión lineal obtenidos experimentalmente en una empresa del sector textil, mostrando una alternativa para incrementar la fiabilidad en los modelos de predicción. Abstract in english One of the main assumptions of the linear regression analysis is the existence of a causal relationship between the variables analyzed, which the regression analysis does not demonstrate. This paper demonstrates the causality between the variables analyzed through the construction and analysis of th [...] e feedback from the variables under study, expressed in a causal diagram and validated through dynamic simulation. The major contribution of this research is the proposal of the use of the system dynamics approach to develop a method of transition from a multiple regression predictive model to a simpler nonlinear regression explanatory model, which increases the level of prediction of the model. The mean square error (MSE) is taken as a criterion for prediction. The validation in the transition model was performed with three linear regression models obtained experimentally in a textile company, showing a method for increasing the reliability of prediction models.

Roberto, Baeza-Serrato; José Antonio, Vázquez-López.

2014-06-01

403

The aim of this work was to use an iterative regression analysis for the determination of geophysical time series periodicities. This method gives the standard deviation associated to the three parameters determined for each periodicity. This feature allows the selection of the amplitudes with higher amplitude/deviation ratio. The method used was compared with the periodogram method, Fourier analysis, and maximum entropy spectral analysis. It was then applied to the analysis of the main periodicities in time series of atmospheric cosmonuclides (atmospheric carbon-14 and beryllium-10 of ice cores from Greenland and Antarctica), mean surface temperatures, and indicators of atmospheric volcanic dust. During the time interval of these series, the periodicities found were compared from the point of view of possible causal associations between the phenomenon such as solar activity, cosmonuclides in the terrestrial atmosphere, atmospheric circulation, temperatures of the atmosphere, and volcanic dust in the atmosphere.

1994-03-01

404

Robust best linear estimation for regression analysis using surrogate and instrumental variables.

Wang, C Y

405

Directory of Open Access Journals (Sweden)

Javali Shivalingappa

2010-01-01

406

Analysis of multiple-choice items.

1991-04-01

407

Multiple scattering in neutron polarization analysis experiments

International Nuclear Information System (INIS)

A simple analytic method to correct the experimentally observed spin-flip and non-spin-flip scattering cross sections in neutron polarization analysis experiments for the effects of multiple scattering is presented. From known parameters of the constituent elements of a specimen and from the measured experimental cross sections the single scattering cross sections can readily be determined. This is particularly useful in situations where the scattering is isotropic or exhibits only slight angular dependence. (orig.)

408

A software tool was created in Fiscal Year 2010 (FY11) that enables multiple-regression correction of well water levels for river-stage effects. This task was conducted as part of the Remediation Science and Technology project of CH2MHILL Plateau Remediation Company (CHPRC). This document contains an overview of the correction methodology and a user’s manual for Multiple Regression in Excel (MRCX) v.1.1. It also contains a step-by-step tutorial that shows users how to use MRCX to correct river effects in two different wells. This report is accompanied by an enclosed CD that contains the MRCX installer application and files used in the tutorial exercises.

Mackley, Rob D.; Spane, Frank A.; Pulsipher, Trenton C.; Allwardt, Craig H.

409

410

KINETIC ANALYSIS OF HIGH-NITROGEN ENERGETIC MATERIALS USING MULTIVARIATE NONLINEAR REGRESSION

410

411

International Nuclear Information System (INIS)

The experimental data of ammonium exchange by natural Bigadic clinoptilolite was evaluated using nonlinear regression analysis. Three two-parameters isotherm models (Langmuir, Freundlich and Temkin) and three three-parameters isotherm models (Redlich-Peterson, Sips and Khan) were used to analyse the equilibrium data. Fitting of isotherm models was determined using values of standard normalization error procedure (SNE) and coefficient of determination (R2). HYBRID error function provided lowest sum of normalized error and Khan model had better performance for modeling the equilibrium data. Thermodynamic investigation indicated that ammonium removal by clinoptilolite was favorable at lower temperatures and exothermic in nature

412

412

Gunay, Ahmet [Deparment of Environmental Engineering, Faculty of Engineering and Architecture, Balikesir University (Turkey)], E-mail: ahmetgunay2@gmail.com

413

Analysis of reactor noise by multi-variate auto-regressive model

The multi-variate auto-regressive model has recently been applied to the noise analysis of nuclear reactor systems. From such a standpoint a system identification study was performed at the Japan Power Demonstrain Reactor-2 (JPDR-2), 45 Mwt, using pseuds-random signals. The aim of this paper is further to extend and refine this identification problem based on the measured data. Emphasis is on the fact that the results obtained by the non-parametric method can by justified by the parametric one. Elucidation of feedback map is also made by estimating the noise contribution rate. Results of computation show the effectiveness of the procedure. (author)

414

Soil colour and spectral analysis employing linear regression models I. Effect of organic matter

Directory of Open Access Journals (Sweden)

Moustakas N.K.

415

International Nuclear Information System (INIS)

An uncertainty analysis method is proposed here, which uses Fourier Amplitude Sensitivity Test (FAST) and Stepwise Regression Technique (SRT). This method is a compromise between the approximation method [response surface method (RSM) or moments method] and Monte Carlo method (MCM). It is concluded that: 1. FAST gives the partial variance for each input parameter, which can be used as global sensitivity ranking between input parameters, with moderate sampling point compared to crude MCM. 2. SRT is a good tool to construct the later-used first- or second-order response surface model consisting of comparatively important parameters. 3. The combined uncertainty analysis method using FAST and SRT can be used for uncertainty/sensitivity analysis of the large computer codes with moderate cost and it will be a useful tool to analyze the feasibility of the newly developed, highly uncertain system models

416

Full Text Available Este artigo discute algumas aplicações das técnicas de análise de regressão múltipla stepwise e hierárquica, as quais são muito utilizadas em pesquisas da área de Psicologia Organizacional. São discutidas algumas estratégias de identificação e de solução de problemas relativos à ocorrência de erros do Tipo I e II e aos fenômenos de supressão, complementaridade e redundância nas equações de regressão múltipla. São apresentados alguns exemplos de pesquisas nas quais esses padrões de associação entre variáveis estiveram presentes e descritas as estratégias utilizadas pelos pesquisadores para interpretá-los. São discutidas as aplicações dessas análises no estudo de interação entre variáveis e na realização de testes para avaliação da linearidade do relacionamento entre variáveis. Finalmente, são apresentadas sugestões para lidar com as limitações das análises de regressão múltipla (stepwise e hierárquica.This article discusses applications of stepwise and hierarchical multiple regression analyses to research in organizational psychology. Strategies for identifying type I and II errors, and solutions to potential problems that may arise from such errors are proposed. In addition, phenomena such as suppression, complementarity, and redundancy are reviewed. The article presents examples of research where these phenomena occurred, and the manner in which they were explained by researchers. Some applications of multiple regression analyses to studies involving between-variable interactions are presented, along with tests used to analyze the presence of linearity among variables. Finally, some suggestions are provided for dealing with limitations implicit in multiple regression analyses (stepwise and hierarchical.

Gardênia Abbad

2002-01-01

417

We developed a new, indirect method for the determination of mineral substances, expressed as total ash content in bee honey varieities, based on a multiple regression model. This time-saving and effective method could serve as a new procedure in routine quality control plans of bee honey varieties. PMID:12653435

2003-02-01

418

Regression analysis of growth responses to water depth in three wetland plant species

Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta, Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. P. tenax growth rapidly decreased in standing water > 0.25 m depth, C. secta growth increased initially with depth but then decreased at depths > 0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0 – 0.50 m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences.

Sorrell, Brian K; Tanner, Chris C

2012-01-01

419

Performance Analysis of Multiple User Optical Code Division Multiple Access

In this paper, we discuss and analyze an optical code division multiple access for multiple user system. Media access control implementation has been considered. For fulfilling the huge need of bandwidth services, technology tends to move to optical networks and three major optical systems come into existence. Code division of the optical network is most used and real concept i...

Himanshu Monga2; Kaler, R. S.

420

420

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

Jiang, Mingfeng; Zhu, Lingyan; Wang, Yaming; Xia, Ling; Shou, Guofa; Liu, Feng; Crozier, Stuart

2011-03-01

421

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

Jiang Mingfeng; Wang Yaming [College of Electronics and Informatics, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Zhu Lingyan [Dongfang College, Zhejiang University of Finance and Economics, Hangzhou, 310018 (China); Xia Ling; Shou Guofa; Liu Feng [Department of Biomedical Engineering, Zhejiang University, Hangzhou 310027 (China); Crozier, Stuart, E-mail: peterjiang0517@163.com, E-mail: jiang.mingfeng@hotmail.com [School of Information Technology and Electrical Engineering, University of Queensland, St Lucia, Brisbane, Queensland 4072 (Australia)

422

422

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

423

Local linear regression for function learning: an analysis based on sample discrepancy.

Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided. PMID:25330431

2014-11-01

Full Text Available Medium Density Fiber board (MDF panels are appropriate for many exterior and interior industrial applications. The degree of surface roughness of MDF plays an important role since, any surface irregularities will affect the final quality of the product. In the present study, regression model were developed to predict surface roughness in drilling MDF panels with carbide step drills. In the development of predictive models, drilling parameters of spindle speed, feed rate and drill diameter were considered as model variables. For this purpose, Taguchi’s design of experiments was carried out in order to collect surface roughness value. The Orthogonal Array (OA and Analysis of Variance (ANOVA are employed to study the surface roughness characteristics in drilling operation of MDF panels. The objective is to establish a correlation between spindle speed, feed rate and drill diameter with surface roughness in a MDF panel. The experiments are conducted as per Taguchi L27 orthogonal array with different cutting conditions. ANOVA and F-test were used to check the validity of regression model and to determine the significant parameter affecting the surface roughness. The statistical analysis showed that the feed rate was an utmost parameter on surface roughness. The microstructure of drilled surfaces were also studied by scanning electron microscopy (SEM.The SEM investigations reveled that drilling MDF panels with step drill produce surface striations and waviness which were increased significantly with feed rate.

424

2012-01-01

425

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seu comportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se o comportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: as concentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; as temperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Data analysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured places are east and southeast.

Rosana de Cassia de Souza Schneider

2011-03-01

425

International Nuclear Information System (INIS)

427

We report the first systems biology investigation of regulators controlling arterial plaque macrophage transcriptional changes in response to lipid lowering in vivo in two distinct mouse models of atherosclerosis regression. Transcriptome measurements from plaque macrophages from the Reversa mouse were integrated with measurements from an aortic transplant-based mouse model of plaque regression. Functional relevance of the genes detected as differentially expressed in plaque macrophages in response to lipid lowering in vivo was assessed through analysis of gene functional annotations, overlap with in vitro foam cell studies, and overlap of associated eQTLs with human atherosclerosis/CAD risk SNPs. To identify transcription factors that control plaque